359 Commits

Author SHA1 Message Date
Rob Armstrong
c94ff366ae
Merge pull request #379 from ntalpallikar/bug_matrixmulDynlinkJIT_crash
Fix null pointer reference issue with cuda driver API function pointer.
2025-09-05 09:29:17 -07:00
Nikhil Talpallikar
f6504f40a5 fix formatting 2025-08-29 13:28:40 -07:00
Nikhil Talpallikar
b4aaab387e fixed formatting 2025-08-29 12:37:12 -07:00
ntalpallikar
0861db73ad
fixing indentation 2025-08-06 10:56:14 -07:00
Nikhil Talpallikar
6df7127c23 Fixed dlopen on linux with lazy load flag 2025-08-06 00:36:35 -07:00
Nikhil Talpallikar
d2c52db3e0 Fixed the error path to initialize error path function pointers. Exit with error in case of LOADLIBRARY failureas initialize of function pointers in case of LOADLIBRARY failure will fail 2025-08-06 00:29:22 -07:00
Nikhil Talpallikar
527b29dbd0 Clean implementation for failure path when cuInit fails. Removed CHECKED_CALL macro which returned prematurely 2025-08-05 13:49:28 -07:00
Nikhil Talpallikar
f8aab0053f Clean implementation for failure path when cuInit fails 2025-08-05 13:46:44 -07:00
Matt Davis
1b36fefcd5 Test the compute capability minor number prior to using the result value. 2025-08-04 18:00:26 +00:00
Nikhil Talpallikar
fd513b4846 Fix null pointer refrence issue with cuda driver API function pointers in case cuInit fails 2025-08-01 10:27:09 -07:00
Rob Armstrong
a5267b83a5 Merge branch 'shawnz_bug_fix' into 'master'
Bug 5412815: Fix the issue of cudaTensorCoreGemm.cu

See merge request cuda-samples/cuda-samples!125
2025-07-28 10:18:27 -07:00
Matt Davis
903c840a8c Update the README mentioning testing requires a compatible driver. 2025-07-25 12:40:16 +00:00
Matt Davis
24e7c5b428 Update compute capability tests to look for >=sm75. 2025-07-25 10:28:53 +00:00
shawnz
b38ed29c95 Bug 5412815: Fix the issue of cudaTensorCoreGemm.cu 2025-07-25 15:16:12 +08:00
shawnz
2369d4c649 Fix the FindNVSCI.cmake syntax error and the typo in README.md 2025-07-10 17:13:31 +08:00
Rob Armstrong
8433f89993 Globally remove VSCode tasks.json referencing obsolete build steps 2025-07-07 08:31:22 -07:00
Rob Armstrong
45314d7ff8 README formatting changes 2025-06-24 12:44:08 -07:00
shawnz
0d61031846 Bug 5355361: Update README.md of 7_libNVVM for forward compatibility sample build 2025-06-24 17:04:28 +08:00
shawnz
e674cc36fe Bug 5339530: Set socket creating folder to /tmp for QNX 2025-06-13 16:34:40 +08:00
shawnz
ce28796d6c Bug 5189457: Disable -no-pie for hpc 2025-06-11 16:25:22 +08:00
shawnz
9424c15848 Bug 5331767: Specify sm list of cdp samples for QNX 2025-06-10 15:33:07 +08:00
shawnz
9075c50a3d Bug 5323034 and 5323144: Disable .rsp for linking as qcc doesn't support lib path with double quotes in .rsp on QNX 2025-06-10 15:21:29 +08:00
shawnz
de5fa98e6e Bug 5323118: Remove the -lpthread and -lrt which are not supported on QNX 2025-06-10 15:18:04 +08:00
shawnz
6c9e9d3cd2 Bug 5323163: Get correct cuda include path for finding header files 2025-06-10 15:15:56 +08:00
shawnz
7f5390cec3 Bug 5323124: Waive simpleAWBarrier on QNX 2025-06-10 15:14:33 +08:00
shawnz
5f6d46dfea Bug 5323018: Update the CMakeLists.txt and Common/helper_multiprocess.cpp of ptxjit and memMapIPCDrv for QNX cross build 2025-06-09 19:53:25 +08:00
shawnz
49307463b5 Bug 5133216: Add QNX tooltrain and cross build support 2025-05-29 16:57:28 +08:00
Rob Armstrong
9c0d5aaae6 Merge branch 'shawnz_bug_fix' into 'master'
Bug 5305842 and 5305854: Update CMakeLists for SBSA CUDA Toolkit support on aarch64-linux platforms

See merge request cuda-samples/cuda-samples!115
2025-05-28 08:26:16 -07:00
shawnz
0d08748ffa Bug 5305842: Update CMakeLists.txt for cdp samples for sbsa CUDA Toolkit supporting on aarch64-linux platforms 2025-05-28 11:42:49 +08:00
Rob Armstrong
27f47634a0 Merge branch 'shawnz_bug_fix' into 'master'
Bug 5300528: Add MPI_C_LIBRARIES for user defined MPI path

See merge request cuda-samples/cuda-samples!114
2025-05-23 07:57:54 -07:00
shawnz
40d297dfe7 Bug 5300528: Add MPI_C_LIBRARIES for user defined MPI path 2025-05-23 14:47:55 +08:00
Rob Armstrong
20b90b5063 Merge public 12.9 changes into 13.0 dev 2025-05-22 11:45:07 -07:00
Rob Armstrong
8a9e2c830c
Update 1_Utilities/README.md to redirect bandwidthTest to NVBandwidth (#371) 2025-05-22 11:43:14 -07:00
Rob Armstrong
f47f06f077 Merge branch 'shawnz_bug_fix' into 'master'
Update the CMAKE_CUDA_ARCHITECTURES from 50 to 75 for Samples/7_libNVVM/cuda-c-linking

See merge request cuda-samples/cuda-samples!113
2025-05-22 08:16:44 -07:00
shawnz
0c78a3e0de Update the CMAKE_CUDA_ARCHITECTURES from 50 to 75 for Samples/7_libNVVM/cuda-c-linking 2025-05-22 15:52:51 +08:00
Rob Armstrong
8219570c15 Merge branch 'shawnz_bug_fix' into 'master'
Bug fix for 5280038, 5277193, 5281036 and 5294720

See merge request cuda-samples/cuda-samples!112
2025-05-21 10:51:12 -07:00
Rob Armstrong
fc88988f23 Merge public changes to internal ToT 2025-05-21 09:28:35 -07:00
shawnz
1141dd7af4 Bug 5281036: Limit the register number of debug version for cdpAdvancedQuicksort 2025-05-20 14:47:44 +08:00
shawnz
da3b7a2b3c Update the vulkanImageCUDA/vulkanImageCUDA.cu for Windows headers 2025-05-19 17:43:08 +08:00
shawnz
5987a9e9fa Update transpose for code format check 2025-05-19 17:38:42 +08:00
shawnz
107f3f537f Update the include files sequence for vulkan samples on Windows 2025-05-19 17:38:22 +08:00
shawnz
c1b03b9f81 Bug 5277193: Remove the CUFFT_LICENSE_ERROR checking as it is deprecated since CUDA13.0 2025-05-16 11:21:07 +08:00
shawnz
ee0ee417c9 Update the sample list for API changes in CHANGELOG 2025-05-14 16:02:17 +08:00
shawnz
ebc1078379 Bug 5280038: Update cuda-c-linking as per CUDA 13.0 API change 2025-05-14 15:54:41 +08:00
Rob Armstrong
2861e78272 Merge branch 'peggyt_bug_fix' into 'master'
Bug 5056055: limit register usage to 128 per thread in debug mode

See merge request cuda-samples/cuda-samples!110
2025-05-12 08:55:45 -07:00
shawnz
8f33cc6094 Bug 5274280: Enable 8_Platform_Specific/Tegra/EGLSync_CUDAEvent_Interop 2025-05-12 15:02:31 +08:00
shawnz
2ec9cf394a Bug 5272236: Update the include file copy path as path changes on 13.0 2025-05-12 15:00:52 +08:00
Peggy Tian
770e433a9e Bug 5056055: limit register usage to 128 per thread in debug mode to comply with the maximum number of 32-bit registers per SM 2025-05-12 06:04:22 +00:00
Rob Armstrong
c6af90553e Merge branch 'master' into 'master'
Bug 5236593: Increase the pending kernel launch limit to 4096

See merge request cuda-samples/cuda-samples!109
2025-05-07 09:47:51 -07:00
Peggy Tian
611008fa86 Bug 5236593: Increase the pending kernel launch limit to 4096 2025-05-07 17:38:52 +08:00