Commit Graph

  • c94ff366ae
    Merge pull request #379 from ntalpallikar/bug_matrixmulDynlinkJIT_crash master Rob Armstrong 2025-09-05 09:29:17 -07:00
  • f6504f40a5 fix formatting Nikhil Talpallikar 2025-08-29 13:28:40 -07:00
  • b4aaab387e fixed formatting Nikhil Talpallikar 2025-08-29 12:37:12 -07:00
  • 0861db73ad
    fixing indentation ntalpallikar 2025-08-06 10:56:14 -07:00
  • 6df7127c23 Fixed dlopen on linux with lazy load flag Nikhil Talpallikar 2025-08-06 00:36:35 -07:00
  • d2c52db3e0 Fixed the error path to initialize error path function pointers. Exit with error in case of LOADLIBRARY failureas initialize of function pointers in case of LOADLIBRARY failure will fail Nikhil Talpallikar 2025-08-06 00:29:22 -07:00
  • 3f1c509650 Merge branch 'shawnz_bug_fix' into 'master' v13.0 Rob Armstrong 2025-08-05 21:22:57 -07:00
  • 527b29dbd0 Clean implementation for failure path when cuInit fails. Removed CHECKED_CALL macro which returned prematurely Nikhil Talpallikar 2025-08-05 13:49:28 -07:00
  • f8aab0053f Clean implementation for failure path when cuInit fails Nikhil Talpallikar 2025-08-05 13:46:44 -07:00
  • fbc40da311 Merge branch 'dev/mattd/update-simple-cc-check' into 'master' Rob Armstrong 2025-08-04 11:06:48 -07:00
  • 1b36fefcd5 Test the compute capability minor number prior to using the result value. Matt Davis 2025-08-04 18:00:26 +00:00
  • fd513b4846 Fix null pointer refrence issue with cuda driver API function pointers in case cuInit fails Nikhil Talpallikar 2025-08-01 10:27:09 -07:00
  • 13c2fd9717 Update README with removing the old auto-linux part shawnz 2025-08-01 19:02:37 +08:00
  • a5267b83a5 Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-07-28 10:18:27 -07:00
  • 2ab16e6d15 Merge branch 'dev/mattd/r13-nvvm-update-master' into 'master' Rob Armstrong 2025-07-28 10:17:45 -07:00
  • 903c840a8c Update the README mentioning testing requires a compatible driver. Matt Davis 2025-07-25 12:40:16 +00:00
  • 24e7c5b428 Update compute capability tests to look for >=sm75. Matt Davis 2025-07-25 10:28:53 +00:00
  • b38ed29c95 Bug 5412815: Fix the issue of cudaTensorCoreGemm.cu shawnz 2025-07-25 15:16:12 +08:00
  • 4a631c9fd6 Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-07-24 16:57:29 -07:00
  • 98afcd6515 Bug 5314090: Update README.md for Sample build for auto-linux from DriveOS Docker Container shawnz 2025-07-21 18:58:43 +08:00
  • 8013610840 Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-07-10 13:44:45 -07:00
  • 2369d4c649 Fix the FindNVSCI.cmake syntax error and the typo in README.md shawnz 2025-07-10 17:13:31 +08:00
  • 8433f89993 Globally remove VSCode tasks.json referencing obsolete build steps Rob Armstrong 2025-07-07 08:31:22 -07:00
  • a9a4db666f Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-06-26 12:07:06 -07:00
  • 87d03ef003 Bug 5339530: Bug 4854664: Set socket creating folder to /tmp for QNX shawnz 2025-06-26 15:49:16 +08:00
  • 45314d7ff8 README formatting changes Rob Armstrong 2025-06-24 12:44:08 -07:00
  • 2f0a55d7dc Merge branch 'shawnz_qnx_crossbuild' into 'master' Rob Armstrong 2025-06-24 12:25:55 -07:00
  • 0d61031846 Bug 5355361: Update README.md of 7_libNVVM for forward compatibility sample build shawnz 2025-06-18 16:24:55 +08:00
  • 225f84d433 Bug 5295515: Update README.md for forward compatibility sample build shawnz 2025-06-18 16:17:46 +08:00
  • e674cc36fe Bug 5339530: Set socket creating folder to /tmp for QNX shawnz 2025-06-13 16:03:05 +08:00
  • a47b422205 Update CHANGELOG.md and README.md for QNX cross build shawnz 2025-06-11 16:32:09 +08:00
  • ce28796d6c Bug 5189457: Disable -no-pie for hpc shawnz 2025-06-11 16:25:22 +08:00
  • 9424c15848 Bug 5331767: Specify sm list of cdp samples for QNX shawnz 2025-06-10 15:33:07 +08:00
  • 9075c50a3d Bug 5323034 and 5323144: Disable .rsp for linking as qcc doesn't support lib path with double quotes in .rsp on QNX shawnz 2025-06-10 15:21:29 +08:00
  • de5fa98e6e Bug 5323118: Remove the -lpthread and -lrt which are not supported on QNX shawnz 2025-06-10 15:18:04 +08:00
  • 6c9e9d3cd2 Bug 5323163: Get correct cuda include path for finding header files shawnz 2025-06-10 15:15:56 +08:00
  • 7f5390cec3 Bug 5323124: Waive simpleAWBarrier on QNX shawnz 2025-06-10 15:14:33 +08:00
  • 5f6d46dfea Bug 5323018: Update the CMakeLists.txt and Common/helper_multiprocess.cpp of ptxjit and memMapIPCDrv for QNX cross build shawnz 2025-06-09 19:49:44 +08:00
  • 49307463b5 Bug 5133216: Add QNX tooltrain and cross build support shawnz 2025-05-29 16:57:28 +08:00
  • 9c0d5aaae6 Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-05-28 08:26:16 -07:00
  • 74a9a08887 Bug 5305854: Update aarch64 toolchain for SBSA CUDA Toolkit support shawnz 2025-05-28 12:43:16 +08:00
  • 0d08748ffa Bug 5305842: Update CMakeLists.txt for cdp samples for sbsa CUDA Toolkit supporting on aarch64-linux platforms shawnz 2025-05-28 11:42:49 +08:00
  • 27f47634a0 Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-05-23 07:57:54 -07:00
  • 40d297dfe7 Bug 5300528: Add MPI_C_LIBRARIES for user defined MPI path shawnz 2025-05-23 14:47:55 +08:00
  • 20b90b5063 Merge public 12.9 changes into 13.0 dev Rob Armstrong 2025-05-22 11:45:07 -07:00
  • 8a9e2c830c
    Update 1_Utilities/README.md to redirect bandwidthTest to NVBandwidth (#371) Rob Armstrong 2025-05-22 11:43:14 -07:00
  • 61225e22f0 Remove erroneous CMAKE_MODULE_PATH from top-level CMakeLists.txt Rob Armstrong 2025-05-22 11:35:00 -07:00
  • f47f06f077 Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-05-22 08:16:44 -07:00
  • 0c78a3e0de Update the CMAKE_CUDA_ARCHITECTURES from 50 to 75 for Samples/7_libNVVM/cuda-c-linking shawnz 2025-05-22 15:52:51 +08:00
  • 8219570c15 Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-05-21 10:51:12 -07:00
  • fc88988f23 Merge public changes to internal ToT Rob Armstrong 2025-05-21 09:28:35 -07:00
  • adacf1cffd
    Merge pull request #368 from XSShawnZeng/master Rob Armstrong 2025-05-21 09:27:13 -07:00
  • 3e631f467a Bug 5294720: Update GpuArch Cores and Name for SM103, 110 and 121 shawnz 2025-05-21 14:16:48 +08:00
  • 1141dd7af4 Bug 5281036: Limit the register number of debug version for cdpAdvancedQuicksort shawnz 2025-05-20 14:47:44 +08:00
  • da3b7a2b3c Update the vulkanImageCUDA/vulkanImageCUDA.cu for Windows headers shawnz 2025-05-19 17:43:08 +08:00
  • 5987a9e9fa Update transpose for code format check shawnz 2025-05-19 17:38:42 +08:00
  • 107f3f537f Update the include files sequence for vulkan samples on Windows shawnz 2025-05-19 17:38:22 +08:00
  • c1b03b9f81 Bug 5277193: Remove the CUFFT_LICENSE_ERROR checking as it is deprecated since CUDA13.0 shawnz 2025-05-16 11:21:07 +08:00
  • ee0ee417c9 Update the sample list for API changes in CHANGELOG shawnz 2025-05-14 16:02:17 +08:00
  • ebc1078379 Bug 5280038: Update cuda-c-linking as per CUDA 13.0 API change shawnz 2025-05-14 15:54:41 +08:00
  • 2861e78272 Merge branch 'peggyt_bug_fix' into 'master' Rob Armstrong 2025-05-12 08:55:45 -07:00
  • 494d598f86 Merge branch 'shawnz_bugs_fix' into 'master' Rob Armstrong 2025-05-12 08:54:40 -07:00
  • c6208f5897 Bug 5263330: Update CUFFT errors as per latest changes on CUDA 13.0 shawnz 2025-05-12 15:39:08 +08:00
  • 8f33cc6094 Bug 5274280: Enable 8_Platform_Specific/Tegra/EGLSync_CUDAEvent_Interop shawnz 2025-05-12 15:02:31 +08:00
  • 2ec9cf394a Bug 5272236: Update the include file copy path as path changes on 13.0 shawnz 2025-05-12 15:00:52 +08:00
  • 770e433a9e Bug 5056055: limit register usage to 128 per thread in debug mode to comply with the maximum number of 32-bit registers per SM Peggy Tian 2025-05-12 06:04:22 +00:00
  • c6af90553e Merge branch 'master' into 'master' Rob Armstrong 2025-05-07 09:47:51 -07:00
  • 7989d1fcc8 Merge branch 'shawnz_bug_fix' into 'master' Rob Armstrong 2025-05-07 09:46:41 -07:00
  • 611008fa86 Bug 5236593: Increase the pending kernel launch limit to 4096 Peggy Tian 2025-05-07 17:38:52 +08:00
  • bf628887f1 Bug 5217339: Replace SM 101 with 110 for Thor shawnz 2025-05-07 14:10:35 +08:00
  • 330dd8472f Merge external PR #363 Rob Armstrong 2025-05-05 08:45:01 -07:00
  • b530f1cf42
    Fix bug in 6_Performance/transpose: copy sharedmem kernel (#363) Francesco Rizzi 2025-05-05 17:43:23 +02:00
  • 8ee551c99a Udpate README for 13.0 Rob Armstrong 2025-05-01 15:23:36 -07:00
  • 148014e709 Merge cuda_a_dev 13.0 changes to master Rob Armstrong 2025-05-01 15:22:40 -07:00
  • cab7c66b4f Update pre-config to include Python and JSON for EOL, whitespace checks v12.9 Rob Armstrong 2025-05-01 10:17:42 -07:00
  • 8d400cfb7f Additional minor changes to run_tests.py output formatting Rob Armstrong 2025-05-01 10:14:09 -07:00
  • f2645c5df8 Final merge of 12.9 changes into cuda_a_dev Rob Armstrong 2025-05-01 09:55:03 -07:00
  • 6d6d964f97 Minor changes to run_tests.py output formatting Rob Armstrong 2025-05-01 09:54:25 -07:00
  • ab68d58d59 Remove unused bin/x86_64 directory hierarchy Rob Armstrong 2025-05-01 09:53:54 -07:00
  • c70d79cf3b Final 12.9 README updates Rob Armstrong 2025-05-01 09:39:06 -07:00
  • 9ac81370fa Update 12.9 changes from 'master' into 'cuda_a_dev' Rob Armstrong 2025-04-30 09:48:21 -07:00
  • 14b1bfdcc4 Replace README references to "CUDA Toolkit 12.5" with general "CUDA Toolkit" Rob Armstrong 2025-04-30 09:46:45 -07:00
  • c14a0114d6 Some samples require multiple GPUs. Update 'run_tests.py' to skip them on single- or no-GPU systems. Rob Armstrong 2025-04-30 09:45:20 -07:00
  • ee15cc0fe2 Merge branch 'shawnz_bugs_fix' into 'master' Rob Armstrong 2025-04-28 08:53:11 -07:00
  • 3438fd4875 Update README for OpenMP shawnz 2025-04-28 23:44:45 +08:00
  • b27b55ec70 Bug 5241914: Fix the error message for cuSolverDn_LinearSolver shawnz 2025-04-27 16:57:02 +08:00
  • 49159f3739 Bug 5164417 and 5097376: Fix the OpenMP issue finding issue for MSVC and Glang shawnz 2025-04-27 16:50:12 +08:00
  • 93cafa8fe9 Update 12.9 changes from 'master' into 'cuda_a_dev' Rob Armstrong 2025-04-21 09:22:29 -07:00
  • 1680a1dc7f Update Windows FreeImage configuration instructions in README.md Rob Armstrong 2025-04-21 09:20:22 -07:00
  • 49daf0e4e0 Merge Bug 5199167: Fix the includes issue for 5_Domain_Specific\simpleD3D12 Rob Armstrong 2025-04-21 08:11:52 -07:00
  • a45fd3bd7c Bug 5199167: Fix the includes issue for 5_Domain_Specific\simpleD3D12 shawnz 2025-04-21 11:52:33 +08:00
  • 1627e96677 Merge branch 'shawnz_bugs_fix_cuda_a_dev' into 'cuda_a_dev' Rob Armstrong 2025-04-17 09:29:53 -07:00
  • 7e90d36120 Bug 5196362: Update parameters of cuCtxCreate for vectorAddMMAP shawnz 2025-04-17 10:53:03 +08:00
  • 9e50fdc01f Merge branch 'shawnz_bugs_fix_cuda_a_dev' into 'cuda_a_dev' Rob Armstrong 2025-04-15 09:55:49 -07:00
  • 2c0b36a967 Bug 5214721: Correct the path of nvvm64_40_0.dll shawnz 2025-04-15 14:27:58 +08:00
  • 83397dc811 Merge branch 'shawnz_bugs_fix_cuda_a_dev' into 'cuda_a_dev' Rob Armstrong 2025-04-14 09:48:09 -07:00
  • 640b566412 Bug 5214721: Update path for nvvm64_40_0.dll on CUDA 13.0 shawnz 2025-04-14 16:34:24 +08:00
  • da24673a9f Update CHANGELOG.md for CUDA 13.0 changes shawnz 2025-04-14 16:33:12 +08:00
  • bded2585a4 Merge branch 'shawnz_bugs_fix_cuda_a_dev' into 'cuda_a_dev' Rob Armstrong 2025-04-11 07:07:55 -07:00
  • 5384563c57 Remove SM < 75 for cudaNvSci shawnz 2025-04-11 15:06:52 +08:00