312 Commits

Author SHA1 Message Date
shawnz
2ec9cf394a Bug 5272236: Update the include file copy path as path changes on 13.0 2025-05-12 15:00:52 +08:00
Rob Armstrong
c6af90553e Merge branch 'master' into 'master'
Bug 5236593: Increase the pending kernel launch limit to 4096

See merge request cuda-samples/cuda-samples!109
2025-05-07 09:47:51 -07:00
Peggy Tian
611008fa86 Bug 5236593: Increase the pending kernel launch limit to 4096 2025-05-07 17:38:52 +08:00
shawnz
bf628887f1 Bug 5217339: Replace SM 101 with 110 for Thor 2025-05-07 14:10:35 +08:00
Rob Armstrong
330dd8472f Merge external PR #363 2025-05-05 08:45:01 -07:00
Francesco Rizzi
b530f1cf42
Fix bug in 6_Performance/transpose: copy sharedmem kernel (#363)
Update kernel loop bounds handling, main loop data copy to avoid incorrect reuse of output results.

---------

Authored-by: Francesco Rizzi <francesco.rizzi@ng-analytics.com>
2025-05-05 08:43:23 -07:00
Rob Armstrong
f2645c5df8 Final merge of 12.9 changes into cuda_a_dev 2025-05-01 09:55:03 -07:00
Rob Armstrong
c70d79cf3b Final 12.9 README updates 2025-05-01 09:39:06 -07:00
Rob Armstrong
9ac81370fa Update 12.9 changes from 'master' into 'cuda_a_dev' 2025-04-30 09:48:21 -07:00
Rob Armstrong
14b1bfdcc4 Replace README references to "CUDA Toolkit 12.5" with general "CUDA Toolkit" 2025-04-30 09:46:45 -07:00
shawnz
b27b55ec70 Bug 5241914: Fix the error message for cuSolverDn_LinearSolver 2025-04-27 16:57:02 +08:00
shawnz
49159f3739 Bug 5164417 and 5097376: Fix the OpenMP issue finding issue for MSVC and Glang 2025-04-27 16:50:12 +08:00
Rob Armstrong
93cafa8fe9 Update 12.9 changes from 'master' into 'cuda_a_dev' 2025-04-21 09:22:29 -07:00
shawnz
a45fd3bd7c Bug 5199167: Fix the includes issue for 5_Domain_Specific\simpleD3D12 2025-04-21 11:52:33 +08:00
shawnz
7e90d36120 Bug 5196362: Update parameters of cuCtxCreate for vectorAddMMAP 2025-04-17 10:53:03 +08:00
shawnz
2c0b36a967 Bug 5214721: Correct the path of nvvm64_40_0.dll 2025-04-15 14:27:58 +08:00
shawnz
640b566412 Bug 5214721: Update path for nvvm64_40_0.dll on CUDA 13.0 2025-04-14 16:34:24 +08:00
shawnz
5384563c57 Remove SM < 75 for cudaNvSci 2025-04-11 15:06:52 +08:00
shawnz
01a62e2bc0 Bug 5184356: Update the computeMode for remaining 3 samples 2025-04-11 10:40:35 +08:00
shawnz
02fdb070ad Bug 5196362, 5184356, 5212196, 5214258 and 5214259: Update sameples for CUDA13.0 API changes 2025-04-10 18:27:18 +08:00
Rob Armstrong
278f4adbd2 Merge branch 'master' into cuda_a_dev 2025-04-09 08:33:37 -07:00
shawnz
4672b8ba2b Bug 5163983: Remove SM < 75 in CMakeLists.txt of some samples 2025-04-09 15:10:40 +08:00
shawnz
e77d6eb5ab Bug 5207005: Append pid in shmName for Linux only as this is for MIG scenario 2025-04-07 17:17:17 +08:00
Rob Armstrong
ac700327a2 Add folders to CMakeLists.txt for supporting generators and IDEs 2025-04-05 09:54:24 -07:00
shawnz
a1b5a6f6e3 Bug 5163983: Remove SM < 75 in CMakeLists.txt of some samples 2025-04-03 15:51:59 +08:00
shawnz
a32d5badf7 Bug 5196977: Update includes for nbody 2025-04-03 15:30:05 +08:00
Rob Armstrong
1fd22429c3 Merge branch 'shawnz_bugs_fix' into 'master'
Change for fixing bugs: 5196977, 4914019, 4191696 and 5199167 .

See merge request cuda-samples/cuda-samples!97
2025-04-02 22:28:17 -07:00
Rob Armstrong
00ac0a1673 Remove bandwidthTest subdirectory from CMakeLists.txt 2025-04-02 22:27:30 -07:00
shawnz
b013387a39 Update code format 2025-04-03 11:23:26 +08:00
Rob Armstrong
7d1730f348 Remove outdated bandwidthTest sample 2025-04-02 11:19:48 -07:00
Rob Armstrong
a4fba501a6 Merge branch 'master' into cuda_a_dev 2025-04-02 08:40:31 -07:00
shawnz
718fe6486d Bug 5199167: Adjust the include header files sequence for simpleD3D11/simpleD3D11Texture 2025-04-02 15:10:29 +08:00
shawnz
ad9908e32b Bug4914019 & 4191696: Append pid in shmName for MIG multiple thread scenario 2025-04-02 11:20:09 +08:00
shawnz
952d6edf92 Bug 5196977: Include helper_gl.h before cuda_gl_interop.h 2025-04-01 16:07:32 +08:00
shawnz
4abbdf4e80 Bug 5194249: Need to include cuda_runtime.h for cudaNvSci after the clang format change 2025-03-31 14:57:31 +08:00
Rob Armstrong
69522dd5b7 CUDA 13.0 removes support for Maxwell, Pascal, and Volta architecture offline compilation 2025-03-27 10:43:00 -07:00
Rob Armstrong
ceab6e8bcc Apply consistent code formatting across the repo. Add clang-format and pre-commit hooks. 2025-03-27 10:30:07 -07:00
Rob Armstrong
c0ab53f986 Update all sample CMakeLists.txt to include ENABLE_CUDA_DEBUG flag to enable cuda-gdb 2025-03-26 10:08:59 -07:00
Rob Armstrong
b87c243bbb Add -lineinfo flag to all targets to include line information for developer tools 2025-03-26 09:44:20 -07:00
Rob Armstrong
e214cd29aa Update gencode arguments for separate kernel fatbin builds 2025-03-26 09:28:37 -07:00
shawnz
2848d3bd21 Bug 5176886: Enable nvJPEG samples for aarch64 2025-03-21 13:02:14 +08:00
shawnz
bd0f630bf4 Bug 5133197: Add cmake toolchain and and update the CMakeList of some sample for tegra linux cross build 2025-03-20 12:43:44 +08:00
shawnz
ab9166a6b2 Bug 5139353 and 5139213: Enhancement for streamOrderedAllocationIPC 2025-03-12 15:28:54 +08:00
Rob Armstrong
9370f11e69 graphConditionalNodes: Additional tweaks to launch dimension initialization (#348) 2025-03-05 18:18:37 -08:00
Rob Armstrong
8d901e745d graphConditionalNodes: Change launch dimension initialization for better cross-platform compatibility (#346) 2025-03-05 08:33:35 -08:00
Shawn Zeng
9adce9d9f2 Update file CMakeLists.txt 2025-03-03 19:19:50 -08:00
Rob Armstrong
bcad2c9e61 graphConditionalNodes: Add switch, while, if/else conditional examples and minor cleanup (#344) 2025-03-03 17:50:22 -08:00
Shawn Zeng
310e7f2a11 Bug 5143332: Remove the redundant content in 0_Introduction/CMakeLists.txt 2025-03-03 17:37:48 -08:00
Shawn Zeng
7f0f63f311 Bug 5034785: Update all non-ctx nppi APIs to ctx APIs as per latest change on NPP 2025-02-27 03:01:47 -08:00
Shawn Zeng
acd3a015c8 Revert "Bug 5034785: Update all non-ctx nppi APIs to ctx APIs as per latest change on NPP"
This reverts commit a9869fd6eaeecc748fc5f10f4b331fa41efbdaca
2025-02-27 02:48:03 -08:00