Commit Graph

68 Commits

Author SHA1 Message Date
aioprli
6aac4717b8
Update globalToShmemAsyncCopy.cu
Fix two obvious errors, the first one is that five tasks were submitted to pipeline at the same time and task 4 conflicts with task 0, the remaining two are copy errors
2024-05-06 22:03:16 +08:00
Rob Nertney
5f97d7d0df Updating graphConditionalNodes orphan directory 2024-04-10 19:44:42 +00:00
Rob Nertney
cd3bc1fa8e Updating samples for CUDA 12.4 2024-03-05 20:53:50 +00:00
Rob Nertney
e8568c4173 Fixing jitlto regression, including missing cuDLA source files for bug #235, and updating changelogs 2023-11-09 16:52:00 +00:00
Rob Nertney
b5c84e6996 Updating Samples for 12.3 and updating props files 2023-10-23 18:44:49 +00:00
Rob Nertney
c46754b877 Update samples for 12.3 2023-10-20 17:38:48 +00:00
Rob Nertney
03309a2d42 Changelog updates 2023-06-29 19:33:40 +00:00
Rob Nertney
5688ee0013 Removing stray cpp from master 2023-05-31 17:48:13 +00:00
Rob Nertney
8004ad59ab Fix #194 and add Large Kernel Parameters Sample 2023-05-31 04:43:22 +00:00
Rob Nertney
e612904184
Merge pull request #182 from Wenlong-Zhu/master
Fix cudaExtent.width set error.
2023-03-27 20:53:45 -07:00
Rob Nertney
81cf058e30 Updating Samples for 12.1 2023-03-01 01:41:29 +00:00
Rob Nertney
00bb9bc367 Updating files for Ada architecture 2023-02-27 22:33:19 +00:00
Rob Nertney
e4789153d5 Updating License Header 2023-02-09 19:02:33 +00:00
Rob Nertney
3d553b2ea1 Adding JIT LTO Sample 2023-02-07 19:06:38 +00:00
wenlong-zhu
9316529638 Fix cudaExtent.width set error.
unit: 4_CUDA_Libraries/cudaNvSciNvMedia/cuda_consumer.cu
Because of the change of padding size in NvSciBuf,
the cudaExtent.width and cudaExtent.height should be change

Bug 3880762
2023-02-04 00:00:44 +08:00
Rob Nertney
2b689228b7 Updating samples for 12.0 2022-12-08 20:19:55 +00:00
Rob Nertney
81992093d2 Update samples for CUDA 11.8 with correct props 2022-10-14 17:43:37 -07:00
Rutwik Choughule
8f21b899b6 update dependency related links in README files 2022-01-27 17:58:13 +05:30
Rutwik Choughule
0cbe5f2d82 update makefiles to waive unsupported samples on QNX 2022-01-27 17:57:02 +05:30
Rutwik Choughule
805e60bdfc update lib path for conda 2022-01-27 17:55:38 +05:30
Rutwik Choughule
9d4c014f60 update sample cudaNvSci 2022-01-25 17:22:31 +05:30
Rutwik Choughule
bf8c6dd043 update lib path for conda 2022-01-14 02:31:40 +05:30
Rutwik Choughule
2e41896e1b add and update samples for CUDA 11.6 2022-01-13 11:35:24 +05:30
Rutwik Choughule
471dd47f84 update simpleVulkan
fix typo in importCudaExternalMemory()
2021-11-26 18:47:48 +05:30
Rutwik Choughule
0563025cde update simpleVulkan sample 2021-11-23 18:51:00 +05:30
Rutwik Choughule
3a05f29b94 update cuDLA samples
fix missing DPRINTF
fix Makefile
2021-11-23 14:34:52 +05:30
Rutwik Choughule
e64c65a0d3 update sample conjugateGradientMultiDeviceCG
remove use of deprecated function cudaLaunchCooperativeKernelMultiDevice()
2021-11-18 10:16:22 +05:30
Rutwik Choughule
af0e1af181 remove duplicate sample 2021-11-18 10:15:10 +05:30
Rutwik Choughule
01789304f0 update sample bf16TensorCoreGemm to add explicit casting 2021-11-01 13:22:44 +05:30
Rutwik Choughule
1f76a2d110 add and update samples for CUDA 11.5 2021-10-21 16:34:49 +05:30
Rutwik Choughule
3342d604fe update sample conjugateGradientMultiDeviceCG to remove use of deprecated function cudaLaunchCooperativeKernelMultiDevice() 2021-08-09 22:52:15 +05:30
Rutwik Choughule
8789eb6266 add and update samples with CUDA 11.4 update 1 support 2021-08-03 19:02:58 +05:30
Rutwik Choughule
e950012e72 add and update samples with CUDA 11.4 support 2021-06-30 11:26:41 +05:30
Rutwik Choughule
95b7cea7bc Merge branch 'master' of https://github.com/NVIDIA/cuda-samples 2021-06-10 18:37:53 +05:30
Rutwik Choughule
2aeaf51b11 cudaNvSciNvMedia plane offset correction 2021-06-10 17:33:24 +05:30
Rutwik Choughule
ba5a483c6e update vulkan samples with SPIR-V shaders 2021-06-10 17:30:25 +05:30
Rutwik Choughule
7a5b3e6c8c update vulkan samples with SPIR-V shaders 2021-06-02 17:17:21 +05:30
Rutwik Choughule
0787bc0489
correction in include path 2021-05-04 16:36:42 +05:30
Rutwik Choughule
5c3ec60fae correction in include path 2021-05-03 14:17:55 +05:30
Rutwik Choughule
568b39bd5b add and update samples with CUDA 11.3 support 2021-04-16 11:54:26 +05:30
Mahesh Doijade
067cb65523 -- Add partitioned cuda pipeline prod-cons gemm kernel
-- Add cudaCompressibleMemory sample to use copy engine vs SM writes
   depending on arch
2021-03-03 22:53:02 +05:30
Mahesh Doijade
b882fa00ee -- Add freeglut and glew64 libs for simpleGL
-- delete freeimage libs and headers, user needs to install it for
building samples depending on it
2020-12-11 17:34:13 +05:30
Mahesh Doijade
1cd3264681 Add and update samples with CUDA 11.2 support 2020-12-10 01:05:32 +05:30
Mahesh Doijade
92b0568792 Add cudaNvSciNvMedia sample with/without nvsci* APIs, it takes RGBA image as input and produces YUV via nvmedia
this YUV is consumed by cuda which converts it to grayscale image which is written to file as output
2020-11-24 16:28:04 +05:30
Mahesh Doijade
c4e2869a2b add multi-warp cooperative groups based reduction kernel in reduction sample 2020-09-24 16:49:58 +05:30
Mahesh Doijade
44052982a7 add ewp tags to VS project of conjugateGradientMultiDeviceCG 2020-09-17 20:01:41 +05:30
Mahesh Doijade
cd76533c3f Add and update samples for cuda 11.1 support 2020-09-15 23:45:56 +05:30
Mahesh Doijade
e6ce58fef4 [p2pBandwidthLatency] increase default buffer size and add support to specific buffer size through command line 2020-06-30 19:05:55 +05:30
Mahesh Doijade
0ec4bd58e5 cudaCompressibleMemory: when unable to allocate compressible memory waive the execution 2020-06-01 20:22:22 +05:30
Mahesh Doijade
908dddb207 cudaCompressibleMemory: refactor and refine, and use only cuMemMap API for non-compressible allocation.
hence, remove use of cudaMalloc.
2020-05-27 19:00:33 +05:30