cuda-samples/Samples/4_CUDA_Libraries
2022-06-13 19:05:24 +01:00
..
batchCUBLAS Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
batchedLabelMarkersAndLabelCompressionNPP Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
boxFilterNPP Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cannyEdgeDetectorNPP Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
conjugateGradient Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
conjugateGradientCudaGraphs Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
conjugateGradientMultiBlockCG Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
conjugateGradientMultiDeviceCG Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
conjugateGradientPrecond Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
conjugateGradientUM Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cudaNvSci Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cudaNvSciNvMedia Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cuDLAErrorReporting Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cuDLAHybridMode Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cuDLAStandaloneMode Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cuSolverDn_LinearSolver Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cuSolverRf Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cuSolverSp_LinearSolver Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cuSolverSp_LowlevelCholesky Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
cuSolverSp_LowlevelQR Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
FilterBorderControlNPP Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
freeImageInteropNPP Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
histEqualizationNPP Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
lineOfSight Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
matrixMulCUBLAS Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
MersenneTwisterGP11213 Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
nvJPEG Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
nvJPEG_encoder Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
oceanFFT Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
randomFog Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
simpleCUBLAS Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
simpleCUBLAS_LU Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
simpleCUBLASXT Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
simpleCUFFT Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
simpleCUFFT_2d_MGPU Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
simpleCUFFT_callback Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
simpleCUFFT_MGPU Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
watershedSegmentationNPP Update CUDA Toolkit version to 11.7 in all projects and docs 2022-06-13 19:05:24 +01:00
README.md add and update samples for CUDA 11.6 2022-01-13 11:35:24 +05:30

4. CUDA Libraries

batchCUBLAS

A CUDA Sample that demonstrates how using batched CUBLAS API calls to improve overall performance.

batchedLabelMarkersAndLabelCompressionNPP

An NPP CUDA Sample that demonstrates how to use the NPP label markers generation and label compression functions based on a Union Find (UF) algorithm including both single image and batched image versions.

boxFilterNPP

A NPP CUDA Sample that demonstrates how to use NPP FilterBox function to perform a Box Filter.

cannyEdgeDetectorNPP

An NPP CUDA Sample that demonstrates the recommended parameters to use with the nppiFilterCannyBorder_8u_C1R Canny Edge Detection image filter function. This function expects a single channel 8-bit grayscale input image. You can generate a grayscale image from a color image by first calling nppiColorToGray() or nppiRGBToGray(). The Canny Edge Detection function combines and improves on the techniques required to produce an edge detection image using multiple steps.

conjugateGradient

This sample implements a conjugate gradient solver on GPU using CUBLAS and CUSPARSE library.

conjugateGradientCudaGraphs

This sample implements a conjugate gradient solver on GPU using CUBLAS and CUSPARSE library calls captured and called using CUDA Graph APIs.

conjugateGradientMultiBlockCG

This sample implements a conjugate gradient solver on GPU using Multi Block Cooperative Groups, also uses Unified Memory.

conjugateGradientMultiDeviceCG

This sample implements a conjugate gradient solver on multiple GPUs using Multi Device Cooperative Groups, also uses Unified Memory optimized using prefetching and usage hints.

conjugateGradientPrecond

This sample implements a preconditioned conjugate gradient solver on GPU using CUBLAS and CUSPARSE library.

conjugateGradientUM

This sample implements a conjugate gradient solver on GPU using CUBLAS and CUSPARSE library, using Unified Memory

cudaNvSci

This sample demonstrates CUDA-NvSciBuf/NvSciSync Interop. Two CPU threads import the NvSciBuf and NvSciSync into CUDA to perform two image processing algorithms on a ppm image - image rotation in 1st thread & rgba to grayscale conversion of rotated image in 2nd thread. Currently only supported on Ubuntu 18.04

cudaNvSciNvMedia

This sample demonstrates CUDA-NvMedia interop via NvSciBuf/NvSciSync APIs. Note that this sample only supports cross build from x86_64 to aarch64, aarch64 native build is not supported. For detailed workflow of the sample please check cudaNvSciNvMedia_Readme.pdf in the sample directory.

cuDLAErrorReporting

This sample demonstrates how DLA errors can be detected via CUDA.

cuDLAHybridMode

This sample demonstrates cuDLA hybrid mode wherein DLA can be programmed using CUDA.

cuDLAStandaloneMode

This sample demonstrates cuDLA standalone mode wherein DLA can be programmed without using CUDA.

cuSolverDn_LinearSolver

A CUDA Sample that demonstrates cuSolverDN's LU, QR and Cholesky factorization.

cuSolverRf

A CUDA Sample that demonstrates cuSolver's refactorization library - CUSOLVERRF.

cuSolverSp_LinearSolver

A CUDA Sample that demonstrates cuSolverSP's LU, QR and Cholesky factorization.

cuSolverSp_LowlevelCholesky

A CUDA Sample that demonstrates Cholesky factorization using cuSolverSP's low level APIs.

cuSolverSp_LowlevelQR

A CUDA Sample that demonstrates QR factorization using cuSolverSP's low level APIs.

FilterBorderControlNPP

This sample demonstrates how any border version of an NPP filtering function can be used in the most common mode, with border control enabled. Mentioned functions can be used to duplicate the results of the equivalent non-border version of the NPP functions. They can be also used for enabling and disabling border control on various source image edges depending on what portion of the source image is being used as input.

freeImageInteropNPP

A simple CUDA Sample demonstrate how to use FreeImage library with NPP.

histEqualizationNPP

This CUDA Sample demonstrates how to use NPP for histogram equalization for image data.

lineOfSight

This sample is an implementation of a simple line-of-sight algorithm: Given a height map and a ray originating at some observation point, it computes all the points along the ray that are visible from the observation point. The implementation is based on the Thrust library.

matrixMulCUBLAS

This sample implements matrix multiplication from Chapter 3 of the programming guide. To illustrate GPU performance for matrix multiply, this sample also shows how to use the new CUDA 4.0 interface for CUBLAS to demonstrate high-performance performance for matrix multiplication.

MersenneTwisterGP11213

This sample demonstrates the Mersenne Twister random number generator GP11213 in cuRAND.

nvJPEG

A CUDA Sample that demonstrates single and batched decoding of jpeg images using NVJPEG Library.

nvJPEG_encoder

A CUDA Sample that demonstrates single encoding of jpeg images using NVJPEG Library.

oceanFFT

This sample simulates an Ocean height field using CUFFT Library and renders the result using OpenGL.

randomFog

This sample illustrates pseudo- and quasi- random numbers produced by CURAND.

simpleCUBLAS

Example of using CUBLAS API interface to perform GEMM operations.

simpleCUBLAS_LU

CUDA sample demonstrating cuBLAS API cublasDgetrfBatched() for lower-upper (LU) decomposition of a matrix.

simpleCUBLASXT

Example of using CUBLAS-XT library which performs GEMM operations over Multiple GPUs.

simpleCUFFT

Example of using CUFFT. In this example, CUFFT is used to compute the 1D-convolution of some signal with some filter by transforming both into frequency domain, multiplying them together, and transforming the signal back to time domain. cuFFT plans are created using simple and advanced API functions.

simpleCUFFT_2d_MGPU

Example of using CUFFT. In this example, CUFFT is used to compute the 2D-convolution of some signal with some filter by transforming both into frequency domain, multiplying them together, and transforming the signal back to time domain on Multiple GPU.

simpleCUFFT_callback

Example of using CUFFT. In this example, CUFFT is used to compute the 1D-convolution of some signal with some filter by transforming both into frequency domain, multiplying them together, and transforming the signal back to time domain. The difference between this example and the Simple CUFFT example is that the multiplication step is done by the CUFFT kernel with a user-supplied CUFFT callback routine, rather than by a separate kernel call.

simpleCUFFT_MGPU

Example of using CUFFT. In this example, CUFFT is used to compute the 1D-convolution of some signal with some filter by transforming both into frequency domain, multiplying them together, and transforming the signal back to time domain on Multiple GPU.

watershedSegmentationNPP

An NPP CUDA Sample that demonstrates how to use the NPP watershed segmentation function.