mirror of https://github.com/NVIDIA/cuda-samples.git synced 2026-07-16 21:06:52 +08:00

History

wenlong-zhu 9316529638 Fix cudaExtent.width set error.

unit: 4_CUDA_Libraries/cudaNvSciNvMedia/cuda_consumer.cu
Because of the change of padding size in NvSciBuf,
the cudaExtent.width and cudaExtent.height should be change

Bug 3880762

2023-02-04 00:00:44 +08:00

batchCUBLAS

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

batchedLabelMarkersAndLabelCompressionNPP

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

boxFilterNPP

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cannyEdgeDetectorNPP

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

conjugateGradient

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

conjugateGradientCudaGraphs

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

conjugateGradientMultiBlockCG

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

conjugateGradientMultiDeviceCG

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

conjugateGradientPrecond

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

conjugateGradientUM

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cudaNvSci

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cudaNvSciNvMedia

Fix cudaExtent.width set error.

2023-02-04 00:00:44 +08:00

cuDLAErrorReporting

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cuDLAHybridMode

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cuDLAStandaloneMode

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cuSolverDn_LinearSolver

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cuSolverRf

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cuSolverSp_LinearSolver

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cuSolverSp_LowlevelCholesky

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

cuSolverSp_LowlevelQR

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

FilterBorderControlNPP

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

freeImageInteropNPP

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

histEqualizationNPP

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

lineOfSight

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

matrixMulCUBLAS

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

MersenneTwisterGP11213

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

nvJPEG

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

nvJPEG_encoder

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

oceanFFT

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

randomFog

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

simpleCUBLAS

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

simpleCUBLAS_LU

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

simpleCUBLASXT

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

simpleCUFFT

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

simpleCUFFT_2d_MGPU

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

simpleCUFFT_callback

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

simpleCUFFT_MGPU

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

watershedSegmentationNPP

Updating samples for 12.0

2022-12-08 20:19:55 +00:00

README.md

add and update samples for CUDA 11.6

2022-01-13 11:35:24 +05:30

4. CUDA Libraries

batchCUBLAS

A CUDA Sample that demonstrates how using batched CUBLAS API calls to improve overall performance.

batchedLabelMarkersAndLabelCompressionNPP

An NPP CUDA Sample that demonstrates how to use the NPP label markers generation and label compression functions based on a Union Find (UF) algorithm including both single image and batched image versions.

boxFilterNPP

A NPP CUDA Sample that demonstrates how to use NPP FilterBox function to perform a Box Filter.

cannyEdgeDetectorNPP

An NPP CUDA Sample that demonstrates the recommended parameters to use with the nppiFilterCannyBorder_8u_C1R Canny Edge Detection image filter function. This function expects a single channel 8-bit grayscale input image. You can generate a grayscale image from a color image by first calling nppiColorToGray() or nppiRGBToGray(). The Canny Edge Detection function combines and improves on the techniques required to produce an edge detection image using multiple steps.

conjugateGradient

This sample implements a conjugate gradient solver on GPU using CUBLAS and CUSPARSE library.

conjugateGradientCudaGraphs

This sample implements a conjugate gradient solver on GPU using CUBLAS and CUSPARSE library calls captured and called using CUDA Graph APIs.

conjugateGradientMultiBlockCG

This sample implements a conjugate gradient solver on GPU using Multi Block Cooperative Groups, also uses Unified Memory.

conjugateGradientMultiDeviceCG

This sample implements a conjugate gradient solver on multiple GPUs using Multi Device Cooperative Groups, also uses Unified Memory optimized using prefetching and usage hints.

conjugateGradientPrecond

This sample implements a preconditioned conjugate gradient solver on GPU using CUBLAS and CUSPARSE library.

conjugateGradientUM

This sample implements a conjugate gradient solver on GPU using CUBLAS and CUSPARSE library, using Unified Memory

cudaNvSci

This sample demonstrates CUDA-NvSciBuf/NvSciSync Interop. Two CPU threads import the NvSciBuf and NvSciSync into CUDA to perform two image processing algorithms on a ppm image - image rotation in 1st thread & rgba to grayscale conversion of rotated image in 2nd thread. Currently only supported on Ubuntu 18.04

cudaNvSciNvMedia

This sample demonstrates CUDA-NvMedia interop via NvSciBuf/NvSciSync APIs. Note that this sample only supports cross build from x86_64 to aarch64, aarch64 native build is not supported. For detailed workflow of the sample please check cudaNvSciNvMedia_Readme.pdf in the sample directory.

cuDLAErrorReporting

This sample demonstrates how DLA errors can be detected via CUDA.

cuDLAHybridMode

This sample demonstrates cuDLA hybrid mode wherein DLA can be programmed using CUDA.

cuDLAStandaloneMode

This sample demonstrates cuDLA standalone mode wherein DLA can be programmed without using CUDA.

cuSolverDn_LinearSolver

A CUDA Sample that demonstrates cuSolverDN's LU, QR and Cholesky factorization.

cuSolverRf

A CUDA Sample that demonstrates cuSolver's refactorization library - CUSOLVERRF.

cuSolverSp_LinearSolver

A CUDA Sample that demonstrates cuSolverSP's LU, QR and Cholesky factorization.

cuSolverSp_LowlevelCholesky

A CUDA Sample that demonstrates Cholesky factorization using cuSolverSP's low level APIs.

cuSolverSp_LowlevelQR

A CUDA Sample that demonstrates QR factorization using cuSolverSP's low level APIs.

FilterBorderControlNPP

This sample demonstrates how any border version of an NPP filtering function can be used in the most common mode, with border control enabled. Mentioned functions can be used to duplicate the results of the equivalent non-border version of the NPP functions. They can be also used for enabling and disabling border control on various source image edges depending on what portion of the source image is being used as input.

freeImageInteropNPP

A simple CUDA Sample demonstrate how to use FreeImage library with NPP.

histEqualizationNPP

This CUDA Sample demonstrates how to use NPP for histogram equalization for image data.

lineOfSight

This sample is an implementation of a simple line-of-sight algorithm: Given a height map and a ray originating at some observation point, it computes all the points along the ray that are visible from the observation point. The implementation is based on the Thrust library.

matrixMulCUBLAS

This sample implements matrix multiplication from Chapter 3 of the programming guide. To illustrate GPU performance for matrix multiply, this sample also shows how to use the new CUDA 4.0 interface for CUBLAS to demonstrate high-performance performance for matrix multiplication.

MersenneTwisterGP11213

This sample demonstrates the Mersenne Twister random number generator GP11213 in cuRAND.

nvJPEG

A CUDA Sample that demonstrates single and batched decoding of jpeg images using NVJPEG Library.

nvJPEG_encoder

A CUDA Sample that demonstrates single encoding of jpeg images using NVJPEG Library.

oceanFFT

This sample simulates an Ocean height field using CUFFT Library and renders the result using OpenGL.

randomFog

This sample illustrates pseudo- and quasi- random numbers produced by CURAND.

simpleCUBLAS

Example of using CUBLAS API interface to perform GEMM operations.

simpleCUBLAS_LU

CUDA sample demonstrating cuBLAS API cublasDgetrfBatched() for lower-upper (LU) decomposition of a matrix.

simpleCUBLASXT

Example of using CUBLAS-XT library which performs GEMM operations over Multiple GPUs.

simpleCUFFT

simpleCUFFT_2d_MGPU

Example of using CUFFT. In this example, CUFFT is used to compute the 2D-convolution of some signal with some filter by transforming both into frequency domain, multiplying them together, and transforming the signal back to time domain on Multiple GPU.

simpleCUFFT_callback

Example of using CUFFT. In this example, CUFFT is used to compute the 1D-convolution of some signal with some filter by transforming both into frequency domain, multiplying them together, and transforming the signal back to time domain. The difference between this example and the Simple CUFFT example is that the multiplication step is done by the CUFFT kernel with a user-supplied CUFFT callback routine, rather than by a separate kernel call.

simpleCUFFT_MGPU

watershedSegmentationNPP

An NPP CUDA Sample that demonstrates how to use the NPP watershed segmentation function.