Dheemanth aeab82ff30
CUDA 13.2 samples update (#432)
- Added Python samples for CUDA Python 1.0 release
- Renamed top-level `Samples` directory to `cpp` to accommodate Python samples.
2026-05-13 17:13:18 -05:00

2.1 KiB

Sample: Kernel Nsys Profiling - CUDA C++ Kernel Profiling with cuda.core (Python)

Description

This sample demonstrates how to profile custom CUDA C++ kernels compiled and launched with cuda.core using NVIDIA Nsight Systems. It implements three GPU operations (vector addition, SAXPY, vector transform) as custom kernels and shows how to instrument code with NVTX markers for profiling analysis.

What you will learn

  • How to write and compile CUDA C++ kernels with cuda.core.Program
  • How to launch kernels with LaunchConfig and manage CUDA streams
  • How to use NVTX markers (nvtx.annotate()) to annotate code sections
  • How to profile kernels with Nsight Systems and analyze performance
  • Modern CUDA Python workflow with cuda.core.Device and proper resource cleanup

Requirements

  • NVIDIA GPU with Compute Capability 7.0+
  • CUDA Toolkit 13.0+
  • Python 3.10+
  • Packages: numpy, cuda-python, cuda-core, cupy-cuda13x, nvtx (see requirements.txt; NumPy >=2.3.2)

Install:

pip install -r requirements.txt

How to run

python kernelNsysProfile.py
python kernelNsysProfile.py --array-size 10000000  # Custom size

Nsys Profiling

Basic profile:

nsys profile -o gpu_profile python kernelNsysProfile.py
nsys-ui gpu_profile.nsys-rep  # View results

The program uses color-coded NVTX markers:

  • Purple: Phase 2 (cuda.core Custom Kernels - main focus)
  • Yellow/Blue/Green: Other phases
  • Cyan: Nested operations

Focus on Phase 2 to analyze kernel execution times, launch overhead, and GPU utilization.

For detailed Nsys usage and analysis techniques, see the NVIDIA Nsight Systems documentation.

Troubleshooting

Missing packages:

pip install -r requirements.txt

Out of memory:

python kernelNsysProfile.py -n 10000000  # Reduce array size

Nsys not found:

export PATH=/usr/local/cuda/bin:$PATH

See Also