- Added Python samples for CUDA Python 1.0 release - Renamed top-level `Samples` directory to `cpp` to accommodate Python samples.
3.7 KiB
CUDA Python Utilities
Common utilities for CUDA Python samples using the cuda.core API.
Overview
This module provides reusable utility functions for CUDA samples to reduce code duplication. Samples import from cuda_samples_utils.py using simple path-based imports (no package structure needed).
Installation Requirements
Install from the Python samples directory:
cd /path/to/cuda-samples/Python
pip install -r requirements.txt
This installs a common CUDA 13 stack (see python/requirements.txt):
cuda-python(>=13.0.0)cuda-core(>=0.6.0)cupy-cuda13x(>=13.0.0)numpy(>=2.3.2)
How to Use in Samples
Import utilities using path-based import:
import sys
from pathlib import Path
# Add Utilities directory to path
sys.path.insert(0, str(Path(__file__).parent.parent.parent / "Utilities"))
from cuda_samples_utils import verify_array_result
# Use the utility
if verify_array_result(result, expected):
print("Success!")
Available Functions
Result Verification
verify_array_result(result, expected, rtol=1e-5, atol=1e-8, verbose=True)
Verify computed results match expected values. The helper detects whether both
arguments are NumPy arrays or both are CuPy arrays and uses the matching
library's allclose (no unnecessary cross-device transfers).
Parameters:
result: NumPy or CuPy array with computed resultsexpected: NumPy or CuPy array with expected values (same kind asresult)rtol: Relative tolerance (default: 1e-5)atol: Absolute tolerance (default: 1e-8)verbose: Print test result (default: True)
Returns:
Trueif results match within tolerance,Falseotherwise
Example:
expected = a + b
if verify_array_result(c, expected):
print("Computation correct!")
Package Check
check_cuda_requirements()
Check if required CUDA packages are available.
Returns:
Trueif requirements are met,Falseotherwise
Example:
if not check_cuda_requirements():
sys.exit(1)
Design Philosophy
These utilities focus on common operations that are not part of cuda.core API:
- Result verification for NumPy or CuPy arrays
- Package requirements checking
For CUDA operations like device initialization, kernel compilation, and grid size calculations, samples should use cuda.core API directly to demonstrate the proper usage patterns.
Complete Example
See ../1_GettingStarted/vectorAdd/vectorAdd.py for a complete example:
import sys
from pathlib import Path
# Import utility
sys.path.insert(0, str(Path(__file__).parent.parent.parent / "Utilities"))
from cuda_samples_utils import verify_array_result
import cupy as cp
from cuda.core import Device, Program, ProgramOptions, LaunchConfig, launch
# Use cuda.core directly for device and kernel operations
device = Device(0)
device.set_current()
program_options = ProgramOptions(std="c++17", arch=f"sm_{device.arch}")
program = Program(kernel_source, code_type="c++", options=program_options)
module = program.compile("cubin", name_expressions=("kernel_name",))
kernel = module.get_kernel("kernel_name")
# Calculate grid size inline
threads_per_block = 256
blocks_per_grid = (num_elements + threads_per_block - 1) // threads_per_block
# Launch kernel - pass cupy arrays directly
config = LaunchConfig(grid=blocks_per_grid, block=threads_per_block)
launch(stream, config, kernel, a, b, c, cp.int32(num_elements))
# Verify results using utility
verify_array_result(c, expected)
Benefits
- Code Reuse: Write common functionality once
- Consistency: All samples use the same patterns
- Maintainability: Bug fixes benefit all samples
- Transparency: Samples show cuda.core API usage directly
- Simplicity: No complex package structure needed