Dheemanth aeab82ff30
CUDA 13.2 samples update (#432)
- Added Python samples for CUDA Python 1.0 release
- Renamed top-level `Samples` directory to `cpp` to accommodate Python samples.
2026-05-13 17:13:18 -05:00

135 lines
3.7 KiB
Markdown

# CUDA Python Utilities
Common utilities for CUDA Python samples using the `cuda.core` API.
## Overview
This module provides reusable utility functions for CUDA samples to reduce code duplication. Samples import from `cuda_samples_utils.py` using simple path-based imports (no package structure needed).
## Installation Requirements
Install from the Python samples directory:
```bash
cd /path/to/cuda-samples/Python
pip install -r requirements.txt
```
This installs a common CUDA 13 stack (see `python/requirements.txt`):
- `cuda-python` (>=13.0.0)
- `cuda-core` (>=0.6.0)
- `cupy-cuda13x` (>=13.0.0)
- `numpy` (>=2.3.2)
## How to Use in Samples
Import utilities using path-based import:
```python
import sys
from pathlib import Path
# Add Utilities directory to path
sys.path.insert(0, str(Path(__file__).parent.parent.parent / "Utilities"))
from cuda_samples_utils import verify_array_result
# Use the utility
if verify_array_result(result, expected):
print("Success!")
```
## Available Functions
### Result Verification
#### `verify_array_result(result, expected, rtol=1e-5, atol=1e-8, verbose=True)`
Verify computed results match expected values. The helper detects whether both
arguments are NumPy arrays or both are CuPy arrays and uses the matching
library's `allclose` (no unnecessary cross-device transfers).
**Parameters:**
- `result`: NumPy or CuPy array with computed results
- `expected`: NumPy or CuPy array with expected values (same kind as `result`)
- `rtol`: Relative tolerance (default: 1e-5)
- `atol`: Absolute tolerance (default: 1e-8)
- `verbose`: Print test result (default: True)
**Returns:**
- `True` if results match within tolerance, `False` otherwise
**Example:**
```python
expected = a + b
if verify_array_result(c, expected):
print("Computation correct!")
```
### Package Check
#### `check_cuda_requirements()`
Check if required CUDA packages are available.
**Returns:**
- `True` if requirements are met, `False` otherwise
**Example:**
```python
if not check_cuda_requirements():
sys.exit(1)
```
## Design Philosophy
These utilities focus on common operations that are **not** part of `cuda.core` API:
- Result verification for NumPy or CuPy arrays
- Package requirements checking
For CUDA operations like device initialization, kernel compilation, and grid size calculations, samples should use `cuda.core` API directly to demonstrate the proper usage patterns.
## Complete Example
See `../1_GettingStarted/vectorAdd/vectorAdd.py` for a complete example:
```python
import sys
from pathlib import Path
# Import utility
sys.path.insert(0, str(Path(__file__).parent.parent.parent / "Utilities"))
from cuda_samples_utils import verify_array_result
import cupy as cp
from cuda.core import Device, Program, ProgramOptions, LaunchConfig, launch
# Use cuda.core directly for device and kernel operations
device = Device(0)
device.set_current()
program_options = ProgramOptions(std="c++17", arch=f"sm_{device.arch}")
program = Program(kernel_source, code_type="c++", options=program_options)
module = program.compile("cubin", name_expressions=("kernel_name",))
kernel = module.get_kernel("kernel_name")
# Calculate grid size inline
threads_per_block = 256
blocks_per_grid = (num_elements + threads_per_block - 1) // threads_per_block
# Launch kernel - pass cupy arrays directly
config = LaunchConfig(grid=blocks_per_grid, block=threads_per_block)
launch(stream, config, kernel, a, b, c, cp.int32(num_elements))
# Verify results using utility
verify_array_result(c, expected)
```
## Benefits
- **Code Reuse**: Write common functionality once
- **Consistency**: All samples use the same patterns
- **Maintainability**: Bug fixes benefit all samples
- **Transparency**: Samples show cuda.core API usage directly
- **Simplicity**: No complex package structure needed