mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2026-05-14 14:06:53 +08:00
- Added Python samples for CUDA Python 1.0 release - Renamed top-level `Samples` directory to `cpp` to accommodate Python samples.
131 lines
3.4 KiB
Markdown
131 lines
3.4 KiB
Markdown
# Sample: Vector Addition (Python)
|
|
|
|
## Description
|
|
|
|
Run your first GPU kernel: add two vectors element-wise on the GPU using the [`cuda.core`](https://nvidia.github.io/cuda-python/cuda-core/latest/) API with runtime compilation.
|
|
|
|
## What You'll Learn
|
|
|
|
- Writing CUDA kernels in C++ with template support
|
|
- Runtime compilation of CUDA kernels from Python
|
|
- Using `cuda.core` for device management, programs, and launches
|
|
- Configuring and launching kernels with grid and block dimensions
|
|
- Using CuPy for GPU memory management
|
|
- Verifying GPU results against CPU computation
|
|
|
|
## Key Libraries
|
|
|
|
- [`cuda.core`](https://nvidia.github.io/cuda-python/cuda-core/latest/) — Pythonic access to CUDA runtime and compilation
|
|
- `cupy` — GPU array library for Python
|
|
|
|
## Key APIs
|
|
|
|
### From `cuda.core`
|
|
|
|
- `Device` — Initialize and manage CUDA device
|
|
- `Program` — Create program from kernel source code
|
|
- `ProgramOptions` — Set compilation options (C++ standard, architecture)
|
|
- `LaunchConfig` — Configure kernel launch parameters
|
|
- `launch` — Execute kernel on specified stream
|
|
|
|
Import stable symbols from the top-level package (not `cuda.core.experimental`). See the [cuda.core documentation](https://nvidia.github.io/cuda-python/cuda-core/latest/).
|
|
|
|
### From CuPy
|
|
|
|
- `cp.random.rand()` — Generate random arrays on GPU
|
|
- `cp.empty()` — Allocate uninitialized GPU arrays
|
|
- `cp.allclose()` — Verify results with tolerance
|
|
|
|
### From `cuda_samples_utils`
|
|
|
|
- `verify_array_result()` — Verify computation results
|
|
|
|
## Kernel Techniques
|
|
|
|
- **1D Grid-Stride Loop** — Handle arbitrary array sizes with fixed grid
|
|
- **Template Programming** — Generic kernel for different data types
|
|
- **Bounds Checking** — Prevent out-of-bounds memory access
|
|
|
|
## Requirements
|
|
|
|
### Hardware
|
|
|
|
- NVIDIA GPU with Compute Capability 7.0 or higher
|
|
- Minimum GPU memory: 512 MB
|
|
|
|
### Software
|
|
|
|
- CUDA Toolkit 13.0 or newer (matches `cuda-python` 13.x)
|
|
- Python 3.10 or newer
|
|
- `cuda-python` (>=13.0.0)
|
|
- `cuda-core` (>=0.6.0)
|
|
- `cupy-cuda13x` (>=13.0.0)
|
|
|
|
## Installation
|
|
|
|
Install the required packages from requirements.txt:
|
|
|
|
```bash
|
|
cd /path/to/cuda-samples/python/1_GettingStarted/vectorAdd
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
The requirements.txt installs:
|
|
|
|
- `cuda-python` (>=13.0.0)
|
|
- `cuda-core` (>=0.6.0)
|
|
- `cupy-cuda13x` (>=13.0.0)
|
|
|
|
## How to Run
|
|
|
|
### Basic usage
|
|
|
|
```bash
|
|
cd samples/python/1_GettingStarted/vectorAdd
|
|
python vectorAdd.py
|
|
```
|
|
|
|
### With custom parameters
|
|
|
|
```bash
|
|
# Custom vector size
|
|
python vectorAdd.py --elements 1000000
|
|
|
|
# Use specific GPU
|
|
python vectorAdd.py --device 1
|
|
|
|
# Skip verification for benchmarking
|
|
python vectorAdd.py --no-verify
|
|
```
|
|
|
|
## Expected Output
|
|
|
|
```
|
|
[Vector addition using CUDA Core API]
|
|
Device: <Your GPU Name>
|
|
Compute Capability: sm_<XX>
|
|
Compiling kernel 'vectorAdd<float>'...
|
|
Kernel compiled successfully
|
|
[Vector addition of 50000 elements]
|
|
CUDA kernel launch with 196 blocks of 256 threads
|
|
Verifying result...
|
|
Test PASSED
|
|
|
|
Done
|
|
```
|
|
|
|
**Note:** Device name and compute capability will vary based on your GPU.
|
|
|
|
## Files
|
|
|
|
- `vectorAdd.py` — Python implementation using cuda.core API
|
|
- `README.md` — This file
|
|
- `requirements.txt` — Sample dependencies
|
|
- `../../Utilities/cuda_samples_utils.py` — Common utilities (imported by this sample)
|
|
|
|
## See Also
|
|
|
|
- [CUDA Python Documentation](https://nvidia.github.io/cuda-python/)
|
|
- [cuda.core API](https://nvidia.github.io/cuda-python/cuda-core/latest/)
|
|
- [CuPy Documentation](https://docs.cupy.dev/)
|