Dheemanth aeab82ff30
CUDA 13.2 samples update (#432)
- Added Python samples for CUDA Python 1.0 release
- Renamed top-level `Samples` directory to `cpp` to accommodate Python samples.
2026-05-13 17:13:18 -05:00

4.7 KiB

ipcMemoryPool (Python)

Description

This sample demonstrates how to share GPU memory between Python processes using CUDA Inter-Process Communication (IPC) and cuda.core's IPC-enabled memory pools.

By default each process has its own CUDA virtual address space and cannot see allocations made by another process. With an IPC-enabled DeviceMemoryResource the parent allocates once, and the child process maps that same physical GPU memory into its own address space so both read and write the same bytes. The sample performs a round-trip test:

  1. Parent creates an IPC-enabled DeviceMemoryResource and allocates a Buffer.
  2. Parent fills the buffer with a known pattern.
  3. Parent sends the Buffer to a child process through an multiprocessing.Queue. cuda.core's pickle reducers re-create the memory resource and map the buffer in the child.
  4. Child verifies the parent's pattern, writes a new pattern, and signals completion.
  5. Parent verifies the child's writes.

What You'll Learn

  • Enabling IPC on a DeviceMemoryResource with ipc_enabled=True
  • Sending Buffer objects across process boundaries via mp.Queue
  • How cuda.core's pickle reducers rebuild the MR and map the buffer in the receiving process
  • Why multiprocessing must use the "spawn" start method with CUDA
  • Detecting IPC support at runtime (POSIX file-descriptor handle type, memory-pool support, Linux-only)

Key Libraries

  • cuda.core - IPC-enabled memory resources and buffer reducers
  • cupy - zero-copy views over the shared device memory via DLPack
  • multiprocessing - standard library process management

Key APIs

From cuda.core

  • DeviceMemoryResource(device, options=DeviceMemoryResourceOptions(ipc_enabled=True)) - create an IPC-enabled memory pool
  • DeviceMemoryResourceOptions(max_size=..., ipc_enabled=True) - configure the underlying pool
  • mr.allocate(nbytes) - allocate a Buffer from the IPC pool
  • Buffer.is_mapped - True when the buffer is usable in the current process
  • Device.properties.memory_pools_supported - runtime feature check
  • Device.properties.handle_type_posix_file_descriptor_supported - runtime feature check

From cuda_samples_utils

  • print_gpu_info() - print device name and compute capability

Requirements

Hardware

  • NVIDIA GPU with Compute Capability 7.0 or higher
  • Device that supports CUDA memory pools and POSIX file-descriptor IPC handles (the sample detects and reports this at startup)
  • Minimum GPU memory: 512 MB

Software

  • Linux x86_64 (POSIX file-descriptor IPC handles are not available on Windows or macOS)
  • CUDA Toolkit 13.0 or newer (matches cuda-python 13.x)
  • Python 3.10 or newer
  • cuda-python (>=13.0.0)
  • cuda-core (>=0.6.0)
  • cupy-cuda13x (>=13.0.0)

Installation

Install the required packages from requirements.txt:

cd /path/to/cuda-samples/python/4_DistributedComputing/ipcMemoryPool
pip install -r requirements.txt

The requirements.txt installs:

  • cuda-python (>=13.0.0)
  • cuda-core (>=0.6.0)
  • cupy-cuda13x (>=13.0.0)

How to Run

Basic usage

cd cuda-samples/python/4_DistributedComputing/ipcMemoryPool
python ipcMemoryPool.py

With custom parameters

# Larger shared buffer
python ipcMemoryPool.py --elements 65536

# Use a specific GPU
python ipcMemoryPool.py --device 1

On platforms or devices that do not support CUDA IPC, the sample prints a diagnostic and exits cleanly with status 0.

Expected Output

Device: <Your GPU Name>
Compute Capability: <X.Y>

Created IPC-enabled DeviceMemoryResource (is_ipc_enabled=True)
Parent wrote pattern (first 5 values): [100. 101. 102. 103. 104.]
Parent sent buffer to child pid=<pid>; waiting...
[child pid=<pid>] received buffer: is_mapped=True, size=4096
Parent sees child's pattern (first 5 values): [-0. -1. -2. -3. -4.]
IPC round-trip: OK

Note: Device name, compute capability, and child PID will vary based on your system.

Files

  • ipcMemoryPool.py - Python implementation using cuda.core IPC memory pools
  • README.md - This file
  • requirements.txt - Sample dependencies
  • ../../Utilities/cuda_samples_utils.py - Common utilities (imported by this sample)

See Also