2026-05-27 21:03:57 +00:00

2.1 KiB

cubDeviceSegmentedScan - CUB DeviceSegmentedScan

Description

This sample demonstrates cub::DeviceSegmentedScan. A segmented scan computes an independent scan over each of many contiguous segments in a single device-wide call. Two operations are shown: ExclusiveSegmentedSum across three independent segments, and InclusiveSegmentedScan with a custom binary operator (running maximum via cuda::maximum<>).

Key Concepts

CUB Device Algorithms, Segmented Scan, Prefix Sum

Supported SM Architectures

SM 7.0 SM 7.5 SM 8.0 SM 8.6 SM 8.9 SM 9.0 SM 10.0 SM 11.0 SM 12.0

Supported OSes

Linux, Windows

Supported CPU Architecture

x86_64, aarch64

CUDA APIs involved

CCCL CUB

cub::DeviceSegmentedScan::ExclusiveSegmentedSum, cub::DeviceSegmentedScan::InclusiveSegmentedScan

CCCL libcu++

cuda::maximum

CUDA Runtime API

cudaDeviceSynchronize, cudaGetDeviceProperties

Dependencies needed to build/run

CCCL 3.3+. Fetched automatically via CPM at configure time (pinned to v3.3.3). Override with -DCCCL_SOURCE_DIR=/path/to/cccl to use a local checkout.

Prerequisites

Download and install the CUDA Toolkit for your corresponding platform. Make sure the dependencies mentioned in Dependencies section above are installed.

References (for more details)

CCCL 3.3 release notes, cub::DeviceSegmentedScan header