mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2024-11-25 03:39:16 +08:00
24 lines
944 B
Plaintext
24 lines
944 B
Plaintext
|
starting Simple Print (CUDA Dynamic Parallelism)
|
||
|
GPU Device 0: "Hopper" with compute capability 9.0
|
||
|
|
||
|
***************************************************************************
|
||
|
The CPU launches 2 blocks of 2 threads each. On the device each thread will
|
||
|
launch 2 blocks of 2 threads each. The GPU we will do that recursively
|
||
|
until it reaches max_depth=2
|
||
|
|
||
|
In total 2+8=10 blocks are launched!!! (8 from the GPU)
|
||
|
***************************************************************************
|
||
|
|
||
|
Launching cdp_kernel() with CUDA Dynamic Parallelism:
|
||
|
|
||
|
BLOCK 1 launched by the host
|
||
|
BLOCK 0 launched by the host
|
||
|
| BLOCK 3 launched by thread 0 of block 1
|
||
|
| BLOCK 2 launched by thread 0 of block 1
|
||
|
| BLOCK 4 launched by thread 0 of block 0
|
||
|
| BLOCK 5 launched by thread 0 of block 0
|
||
|
| BLOCK 7 launched by thread 1 of block 0
|
||
|
| BLOCK 6 launched by thread 1 of block 0
|
||
|
| BLOCK 9 launched by thread 1 of block 1
|
||
|
| BLOCK 8 launched by thread 1 of block 1
|