mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2024-11-25 00:19:18 +08:00
24 lines
1.1 KiB
Plaintext
24 lines
1.1 KiB
Plaintext
|
GPU Device 0: "Hopper" with compute capability 9.0
|
||
|
|
||
|
16777216 elements
|
||
|
threads per block = 512
|
||
|
Graph Launch iterations = 3
|
||
|
|
||
|
Num of nodes in the graph created manually = 7
|
||
|
[cudaGraphsManual] Host callback final reduced sum = 0.996214
|
||
|
[cudaGraphsManual] Host callback final reduced sum = 0.996214
|
||
|
[cudaGraphsManual] Host callback final reduced sum = 0.996214
|
||
|
Cloned Graph Output..
|
||
|
[cudaGraphsManual] Host callback final reduced sum = 0.996214
|
||
|
[cudaGraphsManual] Host callback final reduced sum = 0.996214
|
||
|
[cudaGraphsManual] Host callback final reduced sum = 0.996214
|
||
|
|
||
|
Num of nodes in the graph created using stream capture API = 7
|
||
|
[cudaGraphsUsingStreamCapture] Host callback final reduced sum = 0.996214
|
||
|
[cudaGraphsUsingStreamCapture] Host callback final reduced sum = 0.996214
|
||
|
[cudaGraphsUsingStreamCapture] Host callback final reduced sum = 0.996214
|
||
|
Cloned Graph Output..
|
||
|
[cudaGraphsUsingStreamCapture] Host callback final reduced sum = 0.996214
|
||
|
[cudaGraphsUsingStreamCapture] Host callback final reduced sum = 0.996214
|
||
|
[cudaGraphsUsingStreamCapture] Host callback final reduced sum = 0.996214
|