mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2024-11-25 02:59:15 +08:00
24 lines
717 B
Plaintext
24 lines
717 B
Plaintext
[ simpleStreams ]
|
|
|
|
Device synchronization method set to = 0 (Automatic Blocking)
|
|
Setting reps to 100 to demonstrate steady state
|
|
|
|
> GPU Device 0: "Hopper" with compute capability 9.0
|
|
|
|
Device: <NVIDIA H100 PCIe> canMapHostMemory: Yes
|
|
> CUDA Capable: SM 9.0 hardware
|
|
> 114 Multiprocessor(s) x 128 (Cores/Multiprocessor) = 14592 (Cores)
|
|
> scale_factor = 1.0000
|
|
> array_size = 16777216
|
|
|
|
> Using CPU/GPU Device Synchronization method (cudaDeviceScheduleAuto)
|
|
> mmap() allocating 64.00 Mbytes (generic page-aligned system memory)
|
|
> cudaHostRegister() registering 64.00 Mbytes of generic allocated system memory
|
|
|
|
Starting Test
|
|
memcopy: 2.45
|
|
kernel: 0.12
|
|
non-streamed: 2.80
|
|
4 streams: 2.66
|
|
-------------------------------
|