mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2024-11-28 19:19:17 +08:00
36 lines
1009 B
Plaintext
36 lines
1009 B
Plaintext
|
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
|
||
|
Device: 0, NVIDIA H100 PCIe, pciBusID: c1, pciDeviceID: 0, pciDomainID:0
|
||
|
|
||
|
***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
|
||
|
So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.
|
||
|
|
||
|
P2P Connectivity Matrix
|
||
|
D\D 0
|
||
|
0 1
|
||
|
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
|
||
|
D\D 0
|
||
|
0 1628.72
|
||
|
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
|
||
|
D\D 0
|
||
|
0 1625.75
|
||
|
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
|
||
|
D\D 0
|
||
|
0 1668.11
|
||
|
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
|
||
|
D\D 0
|
||
|
0 1668.39
|
||
|
P2P=Disabled Latency Matrix (us)
|
||
|
GPU 0
|
||
|
0 2.67
|
||
|
|
||
|
CPU 0
|
||
|
0 2.04
|
||
|
P2P=Enabled Latency (P2P Writes) Matrix (us)
|
||
|
GPU 0
|
||
|
0 2.68
|
||
|
|
||
|
CPU 0
|
||
|
0 2.02
|
||
|
|
||
|
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
|