p2pBandwidthLatencyTest cudaDeviceEnablePeerAccess cudaOccupancyMaxPotentialBlockSize cudaStreamCreateWithFlags cudaDeviceCanAccessPeer cudaStreamDestroy cudaHostAlloc cudaEventCreate cudaMalloc cudaEventDestroy cudaSetDevice cudaMemcpyPeerAsync cudaGetDeviceProperties cudaCheckError cudaGetDeviceCount cudaEventElapsedTime cudaGetLastError cudaDeviceDisablePeerAccess cudaStreamSynchronize cudaGetErrorString cudaStreamWaitEvent cudaMemset cudaFree cudaEventRecord cudaFreeHost whole ./ ../ ../../../Common Performance Strategies Asynchronous Data Transfers Unified Virtual Address Space Peer to Peer Data Transfers Multi-GPU CUDA Performance multi-GPU support peer to peer true p2pBandwidthLatencyTest.cu 1:CUDA Basic Topics 1:Performance Strategies sm35 sm37 sm50 sm52 sm53 sm60 sm61 sm70 sm72 sm75 sm80 sm86 sm87 x86_64 linux windows7 x86_64 macosx arm sbsa ppc64le linux all Peer-to-Peer Bandwidth Latency Test with Multi-GPUs exe