streamOrderedAllocationP2P --std=c++11 cudaDeviceGetDefaultMemPool cudaFreeAsync cudaStreamCreateWithFlags cudaMemPoolSetAccess cudaStreamDestroy cudaDeviceGetAttribute cudaMallocAsync cudaSetDevice cudaGetDeviceCount cudaEventRecord cudaStreamSynchronize cudaStreamWaitEvent cudaMemcpyAsync cudaDeviceCanAccessPeer cudaEventCreate whole ./ ../ ../../../Common Performance Strategies true streamOrderedAllocationP2P.cu 1:CUDA Basic Topics 1:Performance Strategies sm60 sm61 sm70 sm72 sm75 sm80 sm86 sm87 sm89 sm90 x86_64 linux windows7 arm sbsa ppc64le linux 6.0 stream Ordered Allocation Peer-to-Peer access exe