asyncAPI cudaEventCreate cudaEventRecord cudaEventQuery cudaEventDestroy cudaEventElapsedTime cudaMemcpyAsync whole ./ ../ ../../Common Asynchronous Data Transfers CUDA Streams and Events GPGPU true asyncAPI.cu 1:CUDA Basic Topics 1:Performance Strategies sm35 sm37 sm50 sm52 sm60 sm61 sm70 sm72 sm75 sm80 sm86 x86_64 linux windows7 x86_64 macosx arm ppc64le linux all asyncAPI exe