globalToShmemAsyncCopy
--std=c++11
cudaEventCreate
cudaEventRecord
cudaEventQuery
cudaEventDestroy
cudaEventElapsedTime
cudaEventSynchronize
cudaMalloc
cudaFree
cudaMemcpy
whole
./
../
../../common/inc
CUDA Runtime API
Linear Algebra
CPP11 CUDA
CUDA
matrix multiply
Async copy
CPP11
GCC 5.0.0
true
globalToShmemAsyncCopy.cu
CPP11
1:CUDA Basic Topics
3:Linear Algebra
sm35
sm37
sm50
sm52
sm60
sm61
sm70
sm72
sm75
sm80
x86_64
linux
x86_64
macosx
arm
ppc64le
linux
aarch64
linux
aarch64
qnx
windows7
all
Global Memory to Shared Memory Async Copy