matrixMul
cudaStreamCreateWithFlags
cudaProfilerStop
cudaMalloc
cudaFree
cudaMallocHost
cudaProfilerStart
cudaEventSynchronize
cudaEventRecord
cudaFreeHost
cudaStreamSynchronize
cudaEventDestroy
cudaEventElapsedTime
cudaMemcpyAsync
cudaEventCreate
whole
./
../
../../../Common
CUDA Runtime API
Linear Algebra
CUDA
matrix multiply
true
matrixMul.cu
1:CUDA Basic Topics
3:Linear Algebra
sm50
sm52
sm53
sm60
sm61
sm70
sm72
sm75
sm80
sm86
sm87
sm90
x86_64
linux
windows7
x86_64
macosx
arm
aarch64
sbsa
ppc64le
linux
all
Matrix Multiplication (CUDA Runtime API Version)