mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2024-11-28 17:49:18 +08:00
42 lines
1.6 KiB
Plaintext
42 lines
1.6 KiB
Plaintext
[./convolutionFFT2D] - Starting...
|
|
GPU Device 0: "Hopper" with compute capability 9.0
|
|
|
|
Testing built-in R2C / C2R FFT-based convolution
|
|
...allocating memory
|
|
...generating random input data
|
|
...creating R2C & C2R FFT plans for 2048 x 2048
|
|
...uploading to GPU and padding convolution kernel and input data
|
|
...transforming convolution kernel
|
|
...running GPU FFT convolution: 33613.444604 MPix/s (0.119000 ms)
|
|
...reading back GPU convolution results
|
|
...running reference CPU convolution
|
|
...comparing the results: rel L2 = 9.395370E-08 (max delta = 1.208283E-06)
|
|
L2norm Error OK
|
|
...shutting down
|
|
Testing custom R2C / C2R FFT-based convolution
|
|
...allocating memory
|
|
...generating random input data
|
|
...creating C2C FFT plan for 2048 x 1024
|
|
...uploading to GPU and padding convolution kernel and input data
|
|
...transforming convolution kernel
|
|
...running GPU FFT convolution: 29197.081461 MPix/s (0.137000 ms)
|
|
...reading back GPU FFT results
|
|
...running reference CPU convolution
|
|
...comparing the results: rel L2 = 1.067915E-07 (max delta = 9.817303E-07)
|
|
L2norm Error OK
|
|
...shutting down
|
|
Testing updated custom R2C / C2R FFT-based convolution
|
|
...allocating memory
|
|
...generating random input data
|
|
...creating C2C FFT plan for 2048 x 1024
|
|
...uploading to GPU and padding convolution kernel and input data
|
|
...transforming convolution kernel
|
|
...running GPU FFT convolution: 39603.959017 MPix/s (0.101000 ms)
|
|
...reading back GPU FFT results
|
|
...running reference CPU convolution
|
|
...comparing the results: rel L2 = 1.065127E-07 (max delta = 9.817303E-07)
|
|
L2norm Error OK
|
|
...shutting down
|
|
Test Summary: 0 errors
|
|
Test passed
|