cuda-samples/Samples/4_CUDA_Libraries/cuDLALayerwiseStatsStandalone
2024-03-05 20:53:50 +00:00
..
.vscode Updating Samples for 12.3 and updating props files 2023-10-23 18:44:49 +00:00
findnvsci.mk Updating Samples for 12.3 and updating props files 2023-10-23 18:44:49 +00:00
main.cpp Fixing jitlto regression, including missing cuDLA source files for bug #235, and updating changelogs 2023-11-09 16:52:00 +00:00
Makefile Updating Samples for 12.3 and updating props files 2023-10-23 18:44:49 +00:00
NsightEclipse.xml Updating Samples for 12.3 and updating props files 2023-10-23 18:44:49 +00:00
README.md Updating samples for CUDA 12.4 2024-03-05 20:53:50 +00:00

cuDLALayerwiseStatsStandalone - cuDLA Layerwise Statistics Standalone Mode

Description

This sample is used to provide layerwise statistics to the application in cuDLA standalone mode where DLA is programmed without using CUDA.

Key Concepts

cuDLA, Data Parallel Algorithms, Image Processing

Supported SM Architectures

SM 6.0 SM 6.1 SM 7.0 SM 7.2 SM 7.5 SM 8.0 SM 8.6 SM 8.7 SM 8.9 SM 9.0

Supported OSes

Linux, QNX

Supported CPU Architecture

aarch64

CUDA APIs involved

Dependencies needed to build/run

NVSCI

Prerequisites

Download and install the CUDA Toolkit 12.4 for your corresponding platform. Make sure the dependencies mentioned in Dependencies section above are installed.

Build and Run

Linux

The Linux samples are built using makefiles. To use the makefiles, change the current directory to the sample directory you wish to build, and run make:

$ cd <sample_dir>
$ make

The samples makefiles can take advantage of certain options:

  • TARGET_ARCH= - cross-compile targeting a specific architecture. Allowed architectures are aarch64. By default, TARGET_ARCH is set to HOST_ARCH. On a x86_64 machine, not setting TARGET_ARCH is the equivalent of setting TARGET_ARCH=x86_64.
    $ make TARGET_ARCH=aarch64
    See here for more details.

  • dbg=1 - build with debug symbols

    $ make dbg=1
    
  • SMS="A B ..." - override the SM architectures for which the sample will be built, where "A B ..." is a space-delimited list of SM architectures. For example, to generate SASS for SM 50 and SM 60, use SMS="50 60".

    $ make SMS="50 60"
    
  • HOST_COMPILER=<host_compiler> - override the default g++ host compiler. See the Linux Installation Guide for a list of supported host compilers.

    $ make HOST_COMPILER=g++

References (for more details)