trtorchc

trtorchc is a CLI application for using the TRTorch compiler. It serves as an easy way to compile a TorchScript Module with TRTorch from the command-line to quickly check support or as part of a deployment pipeline. All basic features of the compiler are supported including post training quantization (though you must already have a calibration cache file to use the PTQ feature). The compiler can output two formats, either a TorchScript program with the TensorRT engine embedded or the TensorRT engine itself as a PLAN file.

All that is required to run the program after compilation is for C++ linking against libtrtorch.so or in Python importing the trtorch package. All other aspects of using compiled modules are identical to standard TorchScript. Load with torch.jit.load() and run like you would run any other module.

trtorchc [input_file_path] [output_file_path]
    [input_shapes...] {OPTIONS}

    TRTorch is a compiler for TorchScript, it will compile and optimize
    TorchScript programs to run on NVIDIA GPUs using TensorRT

OPTIONS:

    -h, --help                        Display this help menu
    Verbiosity of the compiler
        -v, --verbose                     Dumps debugging information about the
                                        compilation process onto the console
        -w, --warnings                    Disables warnings generated during
                                        compilation onto the console (warnings
                                        are on by default)
        --info                            Dumps info messages generated during
                                        compilation onto the console
    --build-debuggable-engine         Creates a debuggable engine
    --use-strict-types                Restrict operating type to only use set
                                        default operation precision
                                        (op_precision)
    --allow-gpu-fallback              (Only used when targeting DLA
                                        (device-type)) Lets engine run layers on
                                        GPU if they are not supported on DLA
    -p[precision],
    --default-op-precision=[precision]
                                        Default operating precision for the
                                        engine (Int8 requires a
                                        calibration-cache argument) [ float |
                                        float32 | f32 | half | float16 | f16 |
                                        int8 | i8 ] (default: float)
    -d[type], --device-type=[type]    The type of device the engine should be
                                        built for [ gpu | dla ] (default: gpu)
    --engine-capability=[capability]  The type of device the engine should be
                                        built for [ default | safe_gpu |
                                        safe_dla ]
    --calibration-cache-file=[file_path]
                                        Path to calibration cache file to use
                                        for post training quantization
    --num-min-timing-iter=[num_iters] Number of minimization timing iterations
                                        used to select kernels
    --num-avg-timing-iters=[num_iters]
                                        Number of averaging timing iterations
                                        used to select kernels
    --workspace-size=[workspace_size] Maximum size of workspace given to
                                        TensorRT
    --max-batch-size=[max_batch_size] Maximum batch size (must be >= 1 to be
                                        set, 0 means not set)
    -t[threshold],
    --threshold=[threshold]           Maximum acceptable numerical deviation
                                        from standard torchscript output
                                        (default 2e-5)
    --save-engine                     Instead of compiling a full a
                                        TorchScript program, save the created
                                        engine to the path specified as the
                                        output path
    input_file_path                   Path to input TorchScript file
    output_file_path                  Path for compiled TorchScript (or
                                        TensorRT engine) file
    input_shapes...                   Sizes for inputs to engine, can either
                                        be a single size or a range defined by
                                        Min, Optimal, Max sizes, e.g.
                                        "(N,..,C,H,W)"
                                        "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]"
    "--" can be used to terminate flag options and force all following
    arguments to be treated as positional options

e.g.

trtorchc tests/modules/ssd_traced.jit.pt ssd_trt.ts "[(1,3,300,300); (1,3,512,512); (1, 3, 1024, 1024)]" -p f16