Struct CompileSpec

Nested Relationships

Nested Types

Struct Documentation

struct trtorch :: CompileSpec

Settings data structure for TRTorch compilation

Public Types

enum EngineCapability

Emum for selecting engine capability

Values:

enumerator kSTANDARD
enumerator kSAFETY
enumerator kDLA_STANDALONE

Public Functions

CompileSpec ( std::vector< InputRange > input_ranges )

Construct a new Extra Info object from input ranges. Each entry in the vector represents a input and should be provided in call order.

Use this constructor if you want to use dynamic shape

Parameters
  • input_ranges :

CompileSpec ( std::vector<std::vector<int64_t>> fixed_sizes )

Construct a new Extra Info object Convienence constructor to set fixed input size from vectors describing size of input tensors. Each entry in the vector represents a input and should be provided in call order.

This constructor should be use as a convience in the case that all inputs are static sized and you are okay with default input dtype and formats (FP32 for FP32 and INT8 weights, FP16 for FP16 weights, contiguous)

Parameters
  • fixed_sizes :

CompileSpec ( std::vector<c10::ArrayRef<int64_t>> fixed_sizes )

Construct a new Extra Info object Convienence constructor to set fixed input size from c10::ArrayRef’s (the output of tensor.sizes()) describing size of input tensors. Each entry in the vector represents a input and should be provided in call order.

This constructor should be use as a convience in the case that all inputs are static sized and you are okay with default input dtype and formats (FP32 for FP32 and INT8 weights, FP16 for FP16 weights, contiguous)

Parameters
  • fixed_sizes :

CompileSpec ( std::vector< Input > inputs )

Construct a new Extra Info object from input ranges. Each entry in the vector represents a input and should be provided in call order.

Use this constructor to define inputs with dynamic shape, specific input types or tensor formats

Parameters
  • inputs :

Public Members

std::vector< Input > inputs

Specifications for inputs to the engine, can either be a single size or a range defined by min, opt and max sizes Users can also specify expected input type as well as tensor memory format.

Order in vector should match call order for the function

std::vector< InputRange > input_ranges

Sizes for inputs to engine, can either be a single size or a range defined by Min, Optimal, Max sizes

Order is should match call order

DataType op_precision = DataType :: kFloat

Default operating precision for the engine

std::set< DataType > enabled_precisions = { DataType :: kFloat }

The set of precisions TensorRT is allowed to use for kernels during compilation.

bool disable_tf32 = false

Prevent Float32 layers from using TF32 data format

TF32 computes inner products by rounding the inputs to 10-bit mantissas before multiplying, but accumulates the sum using 23-bit mantissas. This is the behavior of FP32 layers by default.

bool sparse_weights = false

Enable sparsity for weights of conv and FC layers

bool refit = false

Build a refitable engine

bool debug = false

Build a debugable engine

bool truncate_long_and_double = false

Truncate long/double type to int/float type

bool strict_types = false

Restrict operating type to only the lowest enabled operation precision (enabled_precisions)

Device device

Target Device

TorchFallback torch_fallback

Settings related to partial compilation.

EngineCapability capability = EngineCapability :: kSTANDARD

Sets the restrictions for the engine (CUDA Safety)

uint64_t num_min_timing_iters = 2

Number of minimization timing iterations used to select kernels

uint64_t num_avg_timing_iters = 1

Number of averaging timing iterations used to select kernels

uint64_t workspace_size = 0

Maximum size of workspace given to TensorRT

uint64_t max_batch_size = 0

Maximum batch size (must be >= 1 to be set, 0 means not set)

nvinfer1::IInt8Calibrator * ptq_calibrator = nullptr

Calibration dataloaders for each input for post training quantizatiom

class DataType

Supported Data Types that can be used with TensorRT engines

This class is compatable with c10::DataTypes (but will check for TRT support) so there should not be a reason that you need to use this type explictly.

Public Types

enum Value

Underlying enum class to support the DataType Class

In the case that you need to use the DataType class itself, interface using this enum vs. normal instatination

ex. trtorch::DataType type = DataType::kFloat ;

Values:

enumerator kFloat

FP32.

enumerator kHalf

FP16.

enumerator kChar

INT8.

enumerator kInt

INT.

enumerator kBool

Bool.

enumerator kUnknown

Sentinel value.

Public Functions

DataType ( ) = default

Construct a new Data Type object.

constexpr DataType ( Value t )

DataType constructor from enum.

DataType ( c10::ScalarType t )

Construct a new Data Type object from torch type enums.

Parameters
  • t :

operator Value ( ) const

Get the enum value of the DataType object.

Return

Value

operator bool ( ) = delete
constexpr bool operator== ( DataType other ) const

Comparision operator for DataType .

Return

true

Return

false

Parameters
  • other :

constexpr bool operator== ( DataType :: Value other ) const

Comparision operator for DataType .

Return

true

Return

false

Parameters
  • other :

constexpr bool operator!= ( DataType other ) const

Comparision operator for DataType .

Return

true

Return

false

Parameters
  • other :

constexpr bool operator!= ( DataType :: Value other ) const

Comparision operator for DataType .

Return

true

Return

false

Parameters
  • other :

struct Device

Public Functions

Device ( )

Constructor for Device structure

Public Members

DeviceType device_type

Setting data structure for device This struct will hold Target device related parameters such as device_type, gpu_id, dla_core.

int64_t gpu_id
int64_t dla_core
bool allow_gpu_fallback

(Only used when targeting DLA (device)) Lets engine run layers on GPU if they are not supported on DLA

class DeviceType

Supported Device Types that can be used with TensorRT engines

This class is compatable with c10::DeviceTypes (but will check for TRT support) but the only applicable value is at::kCUDA, which maps to DeviceType::kGPU

To use the DataType class itself, interface using the enum vs. normal instatination

ex. trtorch::DeviceType type = DeviceType::kGPU ;

Public Types

enum Value

Underlying enum class to support the DeviceType Class

In the case that you need to use the DeviceType class itself, interface using this enum vs. normal instatination

ex. trtorch::DeviceType type = DeviceType::kGPU ;

Values:

enumerator kGPU

Target GPU to run engine.

enumerator kDLA

Target DLA to run engine.

Public Functions

DeviceType ( ) = default

Construct a new Device Type object.

constexpr DeviceType ( Value t )

Construct a new Device Type object from internal enum.

DeviceType ( c10::DeviceType t )

Construct a new Device Type object from torch device enums Note: The only valid value is torch::kCUDA (torch::kCPU is not supported)

Parameters
  • t :

operator Value ( ) const

Get the internal value from the Device object.

Return

Value

operator bool ( ) = delete
constexpr bool operator== ( DeviceType other ) const

Comparison operator for DeviceType .

Return

true

Return

false

Parameters
  • other :

constexpr bool operator!= ( DeviceType other ) const

Comparison operator for DeviceType .

Return

true

Return

false

Parameters
  • other :

struct Input

A struct to hold an input range (used by TensorRT Optimization profile)

This struct can either hold a single vector representing an input shape, signifying a static input shape or a set of three input shapes representing the min, optiminal and max input shapes allowed for the engine.

Public Functions

Input ( std::vector<int64_t> shape , TensorFormat format = TensorFormat :: kContiguous )

Construct a new Input spec object for static input size from vector, optional arguments allow the user to configure expected input shape tensor format. dtype (Expected data type for the input) defaults to PyTorch / traditional TRT convection (FP32 for FP32 only, FP16 for FP32 and FP16, FP32 for Int8)

Parameters
  • shape : Input tensor shape

  • dtype : Expected data type for the input (Defaults to Float32)

  • format : Expected tensor format for the input (Defaults to contiguous)

Input ( std::vector<int64_t> shape , DataType dtype , TensorFormat format = TensorFormat :: kContiguous )

Construct a new Input spec object for static input size from vector, optional arguments allow the user to configure expected input shape tensor format.

Parameters
  • shape : Input tensor shape

  • dtype : Expected data type for the input (Defaults to Float32)

  • format : Expected tensor format for the input (Defaults to contiguous)

Input ( c10::ArrayRef<int64_t> shape , TensorFormat format = TensorFormat :: kContiguous )

Construct a new Input spec object for static input size from c10::ArrayRef (the type produced by tensor.sizes()), vector, optional arguments allow the user to configure expected input shape tensor format dtype (Expected data type for the input) defaults to PyTorch / traditional TRT convection (FP32 for FP32 only, FP16 for FP32 and FP16, FP32 for Int8)

Parameters
  • shape : Input tensor shape

  • format : Expected tensor format for the input (Defaults to contiguous)

Input ( c10::ArrayRef<int64_t> shape , DataType dtype , TensorFormat format = TensorFormat :: kContiguous )

Construct a new Input spec object for static input size from c10::ArrayRef (the type produced by tensor.sizes()), vector, optional arguments allow the user to configure expected input shape tensor format.

Parameters
  • shape : Input tensor shape

  • dtype : Expected data type for the input (Defaults to Float32)

  • format : Expected tensor format for the input (Defaults to contiguous)

Input ( std::vector<int64_t> min_shape , std::vector<int64_t> opt_shape , std::vector<int64_t> max_shape , TensorFormat format = TensorFormat :: kContiguous )

Construct a new Input Range object dynamic input size from c10::ArrayRef (the type produced by tensor.sizes()) for min, opt, and max supported sizes. dtype (Expected data type for the input) defaults to PyTorch / traditional TRT convection (FP32 for FP32 only, FP16 for FP32 and FP16, FP32 for Int8)

Parameters
  • min_shape : Minimum shape for input tensor

  • opt_shape : Target optimization shape for input tensor

  • max_shape : Maximum acceptible shape for input tensor

  • format : Expected tensor format for the input (Defaults to contiguous)

Input ( std::vector<int64_t> min_shape , std::vector<int64_t> opt_shape , std::vector<int64_t> max_shape , DataType dtype , TensorFormat format = TensorFormat :: kContiguous )

Construct a new Input spec object for a dynamic input size from vectors for minimum shape, optimal shape, and max shape supported sizes optional arguments allow the user to configure expected input shape tensor format.

Parameters
  • min_shape : Minimum shape for input tensor

  • opt_shape : Target optimization shape for input tensor

  • max_shape : Maximum acceptible shape for input tensor

  • dtype : Expected data type for the input (Defaults to Float32)

  • format : Expected tensor format for the input (Defaults to contiguous)

Input ( c10::ArrayRef<int64_t> min_shape , c10::ArrayRef<int64_t> opt_shape , c10::ArrayRef<int64_t> max_shape , TensorFormat format = TensorFormat :: kContiguous )

Construct a new Input Range object dynamic input size from c10::ArrayRef (the type produced by tensor.sizes()) for min, opt, and max supported sizes. dtype (Expected data type for the input) defaults to PyTorch / traditional TRT convection (FP32 for FP32 only, FP16 for FP32 and FP16, FP32 for Int8)

Parameters
  • min_shape : Minimum shape for input tensor

  • opt_shape : Target optimization shape for input tensor

  • max_shape : Maximum acceptible shape for input tensor

  • format : Expected tensor format for the input (Defaults to contiguous)

Input ( c10::ArrayRef<int64_t> min_shape , c10::ArrayRef<int64_t> opt_shape , c10::ArrayRef<int64_t> max_shape , DataType dtype , TensorFormat format = TensorFormat :: kContiguous )

Construct a new Input Range object dynamic input size from c10::ArrayRef (the type produced by tensor.sizes()) for min, opt, and max supported sizes.

Parameters
  • min_shape : Minimum shape for input tensor

  • opt_shape : Target optimization shape for input tensor

  • max_shape : Maximum acceptible shape for input tensor

  • dtype : Expected data type for the input (Defaults to Float32)

  • format : Expected tensor format for the input (Defaults to contiguous)

bool get_explicit_set_dtype ( )

Public Members

std::vector<int64_t> min_shape

Minimum acceptable input size into the engine.

std::vector<int64_t> opt_shape

Optimal input size into the engine (size optimized for given kernels accept any size in min max range)

std::vector<int64_t> max_shape

Maximum acceptable input size into the engine.

std::vector<int64_t> shape

Input shape to be fed to TensorRT, in the event of a dynamic shape, -1’s will hold the place of variable dimensions

DataType dtype

Expected data type for the input.

TensorFormat format

Expected tensor format for the input.

struct InputRange

A struct to hold an input range (used by TensorRT Optimization profile)

This struct can either hold a single vector representing an input shape, signifying a static input shape or a set of three input shapes representing the min, optiminal and max input shapes allowed for the engine.

Public Functions

InputRange ( std::vector<int64_t> opt )

Construct a new Input Range object for static input size from vector.

Parameters
  • opt :

InputRange ( c10::ArrayRef<int64_t> opt )

Construct a new Input Range object static input size from c10::ArrayRef (the type produced by tensor.sizes())

Parameters
  • opt :

InputRange ( std::vector<int64_t> min , std::vector<int64_t> opt , std::vector<int64_t> max )

Construct a new Input Range object dynamic input size from vectors for min, opt, and max supported sizes.

Parameters
  • min :

  • opt :

  • max :

InputRange ( c10::ArrayRef<int64_t> min , c10::ArrayRef<int64_t> opt , c10::ArrayRef<int64_t> max )

Construct a new Input Range object dynamic input size from c10::ArrayRef (the type produced by tensor.sizes()) for min, opt, and max supported sizes.

Parameters
  • min :

  • opt :

  • max :

Public Members

std::vector<int64_t> min

Minimum acceptable input size into the engine.

std::vector<int64_t> opt

Optimal input size into the engine (gets best performace)

std::vector<int64_t> max

Maximum acceptable input size into the engine.

class TensorFormat

Public Types

enum Value

Underlying enum class to support the TensorFormat Class

In the case that you need to use the TensorFormat class itself, interface using this enum vs. normal instatination

ex. trtorch::TensorFormat type = TensorFormat::kContiguous ;

Values:

enumerator kContiguous

Contiguous / NCHW / Linear.

enumerator kChannelsLast

Channel Last / NHWC.

enumerator kUnknown

Sentinel value.

Public Functions

TensorFormat ( ) = default

Construct a new TensorFormat object.

constexpr TensorFormat ( Value t )

TensorFormat constructor from enum.

TensorFormat ( at::MemoryFormat t )

Construct a new TensorFormat object from torch type enums.

Parameters
  • t :

operator Value ( ) const

Get the enum value of the TensorFormat object.

Return

Value

operator bool ( ) = delete
constexpr bool operator== ( TensorFormat other ) const

Comparision operator for TensorFormat .

Return

true

Return

false

Parameters
  • other :

constexpr bool operator== ( TensorFormat :: Value other ) const

Comparision operator for TensorFormat .

Return

true

Return

false

Parameters
  • other :

constexpr bool operator!= ( TensorFormat other ) const

Comparision operator for TensorFormat .

Return

true

Return

false

Parameters
  • other :

constexpr bool operator!= ( TensorFormat :: Value other ) const

Comparision operator for TensorFormat .

Return

true

Return

false

Parameters
  • other :

struct TorchFallback

A struct to hold fallback info.

Public Functions

TorchFallback ( ) = default

Construct a default Torch Fallback object, fallback will be off.

TorchFallback ( bool enabled )

Construct from a bool.

TorchFallback ( bool enabled , uint64_t min_size )

Constructor for setting min_block_size.

Public Members

bool enabled = false

enable the automatic fallback feature

uint64_t min_block_size = 1

minimum consecutive operation number that needs to be satisfied to convert to TensorRT

std::vector<std::string> forced_fallback_ops

A list of names of operations that will explicitly run in PyTorch.

std::vector<std::string> forced_fallback_modules

A list of names of modules that will explicitly run in PyTorch.