Struct CompileSpec ¶
-
Defined in File torch_tensorrt.h
Struct Documentation ¶
-
struct
torch_tensorrt::torchscript
::
CompileSpec
¶
-
Settings data structure for Torch-TensorRT TorchScript compilation
Public Functions
-
CompileSpec
( std::vector<std::vector<int64_t>> fixed_sizes ) ¶
-
Construct a new Compile Spec object Convienence constructor to set fixed input size from vectors describing size of input tensors. Each entry in the vector represents a input and should be provided in call order.
This constructor should be use as a convience in the case that all inputs are static sized and you are okay with default input dtype and formats (FP32 for FP32 and INT8 weights, FP16 for FP16 weights, contiguous)
- Parameters
-
-
fixed_sizes
:
-
-
CompileSpec
( std::vector<c10::ArrayRef<int64_t>> fixed_sizes ) ¶
-
Construct a new Extra Info object Convienence constructor to set fixed input size from c10::ArrayRef’s (the output of tensor.sizes()) describing size of input tensors. Each entry in the vector represents a input and should be provided in call order.
This constructor should be use as a convience in the case that all inputs are static sized and you are okay with default input dtype and formats (FP32 for FP32 and INT8 weights, FP16 for FP16 weights, contiguous)
- Parameters
-
-
fixed_sizes
:
-
Public Members
-
std::vector<
Input
>
inputs
¶
-
Specifications for inputs to the engine, can either be a single size or a range defined by min, opt and max sizes Users can also specify expected input type as well as tensor memory format.
Order in vector should match call order for the function
-
std::set<
DataType
>
enabled_precisions
= { DataType :: kFloat } ¶
-
The set of precisions TensorRT is allowed to use for kernels during compilation.
-
bool
disable_tf32
= false ¶
-
Prevent Float32 layers from using TF32 data format
TF32 computes inner products by rounding the inputs to 10-bit mantissas before multiplying, but accumulates the sum using 23-bit mantissas. This is the behavior of FP32 layers by default.
-
bool
sparse_weights
= false ¶
-
Enable sparsity for weights of conv and FC layers
-
bool
refit
= false ¶
-
Build a refitable engine
-
bool
debug
= false ¶
-
Build a debugable engine
-
bool
truncate_long_and_double
= false ¶
-
Truncate long/double type to int/float type
-
EngineCapability
capability
= EngineCapability :: kSTANDARD ¶
-
Sets the restrictions for the engine (CUDA Safety)
-
uint64_t
num_min_timing_iters
= 2 ¶
-
Number of minimization timing iterations used to select kernels
-
uint64_t
num_avg_timing_iters
= 1 ¶
-
Number of averaging timing iterations used to select kernels
-
uint64_t
workspace_size
= 0 ¶
-
Maximum size of workspace given to TensorRT
-
nvinfer1::IInt8Calibrator *
ptq_calibrator
= nullptr ¶
-
Calibration dataloaders for each input for post training quantizatiom
-
bool
require_full_compilation
= false ¶
-
Require the full module be compiled to TensorRT instead of potentially running unsupported operations in PyTorch
-
uint64_t
min_block_size
= 3 ¶
-
Minimum number of contiguous supported operators to compile a subgraph to TensorRT
-
std::vector<std::string>
torch_executed_ops
¶
-
List of aten operators that must be run in PyTorch. An error will be thrown if this list is not empty but
require_full_compilation
is True
-
std::vector<std::string>
torch_executed_modules
¶
-
List of modules that must be run in PyTorch. An error will be thrown if this list is not empty but
require_full_compilation
is True
-