Compiler Phases

Lowering

Lowering Phase

The lowering is made up of a set of passes (some from PyTorch and some specific to TRTorch) run over the graph IR to map the large PyTorch opset to a reduced opset that is easier to convert to TensorRT.

Conversion

Conversion Phase

In the conversion phase we traverse the lowered graph and construct an equivalent TensorRT graph. The conversion phase is made up of three main components, a context to manage compile time data, a evaluator library which will execute operations that can be resolved at compile time and a converter library which maps an op from JIT to TensorRT.

Compilation and Runtime

Runtime Phase

The final compilation phase constructs a TorchScript program to run the converted TensorRT engine. It takes a serialized engine and instantiates it within a engine manager, then the compiler will build out a JIT graph that references this engine and wraps it in a module to return to the user. When the user executes the module, the JIT program run in the JIT runtime extended by TRTorch with the data providied from the user.