|
| 1 | +.. _dynamo_export: |
| 2 | + |
| 3 | +Torch-TensorRT Dynamo Backend |
| 4 | +============================================= |
| 5 | +.. currentmodule:: torch_tensorrt.dynamo |
| 6 | + |
| 7 | +.. automodule:: torch_tensorrt.dynamo |
| 8 | + :members: |
| 9 | + :undoc-members: |
| 10 | + :show-inheritance: |
| 11 | + |
| 12 | +This guide presents Torch-TensorRT dynamo backend which optimizes Pytorch models |
| 13 | +using TensorRT in an Ahead-Of-Time fashion. |
| 14 | + |
| 15 | +Using the Dynamo backend |
| 16 | +---------------------------------------- |
| 17 | +Pytorch 2.1 introduced ``torch.export`` APIs which |
| 18 | +can export graphs from Pytorch programs into ``ExportedProgram`` objects. Torch-TensorRT dynamo |
| 19 | +backend compiles these ``ExportedProgram`` objects and optimizes them using TensorRT. Here's a simple |
| 20 | +usage of the dynamo backend |
| 21 | + |
| 22 | +.. code-block:: python |
| 23 | +
|
| 24 | + import torch |
| 25 | + import torch_tensorrt |
| 26 | +
|
| 27 | + model = MyModel().eval().cuda() |
| 28 | + inputs = [torch.randn((1, 3, 224, 224), dtype=torch.float32).cuda()] |
| 29 | + exp_program = torch.export.export(model, tuple(inputs)) |
| 30 | + trt_gm = torch_tensorrt.dynamo.compile(exp_program, inputs) # Output is a torch.fx.GraphModule |
| 31 | + trt_gm(*inputs) |
| 32 | +
|
| 33 | +.. note:: ``torch_tensorrt.dynamo.compile`` is the main API for users to interact with Torch-TensorRT dynamo backend. The input type of the model should be ``ExportedProgram`` (ideally the output of ``torch.export.export`` or ``torch_tensorrt.dynamo.trace`` (discussed in the section below)) and output type is a ``torch.fx.GraphModule`` object. |
| 34 | + |
| 35 | +Customizeable Settings |
| 36 | +---------------------- |
| 37 | + |
| 38 | +There are lot of options for users to customize their settings for optimizing with TensorRT. |
| 39 | +Some of the frequently used options are as follows: |
| 40 | + |
| 41 | +* ``inputs`` - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects. |
| 42 | +* ``enabled_precisions`` - Set of precisions that TensorRT builder can use during optimization. |
| 43 | +* ``truncate_long_and_double`` - Truncates long and double values to int and floats respectively. |
| 44 | +* ``torch_executed_ops`` - Operators which are forced to be executed by Torch. |
| 45 | +* ``min_block_size`` - Minimum number of consecutive operators required to be executed as a TensorRT segment. |
| 46 | + |
| 47 | +The complete list of options can be found `here <https://github.com/pytorch/TensorRT/blob/123a486d6644a5bbeeec33e2f32257349acc0b8f/py/torch_tensorrt/dynamo/compile.py#L51-L77>`_ |
| 48 | + |
| 49 | +.. note:: We do not support INT precision currently in Dynamo. Support for this currently exists in |
| 50 | +our Torchscript IR. We plan to implement similar support for dynamo in our next release. |
| 51 | + |
| 52 | +Under the hood |
| 53 | +-------------- |
| 54 | + |
| 55 | +Under the hood, ``torch_tensorrt.dynamo.compile`` performs the following on the graph. |
| 56 | + |
| 57 | +* Lowering - Applies lowering passes to add/remove operators for optimal conversion. |
| 58 | +* Partitioning - Partitions the graph into Pytorch and TensorRT segments based on the ``min_block_size`` and ``torch_executed_ops`` field. |
| 59 | +* Conversion - Pytorch ops get converted into TensorRT ops in this phase. |
| 60 | +* Optimization - Post conversion, we build the TensorRT engine and embed this inside the pytorch graph. |
| 61 | + |
| 62 | +Tracing |
| 63 | +------- |
| 64 | + |
| 65 | +``torch_tensorrt.dynamo.trace`` can be used to trace a Pytorch graphs and produce ``ExportedProgram``. |
| 66 | +This internally performs some decompositions of operators for downstream optimization. |
| 67 | +The ``ExportedProgram`` can then be used with ``torch_tensorrt.dynamo.compile`` API. |
| 68 | +If you have dynamic input shapes in your model, you can use this ``torch_tensorrt.dynamo.trace`` to export |
| 69 | +the model with dynamic shapes. Alternatively, you can use ``torch.export`` `with constraints <https://pytorch.org/docs/stable/export.html#expressing-dynamism>`_ directly as well. |
| 70 | + |
| 71 | +.. code-block:: python |
| 72 | +
|
| 73 | + import torch |
| 74 | + import torch_tensorrt |
| 75 | +
|
| 76 | + inputs = [torch_tensorrt.Input(min_shape=(1, 3, 224, 224), |
| 77 | + opt_shape=(4, 3, 224, 224), |
| 78 | + max_shape=(8, 3, 224, 224), |
| 79 | + dtype=torch.float32)] |
| 80 | + model = MyModel().eval() |
| 81 | + exp_program = torch_tensorrt.dynamo.trace(model, inputs) |
| 82 | + |
0 commit comments