You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Add support for hardware compatibility for Ampere and later
architectures
- Add necessary functions to support the modification throughout the
stack, including C++ and Python components
- Update ABI version to address new metadata format for TRT Engines
- Update engine serialization schema accordingly
- Add test cases to validate feature
"""Compile a TorchScript module for NVIDIA GPUs using TensorRT
@@ -131,6 +134,7 @@ def compile(
131
134
use_python_runtime: (bool): Return a graph using a pure Python runtime, reduces options for serialization
132
135
use_fast_partitioner: (bool): Use the adjacency based partitioning scheme instead of the global partitioner. Adjacency partitioning is faster but may not be optiminal. Use the global paritioner (``False``) if looking for best performance
133
136
enable_experimental_decompositions (bool): Use the full set of operator decompositions. These decompositions may not be tested but serve to make the grap easier to covert to TensorRT, potentially increasing the amount of graphs run in TensorRT.
137
+
hardware_compatible (bool): Build the TensorRT engines compatible with GPU architectures other than that of the GPU on which the engine was built (currently works for NVIDIA Ampere and newer)
134
138
**kwargs: Any,
135
139
Returns:
136
140
torch.fx.GraphModule: Compiled FX Module, when run it will execute via TensorRT
Copy file name to clipboardExpand all lines: py/torch_tensorrt/dynamo/_settings.py
+3
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,7 @@
6
6
fromtorch_tensorrt.dynamo._defaultsimport (
7
7
DEBUG,
8
8
ENABLE_EXPERIMENTAL_DECOMPOSITIONS,
9
+
HARDWARE_COMPATIBLE,
9
10
MAX_AUX_STREAMS,
10
11
MIN_BLOCK_SIZE,
11
12
OPTIMIZATION_LEVEL,
@@ -46,6 +47,7 @@ class CompilationSettings:
46
47
device (Device): GPU to compile the model on
47
48
require_full_compilation (bool): Whether to require the graph is fully compiled in TensorRT.
48
49
Only applicable for `ir="dynamo"`; has no effect for `torch.compile` path
50
+
hardware_compatible (bool): Build the TensorRT engines compatible with GPU architectures other than that of the GPU on which the engine was built (currently works for NVIDIA Ampere and newer)
0 commit comments