You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/_sources/contributors/execution.rst.txt
+43-18
Original file line number
Diff line number
Diff line change
@@ -3,11 +3,13 @@
3
3
Execution Phase
4
4
================
5
5
6
-
The execution phase is responsible for managing TensorRT engines, constructing a new module for the TensorRT engines,
7
-
and acting as a runtime for JIT modules calling TensorRT engines. The main interface accepts a serialized
8
-
TensorRT engine. It stands up the engine within the Engine Manager which maintains a execution context for each engine
9
-
and some metadata about its inputs and outputs. Each engine is assigned an ID which can be used to reference the engine
10
-
when running the a module with the JIT interpreter.
6
+
The execution phase is responsible for constructing self standing TorchScript graphs with embedded TensorRT engines and serving as the runtime
7
+
when these engines are called. The main interface accepts a serialized TensorRT engine. The execution phase
8
+
will deserialize and wrap this engine in a class which maintains a execution context for each engine
9
+
and some metadata about its inputs and outputs and is compatable with the TorchScript interpreter so that
10
+
it can be moved around and used like other TorchScript IValues. The engine is run by providing it and inputs
11
+
to the ``trt::execute_engine`` operator which will take the engine and its inputs and return the results of engine exeuction.
12
+
11
13
12
14
Background
13
15
------------
@@ -19,27 +21,50 @@ torch::jit::Value type).
19
21
TensorRT Engine Executor Op
20
22
----------------------------
21
23
22
-
When the TRTorch is loaded, it registers an operator in the PyTorch JIT operator library called ``trt::execute_engine(int id, ...) -> ...``
23
-
which takes a engine ID and inputs. It will then use the ID to look up the coresponding execution context, then
24
-
pop off the inputs from the runtime stack. These inputs are passed into a generic engine execution function which
24
+
When the TRTorch is loaded, it registers an operator in the PyTorch JIT operator library called
25
+
``trt::execute_engine(Tensor[] inputs, __torch__.torch.classes.tensorrt.Engine engine) -> Tensor[]`` which takes an
26
+
instantiated engine and list of inputs. Compiled graphs store this engine in an attribute so that it is portable and serializable.
27
+
When the op is called, an instnantiated engine and input tensors are popped off the runtime stack. These inputs are passed into a generic engine execution function which
25
28
will run the tensors through the TensorRT engine and return new tensors as results. These tensors are pushed on to the
26
29
stack so that the next op whatever it is can use it.
27
30
28
31
Constructing the Resulting Graph
29
32
-----------------------------------
30
33
31
-
Once the engine is registered, the compiler will construct a graph that will execute the engine when the module is called.
34
+
Once the engine is deserialized and instantiated, the compiler will construct a graph that will execute the engine when the module is called.
0 commit comments