@@ -19,79 +19,83 @@ to standard TorchScript. Load with ``torch.jit.load()`` and run like you would r
19
19
trtorchc [input_file_path] [output_file_path]
20
20
[input_specs...] {OPTIONS}
21
21
22
- TRTorch is a compiler for TorchScript, it will compile and optimize
23
- TorchScript programs to run on NVIDIA GPUs using TensorRT
22
+ TRTorch is a compiler for TorchScript, it will compile and optimize
23
+ TorchScript programs to run on NVIDIA GPUs using TensorRT
24
24
25
- OPTIONS:
25
+ OPTIONS:
26
26
27
- -h, --help Display this help menu
28
- Verbiosity of the compiler
29
- -v, --verbose Dumps debugging information about the
30
- compilation process onto the console
31
- -w, --warnings Disables warnings generated during
32
- compilation onto the console (warnings
33
- are on by default)
34
- --i, --info Dumps info messages generated during
35
- compilation onto the console
36
- --build-debuggable-engine Creates a debuggable engine
37
- --use-strict-types Restrict operating type to only use set
38
- operation precision
39
- --allow-gpu-fallback (Only used when targeting DLA
40
- (device-type)) Lets engine run layers on
41
- GPU if they are not supported on DLA
42
- --disable-tf32 Prevent Float32 layers from using the
43
- TF32 data format
44
- -p[precision...],
45
- --enabled-precison=[precision...] (Repeatable) Enabling an operating
46
- precision for kernels to use when
47
- building the engine (Int8 requires a
48
- calibration-cache argument) [ float |
49
- float32 | f32 | half | float16 | f16 |
50
- int8 | i8 ] (default: float)
51
- -d[type], --device-type=[type] The type of device the engine should be
52
- built for [ gpu | dla ] (default: gpu)
53
- --gpu-id=[gpu_id] GPU id if running on multi-GPU platform
54
- (defaults to 0)
55
- --dla-core=[dla_core] DLACore id if running on available DLA
56
- (defaults to 0)
57
- --engine-capability=[capability] The type of device the engine should be
58
- built for [ default | safe_gpu |
59
- safe_dla ]
60
- --calibration-cache-file=[file_path]
61
- Path to calibration cache file to use
62
- for post training quantization
63
- --num-min-timing-iter=[num_iters] Number of minimization timing iterations
64
- used to select kernels
65
- --num-avg-timing-iters=[num_iters]
66
- Number of averaging timing iterations
67
- used to select kernels
68
- --workspace-size=[workspace_size] Maximum size of workspace given to
69
- TensorRT
70
- --max-batch-size=[max_batch_size] Maximum batch size (must be >= 1 to be
71
- set, 0 means not set)
72
- -t[threshold],
73
- --threshold=[threshold] Maximum acceptable numerical deviation
74
- from standard torchscript output
75
- (default 2e-5)
76
- --save-engine Instead of compiling a full a
77
- TorchScript program, save the created
78
- engine to the path specified as the
79
- output path
80
- input_file_path Path to input TorchScript file
81
- output_file_path Path for compiled TorchScript (or
82
- TensorRT engine) file
83
- input_specs... Specs for inputs to engine, can either
84
- be a single size or a range defined by
85
- Min, Optimal, Max sizes, e.g.
86
- "(N,..,C,H,W)"
87
- "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]".
88
- Data Type and format can be specified by
89
- adding an "@" followed by dtype and "%"
90
- followed by format to the end of the
91
- shape spec. e.g. "(3, 3, 32,
92
- 32)@f16%NHWC"
93
- "--" can be used to terminate flag options and force all following
94
- arguments to be treated as positional options
27
+ -h, --help Display this help menu
28
+ Verbiosity of the compiler
29
+ -v, --verbose Dumps debugging information about the
30
+ compilation process onto the console
31
+ -w, --warnings Disables warnings generated during
32
+ compilation onto the console (warnings
33
+ are on by default)
34
+ --i, --info Dumps info messages generated during
35
+ compilation onto the console
36
+ --build-debuggable-engine Creates a debuggable engine
37
+ --use-strict-types Restrict operating type to only use set
38
+ operation precision
39
+ --allow-gpu-fallback (Only used when targeting DLA
40
+ (device-type)) Lets engine run layers on
41
+ GPU if they are not supported on DLA
42
+ --disable-tf32 Prevent Float32 layers from using the
43
+ TF32 data format
44
+ -p[precision...],
45
+ --enabled-precison=[precision...] (Repeatable) Enabling an operating
46
+ precision for kernels to use when
47
+ building the engine (Int8 requires a
48
+ calibration-cache argument) [ float |
49
+ float32 | f32 | half | float16 | f16 |
50
+ int8 | i8 ] (default: float)
51
+ -d[type], --device-type=[type] The type of device the engine should be
52
+ built for [ gpu | dla ] (default: gpu)
53
+ --gpu-id=[gpu_id] GPU id if running on multi-GPU platform
54
+ (defaults to 0)
55
+ --dla-core=[dla_core] DLACore id if running on available DLA
56
+ (defaults to 0)
57
+ --engine-capability=[capability] The type of device the engine should be
58
+ built for [ default | safe_gpu |
59
+ safe_dla ]
60
+ --calibration-cache-file=[file_path]
61
+ Path to calibration cache file to use
62
+ for post training quantization
63
+ --embed-engine Whether to treat input file as a
64
+ serialized TensorRT engine and embed it
65
+ into a TorchScript module (device spec
66
+ must be provided)
67
+ --num-min-timing-iter=[num_iters] Number of minimization timing iterations
68
+ used to select kernels
69
+ --num-avg-timing-iters=[num_iters]
70
+ Number of averaging timing iterations
71
+ used to select kernels
72
+ --workspace-size=[workspace_size] Maximum size of workspace given to
73
+ TensorRT
74
+ --max-batch-size=[max_batch_size] Maximum batch size (must be >= 1 to be
75
+ set, 0 means not set)
76
+ -t[threshold],
77
+ --threshold=[threshold] Maximum acceptable numerical deviation
78
+ from standard torchscript output
79
+ (default 2e-5)
80
+ --save-engine Instead of compiling a full a
81
+ TorchScript program, save the created
82
+ engine to the path specified as the
83
+ output path
84
+ input_file_path Path to input TorchScript file
85
+ output_file_path Path for compiled TorchScript (or
86
+ TensorRT engine) file
87
+ input_specs... Specs for inputs to engine, can either
88
+ be a single size or a range defined by
89
+ Min, Optimal, Max sizes, e.g.
90
+ "(N,..,C,H,W)"
91
+ "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]".
92
+ Data Type and format can be specified by
93
+ adding an "@" followed by dtype and "%"
94
+ followed by format to the end of the
95
+ shape spec. e.g. "(3, 3, 32,
96
+ 32)@f16%NHWC"
97
+ "--" can be used to terminate flag options and force all following
98
+ arguments to be treated as positional options
95
99
96
100
97
101
e.g.
0 commit comments