Skip to content

Commit aa9bf94

Browse files
committed
Upgrade Polygraphy to v0.33.0.
Prominent updates include (see [CHANGELOG](tools/Polygraphy/CHANGELOG.md] for details) - Added various examples, a CLI User Guide and how-to guides. - Added experimental support for DLA. - Added a `data to-input` tool that can combine inputs/outputs created by `--save-inputs`/`--save-outputs`. - Added a `PluginRefRunner` which provides CPU reference implementations for TensorRT plugins - Made several performance improvements in the Polygraphy CUDA wrapper. - Removed the `to-json` tool which was used to convert Pickled data generated by Polygraphy 0.26.1 and older to JSON. Signed-off-by: Rajeev Rao <[email protected]>
1 parent b277416 commit aa9bf94

File tree

137 files changed

+2790
-938
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

137 files changed

+2790
-938
lines changed

tools/Polygraphy/CHANGELOG.md

+63
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,69 @@
33
Dates are in YYYY-MM-DD format.
44

55

6+
## v0.33.0 (2021-09-16)
7+
### Added
8+
- Added various examples, a [CLI User Guide](polygraphy/tools/) and [directory for how-to guides](./how-to).
9+
- Added an experimental `template trt-config` tool to generate template scripts that create TensorRT builder configurations.
10+
- Added `--hide-fail-output` to make `debug` subtools suppress output from failed iterations.
11+
- Added experimental support for DLA.
12+
- Added a `data to-input` tool that can combine inputs/outputs created by `--save-inputs`/`--save-outputs`.
13+
The resulting file is compatible with `--load-inputs`.
14+
15+
### Changed
16+
- Updated `debug` subtools to show captured output on failed iterations.
17+
- The logger will now emit all `CRITICAL` messages to `stderr` instead of `stdout`.
18+
- Renamed `CompareFunc.basic_compare_func` to `CompareFunc.simple`. The old name is preserved as an alias.
19+
- The `--good` and `--bad` arguments in `diff-tactics` can now also accept single files instead of directories.
20+
21+
### Fixed
22+
- Fixed a bug where `debug reduce` would crash when ONNX models included `Constant` nodes whose outputs
23+
needed to be marked as model outputs.
24+
25+
26+
## v0.32.0 (2021-08-10)
27+
### Added
28+
- Added support for `K`, `M`, and `G` suffixes to CLI arguments that expect a number of bytes (e.g. `--workspace`).
29+
These correspond to `KiB`, `MiB`, and `GiB` respectively.
30+
For example, `--workspace=16M` is equivalent to `--workspace=16777216`.
31+
- Added a `copy_outputs_to_host` parameter in `TrtRunner.infer()`, which, when set to `False`, will cause the runner
32+
to return `DeviceView`s instead of NumPy arrays for inference outputs. This allows us to avoid a
33+
device-to-host and host-to-device copy if we want outputs to remain on the device.
34+
- Added a `view()` method to `DeviceArray`s to create read-only `DeviceView`s over their data.
35+
- Added a `PluginRefRunner` which provides CPU reference implementations for TensorRT plugins
36+
and a corresponding `--pluginref` runner option in `polygraphy run`.
37+
38+
### Changed
39+
- Marked old shape syntax (`<name>,dim0xdim1x...xdimN,<dtype>`) as deprecated since it leads to ambiguity when
40+
parsing shapes including named dynamic dimensions.
41+
42+
For example, compare:
43+
```
44+
--input-shapes input0,xxyxz
45+
```
46+
47+
and:
48+
```
49+
--input-shapes input0:[x,y,z]
50+
```
51+
52+
For now, the old syntax continues to work for shapes without named dimensions,
53+
but it will be removed in a future version of Polygraphy.
54+
55+
The newer syntax, which was originally introduced in Polygraphy 0.25.0,
56+
uses the list syntax already present in other parts of Polygraphy.
57+
For example, `--val-range [0,1]` in `run` and `--attrs axes=[0,1]` in `surgeon insert` use the same syntax.
58+
- Made several performance improvements in the Polygraphy CUDA wrapper.
59+
- Added a loud warning when the deprecated `--int-min`/`--int-max` or `--float-min`/`--float-max` options are used.
60+
These are superseded by `--val-range` which allows you to specify data ranges on a per-input basis.
61+
62+
### Removed
63+
- Removed various deprecated aliases: `ModifyOnnx`, `SessionFromOnnxBytes`, `ModifyNetwork`, `ModifyGraph`
64+
- Removed the `to-json` tool which was used to convert Pickled data generated by Polygraphy 0.26.1 and older to JSON.
65+
Polygraphy 0.27.0 and later only support reading and writing data in JSON format.
66+
- Removed deprecated legacy submodule `polygraphy.util.misc` which was just an alias for `polygraphy.util`.
67+
68+
669
## v0.31.1 (2021-07-16)
770
### Changed
871
- Improved the quality of several examples and added information on how to load serialized TensorRT engines

tools/Polygraphy/Makefile

+7-1
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,16 @@ NPROC ?= 8
44

55
# Tests also check that docs can build
66
test: docs
7+
# Some tests need to be run serially - we annotate those with a `serial` marker.
78
export PYTHONPATH=$(CURDIR):$${PYTHONPATH} && \
9+
export PATH=$(CURDIR)/bin:$${PATH} && \
10+
export POLYGRAPHY_INTERNAL_CORRECTNESS_CHECKS=1 && \
11+
python3 -m pytest tests/ -v -x --durations=5 -m "serial" && \
12+
\
13+
export PYTHONPATH=$(CURDIR):$${PYTHONPATH} && \
814
export PATH=$(CURDIR)/bin:$${PATH} && \
915
export POLYGRAPHY_INTERNAL_CORRECTNESS_CHECKS=1 && \
10-
python3 -m pytest tests/ -v -x -n $(NPROC) --dist=loadscope --durations=5
16+
python3 -m pytest tests/ -v -x -n $(NPROC) --dist=loadscope --durations=5 -m "not serial"
1117

1218
leak_check:
1319
export PYTHONPATH=$(CURDIR):$${PYTHONPATH} && \

tools/Polygraphy/README.md

+13-28
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,10 @@
55

66
- [Introduction](#introduction)
77
- [Installation](#installation)
8-
- [Usage](#usage)
8+
- [Command-line Toolkit](#command-line-toolkit)
9+
- [Python API](#python-api)
910
- [Examples](#examples)
10-
- [Advanced](#advanced)
11-
- [Using The Python API](#using-the-python-api)
12-
- [Enabling Internal Correctness Checks](#enabling-internal-correctness-checks)
11+
- [How-To Guides](#how-to-guides)
1312
- [Contributing](#contributing)
1413

1514

@@ -43,7 +42,7 @@ Among other things, Polygraphy lets you:
4342
### Installing Prebuilt Wheels
4443

4544
```bash
46-
python -m pip install colored polygraphy --index-url https://pypi.ngc.nvidia.com
45+
python -m pip install colored polygraphy --extra-index-url https://pypi.ngc.nvidia.com
4746
```
4847

4948
**NOTE:** *When using this method, the command-line toolkit will be installed into `${HOME}/.local/bin` by default.*
@@ -137,41 +136,27 @@ You can install the additional packages manually with:
137136
python -m pip install <package_name>
138137
```
139138

140-
## Usage
141139

142-
Polygraphy includes a command-line interface, [`polygraphy`](./bin/polygraphy), which provides various tools.
143-
For usage information, run `polygraphy --help`
140+
## Command-line Toolkit
144141

145-
For details on the various tools included in the Polygraphy toolkit, see the
146-
[tools directory](./polygraphy/tools).
142+
For details on the various tools included in the Polygraphy toolkit,
143+
see the [CLI User Guide](./polygraphy/tools).
147144

148145

149-
## Examples
150-
151-
For examples of both the CLI and Python API, see the [examples directory](./examples).
152-
153-
154-
## Advanced
155-
156-
### Using The Python API
146+
### Python API
157147

158148
For more information on the Polygraphy Python API, including a high-level overview and the
159149
Python API reference documentation, see the [API directory](./polygraphy).
160150

161151

162-
### Enabling Internal Correctness Checks
152+
## Examples
153+
154+
For examples of both the CLI and Python API, see the [examples directory](./examples).
163155

164-
Polygraphy includes various runtime checks for internal correctness, which are
165-
disabled by default. These checks can be enabled by setting the `POLYGRAPHY_INTERNAL_CORRECTNESS_CHECKS`
166-
environment variable to `1` or `polygraphy.config.INTERNAL_CORRECTNESS_CHECKS = True` in the Python API.
167-
A failure in this type of check indicates a bug in Polygraphy.
168156

169-
When the checks are enabled, Polygraphy will ensure, for example, that loaders do not
170-
modify their state when they are called, and that runners will reset their state correctly in
171-
`deactivate()`.
157+
## How-To Guides
172158

173-
**NOTE:** *`POLYGRAPHY_INTERNAL_CORRECTNESS_CHECKS` only relates to checks that validate Polygraphy's*
174-
*internal APIs. User input validation and public API checks are always enabled and cannot be disabled.*
159+
For how-to guides, see the [how-to guides directory](./how-to).
175160

176161

177162
## Contributing

tools/Polygraphy/docs/conf.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@
3636
autodoc_default_options = {
3737
"members": True,
3838
"show-inheritance": True,
39-
"exclude-members": "activate_impl, deactivate_impl, get_input_metadata_impl, infer_impl, BaseNetworkFromOnnx, Encoder, Decoder, add_json_methods, constantmethod",
39+
"exclude-members": "activate_impl, deactivate_impl, get_input_metadata_impl, BaseNetworkFromOnnx, Encoder, Decoder, add_json_methods, constantmethod",
4040
"special-members": "__call__, __getitem__, __bool__, __enter__, __exit__",
4141
}
4242

tools/Polygraphy/examples/README.md

-2
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,3 @@
33
This directory includes various examples covering the Polygraphy [CLI](./cli), [Python API](./api), and [development practices](./dev).
44

55
The paths used in each example assume that the example is being run from within that example's directory.
6-
7-
All the models used by these examples are provided in the [models directory](./models).

tools/Polygraphy/examples/api/00_inference_with_tensorrt/build_and_run.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ def main():
2929
#
3030
# NOTE: `build_engine` is a *callable* that returns an engine, not the engine itself.
3131
# To get the engine directly, you can use the immediately evaluated functional API.
32-
# See eexamples/api/06_immediate_eval_api for details.
32+
# See examples/api/06_immediate_eval_api for details.
3333
build_engine = EngineFromNetwork(
3434
NetworkFromOnnxPath("identity.onnx"), config=CreateConfig(fp16=True)
3535
) # Note that config is an optional argument.

tools/Polygraphy/examples/api/00_inference_with_tensorrt/load_and_run.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
#
1717

1818
"""
19-
This script loads the TensorRT engine built by `build_and_run.py` and then runs it.
19+
This script loads the TensorRT engine built by `build_and_run.py` and runs inference.
2020
"""
2121
import numpy as np
2222
from polygraphy.backend.common import BytesFromPath

tools/Polygraphy/examples/api/01_comparing_frameworks/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ different backends. This makes it possible to check the accuracy of one backend
88
respect to another.
99

1010
In this example, we'll look at how you can use the Polygraphy API to run inference
11-
on a model using ONNX Runtime and TensorRT, and then compare the results.
11+
with synthetic input data using ONNX Runtime and TensorRT, and then compare the results.
1212

1313

1414
## Running The Example

tools/Polygraphy/examples/api/01_comparing_frameworks/example.py

+6-2
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
"""
2222
from polygraphy.backend.onnxrt import OnnxrtRunner, SessionFromOnnx
2323
from polygraphy.backend.trt import EngineFromNetwork, NetworkFromOnnxPath, TrtRunner
24-
from polygraphy.comparator import Comparator
24+
from polygraphy.comparator import Comparator, CompareFunc
2525

2626

2727
def main():
@@ -46,7 +46,11 @@ def main():
4646
run_results = Comparator.run(runners)
4747

4848
# `Comparator.compare_accuracy()` checks that outputs match between runners.
49-
assert bool(Comparator.compare_accuracy(run_results))
49+
#
50+
# TIP: The `compare_func` parameter can be used to control how outputs are compared (see API reference for details).
51+
# The default comparison function is created by `CompareFunc.simple()`, but we can construct it
52+
# explicitly if we want to change the default parameters, such as tolerance.
53+
assert bool(Comparator.compare_accuracy(run_results, compare_func=CompareFunc.simple(atol=1e-8)))
5054

5155
# We can use `RunResults.save()` method to save the inference results to a JSON file.
5256
# This can be useful if you want to generate and compare results separately.

tools/Polygraphy/examples/api/04_int8_calibration_in_tensorrt/example.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
def calib_data():
3030
for _ in range(4):
3131
# TIP: If your calibration data is already on the GPU, you can instead provide GPU pointers
32-
# (as `int`s) or Polygraphy `DeviceView`s instead of NumPy arrays.
32+
# (as `int`s) or Polygraphy `DeviceView`s instead of NumPy arrays.
3333
#
3434
# For details on `DeviceView`, see `polygraphy/cuda/cuda.py`.
3535
yield {"x": np.ones(shape=(1, 1, 2, 2), dtype=np.float32)} # Totally real data

tools/Polygraphy/examples/api/06_immediate_eval_api/README.md

+31-5
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
## Introduction
44

5+
<!-- Polygraphy Test: Ignore Start -->
56
Most of the time, the lazy loaders included with Polygraphy have several advantages:
67

78
- They allow us to defer the work until we actually need to do it, which can potentially save
@@ -16,6 +17,7 @@ Most of the time, the lazy loaders included with Polygraphy have several advanta
1617
```python
1718
build_engine = EngineBytesFromNetwork(NetworkFromOnnxPath("/path/to/model.onnx"))
1819
```
20+
1921
- They allow for special semantics where if a callable is provided to a loader, it takes ownership
2022
of the return value, whereas otherwise it does not. These special semantics are useful for
2123
sharing objects between multiple loaders.
@@ -46,21 +48,45 @@ engine = build_engine()
4648
becomes:
4749

4850
```python
49-
builder, network = network_from_onnx_path("/path/to/model.onnx")
51+
builder, network, parser = network_from_onnx_path("/path/to/model.onnx")
5052
config = create_config(builder, network, fp16=True, tf32=True)
51-
engine = engine_from_network((builder, network), config)
53+
engine = engine_from_network((builder, network, parser), config)
5254
```
55+
<!-- Polygraphy Test: Ignore End -->
56+
57+
58+
In this example, we'll look at how you can leverage the functional API to convert an ONNX
59+
model to a TensorRT network, modify the network, build a TensorRT engine with FP16 precision
60+
enabled, and run inference.
61+
We'll also save the engine to a file to see how you can load it again and run inference.
5362

54-
`example.py` showcases basic usage of the immediately evaluated functional API.
5563

5664
## Running The Example
5765

5866
1. Install prerequisites
5967
* Ensure that TensorRT is installed
6068
* Install other dependencies with `python3 -m pip install -r requirements.txt`
6169

62-
2. Run the example:
70+
2. **[Optional]** Inspect the model before running the example:
71+
72+
```bash
73+
polygraphy inspect model identity.onnx
74+
```
75+
76+
3. Run the script that builds and runs the engine:
77+
78+
```bash
79+
python3 build_and_run.py
80+
```
81+
82+
4. **[Optional]** Inspect the TensorRT engine built by the example:
83+
84+
```bash
85+
polygraphy inspect model identity.engine
86+
```
87+
88+
5. Run the script that loads the previously built engine, then runs it:
6389

6490
```bash
65-
python3 example.py
91+
python3 load_and_run.py
6692
```

tools/Polygraphy/examples/api/06_immediate_eval_api/example.py renamed to tools/Polygraphy/examples/api/06_immediate_eval_api/build_and_run.py

+12-7
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,11 @@
1818
"""
1919
This script uses Polygraphy's immediately evaluated functional APIs
2020
to load an ONNX model, convert it into a TensorRT network, add an identity
21-
layer to the end of it, build an engine with FP16 mode enabled, and finally
22-
run inference.
21+
layer to the end of it, build an engine with FP16 mode enabled,
22+
save the engine, and finally run inference.
2323
"""
2424
import numpy as np
25-
26-
from polygraphy.backend.trt import TrtRunner, create_config, engine_from_network, network_from_onnx_path
25+
from polygraphy.backend.trt import TrtRunner, create_config, engine_from_network, network_from_onnx_path, save_engine
2726

2827

2928
def main():
@@ -34,7 +33,10 @@ def main():
3433
# Since we are immediately evaluating, we take ownership of objects, and are responsible for freeing them.
3534
builder, network, parser = network_from_onnx_path("identity.onnx")
3635

37-
# Extend the network with an identity layer.
36+
# Extend the network with an identity layer (purely for the sake of example).
37+
# Note that unlike with lazy loaders, we don't need to do anything special to modify the network.
38+
# If we were using lazy loaders, we would need to use `func.extend()` as described
39+
# in example 03 and example 05.
3840
prev_output = network.get_output(0)
3941
network.unmark_output(prev_output)
4042
output = network.add_identity(prev_output).get_output(0)
@@ -45,11 +47,14 @@ def main():
4547
config = create_config(builder, network, fp16=True)
4648

4749
# We can free everything we constructed above once we're done building the engine.
48-
# NOTE: In TensorRT 8.0, we do *not* need to use a context manager here.
50+
# NOTE: In TensorRT 8.0 and newer, we do *not* need to use a context manager here.
4951
with builder, network, parser, config:
5052
engine = engine_from_network((builder, network), config)
5153

52-
# NOTE: In TensorRT 8.0, we do *not* need to use a context manager to free `engine`.
54+
# To reuse the engine elsewhere, we can serialize it and save it to a file.
55+
save_engine(engine, path="identity.engine")
56+
57+
# NOTE: In TensorRT 8.0 and newer, we do *not* need to use a context manager to free `engine`.
5358
with engine, TrtRunner(engine) as runner:
5459
inp_data = np.ones((1, 1, 2, 2), dtype=np.float32)
5560

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/usr/bin/env python3
2+
#
3+
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
#
17+
18+
"""
19+
This script uses Polygraphy's immediately evaluated functional APIs
20+
to load the TensorRT engine built by `build_and_run.py` and run inference.
21+
"""
22+
import numpy as np
23+
from polygraphy.backend.common import bytes_from_path
24+
from polygraphy.backend.trt import TrtRunner, engine_from_bytes
25+
26+
27+
def main():
28+
engine = engine_from_bytes(bytes_from_path("identity.engine"))
29+
30+
# NOTE: In TensorRT 8.0 and newer, we do *not* need to use a context manager to free `engine`.
31+
with engine, TrtRunner(engine) as runner:
32+
inp_data = np.ones((1, 1, 2, 2), dtype=np.float32)
33+
34+
# NOTE: The runner owns the output buffers and is free to reuse them between `infer()` calls.
35+
# Thus, if you want to store results from multiple inferences, you should use `copy.deepcopy()`.
36+
outputs = runner.infer(feed_dict={"x": inp_data})
37+
38+
assert np.array_equal(outputs["output"], inp_data) # It's an identity model!
39+
40+
print("Inference succeeded!")
41+
42+
43+
if __name__ == "__main__":
44+
main()

0 commit comments

Comments
 (0)