Skip to content

Latest commit

 

History

History

02_extending_polygraphy_run

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Extending polygraphy run

Introduction

polygraphy run allows you to run inference with multiple backends, including TensorRT and ONNX-Runtime, and compare outputs. While it does provide mechanisms to load and compare against custom outputs from unsupported backends, adding support for the backend via an extension module allows it to be integrated more seamlessly, providing a better user experience.

In this example, we'll create an extension module for polygraphy run called polygraphy_reshape_destroyer, which will include the following:

  • A special loader that will replace no-op Reshape nodes in an ONNX model with Identity nodes.

  • A custom runner that supports ONNX models containing only Identity nodes.

  • Command-line options to:

    • Enable or disable renaming nodes when a transformation is applied by the loader.
    • Run the model in slow, medium, or fast mode. In slow and medium modes, we'll inject a time.sleep() during inference (this will result in massive performance gains in fast mode!).

Background

Although this example is self-contained and concepts will be explained as you encounter them, it is still recommended that you first familiarize yourself with Polygraphy's Loader and Runner APIs, the Argument Group Interface, as well as the Script interface.

After that, creating an extension module for polygraphy run is a simple matter of defining your custom Loaders/Runners and Argument Groups and making them visible to Polygraphy via setuptools's entry_points API.

NOTE: Defining a custom Loader is not strictly required, but will be covered in this example for the sake of completeness.

As a matter of convention, Polygraphy extension module names are prefixed with polygraphy_.

Reading The Example Code

We've structured our example extension module such that it somewhat mirrors the structure of the Polygraphy repository. This should make it easier to see the parallels between functionality in the extension module and that provided by Polygraphy natively. The structure is:

- extension_module/
    - polygraphy_reshape_destroyer/
        - backend/
            - __init__.py   # Controls submodule-level exports
            - loader.py     # Defines our custom loader.
            - runner.py     # Defines our custom runner.
        - args/
            - __init__.py   # Controls submodule-level exports
            - loader.py     # Defines command-line argument group for our custom loader.
            - runner.py     # Defines command-line argument group for our custom runner.
        - __init__.py       # Controls module-level exports
        - export.py         # Defines the entry-point for `polygraphy run`.
    - setup.py              # Builds our module

It is recommended that you read these files in the following order:

  1. backend/loader.py
  2. backend/runner.py
  3. backend/__init__.py
  4. args/loader.py
  5. args/runner.py
  6. args/__init__.py
  7. __init__.py
  8. export.py
  9. setup.py

Running The Example

  1. Build and install the extension module:

    Build using setup.py:

    python3 extension_module/setup.py bdist_wheel

    Install the wheel:

    python3 -m pip install extension_module/dist/polygraphy_reshape_destroyer-0.0.1-py3-none-any.whl \
        --extra-index-url https://pypi.ngc.nvidia.com

    TIP: If you make changes to the example extension module, you can update your installed version by rebuilding (by following step 1) and then running:

    python3 -m pip install extension_module/dist/polygraphy_reshape_destroyer-0.0.1-py3-none-any.whl \
        --force-reinstall --no-deps
  2. Once the extension module is installed, you should see the options you added appear in the help output of polygraphy run:

    polygraphy run -h
  3. Next, we can try out our custom runner with an ONNX model containing a no-op Reshape:

    polygraphy run no_op_reshape.onnx --res-des
  4. We can also try some of the other command-line options we added:

    • Renaming replaced nodes:

      polygraphy run no_op_reshape.onnx --res-des --res-des-rename-nodes
    • Different inference speeds:

      polygraphy run no_op_reshape.onnx --res-des --res-des-speed=slow
      polygraphy run no_op_reshape.onnx --res-des --res-des-speed=medium
      polygraphy run no_op_reshape.onnx --res-des --res-des-speed=fast
  5. Lastly, let's compare our implementation against ONNX-Runtime to make sure it is functionally correct:

    polygraphy run no_op_reshape.onnx --res-des --onnxrt