Skip to content

🐛 [Bug] Cannot export traced model to TensorRT #1227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
domef opened this issue Aug 3, 2022 · 1 comment · Fixed by #1236
Closed

🐛 [Bug] Cannot export traced model to TensorRT #1227

domef opened this issue Aug 3, 2022 · 1 comment · Fixed by #1236
Assignees
Labels
bug Something isn't working channel: NGC Issues related to NGC

Comments

@domef
Copy link

domef commented Aug 3, 2022

Bug Description

I get an error while trying to compile a model with TensorRT. First I trace the model, then I compile it with TensorRT, where it explodes.
Unfortunately my codebase is private so I cannot share much information.

The code I use to export. The variable mymodel is a nn.Module in eval mode and on cuda device:

dummy_input = torch.rand((1, 3, 512, 512), device="cuda")
mymodel_trace = torch.jit.trace(
    mymodel,
    dummy_input,
    strict=False,
)
mymodel_trt = torch_tensorrt.ts.compile(
    mymodel_trace,
    inputs=[torch_tensorrt.Input(dummy_input.shape)],
    device={
        "device_type": torch_tensorrt.DeviceType.GPU,
        "gpu_id": 0,
        "dla_core": 0,
        "allow_gpu_fallback": True,
    },
    truncate_long_and_double=True,
)

The error is:

File "examples/tensorrt/demo.py", line 128, in demo
    mymodel_trt = torch_tensorrt.ts.compile(
  File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/ts/_compiler.py", line 113, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: 
Schema not found for node. File a bug report.
Node: %3874 : int[] = prim::ListConstruct(%16, %3868), scope: __module._merger/__module._merger._gatherer

Input types:int, int
no candidates found
within the graph

and then there hundreds of lines describing the graph.

ENVIRONMENT

I' m using the container nvcr.io/nvidia/pytorch:22.07-py3.
To give more information about the environment I'm using, I used the script provided by PyTorch Bug Report on GitHub https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py.

PyTorch version: 1.13.0a0+08820cb
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.4 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.23.2
Libc version: glibc-2.31

Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) [GCC 10.3.0] (64-bit runtime)
Python platform: Linux-5.15.0-41-generic-x86_64-with-glibc2.10
Is CUDA available: True
CUDA runtime version: 11.7.99
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 2080 Ti
Nvidia driver version: 510.73.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.4.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.23.1
[pip3] pytorch-lightning==1.6.5
[pip3] pytorch-quantization==2.1.2
[pip3] torch==1.13.0a0+08820cb
[pip3] torch-tensorrt==1.2.0a0
[pip3] torchmetrics==0.9.3
[pip3] torchtext==0.13.0a0
[pip3] torchvision==0.14.0a0
[conda] mkl 2020.4 h726a3e6_304 conda-forge
[conda] mkl-include 2020.4 h726a3e6_304 conda-forge
[conda] numpy 1.23.1 pypi_0 pypi
[conda] pytorch-lightning 1.6.5 pypi_0 pypi
[conda] pytorch-quantization 2.1.2 pypi_0 pypi
[conda] torch 1.13.0a0+08820cb pypi_0 pypi
[conda] torch-tensorrt 1.2.0a0 pypi_0 pypi
[conda] torchmetrics 0.9.3 pypi_0 pypi
[conda] torchtext 0.13.0a0 pypi_0 pypi
[conda] torchvision 0.14.0a0 pypi_0 pypi

@domef domef added the bug Something isn't working label Aug 3, 2022
@narendasan narendasan added the channel: NGC Issues related to NGC label Aug 3, 2022
@bowang007
Copy link
Collaborator

bowang007 commented Aug 5, 2022

@domef try if this one #1236 could fix your bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working channel: NGC Issues related to NGC
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants