🐛 [Bug] Encountered bug when using Torch-TensorRT #1687

zshn25 · 2023-02-21T14:28:22Z

Compiling a Scripted model gives error at torch.nn.functional.interpolate(x, scale_factor=2.0, mode="nearest")

RuntimeError                              Traceback (most recent call last)
Cell In[14], line 8
      2 scripted_model = torch.jit.script(model.to("cuda"), input_image_pytorch.to("cuda"))
      4 # trt_encoder = torch_tensorrt.compile(torch.jit.script(encoder).to("cuda"), 
      5 #     inputs= [input_image_pytorch.to("cuda")],
      6 #     enabled_precisions= { torch.float32} # Run with FP16
      7 # )
----> 8 trt_model = torch_tensorrt.compile(scripted_model.to("cuda"), 
      9     inputs=  [input_image_pytorch.to("cuda")], # [torch_tensorrt.Input([1,3,480,768])], # [input_image_pytorch.detach().to("cuda")],
     10     enabled_precisions= { torch.float32} # Run with FP16
     11 # trt_decoder = torch_tensorrt.compile(torch.jit.script(depth_decoder).to("cuda"), 
     12 #     inputs= [features],
     13     # enabled_precisions= { torch.float32} # Run with FP16
     14 )

File ~/miniconda3/envs/inference/lib/python3.8/site-packages/torch_tensorrt/_compile.py:125, in compile(module, ir, inputs, enabled_precisions, **kwargs)
    120         logging.log(
    121             logging.Level.Info,
    122             "Module was provided as a torch.nn.Module, trying to script the module with torch.jit.script. In the event of a failure please preconvert your module to TorchScript",
    123         )
    124         ts_mod = torch.jit.script(module)
--> 125     return torch_tensorrt.ts.compile(
    126         ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs
    127     )
    128 elif target_ir == _IRType.fx:
    129     if (
    130         torch.float16 in enabled_precisions
    131         or torch_tensorrt.dtype.half in enabled_precisions
    132     ):

File ~/miniconda3/envs/inference/lib/python3.8/site-packages/torch_tensorrt/ts/_compiler.py:136, in compile(module, inputs, input_signature, device, disable_tf32, sparse_weights, enabled_precisions, refit, debug, capability, num_avg_timing_iters, workspace_size, dla_sram_size, dla_local_dram_size, dla_global_dram_size, calibrator, truncate_long_and_double, require_full_compilation, min_block_size, torch_executed_ops, torch_executed_modules)
    110     raise ValueError(
    111         f"require_full_compilation is enabled however the list of modules and ops to run in torch is not empty. Found: torch_executed_ops: {torch_executed_ops}, torch_executed_modules: {torch_executed_modules}"
    112     )
    114 spec = {
    115     "inputs": inputs,
    116     "input_signature": input_signature,
   (...)
    133     },
    134 }
--> 136 compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
    137 compiled_module = torch.jit._recursive.wrap_cpp_module(compiled_cpp_mod)
    138 return compiled_module

RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/home/user/miniconda3/envs/inference/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 21, in forward
        x = self.quant(x)
        x = self.encoder(x)
        x = self.decoder(x)
            ~~~~~~~~~~~~ <--- HERE
        x = self.dequant(x[0])
        return x
  File "/home/user/playground/networks/depth_decoder_for_conversion.py", line 116, in forward
    
            # upsample and horizontal connections
            x = torch.nn.functional.interpolate(x, scale_factor=2.0, mode="nearest")
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            if self.use_skips and i > 0:
                x_concat = input_features[i - 1]
  File "/home/user/miniconda3/envs/inference/lib/python3.8/site-packages/torch/nn/functional.py", line 3922, in interpolate
        return torch._C._nn.upsample_nearest1d(input, output_size, scale_factors)
    if input.dim() == 4 and mode == "nearest":
        return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    if input.dim() == 5 and mode == "nearest":
        return torch._C._nn.upsample_nearest3d(input, output_size, scale_factors)
RuntimeError: Expected static_cast<int64_t>(scale_factors->size()) == spatial_dimensions to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.)

To Reproduce

Steps to reproduce the behavior:

import torch
import torch_tensorrt
input_image_pytorch = torch.randn((1, 3, 480, 768), requires_grad=False).to("cuda").detach()

scripted_model = torch.jit.script(model.to("cuda"), input_image_pytorch.to("cuda"))
trt_model = torch_tensorrt.compile(scripted_model.to("cuda"), 
    inputs=  [input_image_pytorch.to("cuda")], 
    enabled_precisions= { torch.float32}
)

Expected behavior

No error.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

Torch-TensorRT Version (e.g. 1.0.0): 1.3.0
PyTorch Version (e.g. 1.0): 1.13.1
CPU Architecture: x86
OS (e.g., Linux): Ubuntu 18.04 LTS
How you installed PyTorch (conda, pip, libtorch, source): conda
Build command you used (if compiling from source):
Are you using local sources or building from archives:
Python version: 3.8
CUDA version: 11.7
GPU models and configuration:
Any other relevant information:

Additional context

The error occurs at torch.nn.functional.interpolate(x, scale_factor=2.0, mode="nearest") line in the network definition. JIT Scripting works.

The text was updated successfully, but these errors were encountered:

gcuendet · 2023-04-04T07:40:30Z

@narendasan @bowang007 any update on this? I am observing the same behaviour.
Can it be linked to this older issue? Maybe in solving that older one, it broke the support for specifying a single float as scale_factors (I mean at least at the interpolate level: torch.nn.functional.interpolate(x, scale_factor=2.0, mode="nearest"))? It's hard to understand exactly what happened with this old issue since there is no PR linked to it.

gcuendet · 2023-04-04T11:52:14Z

Actually I see some deprecation warnings when compiling Torch-TensorRT which seem to come from interpolate_plugin (amongst other files):

In file included from [...]/core/plugins/impl/interpolate_plugin.cpp:1:0:
[...]/core/plugins/impl/interpolate_plugin.h:127:107: warning: 'IPluginV2' is deprecated [-Wdeprecated-declarations]
    nvinfer1::IPluginV2* createPlugin(const char* name, const nvinfer1::PluginFieldCollection* fc) noexcept override;
                                                                                                            ^~~~~~~~
In file included from [...]/include/NvInferLegacyDims.h:16:0,
                 from [...]/include/NvInfer.h:16,
                 from [...]/core/plugins/impl/interpolate_plugin.h:12,
                 from [...]/core/plugins/impl/interpolate_plugin.cpp:1:
[...]/include/NvInferRuntimeCommon.h:393:22: note: declared here
  class TRT_DEPRECATED IPluginV2

Can that explain why interpolate is behaving in a weird way? Are these warning expected?
I am using TensorRT 8.5.3.1, which from my understanding is supposed to be the supported version for 1.3, right?

github-actions · 2023-07-04T00:02:39Z

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

philippewarren · 2023-07-13T14:29:29Z

I'd like to see this issue fixed as it is preventing us from using the software for our use case currently.

philippewarren · 2023-07-20T15:59:47Z

@narendasan or @bowang007 is anyone working on this? If not, what could be done to help?

bowang007 · 2023-07-20T23:30:36Z

Hi @philippewarren, I went through this several weeks ago but I don't think we have seen such issues previously.
Could you please provide a small reproducer? Also, a more detailed log would be helpful as well.
Thanks!

philippewarren · 2023-07-24T22:19:03Z

This seems to be caused by a combination of an upscale and multiple return values for the forward function of the network.
Here is a minimal reproducible example:

import torch
import torch.nn as nn
import torch_tensorrt


class MRE(nn.Module):
    def __init__(self):
        super(MRE, self).__init__()

        self._upsample = nn.Upsample(scale_factor=2, mode='nearest')

    def forward(self, x):
        y = self._upsample(x)

        return [x, y]


model = MRE()
x = torch.ones((1, 3, 416, 416))

model.eval()

device = torch.device('cuda')
model = model.to(device)

trt_module = torch_tensorrt.compile(
    model,
    inputs=[x.to(device)],
    enabled_precisions={torch.float},
)

torch.jit.save(trt_module, "mre.trt.pth")

The same error happens if the list returned ([x, y]) is replaced by a tuple ((x, y)).

This is the associated output:

WARNING: [Torch-TensorRT] - For input x.1, found user specified input dtype as Float32. The compiler is going to use the user setting Float32
Traceback (most recent call last):
  File "./torch-tensort-mre.py", line 31, in <module>
    trt_module = torch_tensorrt.compile(
  File "/home/philippe/.local/lib/python3.8/site-packages/torch_tensorrt/_compile.py", line 125, in compile
    return torch_tensorrt.ts.compile(
  File "/home/philippe/.local/lib/python3.8/site-packages/torch_tensorrt/ts/_compiler.py", line 136, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "./torch-tensort-mre.py", line 18, in forward
    def forward(self, x):
        y = self._upsample(x)
            ~~~~~~~~~~~~~~ <--- HERE
    
        return [x, y]
  File "/home/philippe/.local/lib/python3.8/site-packages/torch/nn/modules/upsampling.py", line 156, in forward
    def forward(self, input: Tensor) -> Tensor:
        return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners,
               ~~~~~~~~~~~~~ <--- HERE
                             recompute_scale_factor=self.recompute_scale_factor)
  File "/home/philippe/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 3922, in interpolate
        return torch._C._nn.upsample_nearest1d(input, output_size, scale_factors)
    if input.dim() == 4 and mode == "nearest":
        return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    if input.dim() == 5 and mode == "nearest":
        return torch._C._nn.upsample_nearest3d(input, output_size, scale_factors)
RuntimeError: Expected static_cast<int64_t>(scale_factors->size()) == spatial_dimensions to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.)

torch.__version__ is 1.13.1+cu117
torch_tensorrt.__version__ is 1.3.0
CUDA version is 12.1 with driver version 530.30.02 on Ubuntu 20.04.
CUDNN version is 8.9.3.28-1+cuda12.1.

github-actions · 2023-10-23T00:02:17Z

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

domef · 2024-11-28T09:49:49Z

any update?

zshn25 added the bug Something isn't working label Feb 21, 2023

zshn25 mentioned this issue Feb 21, 2023

RuntimeError: Expected static_cast<int64_t>(scale_factors->size()) == spatial_dimensions to be true, but got false NVIDIA/TensorRT#2696

Closed

narendasan added the component: partitioning label Feb 22, 2023

github-actions bot assigned bowang007 Feb 22, 2023

philippewarren added a commit to introlab/t-top that referenced this issue May 16, 2023

WIP use torch_tensorrt, blocked by pytorch/TensorRT#1687

d55dbe7

github-actions bot added the No Activity label Jul 4, 2023

laikhtewari removed the No Activity label Jul 20, 2023

github-actions bot added the No Activity label Oct 23, 2023

github-actions bot closed this as completed Nov 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 [Bug] Encountered bug when using Torch-TensorRT #1687

🐛 [Bug] Encountered bug when using Torch-TensorRT #1687

zshn25 commented Feb 21, 2023

gcuendet commented Apr 4, 2023

gcuendet commented Apr 4, 2023

github-actions bot commented Jul 4, 2023

philippewarren commented Jul 13, 2023

philippewarren commented Jul 20, 2023

bowang007 commented Jul 20, 2023

philippewarren commented Jul 24, 2023 •

edited

Loading

github-actions bot commented Oct 23, 2023

domef commented Nov 28, 2024

🐛 [Bug] Encountered bug when using Torch-TensorRT #1687

🐛 [Bug] Encountered bug when using Torch-TensorRT #1687

Comments

zshn25 commented Feb 21, 2023

Compiling a Scripted model gives error at torch.nn.functional.interpolate(x, scale_factor=2.0, mode="nearest")

To Reproduce

Expected behavior

Environment

Additional context

gcuendet commented Apr 4, 2023

gcuendet commented Apr 4, 2023

github-actions bot commented Jul 4, 2023

philippewarren commented Jul 13, 2023

philippewarren commented Jul 20, 2023

bowang007 commented Jul 20, 2023

philippewarren commented Jul 24, 2023 • edited Loading

github-actions bot commented Oct 23, 2023

domef commented Nov 28, 2024

philippewarren commented Jul 24, 2023 •

edited

Loading