Skip to content

✨[Feature] Setting the input data type of models, such as INT32, is not supported #388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ruoqianguo opened this issue Mar 4, 2021 · 5 comments · Fixed by #510
Closed
Assignees
Labels
feature request New feature or request
Milestone

Comments

@ruoqianguo
Copy link
Contributor

Is your feature request related to a problem? Please describe.

When i tried to use TRTorch to convert some PyTorch models which input's data type is INT32 such as BERT , it failed in model.forward(inputs) process. The error message is below. It seems that TRT Model's input data type is related to op_precision.

Describe the solution you'd like

I would like to add a corresponding interface to specify the type of input data.

Describe alternatives you've considered

Additional context

torch::Tensor in1 = torch::randint(0, 4, {128, 203}, torch::kCUDA).to(torch::kInt32);  
torch::Tensor in2 = torch::randint(0, 4, {128, 203}, torch::kCUDA).to(torch::kInt32);
std::vector<at::Tensor> inputs_trt;
inputs_trt.push_back(in1);
inputs_trt.push_back(in2);

std::vector<torch::jit::IValue> inputs_trt_;
for (auto in : inputs_trt) {
  inputs_trt_.push_back(torch::jit::IValue(in.clone()));
}

torch::jit::Module mod;
try {
  // Deserialize the ScriptModule from a file using torch::jit::load().
  mod = torch::jit::load(path);
} catch (const c10::Error& e) {
  std::cerr << "error loading the model\n";
}
mod.eval();
mod.to(torch::kCUDA);

auto trt_mod = trtorch::CompileGraph(mod, std::vector<trtorch::CompileSpec::InputRange>{in1.sizes(), in2.sizes()});
auto trt_out = trt_mod.forward(inputs_trt_);

When executing trt_mod.forward(), a bug appears
image

@ruoqianguo ruoqianguo added the feature request New feature or request label Mar 4, 2021
@peri044
Copy link
Collaborator

peri044 commented Mar 5, 2021

@narendasan
Discovered the same bug while writing testcase for aten::cast. I just had a workaround for cast testcases.
TRTorch assumes the input to be float by using ctx->input_type

I agree we need to provide an interface to provide input data types of tensors.

Also TRT supports INT32 input datatype. When I have two inputs (float32, int32) to the graph, TRT worked fine. I did not
encounter a case with only int32 inputs yet. But I think TRT will support it https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_network_definition.html#a06a61f560bdf6197afd3368937f62025

@narendasan
Copy link
Collaborator

I think those inputs (float32, int32) cases are cases where the int32 are shape tensors typically. I am not sure there is actual input data in int32 (can be completely wrong as well).

@borisfom
Copy link
Collaborator

Oh yes there is actual input data in INT32 (INT64 even) for BERT and other transformers.

@narendasan
Copy link
Collaborator

#412

@ncomly-nvidia
Copy link
Contributor

Comment added by Nick Comly in Aha! - View

If Input provided: Use specified precision

Else: use engine precision (INT8 uses FP32 input)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants