Skip to content

New API to register arbitrary TRT engines as TorchScript modules #431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 21, 2021

Conversation

narendasan
Copy link
Collaborator

Description

This PR adds the ability to register TRT engines from outside TRTorch as TorchScript modules. Modules created through this method inherit all the abilities (and limitations) of TRTorch compiled TorchScript modules including serialization and portability through torch.jit.load and deployment with libtrtorchrt.so

Fixes #46

Type of change

Please delete options that are not relevant and/or add your own.

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes

@narendasan narendasan requested a review from peri044 April 17, 2021 03:06
@github-actions github-actions bot added component: api [Python] Issues re: Python API component: api [C++] Issues re: C++ API component: core Issues re: The core compiler component: tests Issues re: Tests labels Apr 17, 2021
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

Reformatting /workspace/cpp/ptq/training/vgg16/export_ckpt.py
Reformatting /workspace/cpp/ptq/training/vgg16/vgg16.py
Reformatting /workspace/cpp/ptq/training/vgg16/main.py

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/workspace/core/conversion/var/Var.cpp b/tmp/changes.txt
index 71b77f5..8356934 100644
--- a/workspace/core/conversion/var/Var.cpp
+++ b/tmp/changes.txt
@@ -89,7 +89,7 @@ nvinfer1::ITensor* Var::ITensorOrFreeze(ConversionCtx* ctx) {
  if (isIValue()) {
    LOG_DEBUG(ctx->logger, "Found IValue containing object of type " << *(ptr_.ivalue->type()));
  }
-  
+
  TRTORCH_CHECK(
      isITensor() || (isIValue() && (ptr_.ivalue->isTensor() || ptr_.ivalue->isCustomClass())),
      "Requested either IValue containing a Tensor, or ITensor, however Var type is " << type_name());
@@ -100,8 +100,10 @@ nvinfer1::ITensor* Var::ITensorOrFreeze(ConversionCtx* ctx) {
    if (ptr_.ivalue->isTensor()) {
      auto weights = converters::Weights();
      auto tensor = ptr_.ivalue->toTensor();
-      if ((tensor.scalar_type() == at::kLong || tensor.scalar_type() == at::kDouble) && !ctx->settings.truncate_long_and_double) {
-        TRTORCH_THROW_ERROR("Unable to freeze tensor of type Int64/Float64 into constant layer, try to compile model with truncate_long_and_double enabled");
+      if ((tensor.scalar_type() == at::kLong || tensor.scalar_type() == at::kDouble) &&
+          !ctx->settings.truncate_long_and_double) {
+        TRTORCH_THROW_ERROR(
+            "Unable to freeze tensor of type Int64/Float64 into constant layer, try to compile model with truncate_long_and_double enabled");
      } else if (tensor.scalar_type() == at::kLong && ctx->settings.truncate_long_and_double) {
        weights = converters::Weights(ctx, tensor.toType(at::kInt));
        LOG_WARNING("Truncating weight (constant in the graph) from Int64 to Int32");
@@ -111,7 +113,7 @@ nvinfer1::ITensor* Var::ITensorOrFreeze(ConversionCtx* ctx) {
      } else {
        weights = converters::Weights(ctx, tensor);
      }
-      
+
      auto const_layer = ctx->net->addConstant(weights.shape, weights.data);
      TRTORCH_CHECK(const_layer, "Unable to freeze tensor into constant layer");
      out = const_layer->getOutput(0);
ERROR: Some files do not conform to style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/workspace/core/conversion/var/Var.cpp b/tmp/changes.txt
index 71b77f5..8356934 100644
--- a/workspace/core/conversion/var/Var.cpp
+++ b/tmp/changes.txt
@@ -89,7 +89,7 @@ nvinfer1::ITensor* Var::ITensorOrFreeze(ConversionCtx* ctx) {
  if (isIValue()) {
    LOG_DEBUG(ctx->logger, "Found IValue containing object of type " << *(ptr_.ivalue->type()));
  }
-  
+
  TRTORCH_CHECK(
      isITensor() || (isIValue() && (ptr_.ivalue->isTensor() || ptr_.ivalue->isCustomClass())),
      "Requested either IValue containing a Tensor, or ITensor, however Var type is " << type_name());
@@ -100,8 +100,10 @@ nvinfer1::ITensor* Var::ITensorOrFreeze(ConversionCtx* ctx) {
    if (ptr_.ivalue->isTensor()) {
      auto weights = converters::Weights();
      auto tensor = ptr_.ivalue->toTensor();
-      if ((tensor.scalar_type() == at::kLong || tensor.scalar_type() == at::kDouble) && !ctx->settings.truncate_long_and_double) {
-        TRTORCH_THROW_ERROR("Unable to freeze tensor of type Int64/Float64 into constant layer, try to compile model with truncate_long_and_double enabled");
+      if ((tensor.scalar_type() == at::kLong || tensor.scalar_type() == at::kDouble) &&
+          !ctx->settings.truncate_long_and_double) {
+        TRTORCH_THROW_ERROR(
+            "Unable to freeze tensor of type Int64/Float64 into constant layer, try to compile model with truncate_long_and_double enabled");
      } else if (tensor.scalar_type() == at::kLong && ctx->settings.truncate_long_and_double) {
        weights = converters::Weights(ctx, tensor.toType(at::kInt));
        LOG_WARNING("Truncating weight (constant in the graph) from Int64 to Int32");
@@ -111,7 +113,7 @@ nvinfer1::ITensor* Var::ITensorOrFreeze(ConversionCtx* ctx) {
      } else {
        weights = converters::Weights(ctx, tensor);
      }
-      
+
      auto const_layer = ctx->net->addConstant(weights.shape, weights.data);
      TRTORCH_CHECK(const_layer, "Unable to freeze tensor into constant layer");
      out = const_layer->getOutput(0);
ERROR: Some files do not conform to style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

Reformatting /workspace/cpp/ptq/training/vgg16/export_ckpt.py
Reformatting /workspace/cpp/ptq/training/vgg16/vgg16.py
Reformatting /workspace/cpp/ptq/training/vgg16/main.py

Copy link
Collaborator

@peri044 peri044 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Also adds tests to confirm TRT Python API intercompatiability

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
@narendasan narendasan force-pushed the arbitrary_trt_engines branch from ff8b6bc to 88d07a9 Compare April 21, 2021 22:50
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

Reformatting /workspace/cpp/ptq/training/vgg16/export_ckpt.py
Reformatting /workspace/cpp/ptq/training/vgg16/vgg16.py
Reformatting /workspace/cpp/ptq/training/vgg16/main.py
Reformatting /workspace/py/trtorch/ptq.py
Reformatting /workspace/py/trtorch/_compiler.py
Reformatting /workspace/py/trtorch/_compile_spec.py
Reformatting /workspace/py/trtorch/logging.py
Reformatting /workspace/py/trtorch/_types.py
Reformatting /workspace/py/trtorch/__init__.py
Reformatting /workspace/py/setup.py
--- /workspace/tests/py/test_api.py	(original)
+++ /workspace/tests/py/test_api.py	(reformatted)
@@ -45,7 +45,9 @@
        same = (trt_mod(self.input) - self.scripted_model(self.input)).abs().max()
        self.assertTrue(same < 2e-3)

+
class TestPTtoTRTtoPT(ModelTestCase):
+
    def setUp(self):
        self.input = torch.randn((1, 3, 224, 224)).to("cuda")
        self.ts_model = torch.jit.script(self.model)
@@ -66,6 +68,7 @@
        trt_mod = trtorch.embed_engine_in_new_module(trt_engine)
        same = (trt_mod(self.input) - self.ts_model(self.input)).abs().max()
        self.assertTrue(same < 2e-3)
+

class TestCheckMethodOpSupport(unittest.TestCase):

Reformatting /workspace/tests/modules/hub.py
Reformatting /workspace/tests/py/test_to_backend_api.py
Reformatting /workspace/tests/py/test_api.py
--- /workspace/tests/py/test_trt_intercompatability.py	(original)
+++ /workspace/tests/py/test_trt_intercompatability.py	(reformatted)
@@ -8,6 +8,7 @@


class TestPyTorchToTRTEngine(ModelTestCase):
+
    def setUp(self):
        self.input = torch.randn((1, 3, 224, 224)).to("cuda:0")
        self.ts_model = torch.jit.script(self.model)
@@ -32,9 +33,12 @@
            with engine.create_execution_context() as ctx:
                out = torch.empty(size=tuple(engine.get_binding_shape(1))).to("cuda:0")
                bindings = [self.input.contiguous().data_ptr(), out.contiguous().data_ptr()]
-                ctx.execute_async(batch_size=1, bindings=bindings, stream_handle=torch.cuda.current_stream(device='cuda:0').cuda_stream)
+                ctx.execute_async(batch_size=1,
+                                  bindings=bindings,
+                                  stream_handle=torch.cuda.current_stream(device='cuda:0').cuda_stream)
                same = (out - self.ts_model(self.input)).abs().max()
                self.assertTrue(same < 2e-3)
+

def test_suite():
    suite = unittest.TestSuite()
Reformatting /workspace/tests/py/test_ptq_trt_calibrator.py
Reformatting /workspace/tests/py/test_multi_gpu.py
Reformatting /workspace/tests/py/test_api_dla.py
Reformatting /workspace/tests/py/test_trt_intercompatability.py
Reformatting /workspace/tests/py/test_ptq_to_backend.py
Reformatting /workspace/tests/py/model_test_case.py
Reformatting /workspace/tests/py/test_ptq_dataloader_calibrator.py
ERROR: Some files do not conform to style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

@narendasan narendasan merged commit 88f0e81 into master Apr 21, 2021
@narendasan narendasan deleted the arbitrary_trt_engines branch April 21, 2021 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: api [C++] Issues re: C++ API component: api [Python] Issues re: Python API component: core Issues re: The core compiler component: tests Issues re: Tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API to register arbitrary TRT engines as modules
2 participants