Automatic Fallback #406

bowang007 · 2021-03-19T06:00:04Z

Description

Implemented Automatic Fallback, this feature will allow the compiler to identify which operations are supported by TRTorch and correctly segment out these graphs, compile each engine and then link together TorchScript and TRTorch engines.

New feature
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes

Signed-off-by: Bo Wang <[email protected]>

…or segmented graphs Signed-off-by: Bo Wang <[email protected]>

Signed-off-by: Bo Wang <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

Signed-off-by: Bo Wang <[email protected]>

peri044

Took an initial pass (80% complete). Still have some functions left to review.
In general, implementation looks good.

As discussed before, some functions can be restructured for testcases to directly call them.
Comments (with an example) at certain places (eg: stitching ) for easier understanding for future users.

peri044 · 2021-03-25T09:04:36Z

core/compiler.cpp

  // Get required metadata about the engine out
  auto num_io = engine_ptr->num_io;
  auto name = engine_ptr->name;

+  //..


can be removed.

peri044 · 2021-03-25T09:13:15Z

core/compiler.cpp

+    if (method.name().rfind("_", 0)) {
+      auto new_g = std::make_shared<torch::jit::Graph>();
+      auto graph_and_parameters = lowering::Lower(mod, method.name());
+      //      LOG_INFO(*(method.graph()) << "Original graph\n");


can be removed

peri044 · 2021-03-25T09:15:50Z

core/partitioning/partitioning.cpp

+    std::shared_ptr<torch::jit::Graph> g,
+    std::vector<conversion::InputRange>& input_ranges,
+    const conversion::TorchFallback& fallback_info) {
+  auto min_block_size = fallback_info.min_block_size;


Renaming this to min_segment_size might be better.

I think min block size is fine, I think we can consider the actual API more as we get closer to merge

peri044 · 2021-03-25T09:36:25Z

core/partitioning/partitioning.cpp

+    segmented_blocks.emplace_back(SegmentedBlock::kTorch, pytorch_nodes);
+  }
+
+  // register input/output torch::jit::Value for segmetned graphs


peri044 · 2021-03-25T09:57:05Z

core/partitioning/partitioning.cpp

+  auto* block = graph->block();
+  auto env = [&](torch::jit::Value* v) { return getOrAddInputForValue(v, graph, old_to_new); };
+
+  auto new_node = block->appendNode(graph->createClone(node, env));


It would be helpful if you can add comment on internal pytorch functions like graph->CreateClone and what its arguments actually mean. What does env represent here ?

env looks like it is the new torch::jit::Value which has the same metadata as the node we are trying to clone from ? This should be renamed better.

Yes, graph->createClone is used to create a node for current graph by copying all the metadata from another node in another graph. Env here is a value_map to translate inputs of original node to inputs of the cloned node. I will comment this part for better understanding.

peri044 · 2021-03-25T10:13:20Z

core/partitioning/partitioning.cpp

+  // find the corresponding raw values in original global graph for this segmented block's inputs/outputs
+  std::set<torch::jit::Value*> input_values;
+  for (auto& seg_block : segmented_blocks) {
+    seg_block.registerInputs();


Can you explain why we have this function registerInputs ?
It looks like when we call SegmentedBlock(kTensorRT, tensorrt_nodes) , it will create a mini_graph g_ with all nodes appended.
Do we actually need a separate registerInputs() to add the graph_inputs to a inputs_ vector ?
Is there a problem with doing this in the constructor itself ?

That's a good point. I think I used this because I didn't have SegmentedBlock(kTensorRT, tensorrt_nodes) constructer at first. So, I had to construct segmentedBlock first and then append node one by one, after all of this is done we can identify all required inputs and then register inputs.
Currently maybe we can move register inputs to our constructors. However, for the feature we discussed yesterday(convert Int input values to nodes) we may still need to prepend some node for some mini-graph and these nodes may depend on some tensors, so we need to add some input values after construction is done, so maybe we still need the registerInputs()?

peri044 · 2021-03-25T10:23:01Z

core/partitioning/partitioning.cpp

+  for (auto& mini_graph_input : input_values) {
+    for (auto& seg_block : segmented_blocks) {
+      if (std::find(seg_block.raw_inputs().begin(), seg_block.raw_inputs().end(), mini_graph_input) ==
+              seg_block.raw_inputs().end() &&
+          seg_block.contain_raw_input(mini_graph_input)) {
+        seg_block.registerOutput(mini_graph_input);
+      }
+    }
+  }


Can you add a comment with an example on what is happening here ? Seems like the stiching code.

Yes, I will add comments for this part. This piece of code is used to identify each mini-graph's outputs by going through all input values that is necessary for all mini-graphs and if we find that input value is included in a block, we will output that value.

peri044 · 2021-03-25T10:31:10Z

core/partitioning/partitioning.cpp

+  // store the mapping from lowering graph torch::jit::Value => torch::jit::IValue that we get by running segments
+  std::unordered_map<torch::jit::Value*, torch::jit::IValue> ivalues_maps;
+
+  std::vector<torch::jit::IValue> random_inputs = generateRandomInputs(input_ranges);


Should we move this generateRandomInputs function outside of fallback code for more general usage? something like utils ?

Yes, I think it's ok. BTW, I found that my getFunctionSchema function in partitioning.cpp is same with the GenerateGraphSchema function in compiler.cpp, I am not sure if we should delete both of these two and move them to utils?

peri044 · 2021-03-25T10:48:08Z

core/partitioning/partitioning.h

+  c10::ArrayRef<torch::jit::Value*> inputs() {
+    return g_->inputs();
+  }
+
+  c10::ArrayRef<torch::jit::Value*> outputs() {
+    return g_->outputs();
+  }
+
+  const std::vector<torch::jit::Value*>& raw_inputs() const {
+    return inputs_;
+  }
+
+  const std::vector<torch::jit::Value*>& raw_outputs() const {


What would be the difference between raw_inputs() and input() values in practice ? Do we need both variants ?

raw_inputs() would be the torch::jit::Value that in the lowering global graph that is used as an input in our mini graph, when we construct the mini-graphs these Values in mini-graph will change because we are constructing new graph. We need them both because when we stitch mini-graph together, we need the raw_inputs to find the mappings, and we use inputs() to run the segments inferences and do something like replaceAllUsesWith() for mini-graph inputs.

…RT segments Signed-off-by: Bo Wang <[email protected]>

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

Signed-off-by: Bo Wang <[email protected]>

github-actions

Code conforms to Python style guidelines

github-actions

There are some changes that do not conform to C++ style guidelines:

diff --git a/workspace/core/partitioning/shape_analysis.cpp b/tmp/changes.txt
index 9727b11..42a0165 100644
--- a/workspace/core/partitioning/shape_analysis.cpp
+++ b/tmp/changes.txt
@@ -61,7 +61,7 @@ void getSegmentsOutputByRunning(
      jit_inputs_ivalues.push_back(ivalues_maps[input].toBool());
    } else if (input->type()->kind() == torch::jit::TypeKind::ListType) {
      jit_inputs_ivalues.push_back(ivalues_maps[input].toList());
-    } else if (input->type()->kind() == torch::jit::TypeKind::TupleType){
+    } else if (input->type()->kind() == torch::jit::TypeKind::TupleType) {
      jit_inputs_ivalues.push_back(ivalues_maps[input].toTuple());
    } else {
      TRTORCH_THROW_ERROR("Unable to find type for value: " << input->debugName() << " to get the ivalues.\n");
ERROR: Some files do not conform to style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

There are some changes that do not conform to C++ style guidelines:

diff --git a/workspace/core/partitioning/shape_analysis.cpp b/tmp/changes.txt
index 9727b11..42a0165 100644
--- a/workspace/core/partitioning/shape_analysis.cpp
+++ b/tmp/changes.txt
@@ -61,7 +61,7 @@ void getSegmentsOutputByRunning(
      jit_inputs_ivalues.push_back(ivalues_maps[input].toBool());
    } else if (input->type()->kind() == torch::jit::TypeKind::ListType) {
      jit_inputs_ivalues.push_back(ivalues_maps[input].toList());
-    } else if (input->type()->kind() == torch::jit::TypeKind::TupleType){
+    } else if (input->type()->kind() == torch::jit::TypeKind::TupleType) {
      jit_inputs_ivalues.push_back(ivalues_maps[input].toTuple());
    } else {
      TRTORCH_THROW_ERROR("Unable to find type for value: " << input->debugName() << " to get the ivalues.\n");
ERROR: Some files do not conform to style guidelines

Signed-off-by: Bo Wang <[email protected]>

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

Signed-off-by: Bo Wang <[email protected]>

github-actions

There are some changes that do not conform to C++ style guidelines:

diff --git a/workspace/core/partitioning/partitioning.cpp b/tmp/changes.txt
index f546b02..abdfbb2 100644
--- a/workspace/core/partitioning/partitioning.cpp
+++ b/tmp/changes.txt
@@ -204,17 +204,16 @@ void registerSegmentsOutputs(PartitionedGraph& segmented_blocks, std::shared_ptr
      }
    }
  }
-  std::for_each(
-      segmented_blocks.begin(),
-      segmented_blocks.end(),
-      [](SegmentedBlock& seg_block) { torch::jit::EliminateDeadCode(seg_block.g()); });
-      // erase segments which still have no output
-      segmented_blocks.erase(
-          std::remove_if(
-              segmented_blocks.begin(),
-              segmented_blocks.end(),
-              [](SegmentedBlock& seg_block) { return seg_block.raw_outputs().empty(); }),
-          segmented_blocks.end());
+  std::for_each(segmented_blocks.begin(), segmented_blocks.end(), [](SegmentedBlock& seg_block) {
+    torch::jit::EliminateDeadCode(seg_block.g());
+  });
+  // erase segments which still have no output
+  segmented_blocks.erase(
+      std::remove_if(
+          segmented_blocks.begin(),
+          segmented_blocks.end(),
+          [](SegmentedBlock& seg_block) { return seg_block.raw_outputs().empty(); }),
+      segmented_blocks.end());

  return;
}
ERROR: Some files do not conform to style guidelines

github-actions

Code conforms to Python style guidelines

Signed-off-by: Bo Wang <[email protected]>

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

Signed-off-by: Bo Wang <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

narendasan · 2021-04-28T22:41:54Z

core/partitioning/partitioning.cpp

+      }
+    }
+  }
+  // erase segments which still have no output


Yeah we should do that

narendasan · 2021-04-28T22:44:15Z

core/compiler.cpp

+    const std::string& serialized_engine,
+    int engine_id = 0) {
+  auto engine_ptr =
+      c10::make_intrusive<runtime::TRTEngine>(mod._ivalue()->name() + std::to_string(engine_id), serialized_engine);


Does engine_id just need to be unique or do we use the ids else where?

If they are just unique we should use the pointer trick to get something that is likely to be unique, therefore we dont really need to worry about conflicts

narendasan · 2021-04-28T22:45:23Z

core/compiler.cpp

+      for (auto& seg_block : segmented_blocks) {
+        LOG_INFO(*g << "(MiniGraphInSegmentedBlock)\n");
+        if (seg_block.target() == partitioning::SegmentedBlock::kTensorRT) {
+          std::vector<ir::InputRange> input_ranges;


Thats probably higher priority than loops then, I think since we have unrolling that can be enabled. Also I think its pretty achievable in the time we have

Signed-off-by: Bo Wang <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

tests BUILD files Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

bowang007 added 13 commits February 26, 2021 01:29

implemented segmentedBlock to construct subgraphs

848335e

Signed-off-by: Bo Wang <[email protected]>

stitched the TensorRT engine to Torch nodes

123f026

Signed-off-by: Bo Wang <[email protected]>

implemented fallback and run successfully

bbd3835

Signed-off-by: Bo Wang <[email protected]>

clean messy code

1ca13d8

Signed-off-by: Bo Wang <[email protected]>

resolved dependency problems in edge cases

0d28164

Signed-off-by: Bo Wang <[email protected]>

refactored the new graph output registration

55e0510

Signed-off-by: Bo Wang <[email protected]>

feat: added user level API for fallback

f4c29b4

Signed-off-by: Bo Wang <[email protected]>

feat: allow users to set fallback block size and ops

6d3064a

Signed-off-by: Bo Wang <[email protected]>

feat: support Python APIs for Automatic Fallback

100b090

Signed-off-by: Bo Wang <[email protected]>

fix: register the torch_fallback attribute in Python API

8b7919f

Signed-off-by: Bo Wang <[email protected]>

fix: support shape inference for add_, support non-tensor arguments f…

46950bb

…or segmented graphs Signed-off-by: Bo Wang <[email protected]>

chore: merge master branch into fallback development branch

c0ea3a9

Signed-off-by: Bo Wang <[email protected]>

chore: added some comments and reformat the code

d90a300

Signed-off-by: Bo Wang <[email protected]>

bowang007 requested review from narendasan and peri044 March 19, 2021 06:00

github-actions bot added component: api [Python] Issues re: Python API component: api [C++] Issues re: C++ API component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: lowering Issues re: The lowering / preprocessing passes labels Mar 19, 2021

github-actions bot approved these changes Mar 19, 2021

View reviewed changes

bowang007 added 2 commits March 22, 2021 22:14

chore: support passing BoolType/ListType arguments for segments

da09e4b

Signed-off-by: Bo Wang <[email protected]>

chore: Support more types conversion for minigraph inputs

6147d4f

Signed-off-by: Bo Wang <[email protected]>

peri044 reviewed Mar 25, 2021

View reviewed changes

feat: support Int/Bool and other constants' inputs/outputs for Tensor…

54e407e

…RT segments Signed-off-by: Bo Wang <[email protected]>

github-actions bot approved these changes Apr 24, 2021

View reviewed changes

chore: optimize minor code problems according to PR

20543c6

Signed-off-by: Bo Wang <[email protected]>

bowang007 requested a review from narendasan April 26, 2021 20:00

github-actions bot approved these changes Apr 26, 2021

View reviewed changes

github-actions bot requested changes Apr 26, 2021

View reviewed changes

github-actions bot approved these changes Apr 26, 2021

View reviewed changes

github-actions bot requested changes Apr 26, 2021

View reviewed changes

chore: apply linting

58cb53e

Signed-off-by: Bo Wang <[email protected]>

github-actions bot approved these changes Apr 26, 2021

View reviewed changes

fix: fix typo bug

e491bb5

Signed-off-by: Bo Wang <[email protected]>

github-actions bot requested changes Apr 26, 2021

View reviewed changes

github-actions bot approved these changes Apr 26, 2021

View reviewed changes

chore: apply linting

4a318a2

Signed-off-by: Bo Wang <[email protected]>

github-actions bot approved these changes Apr 27, 2021

View reviewed changes

fix: erase the repetitive nodes in dependency analysis

80b1038

Signed-off-by: Bo Wang <[email protected]>

github-actions bot approved these changes Apr 27, 2021

View reviewed changes

narendasan reviewed Apr 28, 2021

View reviewed changes

bowang007 and others added 2 commits April 29, 2021 18:46

chore: refactor code structures according to PR

5110480

Signed-off-by: Bo Wang <[email protected]>

Merge branch 'master' into bowa_fallback

dde0216

github-actions bot approved these changes Apr 30, 2021

View reviewed changes

fix(//tests/core/partitioning): Fixing some issues with the partition

ff89059

tests BUILD files Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan merged commit e81367b into master Apr 30, 2021

github-actions bot approved these changes Apr 30, 2021

View reviewed changes

narendasan mentioned this pull request May 20, 2021

Support Loop Translation to TensorRT #112

Closed

Automatic Fallback #406

Automatic Fallback #406

Conversation

bowang007 commented Mar 19, 2021 • edited Loading

Description

Checklist:

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

peri044 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bowang007 Mar 26, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

bowang007 commented Mar 19, 2021 •

edited

Loading

bowang007 Mar 26, 2021 •

edited

Loading