Skip to content

Automatic Fallback #406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 56 commits into from
Apr 30, 2021
Merged

Automatic Fallback #406

merged 56 commits into from
Apr 30, 2021

Conversation

bowang007
Copy link
Collaborator

@bowang007 bowang007 commented Mar 19, 2021

Description

Implemented Automatic Fallback, this feature will allow the compiler to identify which operations are supported by TRTorch and correctly segment out these graphs, compile each engine and then link together TorchScript and TRTorch engines.

  • New feature
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes

@bowang007 bowang007 requested review from narendasan and peri044 March 19, 2021 06:00
@github-actions github-actions bot added component: api [Python] Issues re: Python API component: api [C++] Issues re: C++ API component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: lowering Issues re: The lowering / preprocessing passes labels Mar 19, 2021
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link
Collaborator

@peri044 peri044 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took an initial pass (80% complete). Still have some functions left to review.
In general, implementation looks good.

  • As discussed before, some functions can be restructured for testcases to directly call them.
  • Comments (with an example) at certain places (eg: stitching ) for easier understanding for future users.

// Get required metadata about the engine out
auto num_io = engine_ptr->num_io;
auto name = engine_ptr->name;

//..
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed.

if (method.name().rfind("_", 0)) {
auto new_g = std::make_shared<torch::jit::Graph>();
auto graph_and_parameters = lowering::Lower(mod, method.name());
// LOG_INFO(*(method.graph()) << "Original graph\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed

std::shared_ptr<torch::jit::Graph> g,
std::vector<conversion::InputRange>& input_ranges,
const conversion::TorchFallback& fallback_info) {
auto min_block_size = fallback_info.min_block_size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming this to min_segment_size might be better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think min block size is fine, I think we can consider the actual API more as we get closer to merge

segmented_blocks.emplace_back(SegmentedBlock::kTorch, pytorch_nodes);
}

// register input/output torch::jit::Value for segmetned graphs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

segmented*

auto* block = graph->block();
auto env = [&](torch::jit::Value* v) { return getOrAddInputForValue(v, graph, old_to_new); };

auto new_node = block->appendNode(graph->createClone(node, env));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful if you can add comment on internal pytorch functions like graph->CreateClone and what its arguments actually mean. What does env represent here ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

env looks like it is the new torch::jit::Value which has the same metadata as the node we are trying to clone from ? This should be renamed better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, graph->createClone is used to create a node for current graph by copying all the metadata from another node in another graph. Env here is a value_map to translate inputs of original node to inputs of the cloned node. I will comment this part for better understanding.

// find the corresponding raw values in original global graph for this segmented block's inputs/outputs
std::set<torch::jit::Value*> input_values;
for (auto& seg_block : segmented_blocks) {
seg_block.registerInputs();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we have this function registerInputs ?
It looks like when we call SegmentedBlock(kTensorRT, tensorrt_nodes) , it will create a mini_graph g_ with all nodes appended.
Do we actually need a separate registerInputs() to add the graph_inputs to a inputs_ vector ?
Is there a problem with doing this in the constructor itself ?

Copy link
Collaborator Author

@bowang007 bowang007 Mar 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I think I used this because I didn't have SegmentedBlock(kTensorRT, tensorrt_nodes) constructer at first. So, I had to construct segmentedBlock first and then append node one by one, after all of this is done we can identify all required inputs and then register inputs.
Currently maybe we can move register inputs to our constructors. However, for the feature we discussed yesterday(convert Int input values to nodes) we may still need to prepend some node for some mini-graph and these nodes may depend on some tensors, so we need to add some input values after construction is done, so maybe we still need the registerInputs()?

Comment on lines 166 to 174
for (auto& mini_graph_input : input_values) {
for (auto& seg_block : segmented_blocks) {
if (std::find(seg_block.raw_inputs().begin(), seg_block.raw_inputs().end(), mini_graph_input) ==
seg_block.raw_inputs().end() &&
seg_block.contain_raw_input(mini_graph_input)) {
seg_block.registerOutput(mini_graph_input);
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment with an example on what is happening here ? Seems like the stiching code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will add comments for this part. This piece of code is used to identify each mini-graph's outputs by going through all input values that is necessary for all mini-graphs and if we find that input value is included in a block, we will output that value.

// store the mapping from lowering graph torch::jit::Value => torch::jit::IValue that we get by running segments
std::unordered_map<torch::jit::Value*, torch::jit::IValue> ivalues_maps;

std::vector<torch::jit::IValue> random_inputs = generateRandomInputs(input_ranges);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this generateRandomInputs function outside of fallback code for more general usage? something like utils ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it's ok. BTW, I found that my getFunctionSchema function in partitioning.cpp is same with the GenerateGraphSchema function in compiler.cpp, I am not sure if we should delete both of these two and move them to utils?

Comment on lines 65 to 77
c10::ArrayRef<torch::jit::Value*> inputs() {
return g_->inputs();
}

c10::ArrayRef<torch::jit::Value*> outputs() {
return g_->outputs();
}

const std::vector<torch::jit::Value*>& raw_inputs() const {
return inputs_;
}

const std::vector<torch::jit::Value*>& raw_outputs() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the difference between raw_inputs() and input() values in practice ? Do we need both variants ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw_inputs() would be the torch::jit::Value that in the lowering global graph that is used as an input in our mini graph, when we construct the mini-graphs these Values in mini-graph will change because we are constructing new graph. We need them both because when we stitch mini-graph together, we need the raw_inputs to find the mappings, and we use inputs() to run the segments inferences and do something like replaceAllUsesWith() for mini-graph inputs.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

@bowang007 bowang007 requested a review from narendasan April 26, 2021 20:00
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/workspace/core/partitioning/shape_analysis.cpp b/tmp/changes.txt
index 9727b11..42a0165 100644
--- a/workspace/core/partitioning/shape_analysis.cpp
+++ b/tmp/changes.txt
@@ -61,7 +61,7 @@ void getSegmentsOutputByRunning(
      jit_inputs_ivalues.push_back(ivalues_maps[input].toBool());
    } else if (input->type()->kind() == torch::jit::TypeKind::ListType) {
      jit_inputs_ivalues.push_back(ivalues_maps[input].toList());
-    } else if (input->type()->kind() == torch::jit::TypeKind::TupleType){
+    } else if (input->type()->kind() == torch::jit::TypeKind::TupleType) {
      jit_inputs_ivalues.push_back(ivalues_maps[input].toTuple());
    } else {
      TRTORCH_THROW_ERROR("Unable to find type for value: " << input->debugName() << " to get the ivalues.\n");
ERROR: Some files do not conform to style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/workspace/core/partitioning/shape_analysis.cpp b/tmp/changes.txt
index 9727b11..42a0165 100644
--- a/workspace/core/partitioning/shape_analysis.cpp
+++ b/tmp/changes.txt
@@ -61,7 +61,7 @@ void getSegmentsOutputByRunning(
      jit_inputs_ivalues.push_back(ivalues_maps[input].toBool());
    } else if (input->type()->kind() == torch::jit::TypeKind::ListType) {
      jit_inputs_ivalues.push_back(ivalues_maps[input].toList());
-    } else if (input->type()->kind() == torch::jit::TypeKind::TupleType){
+    } else if (input->type()->kind() == torch::jit::TypeKind::TupleType) {
      jit_inputs_ivalues.push_back(ivalues_maps[input].toTuple());
    } else {
      TRTORCH_THROW_ERROR("Unable to find type for value: " << input->debugName() << " to get the ivalues.\n");
ERROR: Some files do not conform to style guidelines

Signed-off-by: Bo Wang <[email protected]>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Signed-off-by: Bo Wang <[email protected]>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/workspace/core/partitioning/partitioning.cpp b/tmp/changes.txt
index f546b02..abdfbb2 100644
--- a/workspace/core/partitioning/partitioning.cpp
+++ b/tmp/changes.txt
@@ -204,17 +204,16 @@ void registerSegmentsOutputs(PartitionedGraph& segmented_blocks, std::shared_ptr
      }
    }
  }
-  std::for_each(
-      segmented_blocks.begin(),
-      segmented_blocks.end(),
-      [](SegmentedBlock& seg_block) { torch::jit::EliminateDeadCode(seg_block.g()); });
-      // erase segments which still have no output
-      segmented_blocks.erase(
-          std::remove_if(
-              segmented_blocks.begin(),
-              segmented_blocks.end(),
-              [](SegmentedBlock& seg_block) { return seg_block.raw_outputs().empty(); }),
-          segmented_blocks.end());
+  std::for_each(segmented_blocks.begin(), segmented_blocks.end(), [](SegmentedBlock& seg_block) {
+    torch::jit::EliminateDeadCode(seg_block.g());
+  });
+  // erase segments which still have no output
+  segmented_blocks.erase(
+      std::remove_if(
+          segmented_blocks.begin(),
+          segmented_blocks.end(),
+          [](SegmentedBlock& seg_block) { return seg_block.raw_outputs().empty(); }),
+      segmented_blocks.end());

  return;
}
ERROR: Some files do not conform to style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Signed-off-by: Bo Wang <[email protected]>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

}
}
}
// erase segments which still have no output
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we should do that

const std::string& serialized_engine,
int engine_id = 0) {
auto engine_ptr =
c10::make_intrusive<runtime::TRTEngine>(mod._ivalue()->name() + std::to_string(engine_id), serialized_engine);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does engine_id just need to be unique or do we use the ids else where?

If they are just unique we should use the pointer trick to get something that is likely to be unique, therefore we dont really need to worry about conflicts

for (auto& seg_block : segmented_blocks) {
LOG_INFO(*g << "(MiniGraphInSegmentedBlock)\n");
if (seg_block.target() == partitioning::SegmentedBlock::kTensorRT) {
std::vector<ir::InputRange> input_ranges;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats probably higher priority than loops then, I think since we have unrolling that can be enabled. Also I think its pretty achievable in the time we have

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

tests BUILD files

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
@narendasan narendasan merged commit e81367b into master Apr 30, 2021
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: api [C++] Issues re: C++ API component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: evaluators Issues re: Specific op evaluators component: lowering Issues re: The lowering / preprocessing passes component: tests Issues re: Tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants