-
Notifications
You must be signed in to change notification settings - Fork 553
Iteratively run SimplificationPipeline until code optimization converges #1261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
createTorchSimplificationPipeline is already running iteratively inside LowerToBackendContractPass after #1165 Is this still an issue? |
// inference), because Torch type promotion rules actually depend on the shape | ||
// of the operand. | ||
createTorchShapeRefinementPipeline(pm); | ||
for (int i = 0; i < options.maxIterations; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxIterations controls the iterations in LowerToBackendContractPass and should not be used this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
createTorchSimplificationPipeline is already running iteratively inside LowerToBackendContractPass after #1165 Is this still an issue?
Yes. In my local development environment, ShapeRefinementPipeline
only run once because satisfiesBackendContract
returns true after one iteration. The output IR of the pipeline is something like this:
// -----// IR Dump After DropShapeCalculations (torch-drop-shape-calculations) //----- //
...
%5 = torch.vtensor.literal(dense<5> : tensor<si64>) : !torch.vtensor<[],si64>
%210 = torch.prim.NumToTensor.Scalar %int5 : !torch.int -> !torch.vtensor<[],unk>
%211 = torch.aten.div.Tensor_mode %210, %5, %str_4 : !torch.vtensor<[],unk>, !torch.vtensor<[],si64>, !torch.str -> !torch.vtensor<[],unk>
%212 = torch.aten.Int.Tensor %211 : !torch.vtensor<[],unk> -> !torch.int
%213 = torch.prim.ListConstruct %212, %int5, %int-1 : (!torch.int, !torch.int, !torch.int) -> !torch.list<int>
%214 = torch.aten.view %209, %213 : !torch.vtensor<[5,2048],unk>, !torch.list<int> -> !torch.vtensor<[?,5,?],unk>
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is related to RefineTypes. Shall we add PrimNumToTensorScalar
to
if (isa<CopyToValueTensorOp, CopyToNonValueTensorOp, AtenBatchNormOp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this issue is only about constant shape. Could you explain more about the RefineTypes's influence on this problem? In my experiments, the type refinement for PrimNumToTensorScalarOp
will fail if we do so.
// -----// IR Dump After RefineTypes (torch-refine-types) //----- //
%210 = torch.prim.NumToTensor.Scalar %int5 : !torch.int -> !torch.vtensor<[],unk>
1adbd37
to
edbe3ac
Compare
@silvasean The problem might be |
Good point. I would recommend that we run until satisfiedBackendContract is true, and then keep iterating until no changes are made to the program (and of course not iterating more than maxIterations times). This can be done by hashing the module. I think OperationEquivalence can help with this: https://mlir.llvm.org/doxygen/structmlir_1_1OperationEquivalence.html |
edbe3ac
to
0d06a56
Compare
Thanks for the suggestions. I have reimplemented this PR and add a new condition to control the iterative execution of SimplificationPipeline, ensuring that code optimization converges, i.e., no code change happens. |
0d06a56
to
4f30ece
Compare
This approach also introduces problems. We need at least one additional
This problem is already found in the CI failure. I'm not sure if there are other cases where it fails. |
static llvm::hash_code hashOperation(Operation *op) { | ||
llvm::hash_code hash(0); | ||
llvm::hash_code opHash = OperationEquivalence::computeHash( | ||
op, OperationEquivalence::ignoreHashValue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add /*arg=*/
comments here so we know the arg names?
Also, I think we should hash the types of the results so that we keep iterating if the types get refined.
static llvm::hash_code hashBlock(Block &block) { | ||
llvm::hash_code hash(0); | ||
for (Operation &op : block.getOperations()) { | ||
llvm::hash_code opHash = hashOperation(&op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should hash the block arguments too (at least their types)
@@ -234,7 +271,14 @@ class LowerToBackendContractPass | |||
|
|||
if (failed(runPipeline(pm, module))) | |||
return signalPassFailure(); | |||
} while (!satisfiesBackendContract(module)); | |||
|
|||
llvm::hash_code newModuleHash = hashOperation(module); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a unit test of the hashing logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick turnaround!
RefineTypes should support torch.aten.empty.memory_format -- that bug you are running into appears to be a legitimate bug -- can you file an isolated test case for RefineTypes?
I agree that after the backend contract is satisfied, only ShapeRefinementPipeline and canonicalizer should be run. That will be more efficient too. I would recommend adding a new pipeline createTorchBackendContractCleanupPipeline
with ShapeRefinementPipeline and canonicalizer. The documentation for the new pipeline would be "Clean up and optimize IR that satisfies the backend contract".
Recently, I'm thinking of the execution order of the passes in
|
The order of the SimplificationPipeline is chosen so that it almost always finishes in one iteration. Changing the order like this will break that.
I support adding a pass that iteratively runs shape refinement and canonicalizer. But that pass should not run inside TorchSimplificationPipeline. We should clearly separate 1. Lowering to the backend contract 2. Optimizing code once it satisfies the backend contract. The TorchSimplificationPipeline is part of 1. but should not be used as part of 2. (maybe we need a better name than TorchSimplificationPipeline) |
…ums (llvm#1261) * [maccel]: Change --maccel option from a string option to a list of enums Signed-off-by: Ettore Tiotto <[email protected]> Signed-off-by: Tung D. Le <[email protected]> * - add queryEntryPoints Java API (llvm#1255) - use GENERATE_NATIVE_HEADERS option of add_jar (require cmake 3.11+) to generate JNI header since javah was deprecated since Java 8 Signed-off-by: Gong Su <[email protected]> Signed-off-by: Tung D. Le <[email protected]> * Do not set ownership for an output OMTensor that is also a block argument (llvm#1256) * Do not set ownership for an output that is also a block argument Signed-off-by: Tung D. Le <[email protected]> * Edit lit tests Signed-off-by: Tung D. Le <[email protected]> * More name changes Signed-off-by: Tung D. Le <[email protected]> * Edit comments Signed-off-by: Tung D. Le <[email protected]> * typos Signed-off-by: Tung D. Le <[email protected]> * Make the llvm.ident lit test more meaningful (llvm#1260) * Make the llvm.ident lit test more meaningful Update the test to specifically look for a commit hash instead of any characters Signed-off-by: Stella Stamenova <[email protected]> * Account for .git suffix Signed-off-by: Stella Stamenova <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Signed-off-by: Tung D. Le <[email protected]> * [backend_cpp]: Use ModelLib to create CategoryMapper cpp tests. Signed-off-by: Ettore Tiotto <[email protected]> Signed-off-by: Tung D. Le <[email protected]> * Revert "[backend_cpp]: Use ModelLib to create CategoryMapper cpp tests." This reverts commit 00e8a6bdd6d90c6125326173340fd3e00f9c838c. Signed-off-by: Tung D. Le <[email protected]> * [Accelerator] Do not use NNPA preprocessor to avoid exposing accelerator code (llvm#1263) * Do not use NNPA preprocessor to avoid exposing accelerator code Signed-off-by: Tung D. Le <[email protected]> * clang-format Signed-off-by: Tung D. Le <[email protected]> * Move OptimizationLevel to the common place Signed-off-by: Tung D. Le <[email protected]> * Rename functions Signed-off-by: Tung D. Le <[email protected]> * format Signed-off-by: Tung D. Le <[email protected]> * Address comments Signed-off-by: Tung D. Le <[email protected]> * generate Accelerator option enum from CMake Signed-off-by: Kevin O'Brien <[email protected]> Signed-off-by: Tung D. Le <[email protected]> * Edit CMakeLists.txt Signed-off-by: Tung D. Le <[email protected]> * clang-format Signed-off-by: Tung D. Le <[email protected]> Co-authored-by: gongsu832 <[email protected]> Co-authored-by: Tung D. Le <[email protected]> Co-authored-by: Stella Stamenova <[email protected]> Co-authored-by: Kevin O'Brien <[email protected]>
This PR is proposed to solve the problem below.
In the above case, %arg1 has a constant size (5) along the 0-th dimension.It can be deduced that the shape of %554 should be completely static, but the actual output is [?,5,?]. The cause is that before shape refinement pipeline runs, the %551 is not known to be a constant. That is further caused by the canonicalization of AtenDivTensorModeOp fails as it relys on the shape refinement pipeline itself to deduce that one of its operand, %550, is a constant.
So the buggy IR pattern is like this:
By executing canonicalizer pass and shapeRefinementPipeline multiple times, this PR can help to alleviate this problem.