Skip to content

[CIR][Lowering] Introduce HoistAllocasPass #887

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 9, 2024

Conversation

ChuanqiXu9
Copy link
Member

@ChuanqiXu9 ChuanqiXu9 commented Sep 26, 2024

Close #883. See the above issue for details

Copy link
Member

@Lancern Lancern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we invent a new pass for this, or just include it in the FlattenCFG pass?

@bcardosolopes
Copy link
Member

bcardosolopes commented Sep 26, 2024

Should we invent a new pass for this, or just include it in the FlattenCFG pass?

Yea, like I said in the issue, this can be done as we unwrap the scopes, no need to have a dedicated pass. Since FlattenCFG starts with the innermost, in theory we just need to keep hoisting to the parent entry block, naturally they are all gonna converge into the function entry block.

EDIT: thinking more about this, iterating the hoisting until it reaches the parent basic block will actually involve teaching all different OpRewritePattern in FlattenCFG into it. What I suggest instead is to apply the same logic you are doing here to a specific FuncOp OpRewritePattern within FlattenCFG, example (see FuncOp added at the end):

  // Collect operations to apply patterns.
  SmallVector<Operation *, 16> ops;
  getOperation()->walk<mlir::WalkOrder::PostOrder>([&](Operation *op) {
    if (isa<IfOp, ScopeOp, SwitchOp, LoopOpInterface, TernaryOp, TryOp, FuncOp>(op))
      ops.push_back(op);
  });

I think this is what @Lancern is originally referring to.

@ChuanqiXu9 ChuanqiXu9 changed the title [CIR] [Lowering] Introduce HoistAllocas Pass [CIR] [FlattenCFG] Hoist allocas to entry block of FuncOP in FlattenCFGPass Sep 29, 2024
@ChuanqiXu9 ChuanqiXu9 marked this pull request as ready for review September 29, 2024 07:44
@ChuanqiXu9 ChuanqiXu9 requested a review from lanza as a code owner September 29, 2024 07:44
@ChuanqiXu9
Copy link
Member Author

Should we invent a new pass for this, or just include it in the FlattenCFG pass?

Yea, like I said in the issue, this can be done as we unwrap the scopes, no need to have a dedicated pass. Since FlattenCFG starts with the innermost, in theory we just need to keep hoisting to the parent entry block, naturally they are all gonna converge into the function entry block.

EDIT: thinking more about this, iterating the hoisting until it reaches the parent basic block will actually involve teaching all different OpRewritePattern in FlattenCFG into it. What I suggest instead is to apply the same logic you are doing here to a specific FuncOp OpRewritePattern within FlattenCFG, example (see FuncOp added at the end):

  // Collect operations to apply patterns.
  SmallVector<Operation *, 16> ops;
  getOperation()->walk<mlir::WalkOrder::PostOrder>([&](Operation *op) {
    if (isa<IfOp, ScopeOp, SwitchOp, LoopOpInterface, TernaryOp, TryOp, FuncOp>(op))
      ops.push_back(op);
  });

I think this is what @Lancern is originally referring to.

Done.

@ChuanqiXu9
Copy link
Member Author

Update:

I try to match AllocaOp instead of FuncOp in the new version. Since the old version would take 15min+ to compile a single file in Spec2017 while the new version may only take 2min to compile that file (BTW, it takes 1min17s to compile the same file with the compiler without -fclangir. It is another concern or another optimization area).

A side effect change of this is, the MLIR may try to remove the trivial dead alloca op. So we have to update some tests. e.g.,Lowering/alloca.cir.

Copy link
Member

@Lancern Lancern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to match AllocaOp instead of FuncOp in the new version.

This is the wrong way to implement this transformation. The pattern rewriting framework requires that you match on the root operation and all transformations should be limited within the root operation. This means if you match on alloca you can only modify / replace / erase the matched alloca operations, but you cannot touch other parts of the function.

Comment on lines 893 to 896
// It is cheaper to call `mlir::Operation::moveBefore` than using rewriter.
// So we prefer to manually here.
mlir::Operation *insertPoint = &*entryBlock.begin();
allocaOp->moveBefore(insertPoint);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a rewrite pattern, all IR mutations must be performed through the rewriter, or it may crash. See https://mlir.llvm.org/docs/PatternRewriter/#restrictions :

Within the rewrite section of a pattern, the following constraints apply:

  • All IR mutations, including creation, must be performed by the given PatternRewriter. This class provides hooks for performing all of the possible mutations that may take place within a pattern. For example, this means that an operation should not be erased via its erase method. To erase an operation, the appropriate PatternRewriter hook (in this case eraseOp) should be used instead.
  • [...]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://mlir.llvm.org/getting_started/Debugging/#detecting-invalid-api-usage mentions -DMLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON, which might be useful for development? I haven't tried it though.

Copy link
Member

@bcardosolopes bcardosolopes Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wished we had a clang tidy thing for preventing us to use this idiom in rewriters.

Copy link
Member

@bcardosolopes bcardosolopes Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://mlir.llvm.org/getting_started/Debugging/#detecting-invalid-api-usage mentions -DMLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS=ON, which might be useful for development? I haven't tried it though.

Great idea. Maybe we can enable this for github testing, so we catch those in PRs. Now that Nathan disabled unrelated tests perhaps it won't feel like it's taking too much longer.

@bcardosolopes
Copy link
Member

Since the old version would take 15min+ to compile a single file in Spec2017 while the new version may only take 2min to compile that file (BTW, it takes 1min17s to compile the same file with the compiler without -fclangir. It is another concern or another optimization area).

Wow, this is interesting. Do you have the verifiers ON or OFF? Wonder how long it would take without verifiers (e.g. by using -Xclang -clangir-disable-verifier)

@bcardosolopes
Copy link
Member

The pattern rewriting framework requires that you match on the root operation and all transformations should be limited within the root operation

Nice catch!

@gitoleg
Copy link
Collaborator

gitoleg commented Oct 1, 2024

That's awesome! Eagerly awaiting for the PR merged!

There are many tests fail in the llvm-test-suite due to variables declared in loops.
For example, the next program fails with segfault:

int N = 132000;

void foo() {
  for (int i = 0; i < N; ++i) {
    int a = 1, b = 2, c = 3, d = 4;
  }
}

int main() {
  foo();
  return 0;
}

And does NOT fail if the variables are declared outside of the loop.

@ChuanqiXu9 ChuanqiXu9 changed the title [CIR] [FlattenCFG] Hoist allocas to entry block of FuncOP in FlattenCFGPass [CIR] [Lowering] Introduce HoistAllocasPass Oct 8, 2024
@ChuanqiXu9
Copy link
Member Author

Since the old version would take 15min+ to compile a single file in Spec2017 while the new version may only take 2min to compile that file (BTW, it takes 1min17s to compile the same file with the compiler without -fclangir. It is another concern or another optimization area).

Wow, this is interesting. Do you have the verifiers ON or OFF? Wonder how long it would take without verifiers (e.g. by using -Xclang -clangir-disable-verifier)

Yes, I have this ON without verifiers. I believe the root cause is that pattern rewritten process is applied iteratively until a fixed point. The reproduce case is insn-attrtab.c from 502.gcc_r in spec2017. I am not sure if I can share it here since it is not a free benchmark. But I know Meta has access to it. So you can look at it if you're interested.

@ChuanqiXu9
Copy link
Member Author

Update: due to the above long time compilation issue, I prefer to move this pass out of FlattenCIR pass into a standalone pass. If I understand correctly, the previous suggestion to move this to FlattenCIR is majorly for style and I feel the current one is not bad.

@bcardosolopes
Copy link
Member

The reproduce case is insn-attrtab.c from 502.gcc_r in spec2017

Thanks a bunch, this should be enough!

@bcardosolopes bcardosolopes changed the title [CIR] [Lowering] Introduce HoistAllocasPass [CIR][Lowering] Introduce HoistAllocasPass Oct 8, 2024
Copy link
Member

@Lancern Lancern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bcardosolopes bcardosolopes merged commit c3e1d1c into llvm:main Oct 9, 2024
6 checks passed
smeenai pushed a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
lanza pushed a commit that referenced this pull request Nov 5, 2024
Close #883. See the above issue
for details
lanza pushed a commit that referenced this pull request Mar 18, 2025
Close #883. See the above issue
for details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants