[SimplifyIndVar] Push more users to worklist for simplifyUsers #93598

v01dXYZ · 2024-05-28T18:44:17Z

Instead of considering only users that are inside the loop, consider all the users with parent dominated by the loop header.

This is a draft PR to discuss about the code before doing some QA. @etherzhhb @efriedma-quic Do you think my code is valid ?

Provides a solution to #90417.

github-actions · 2024-05-28T18:44:36Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

efriedma-quic · 2024-05-28T19:06:20Z

Two potential problems with this approach:

Some places might implicitly assume IV users are inside the loop.
I'm a bit concerned about compile-time. If you recursively visit every value derived from the exit value of the loop, you can easily end up with quadratic compile-time growth: if you have N successive loops, you visit instructions in the last loop N^2 times.

CC @fhahn

v01dXYZ · 2024-05-29T12:49:22Z

To answer your first question, it seems the Users are expected to be within the loop (test-suite fails with the assert L->contains(ICI->getParent()) && "LCSSA form broken?").

My goal was to allow simplification of a IV comparison in an exit block.

v01dXYZ · 2024-05-30T09:46:17Z

The following breaks:

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"

define void @foo(
       i32 noundef %n_real_zones,
        ptr noundef %real_zones
) {
entry:
  br label %for.cond

for.cond:                                         ; preds = %for.cond.cleanup3, %entry
  %j.0 = phi i32 [ 0, %entry ], [ %inc7, %for.cond.cleanup3 ]
  %n_real_zones.addr.0 = phi i32 [ %n_real_zones, %entry ], [ %n_real_zones.addr.1, %for.cond.cleanup3 ]
  %cmp = icmp ult i32 %j.0, 128
  br label %for.cond1

for.cond1:                                        ; preds = %for.body4, %for.cond
  %n_real_zones.addr.1 = phi i32 [ %n_real_zones.addr.0, %for.cond ], [ %inc, %for.body4 ]
  %i.0 = phi i32 [ 0, %for.cond ], [ %inc5, %for.body4 ]
  %cmp2 = icmp ult i32 %i.0, 128
  br i1 %cmp2, label %for.body4, label %for.cond.cleanup3

for.cond.cleanup3:                                ; preds = %for.cond1
  %inc7 = add nuw nsw i32 %j.0, 1
  br label %for.cond

for.body4:                                        ; preds = %for.cond1
  %idxprom = sext i32 %n_real_zones.addr.1 to i64
  %arrayidx = getelementptr inbounds i32, ptr %real_zones, i64 %idxprom
  store i32 %j.0, ptr %arrayidx, align 4
  %inc = add nsw i32 %n_real_zones.addr.1, 1
  %inc5 = add nuw nsw i32 %i.0, 1
  br label %for.cond1
}

I don't know why it seems to depend on datalayout (especially if n64).

v01dXYZ · 2024-05-31T10:02:17Z

I was able to pass the test-suite with the new changes.

etherzhhb · 2024-05-31T11:03:33Z

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

@@ -511,6 +511,7 @@ bool SimplifyIndvar::eliminateTrunc(TruncInst *TI) {
    ICmpInst *ICI = dyn_cast<ICmpInst>(U);
    if (!ICI) return false;
    assert(L->contains(ICI->getParent()) && "LCSSA form broken?");
+    //    assert(L->contains(ICI->getParent()) && "LCSSA form broken?");


Hi, i think this code assume we run the loop close SSA pass first. Could you include a LIT test to demonstrate the problem? I am not sure this is the right fix to your problem.

This check will break if there is an exit (or dominated) block with a use-chain trunc-icmp such as the following one from Bitcode/Regression/fft/x86_17.06.19_halide_runtime.bc halide_memoization_cache_lookup:

for.cond82.preheader: ; preds = %_ZNK15halide_buffer_t13size_in_bytesEv.exit %48 = trunc i64 %indvars.iv214 to i32 %cmp83195 = icmp sgt i32 %48, 0 br i1 %cmp83195, label %for.body85.preheader, label %cleanup111

or such as:

define i1 @foo(i32 %n) { entry: br label %loop loop: %i = phi i64 [0, %entry], [%i.inc, %loop] %i.inc = add i64 %i, 1 %i.ult.Icnt = icmp ult i64 %i, 128 br i1 %i.ult.Icnt, label %loop, label %loop.exit loop.exit: %i.32 = trunc i64 %i to i32 %i.32.ult.n = icmp ult i32 %i.32, %n ret i1 %i.32.ult.n }

I've got to take a closer look at the code to understand why LCSSA is required here.

It seems to me the utility class SimplifyIndVar don't have to be fed LCSSA-normalised loop but the loop pass IndVarSimplify needs it (I don't know exactly all the reasons, just spotted it uses llvm::rewriteExitValue which simply uses the Backedge Info + SCEV to replace the LCSSA phis with SCEV Constant if the phi is a SCEV Add Rec)

Maybe I'm making a mistake but eliminateTrunc checks all the trunc users are icmp that are eligible to simplifications. I don't see why the LCSSA is required here (except to ensure staying inside the loop).

v01dXYZ · 2024-06-03T13:37:19Z

So I added a cl opt to limit the traversal block-wise. I'm trying to benchmark to see if it hurts performance.

v01dXYZ · 2024-06-04T07:04:37Z

@efriedma-quic The simplification algorithm is almost greedy forward. It stops descending as soon as there is no possible simplification. There is one big exception with instructions representing a SCEV Add Rec associated to the current analysed loop which are always followed, but it's restricting and the user will be traversed only during the analysis of the right loop.

The quadratic behaviour could in theory appear when a simplification is always available. One way to avoid going too far is to keep a counter when traversing the use chain which is incremented everytime we jump to a out-of-loop block and reset when we go back inside the loop. It's a fast way to compute a overapproximation of dist(Loop, CurrentBlock).

I ran test-suite for compile_time but I don't see significant regressions. I would like to have some advice about performance regressions as I fear I could make a dumb mistake (a short sanity check list).

fhahn · 2024-06-06T12:56:32Z

FWIW it looks like we can handle the motivating case from #90417 in constraint-elimination: #94610

v01dXYZ · 2024-06-11T06:37:10Z

Ping.

v01dXYZ · 2024-06-17T15:54:32Z

ping

I could add a mechanism that will specify a radius of action for the simplification depending on the optimisation level.

nikic · 2024-06-17T16:14:37Z

If you want a review, move the PR out of draft.

llvmbot · 2024-06-18T14:39:55Z

@llvm/pr-subscribers-llvm-transforms

Author: None (v01dXYZ)

Changes

Instead of considering only users that are inside the loop, consider all the users with parent dominated by the loop header.

This is a draft PR to discuss about the code before doing some QA. @etherzhhb @efriedma-quic Do you think my code is valid ?

Provides a solution to #90417.

Full diff: https://github.com/llvm/llvm-project/pull/93598.diff

3 Files Affected:

(modified) llvm/include/llvm/Transforms/Utils/SimplifyIndVar.h (+5-6)
(modified) llvm/lib/Transforms/Scalar/IndVarSimplify.cpp (+9-2)
(modified) llvm/lib/Transforms/Utils/SimplifyIndVar.cpp (+47-29)

diff --git a/llvm/include/llvm/Transforms/Utils/SimplifyIndVar.h b/llvm/include/llvm/Transforms/Utils/SimplifyIndVar.h
index bd1718dd8cad7..dbf2bdd18467d 100644
--- a/llvm/include/llvm/Transforms/Utils/SimplifyIndVar.h
+++ b/llvm/include/llvm/Transforms/Utils/SimplifyIndVar.h
@@ -52,12 +52,11 @@ class IVVisitor {
 /// where the first entry indicates that the function makes changes and the
 /// second entry indicates that it introduced new opportunities for loop
 /// unswitching.
-std::pair<bool, bool> simplifyUsersOfIV(PHINode *CurrIV, ScalarEvolution *SE,
-                                        DominatorTree *DT, LoopInfo *LI,
-                                        const TargetTransformInfo *TTI,
-                                        SmallVectorImpl<WeakTrackingVH> &Dead,
-                                        SCEVExpander &Rewriter,
-                                        IVVisitor *V = nullptr);
+std::pair<bool, bool>
+simplifyUsersOfIV(PHINode *CurrIV, ScalarEvolution *SE, DominatorTree *DT,
+                  LoopInfo *LI, const TargetTransformInfo *TTI,
+                  SmallVectorImpl<WeakTrackingVH> &Dead, SCEVExpander &Rewriter,
+                  unsigned MaxDepthOutOfLoop = 1, IVVisitor *V = nullptr);
 
 /// SimplifyLoopIVs - Simplify users of induction variables within this
 /// loop. This does not actually change or add IVs.
diff --git a/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp b/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
index dd7c89034ca09..ea0f7a986e4b9 100644
--- a/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
+++ b/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
@@ -124,6 +124,12 @@ static cl::opt<bool>
 AllowIVWidening("indvars-widen-indvars", cl::Hidden, cl::init(true),
                 cl::desc("Allow widening of indvars to eliminate s/zext"));
 
+static cl::opt<unsigned> MaxDepthOutOfLoop(
+    "indvars-max-depth-out-of-loop", cl::Hidden, cl::init(1),
+    cl::desc(
+        "Strict upper bound for the number of successive out-of-loop blocks "
+        "when traversing use-def chains. 0 enables full traversal"));
+
 namespace {
 
 class IndVarSimplify {
@@ -624,8 +630,9 @@ bool IndVarSimplify::simplifyAndExtend(Loop *L,
       // Information about sign/zero extensions of CurrIV.
       IndVarSimplifyVisitor Visitor(CurrIV, SE, TTI, DT);
 
-      const auto &[C, U] = simplifyUsersOfIV(CurrIV, SE, DT, LI, TTI, DeadInsts,
-                                             Rewriter, &Visitor);
+      const auto &[C, U] =
+          simplifyUsersOfIV(CurrIV, SE, DT, LI, TTI, DeadInsts, Rewriter,
+                            MaxDepthOutOfLoop, &Visitor);
 
       Changed |= C;
       RunUnswitching |= U;
diff --git a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
index 912c02c2ed3ae..d41298fcfa6a5 100644
--- a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
@@ -62,13 +62,18 @@ namespace {
     bool Changed = false;
     bool RunUnswitching = false;
 
+    // When following the def-use chains, it can go outside the loop.
+    // Strict upper bound on number of traversed out-of-loop blocks.
+    unsigned MaxDepthOutOfLoop;
+
   public:
     SimplifyIndvar(Loop *Loop, ScalarEvolution *SE, DominatorTree *DT,
                    LoopInfo *LI, const TargetTransformInfo *TTI,
                    SCEVExpander &Rewriter,
-                   SmallVectorImpl<WeakTrackingVH> &Dead)
+                   SmallVectorImpl<WeakTrackingVH> &Dead,
+                   unsigned MaxDepthOutOfLoop = 1)
         : L(Loop), LI(LI), SE(SE), DT(DT), TTI(TTI), Rewriter(Rewriter),
-          DeadInsts(Dead) {
+          DeadInsts(Dead), MaxDepthOutOfLoop(MaxDepthOutOfLoop) {
       assert(LI && "IV simplification requires LoopInfo");
     }
 
@@ -509,8 +514,8 @@ bool SimplifyIndvar::eliminateTrunc(TruncInst *TI) {
         !DT->isReachableFromEntry(cast<Instruction>(U)->getParent()))
       continue;
     ICmpInst *ICI = dyn_cast<ICmpInst>(U);
-    if (!ICI) return false;
-    assert(L->contains(ICI->getParent()) && "LCSSA form broken?");
+    if (!ICI)
+      return false;
     if (!(ICI->getOperand(0) == TI && L->isLoopInvariant(ICI->getOperand(1))) &&
         !(ICI->getOperand(1) == TI && L->isLoopInvariant(ICI->getOperand(0))))
       return false;
@@ -839,10 +844,12 @@ bool SimplifyIndvar::strengthenRightShift(BinaryOperator *BO,
 }
 
 /// Add all uses of Def to the current IV's worklist.
-static void pushIVUsers(
-  Instruction *Def, Loop *L,
-  SmallPtrSet<Instruction*,16> &Simplified,
-  SmallVectorImpl< std::pair<Instruction*,Instruction*> > &SimpleIVUsers) {
+static void
+pushIVUsers(Instruction *Def, Loop *L, DominatorTree *DT,
+            SmallPtrSet<Instruction *, 16> &Simplified,
+            SmallVectorImpl<std::tuple<Instruction *, Instruction *, unsigned>>
+                &SimpleIVUsers,
+            unsigned OutOfLoopChainCounter, unsigned MaxOutOfLoopChainCounter) {
 
   for (User *U : Def->users()) {
     Instruction *UI = cast<Instruction>(U);
@@ -854,16 +861,23 @@ static void pushIVUsers(
     if (UI == Def)
       continue;
 
-    // Only change the current Loop, do not change the other parts (e.g. other
-    // Loops).
-    if (!L->contains(UI))
+    // Avoid adding Defs that SCEV expand to themselves, e.g. the LoopPhis
+    // of the outter loops.
+    if (!DT->dominates(L->getHeader(), UI->getParent()))
       continue;
 
     // Do not push the same instruction more than once.
     if (!Simplified.insert(UI).second)
       continue;
 
-    SimpleIVUsers.push_back(std::make_pair(UI, Def));
+    unsigned Counter =
+        L->contains(UI)
+            ? 0 // reset depth if we go back inside the loop.
+            : OutOfLoopChainCounter + (UI->getParent() != Def->getParent());
+
+    if (!MaxOutOfLoopChainCounter || Counter < MaxOutOfLoopChainCounter) {
+      SimpleIVUsers.push_back(std::make_tuple(UI, Def, Counter));
+    }
   }
 }
 
@@ -908,17 +922,17 @@ void SimplifyIndvar::simplifyUsers(PHINode *CurrIV, IVVisitor *V) {
   SmallPtrSet<Instruction*,16> Simplified;
 
   // Use-def pairs if IV users waiting to be processed for CurrIV.
-  SmallVector<std::pair<Instruction*, Instruction*>, 8> SimpleIVUsers;
+  SmallVector<std::tuple<Instruction *, Instruction *, unsigned>, 8>
+      SimpleIVUsers;
 
   // Push users of the current LoopPhi. In rare cases, pushIVUsers may be
   // called multiple times for the same LoopPhi. This is the proper thing to
   // do for loop header phis that use each other.
-  pushIVUsers(CurrIV, L, Simplified, SimpleIVUsers);
+  pushIVUsers(CurrIV, L, DT, Simplified, SimpleIVUsers, 0, MaxDepthOutOfLoop);
 
   while (!SimpleIVUsers.empty()) {
-    std::pair<Instruction*, Instruction*> UseOper =
-      SimpleIVUsers.pop_back_val();
-    Instruction *UseInst = UseOper.first;
+    auto [UseInst, IVOperand, OutOfLoopChainCounter] =
+        SimpleIVUsers.pop_back_val();
 
     // If a user of the IndVar is trivially dead, we prefer just to mark it dead
     // rather than try to do some complex analysis or transformation (such as
@@ -942,11 +956,11 @@ void SimplifyIndvar::simplifyUsers(PHINode *CurrIV, IVVisitor *V) {
     if ((isa<PtrToIntInst>(UseInst)) || (isa<TruncInst>(UseInst)))
       for (Use &U : UseInst->uses()) {
         Instruction *User = cast<Instruction>(U.getUser());
-        if (replaceIVUserWithLoopInvariant(User))
+        if (DT->dominates(L->getHeader(), User->getParent()) &&
+            replaceIVUserWithLoopInvariant(User))
           break; // done replacing
       }
 
-    Instruction *IVOperand = UseOper.second;
     for (unsigned N = 0; IVOperand; ++N) {
       assert(N <= Simplified.size() && "runaway iteration");
       (void) N;
@@ -960,7 +974,8 @@ void SimplifyIndvar::simplifyUsers(PHINode *CurrIV, IVVisitor *V) {
       continue;
 
     if (eliminateIVUser(UseInst, IVOperand)) {
-      pushIVUsers(IVOperand, L, Simplified, SimpleIVUsers);
+      pushIVUsers(IVOperand, L, DT, Simplified, SimpleIVUsers,
+                  OutOfLoopChainCounter, MaxDepthOutOfLoop);
       continue;
     }
 
@@ -968,14 +983,16 @@ void SimplifyIndvar::simplifyUsers(PHINode *CurrIV, IVVisitor *V) {
       if (strengthenBinaryOp(BO, IVOperand)) {
         // re-queue uses of the now modified binary operator and fall
         // through to the checks that remain.
-        pushIVUsers(IVOperand, L, Simplified, SimpleIVUsers);
+        pushIVUsers(IVOperand, L, DT, Simplified, SimpleIVUsers,
+                    OutOfLoopChainCounter, MaxDepthOutOfLoop);
       }
     }
 
     // Try to use integer induction for FPToSI of float induction directly.
     if (replaceFloatIVWithIntegerIV(UseInst)) {
       // Re-queue the potentially new direct uses of IVOperand.
-      pushIVUsers(IVOperand, L, Simplified, SimpleIVUsers);
+      pushIVUsers(IVOperand, L, DT, Simplified, SimpleIVUsers,
+                  OutOfLoopChainCounter, MaxDepthOutOfLoop);
       continue;
     }
 
@@ -985,7 +1002,8 @@ void SimplifyIndvar::simplifyUsers(PHINode *CurrIV, IVVisitor *V) {
       continue;
     }
     if (isSimpleIVUser(UseInst, L, SE)) {
-      pushIVUsers(UseInst, L, Simplified, SimpleIVUsers);
+      pushIVUsers(UseInst, L, DT, Simplified, SimpleIVUsers,
+                  OutOfLoopChainCounter, MaxDepthOutOfLoop);
     }
   }
 }
@@ -999,13 +1017,13 @@ void IVVisitor::anchor() { }
 ///  Returns a pair where the first entry indicates that the function makes
 ///  changes and the second entry indicates that it introduced new opportunities
 ///  for loop unswitching.
-std::pair<bool, bool> simplifyUsersOfIV(PHINode *CurrIV, ScalarEvolution *SE,
-                                        DominatorTree *DT, LoopInfo *LI,
-                                        const TargetTransformInfo *TTI,
-                                        SmallVectorImpl<WeakTrackingVH> &Dead,
-                                        SCEVExpander &Rewriter, IVVisitor *V) {
+std::pair<bool, bool>
+simplifyUsersOfIV(PHINode *CurrIV, ScalarEvolution *SE, DominatorTree *DT,
+                  LoopInfo *LI, const TargetTransformInfo *TTI,
+                  SmallVectorImpl<WeakTrackingVH> &Dead, SCEVExpander &Rewriter,
+                  unsigned MaxDepthOutOfLoop, IVVisitor *V) {
   SimplifyIndvar SIV(LI->getLoopFor(CurrIV->getParent()), SE, DT, LI, TTI,
-                     Rewriter, Dead);
+                     Rewriter, Dead, MaxDepthOutOfLoop);
   SIV.simplifyUsers(CurrIV, V);
   return {SIV.hasChanged(), SIV.runUnswitching()};
 }

v01dXYZ · 2024-06-24T09:18:40Z

ping

nikic

Please add a test in llvm/test/Transforms/IndVarSimplify.

nikic · 2024-06-24T09:22:21Z

llvm/include/llvm/Transforms/Utils/SimplifyIndVar.h

+simplifyUsersOfIV(PHINode *CurrIV, ScalarEvolution *SE, DominatorTree *DT,
+                  LoopInfo *LI, const TargetTransformInfo *TTI,
+                  SmallVectorImpl<WeakTrackingVH> &Dead, SCEVExpander &Rewriter,
+                  unsigned MaxDepthOutOfLoop = 1, IVVisitor *V = nullptr);


Do we need the MaxDepthOutOfLoop parameter? Can you move the cl::opt into SimplifyIndVar and directly use it instead?

Transforms/Utils/LoopUnroll.cpp uses it via simplifyLoopIVs. I didn't want to add it here as it could create regressions in passes I'm not interested in.

Okay, that makes sense. But as implemented, aren't you still calling this with the default of 1 from LoopUnroll? Shouldn't the default be 0 instead?

Could you please rebase over 6e3725d so we at least don't have to pass through this parameter everywhere?

The question you raise is actually quite important as this code is disabled with this current default config. I disabled it as I wanted to make it optimisation level dependent but I needed some input to calibrate it according to the level.

It is set to 1 which means the set of BBs that will be affected are dist(Loop, BB) < 1 => BB in Loop (the strict bound is on purpose). The 0 value is a special value for an +infinity bound.

I don't know if this encoding is satisfying. Another one is to use -1 but I don't know if cl::opt<unsigned> supports it.

+1 for making pushIVUsers a method instead of a function.

I think I'll put it to 0 now as it will show performance regressions and allow discussing about that. I'll add some tests before though.

nikic · 2024-06-24T09:23:15Z

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

-    SimpleIVUsers.push_back(std::make_pair(UI, Def));
+    unsigned Counter =
+        L->contains(UI)
+            ? 0 // reset depth if we go back inside the loop.


Do we need to do that?

No.

If we want to compute a overapproximation of the distance in the CFG dist(Loop, BB), this helps to tighten the approximation to the real distance.

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

nikic · 2024-06-24T09:32:19Z

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

+    unsigned Counter =
+        L->contains(UI)
+            ? 0 // reset depth if we go back inside the loop.
+            : OutOfLoopChainCounter + (UI->getParent() != Def->getParent());


Why is this a limit on the depth of the blocks rather than instructions? You can have a very long chain of instructions within a single block, or a very short one across blocks. It seems like that would be the more relevant quantity.

At first, I implemented a instruction-wise depth but I changed that as I visualised the algorithm at the CFG level, as accepting optimising basic blocks that are at given distance of the loop dist(Loop, BB) < limit.

Also, I didn't like the idea to have partially optimised basic-blocks.

llvm/lib/Transforms/Utils/SimplifyIndVar.cpp

nikic · 2024-06-24T10:35:06Z

No compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=e5a41f0afc152cc24b8fef3aa177ef53b2e77c43&to=96dd00646ea8346b9eb4efc0b5b5c78375e8189b&stat=instructions:u

Instead of considering only users that are inside the loop, consider all the users with parent dominated by the loop header.

v01dXYZ · 2024-06-26T22:31:15Z

No compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=e5a41f0afc152cc24b8fef3aa177ef53b2e77c43&to=96dd00646ea8346b9eb4efc0b5b5c78375e8189b&stat=instructions:u

I changed the default config which disabled the traversal of users that are out of loop.

I expect performance regressions with reenabling full traversal.

v01dXYZ · 2024-06-28T15:04:53Z

I don't understand why the CI pipeline failed. Is there a way to relaunch it for linux without repushing the last commit ?

v01dXYZ · 2024-07-01T13:31:28Z

Ping

nikic · 2024-07-01T16:41:47Z

It looks like the updated patch miscompiles clang: http://llvm-compile-time-tracker.com/show_error.php?commit=9236a1de08a0f522a4c123b3818ea1dcb0361c3c There are crashes when building llvm-test-suite with stage2 clang.

nikic · 2024-07-02T19:23:46Z

Can you please also add a test for your last fix?

We do get a working stage2 build now: https://llvm-compile-time-tracker.com/compare.php?from=3c50cbfda4fc3ad85349167132f7ed809ecc685a&to=8c47a1722b8b303e3262a77f992fea606a0a305c&stat=instructions:u

Out of loop, an IV user is loop invariant. Using L->isLoopInvariant is no more valid to distinguish which operand is the loop-constant one from the IV user.

v01dXYZ · 2024-07-03T14:53:44Z

I want to add another point to the ones mentioned by @efriedma-quic:

the optimisations done in-loop are much more susceptible to improve the perf compared to out-of-loop. It's more profitable compared to colder paths. It could be wiser to use a light version instead.

v01dXYZ mentioned this pull request May 31, 2024

[ConstraintElimination] Use SCEV ranges information for Loop counter #91457

Closed

etherzhhb reviewed May 31, 2024

View reviewed changes

v01dXYZ force-pushed the 90417-indvars-simplify-users-out-of-loop branch from 248e441 to 3af0c12 Compare June 18, 2024 14:37

v01dXYZ marked this pull request as ready for review June 18, 2024 14:39

llvmbot added the llvm:transforms label Jun 18, 2024

v01dXYZ changed the title ~~(Draft) [SimplifyIndVar] Push more users to worklist for simplifyUsers~~ [SimplifyIndVar] Push more users to worklist for simplifyUsers Jun 18, 2024

nikic reviewed Jun 24, 2024

View reviewed changes

v01dxyz added 4 commits June 27, 2024 00:16

[SimplifyIndVar] Push more users to worklist for simplifyUsers

d1a22f3

Instead of considering only users that are inside the loop, consider all the users with parent dominated by the loop header.

Add simple tests

e0d9d00

Default to enable full traversal (it shoud be changed before merge)

3511e43

typo

aeac9e8

v01dXYZ force-pushed the 90417-indvars-simplify-users-out-of-loop branch from 3af0c12 to aeac9e8 Compare June 26, 2024 22:27

v01dxyz added 2 commits June 27, 2024 12:12

Update tests

06970c7

Update test for LoopDeletion/LoopVectorize

522ba6e

v01dxyz added 2 commits July 3, 2024 16:31

pre-commit test eliminateTrunc miscompile

db8fb22

Fix eliminateTrunc

8a92def

Out of loop, an IV user is loop invariant. Using L->isLoopInvariant is no more valid to distinguish which operand is the loop-constant one from the IV user.

v01dXYZ force-pushed the 90417-indvars-simplify-users-out-of-loop branch from 2276def to 8a92def Compare July 3, 2024 14:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SimplifyIndVar] Push more users to worklist for simplifyUsers #93598

[SimplifyIndVar] Push more users to worklist for simplifyUsers #93598

v01dXYZ commented May 28, 2024 •

edited

Loading

github-actions bot commented May 28, 2024

efriedma-quic commented May 28, 2024

v01dXYZ commented May 29, 2024

v01dXYZ commented May 30, 2024 •

edited

Loading

v01dXYZ commented May 31, 2024

etherzhhb May 31, 2024

v01dXYZ May 31, 2024

v01dXYZ Jun 5, 2024

v01dXYZ commented Jun 3, 2024

v01dXYZ commented Jun 4, 2024

fhahn commented Jun 6, 2024

v01dXYZ commented Jun 11, 2024

v01dXYZ commented Jun 17, 2024

nikic commented Jun 17, 2024

llvmbot commented Jun 18, 2024

v01dXYZ commented Jun 24, 2024

nikic left a comment

nikic Jun 24, 2024

v01dXYZ Jun 24, 2024

nikic Jun 24, 2024

v01dXYZ Jun 26, 2024 •

edited

Loading

nikic Jun 24, 2024

v01dXYZ Jun 24, 2024

nikic Jun 24, 2024

v01dXYZ Jun 24, 2024

nikic commented Jun 24, 2024

v01dXYZ commented Jun 26, 2024

v01dXYZ commented Jun 28, 2024

v01dXYZ commented Jul 1, 2024

nikic commented Jul 1, 2024

nikic commented Jul 2, 2024

v01dXYZ commented Jul 3, 2024

[SimplifyIndVar] Push more users to worklist for simplifyUsers #93598

Are you sure you want to change the base?

[SimplifyIndVar] Push more users to worklist for simplifyUsers #93598

Conversation

v01dXYZ commented May 28, 2024 • edited Loading

github-actions bot commented May 28, 2024

efriedma-quic commented May 28, 2024

v01dXYZ commented May 29, 2024

v01dXYZ commented May 30, 2024 • edited Loading

v01dXYZ commented May 31, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v01dXYZ commented Jun 3, 2024

v01dXYZ commented Jun 4, 2024

fhahn commented Jun 6, 2024

v01dXYZ commented Jun 11, 2024

v01dXYZ commented Jun 17, 2024

nikic commented Jun 17, 2024

llvmbot commented Jun 18, 2024

v01dXYZ commented Jun 24, 2024

nikic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v01dXYZ Jun 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikic commented Jun 24, 2024

v01dXYZ commented Jun 26, 2024

v01dXYZ commented Jun 28, 2024

v01dXYZ commented Jul 1, 2024

nikic commented Jul 1, 2024

nikic commented Jul 2, 2024

v01dXYZ commented Jul 3, 2024

v01dXYZ commented May 28, 2024 •

edited

Loading

v01dXYZ commented May 30, 2024 •

edited

Loading

v01dXYZ Jun 26, 2024 •

edited

Loading