Skip to content

Commit be7cbe3

Browse files
committed
[LCSSA] Skip blocks in sub-loops when scanning for uses.
Summary: Scanning blocks in sub-loops for uses is unnecessary, as they were already handled while dealing with the containing sub-loop. This speeds up LCSSA for highly nested loops. For the test case in PR37202, it halves the time spent in LCSSA. In cases were we won't be able to skip any blocks, the additional lookup should be negligible. Time-passes without this patch for test case from PR37202: Total Execution Time: 48.5505 seconds (48.5511 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 10.0822 ( 21.0%) 0.1406 ( 27.0%) 10.2228 ( 21.1%) 10.2228 ( 21.1%) Loop-Closed SSA Form Pass 10.0417 ( 20.9%) 0.1467 ( 28.2%) 10.1884 ( 21.0%) 10.1890 ( 21.0%) Loop-Closed SSA Form Pass intel#2 4.2703 ( 8.9%) 0.0040 ( 0.8%) 4.2742 ( 8.8%) 4.2742 ( 8.8%) Unswitch loops 2.7376 ( 5.7%) 0.0229 ( 4.4%) 2.7605 ( 5.7%) 2.7611 ( 5.7%) Loop-Closed SSA Form Pass intel#5 2.7332 ( 5.7%) 0.0214 ( 4.1%) 2.7546 ( 5.7%) 2.7546 ( 5.7%) Loop-Closed SSA Form Pass intel#3 2.7088 ( 5.6%) 0.0230 ( 4.4%) 2.7319 ( 5.6%) 2.7324 ( 5.6%) Loop-Closed SSA Form Pass intel#4 2.6855 ( 5.6%) 0.0236 ( 4.5%) 2.7091 ( 5.6%) 2.7090 ( 5.6%) Loop-Closed SSA Form Pass intel#6 2.1648 ( 4.5%) 0.0018 ( 0.4%) 2.1666 ( 4.5%) 2.1664 ( 4.5%) Unroll loops 1.8371 ( 3.8%) 0.0009 ( 0.2%) 1.8379 ( 3.8%) 1.8380 ( 3.8%) Value Propagation 1.8149 ( 3.8%) 0.0021 ( 0.4%) 1.8170 ( 3.7%) 1.8169 ( 3.7%) Loop Invariant Code Motion 1.6755 ( 3.5%) 0.0226 ( 4.3%) 1.6981 ( 3.5%) 1.6980 ( 3.5%) Loop-Closed SSA Form Pass intel#7 Time-passes with this patch Total Execution Time: 29.9285 seconds (29.9276 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 5.2786 ( 17.7%) 0.0021 ( 1.2%) 5.2806 ( 17.6%) 5.2808 ( 17.6%) Unswitch loops 4.3739 ( 14.7%) 0.0303 ( 18.1%) 4.4042 ( 14.7%) 4.4042 ( 14.7%) Loop-Closed SSA Form Pass 4.2658 ( 14.3%) 0.0192 ( 11.5%) 4.2850 ( 14.3%) 4.2851 ( 14.3%) Loop-Closed SSA Form Pass intel#2 2.2307 ( 7.5%) 0.0013 ( 0.8%) 2.2320 ( 7.5%) 2.2318 ( 7.5%) Loop Invariant Code Motion 2.0888 ( 7.0%) 0.0012 ( 0.7%) 2.0900 ( 7.0%) 2.0897 ( 7.0%) Unroll loops 1.6761 ( 5.6%) 0.0013 ( 0.8%) 1.6774 ( 5.6%) 1.6774 ( 5.6%) Value Propagation 1.3686 ( 4.6%) 0.0029 ( 1.8%) 1.3716 ( 4.6%) 1.3714 ( 4.6%) Induction Variable Simplification 1.1457 ( 3.8%) 0.0010 ( 0.6%) 1.1468 ( 3.8%) 1.1468 ( 3.8%) Loop-Closed SSA Form Pass intel#4 1.1384 ( 3.8%) 0.0005 ( 0.3%) 1.1389 ( 3.8%) 1.1389 ( 3.8%) Loop-Closed SSA Form Pass intel#6 1.1360 ( 3.8%) 0.0027 ( 1.6%) 1.1387 ( 3.8%) 1.1387 ( 3.8%) Loop-Closed SSA Form Pass intel#5 1.1331 ( 3.8%) 0.0010 ( 0.6%) 1.1341 ( 3.8%) 1.1340 ( 3.8%) Loop-Closed SSA Form Pass intel#3 Reviewers: davide, efriedma, mzolotukhin Reviewed By: davide, efriedma Subscribers: hiraditya, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D56848 llvm-svn: 351567
1 parent 0cddc39 commit be7cbe3

File tree

2 files changed

+6
-1
lines changed

2 files changed

+6
-1
lines changed

llvm/include/llvm/Transforms/Utils/LoopUtils.h

+2-1
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,8 @@ bool formLCSSAForInstructions(SmallVectorImpl<Instruction *> &Worklist,
7979
///
8080
/// Looks at all instructions in the loop which have uses outside of the
8181
/// current loop. For each, an LCSSA PHI node is inserted and the uses outside
82-
/// the loop are rewritten to use this node.
82+
/// the loop are rewritten to use this node. Sub-loops must be in LCSSA form
83+
/// already.
8384
///
8485
/// LoopInfo and DominatorTree are required and preserved.
8586
///

llvm/lib/Transforms/Utils/LCSSA.cpp

+4
Original file line numberDiff line numberDiff line change
@@ -325,6 +325,10 @@ bool llvm::formLCSSA(Loop &L, DominatorTree &DT, LoopInfo *LI,
325325
// Look at all the instructions in the loop, checking to see if they have uses
326326
// outside the loop. If so, put them into the worklist to rewrite those uses.
327327
for (BasicBlock *BB : BlocksDominatingExits) {
328+
// Skip blocks that are part of any sub-loops, they must be in LCSSA
329+
// already.
330+
if (LI->getLoopFor(BB) != &L)
331+
continue;
328332
for (Instruction &I : *BB) {
329333
// Reject two common cases fast: instructions with no uses (like stores)
330334
// and instructions with one use that is in the same block as this.

0 commit comments

Comments
 (0)