-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[FixIrreducible] Use CycleInfo instead of a custom SCC traversal #101386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-adt @llvm/pr-subscribers-backend-amdgpu Author: Sameer Sahasrabuddhe (ssahasra) Changes
Patch is 63.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/101386.diff 9 Files Affected:
diff --git a/llvm/include/llvm/ADT/GenericCycleInfo.h b/llvm/include/llvm/ADT/GenericCycleInfo.h
index b601fc9bae38a..e6d76d5163b1e 100644
--- a/llvm/include/llvm/ADT/GenericCycleInfo.h
+++ b/llvm/include/llvm/ADT/GenericCycleInfo.h
@@ -107,6 +107,12 @@ template <typename ContextT> class GenericCycle {
return is_contained(Entries, Block);
}
+ /// \brief Replace all entries with \p Block as single entry.
+ void setSingleEntry(BlockT *Block) {
+ Entries.clear();
+ Entries.push_back(Block);
+ }
+
/// \brief Return whether \p Block is contained in the cycle.
bool contains(const BlockT *Block) const { return Blocks.contains(Block); }
@@ -189,6 +195,21 @@ template <typename ContextT> class GenericCycle {
//@{
using const_entry_iterator =
typename SmallVectorImpl<BlockT *>::const_iterator;
+ const_entry_iterator entry_begin() const {
+ return const_entry_iterator{Entries.begin()};
+ }
+ const_entry_iterator entry_end() const {
+ return const_entry_iterator{Entries.end()};
+ }
+
+ using const_reverse_entry_iterator =
+ typename SmallVectorImpl<BlockT *>::const_reverse_iterator;
+ const_reverse_entry_iterator entry_rbegin() const {
+ return const_reverse_entry_iterator{Entries.rbegin()};
+ }
+ const_reverse_entry_iterator entry_rend() const {
+ return const_reverse_entry_iterator{Entries.rend()};
+ }
size_t getNumEntries() const { return Entries.size(); }
iterator_range<const_entry_iterator> entries() const {
@@ -252,12 +273,6 @@ template <typename ContextT> class GenericCycleInfo {
/// the subtree.
void moveTopLevelCycleToNewParent(CycleT *NewParent, CycleT *Child);
- /// Assumes that \p Cycle is the innermost cycle containing \p Block.
- /// \p Block will be appended to \p Cycle and all of its parent cycles.
- /// \p Block will be added to BlockMap with \p Cycle and
- /// BlockMapTopLevel with \p Cycle's top level parent cycle.
- void addBlockToCycle(BlockT *Block, CycleT *Cycle);
-
public:
GenericCycleInfo() = default;
GenericCycleInfo(GenericCycleInfo &&) = default;
@@ -275,6 +290,12 @@ template <typename ContextT> class GenericCycleInfo {
unsigned getCycleDepth(const BlockT *Block) const;
CycleT *getTopLevelParentCycle(BlockT *Block);
+ /// Assumes that \p Cycle is the innermost cycle containing \p Block.
+ /// \p Block will be appended to \p Cycle and all of its parent cycles.
+ /// \p Block will be added to BlockMap with \p Cycle and
+ /// BlockMapTopLevel with \p Cycle's top level parent cycle.
+ void addBlockToCycle(BlockT *Block, CycleT *Cycle);
+
/// Methods for debug and self-test.
//@{
#ifndef NDEBUG
diff --git a/llvm/lib/Transforms/Utils/FixIrreducible.cpp b/llvm/lib/Transforms/Utils/FixIrreducible.cpp
index 11e24d0585be4..9cad689c50e90 100644
--- a/llvm/lib/Transforms/Utils/FixIrreducible.cpp
+++ b/llvm/lib/Transforms/Utils/FixIrreducible.cpp
@@ -6,58 +6,55 @@
//
//===----------------------------------------------------------------------===//
//
-// An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
-// with control-flow edges incident from outside the SCC. This pass converts a
-// irreducible SCC into a natural loop by applying the following transformation:
+// To convert an irreducible cycle C to a natural loop L:
//
-// 1. Collect the set of headers H of the SCC.
-// 2. Collect the set of predecessors P of these headers. These may be inside as
-// well as outside the SCC.
-// 3. Create block N and redirect every edge from set P to set H through N.
+// 1. Add a new node N to C.
+// 2. Redirect all external incoming edges through N.
+// 3. Redirect all edges incident on header H through N.
//
-// This converts the SCC into a natural loop with N as the header: N is the only
-// block with edges incident from outside the SCC, and all backedges in the SCC
-// are incident on N, i.e., for every backedge, the head now dominates the tail.
+// This is sufficient to ensure that:
//
-// INPUT CFG: The blocks A and B form an irreducible loop with two headers.
+// a. Every closed path in C also exists in L, with the modification that any
+// path passing through H now passes through N before reaching H.
+// b. Every external path incident on any entry of C is now incident on N and
+// then redirected to the entry.
+//
+// Thus, L is a strongly connected component dominated by N, and hence L is a
+// natural loop with header N.
+//
+// INPUT CFG: The blocks H and B form an irreducible loop with two headers.
//
// Entry
// / \
// v v
-// A ----> B
+// H ----> B
// ^ /|
// `----' |
// v
// Exit
//
-// OUTPUT CFG: Edges incident on A and B are now redirected through a
-// new block N, forming a natural loop consisting of N, A and B.
+// OUTPUT CFG:
//
// Entry
// |
// v
-// .---> N <---.
-// / / \ \
-// | / \ |
-// \ v v /
-// `-- A B --'
+// N <---.
+// / \ \
+// / \ |
+// v v /
+// H --> B --'
// |
// v
// Exit
//
-// The transformation is applied to every maximal SCC that is not already
-// recognized as a loop. The pass operates on all maximal SCCs found in the
-// function body outside of any loop, as well as those found inside each loop,
-// including inside any newly created loops. This ensures that any SCC hidden
-// inside a maximal SCC is also transformed.
//
// The actual transformation is handled by function CreateControlFlowHub, which
// takes a set of incoming blocks (the predecessors) and outgoing blocks (the
-// headers). The function also moves every PHINode in an outgoing block to the
+// entries). The function also moves every PHINode in an outgoing block to the
// hub. Since the hub dominates all the outgoing blocks, each such PHINode
-// continues to dominate its uses. Since every header in an SCC has at least two
-// predecessors, every value used in the header (or later) but defined in a
-// predecessor (or earlier) is represented by a PHINode in a header. Hence the
+// continues to dominate its uses. Since every entry the cycle has at least two
+// predecessors, every value used in the entry (or later) but defined in a
+// predecessor (or earlier) is represented by a PHINode in a entry. Hence the
// above handling of PHINodes is sufficient and no further processing is
// required to restore SSA.
//
@@ -68,8 +65,9 @@
#include "llvm/Transforms/Utils/FixIrreducible.h"
#include "llvm/ADT/SCCIterator.h"
+#include "llvm/Analysis/CycleAnalysis.h"
#include "llvm/Analysis/DomTreeUpdater.h"
-#include "llvm/Analysis/LoopIterator.h"
+#include "llvm/Analysis/LoopInfo.h"
#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"
#include "llvm/Transforms/Utils.h"
@@ -88,8 +86,9 @@ struct FixIrreducible : public FunctionPass {
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<DominatorTreeWrapperPass>();
- AU.addRequired<LoopInfoWrapperPass>();
+ AU.addRequired<CycleInfoWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();
+ AU.addPreserved<CycleInfoWrapperPass>();
AU.addPreserved<LoopInfoWrapperPass>();
}
@@ -113,27 +112,23 @@ INITIALIZE_PASS_END(FixIrreducible, "fix-irreducible",
// When a new loop is created, existing children of the parent loop may now be
// fully inside the new loop. Reconnect these as children of the new loop.
static void reconnectChildLoops(LoopInfo &LI, Loop *ParentLoop, Loop *NewLoop,
- SetVector<BasicBlock *> &Blocks,
- SetVector<BasicBlock *> &Headers) {
+ BasicBlock *OldHeader) {
auto &CandidateLoops = ParentLoop ? ParentLoop->getSubLoopsVector()
: LI.getTopLevelLoopsVector();
- // The new loop cannot be its own child, and any candidate is a
- // child iff its header is owned by the new loop. Move all the
- // children to a new vector.
+ // Any candidate is a child iff its header is owned by the new loop. Move all
+ // the children to a new vector.
auto FirstChild = std::partition(
- CandidateLoops.begin(), CandidateLoops.end(), [&](Loop *L) {
- return L == NewLoop || !Blocks.contains(L->getHeader());
- });
+ CandidateLoops.begin(), CandidateLoops.end(),
+ [&](Loop *L) { return !NewLoop->contains(L->getHeader()); });
SmallVector<Loop *, 8> ChildLoops(FirstChild, CandidateLoops.end());
CandidateLoops.erase(FirstChild, CandidateLoops.end());
for (Loop *Child : ChildLoops) {
LLVM_DEBUG(dbgs() << "child loop: " << Child->getHeader()->getName()
<< "\n");
- // TODO: A child loop whose header is also a header in the current
- // SCC gets destroyed since its backedges are removed. That may
- // not be necessary if we can retain such backedges.
- if (Headers.count(Child->getHeader())) {
+ // A child loop whose header was the old cycle header gets destroyed since
+ // its backedges are removed.
+ if (Child->getHeader() == OldHeader) {
for (auto *BB : Child->blocks()) {
if (LI.getLoopFor(BB) != Child)
continue;
@@ -161,21 +156,25 @@ static void reconnectChildLoops(LoopInfo &LI, Loop *ParentLoop, Loop *NewLoop,
// Given a set of blocks and headers in an irreducible SCC, convert it into a
// natural loop. Also insert this new loop at its appropriate place in the
// hierarchy of loops.
-static void createNaturalLoopInternal(LoopInfo &LI, DominatorTree &DT,
- Loop *ParentLoop,
- SetVector<BasicBlock *> &Blocks,
- SetVector<BasicBlock *> &Headers) {
-#ifndef NDEBUG
- // All headers are part of the SCC
- for (auto *H : Headers) {
- assert(Blocks.count(H));
- }
-#endif
+static bool fixIrreducible(Cycle &C, CycleInfo &CI, DominatorTree &DT,
+ LoopInfo *LI) {
+ if (C.isReducible())
+ return false;
SetVector<BasicBlock *> Predecessors;
- for (auto *H : Headers) {
- for (auto *P : predecessors(H)) {
+
+ // Redirect internal edges incident on the header.
+ BasicBlock *OldHeader = C.getHeader();
+ for (BasicBlock *P : predecessors(OldHeader)) {
+ if (C.contains(P))
Predecessors.insert(P);
+ }
+
+ // Redirect external incoming edges. This includes the edges on the header.
+ for (BasicBlock *E : C.entries()) {
+ for (BasicBlock *P : predecessors(E)) {
+ if (!C.contains(P))
+ Predecessors.insert(P);
}
}
@@ -189,21 +188,40 @@ static void createNaturalLoopInternal(LoopInfo &LI, DominatorTree &DT,
// Redirect all the backedges through a "hub" consisting of a series
// of guard blocks that manage the flow of control from the
// predecessors to the headers.
- SmallVector<BasicBlock *, 8> GuardBlocks;
+ SmallVector<BasicBlock *> GuardBlocks;
+
+ // Minor optimization: The cycle entries are discovered in an order that is
+ // the opposite of the order in which these blocks appear as branch targets.
+ // This results in a lot of condition inversions in the control flow out of
+ // the new ControlFlowHub, which can be mitigated if the orders match. So we
+ // reverse the entries when adding them to the hub.
+ SetVector<BasicBlock *> Entries;
+ Entries.insert(C.entry_rbegin(), C.entry_rend());
+
DomTreeUpdater DTU(DT, DomTreeUpdater::UpdateStrategy::Eager);
- CreateControlFlowHub(&DTU, GuardBlocks, Predecessors, Headers, "irr");
+ CreateControlFlowHub(&DTU, GuardBlocks, Predecessors, Entries, "irr");
#if defined(EXPENSIVE_CHECKS)
assert(DT.verify(DominatorTree::VerificationLevel::Full));
#else
assert(DT.verify(DominatorTree::VerificationLevel::Fast));
#endif
+ for (auto *G : GuardBlocks) {
+ LLVM_DEBUG(dbgs() << "added guard block: " << G->getName() << "\n");
+ CI.addBlockToCycle(G, &C);
+ }
+ C.setSingleEntry(GuardBlocks[0]);
+
+ if (!LI)
+ return true;
+
+ Loop *ParentLoop = LI->getLoopFor(OldHeader);
// Create a new loop from the now-transformed cycle
- auto NewLoop = LI.AllocateLoop();
+ auto *NewLoop = LI->AllocateLoop();
if (ParentLoop) {
ParentLoop->addChildLoop(NewLoop);
} else {
- LI.addTopLevelLoop(NewLoop);
+ LI->addTopLevelLoop(NewLoop);
}
// Add the guard blocks to the new loop. The first guard block is
@@ -213,16 +231,15 @@ static void createNaturalLoopInternal(LoopInfo &LI, DominatorTree &DT,
// are also propagated up the chain of parent loops.
for (auto *G : GuardBlocks) {
LLVM_DEBUG(dbgs() << "added guard block: " << G->getName() << "\n");
- NewLoop->addBasicBlockToLoop(G, LI);
+ NewLoop->addBasicBlockToLoop(G, *LI);
}
- // Add the SCC blocks to the new loop.
- for (auto *BB : Blocks) {
+ for (auto *BB : C.blocks()) {
NewLoop->addBlockEntry(BB);
- if (LI.getLoopFor(BB) == ParentLoop) {
+ if (LI->getLoopFor(BB) == ParentLoop) {
LLVM_DEBUG(dbgs() << "moved block from parent: " << BB->getName()
<< "\n");
- LI.changeLoopFor(BB, NewLoop);
+ LI->changeLoopFor(BB, NewLoop);
} else {
LLVM_DEBUG(dbgs() << "added block from child: " << BB->getName() << "\n");
}
@@ -230,129 +247,58 @@ static void createNaturalLoopInternal(LoopInfo &LI, DominatorTree &DT,
LLVM_DEBUG(dbgs() << "header for new loop: "
<< NewLoop->getHeader()->getName() << "\n");
- reconnectChildLoops(LI, ParentLoop, NewLoop, Blocks, Headers);
+ reconnectChildLoops(*LI, ParentLoop, NewLoop, OldHeader);
NewLoop->verifyLoop();
if (ParentLoop) {
ParentLoop->verifyLoop();
}
#if defined(EXPENSIVE_CHECKS)
- LI.verify(DT);
+ LI->verify(DT);
#endif // EXPENSIVE_CHECKS
-}
-
-namespace llvm {
-// Enable the graph traits required for traversing a Loop body.
-template <> struct GraphTraits<Loop> : LoopBodyTraits {};
-} // namespace llvm
-// Overloaded wrappers to go with the function template below.
-static BasicBlock *unwrapBlock(BasicBlock *B) { return B; }
-static BasicBlock *unwrapBlock(LoopBodyTraits::NodeRef &N) { return N.second; }
-
-static void createNaturalLoop(LoopInfo &LI, DominatorTree &DT, Function *F,
- SetVector<BasicBlock *> &Blocks,
- SetVector<BasicBlock *> &Headers) {
- createNaturalLoopInternal(LI, DT, nullptr, Blocks, Headers);
+ return true;
}
-static void createNaturalLoop(LoopInfo &LI, DominatorTree &DT, Loop &L,
- SetVector<BasicBlock *> &Blocks,
- SetVector<BasicBlock *> &Headers) {
- createNaturalLoopInternal(LI, DT, &L, Blocks, Headers);
-}
-
-// Convert irreducible SCCs; Graph G may be a Function* or a Loop&.
-template <class Graph>
-static bool makeReducible(LoopInfo &LI, DominatorTree &DT, Graph &&G) {
- bool Changed = false;
- for (auto Scc = scc_begin(G); !Scc.isAtEnd(); ++Scc) {
- if (Scc->size() < 2)
- continue;
- SetVector<BasicBlock *> Blocks;
- LLVM_DEBUG(dbgs() << "Found SCC:");
- for (auto N : *Scc) {
- auto BB = unwrapBlock(N);
- LLVM_DEBUG(dbgs() << " " << BB->getName());
- Blocks.insert(BB);
- }
- LLVM_DEBUG(dbgs() << "\n");
-
- // Minor optimization: The SCC blocks are usually discovered in an order
- // that is the opposite of the order in which these blocks appear as branch
- // targets. This results in a lot of condition inversions in the control
- // flow out of the new ControlFlowHub, which can be mitigated if the orders
- // match. So we discover the headers using the reverse of the block order.
- SetVector<BasicBlock *> Headers;
- LLVM_DEBUG(dbgs() << "Found headers:");
- for (auto *BB : reverse(Blocks)) {
- for (const auto P : predecessors(BB)) {
- // Skip unreachable predecessors.
- if (!DT.isReachableFromEntry(P))
- continue;
- if (!Blocks.count(P)) {
- LLVM_DEBUG(dbgs() << " " << BB->getName());
- Headers.insert(BB);
- break;
- }
- }
- }
- LLVM_DEBUG(dbgs() << "\n");
-
- if (Headers.size() == 1) {
- assert(LI.isLoopHeader(Headers.front()));
- LLVM_DEBUG(dbgs() << "Natural loop with a single header: skipped\n");
- continue;
- }
- createNaturalLoop(LI, DT, G, Blocks, Headers);
- Changed = true;
- }
- return Changed;
-}
-
-static bool FixIrreducibleImpl(Function &F, LoopInfo &LI, DominatorTree &DT) {
+static bool FixIrreducibleImpl(Function &F, CycleInfo &CI, DominatorTree &DT,
+ LoopInfo *LI) {
LLVM_DEBUG(dbgs() << "===== Fix irreducible control-flow in function: "
<< F.getName() << "\n");
assert(hasOnlySimpleTerminator(F) && "Unsupported block terminator.");
bool Changed = false;
- SmallVector<Loop *, 8> WorkList;
-
- LLVM_DEBUG(dbgs() << "visiting top-level\n");
- Changed |= makeReducible(LI, DT, &F);
-
- // Any SCCs reduced are now already in the list of top-level loops, so simply
- // add them all to the worklist.
- append_range(WorkList, LI);
-
- while (!WorkList.empty()) {
- auto L = WorkList.pop_back_val();
- LLVM_DEBUG(dbgs() << "visiting loop with header "
- << L->getHeader()->getName() << "\n");
- Changed |= makeReducible(LI, DT, *L);
- // Any SCCs reduced are now already in the list of child loops, so simply
- // add them all to the worklist.
- WorkList.append(L->begin(), L->end());
+ SmallVector<Cycle *> Worklist{CI.toplevel_cycles()};
+
+ while (!Worklist.empty()) {
+ Cycle *C = Worklist.pop_back_val();
+ Changed |= fixIrreducible(*C, CI, DT, LI);
+ append_range(Worklist, C->children());
}
return Changed;
}
bool FixIrreducible::runOnFunction(Function &F) {
- auto &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
+ auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();
+ LoopInfo *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;
+ auto &CI = getAnalysis<CycleInfoWrapperPass>().getResult();
auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
- return FixIrreducibleImpl(F, LI, DT);
+ return FixIrreducibleImpl(F, CI, DT, LI);
}
PreservedAnalyses FixIrreduciblePass::run(Function &F,
FunctionAnalysisManager &AM) {
- auto &LI = AM.getResult<LoopAnalysis>(F);
+ auto *LI = AM.getCachedResult<LoopAnalysis>(F);
+ auto &CI = AM.getResult<CycleAnalysis>(F);
auto &DT = AM.getResult<DominatorTreeAnalysis>(F);
- if (!FixIrreducibleImpl(F, LI, DT))
+
+ if (!FixIrreducibleImpl(F, CI, DT, LI))
return PreservedAnalyses::all();
+
PreservedAnalyses PA;
PA.preserve<LoopAnalysis>();
+ PA.preserve<CycleAnalysis>();
PA.preserve<DominatorTreeAnalysis>();
return PA;
}
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index b61838c06a1f9..62b42c892a11e 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -68,8 +68,9 @@
; GCN-O0-NEXT: Uniformity Analysis
; GCN-O0-NEXT: Unify divergent function exit nodes
; GCN-O0-NEXT: Dominator Tree Construction
-; GCN-O0-NEXT: Natural Loop Information
+; GCN-O0-NEXT: Cycle Info Analysis
; GCN-O0-NEXT: Convert irreducible control-flow into natural loops
+; GCN-O0-NEXT: Natural Loop Information
; GCN-O0-NEXT: Fixup each natural loop to have a single exit block
; GCN-O0-NEXT: Post-Dominator Tree Construction
; GCN-O0-NEXT: Dominance Frontier Construction
@@ -262,8 +263,9 @@
; GCN-O1-NEXT: Post-Dominator Tree Construction
; GCN-O1-NEXT: Unify divergent function exit nodes
; GCN-O1-NEXT: Dominator Tree Construction
-; GCN-O1-NEXT: Natural Loop Information
+; GCN-O1-NEXT: Cycle Info Analysis
; GCN-O1-NEXT:...
[truncated]
|
@llvm/pr-subscribers-llvm-transforms Author: Sameer Sahasrabuddhe (ssahasra) Changes
Patch is 63.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/101386.diff 9 Files Affected:
diff --git a/llvm/include/llvm/ADT/GenericCycleInfo.h b/llvm/include/llvm/ADT/GenericCycleInfo.h
index b601fc9bae38a..e6d76d5163b1e 100644
--- a/llvm/include/llvm/ADT/GenericCycleInfo.h
+++ b/llvm/include/llvm/ADT/GenericCycleInfo.h
@@ -107,6 +107,12 @@ template <typename ContextT> class GenericCycle {
return is_contained(Entries, Block);
}
+ /// \brief Replace all entries with \p Block as single entry.
+ void setSingleEntry(BlockT *Block) {
+ Entries.clear();
+ Entries.push_back(Block);
+ }
+
/// \brief Return whether \p Block is contained in the cycle.
bool contains(const BlockT *Block) const { return Blocks.contains(Block); }
@@ -189,6 +195,21 @@ template <typename ContextT> class GenericCycle {
//@{
using const_entry_iterator =
typename SmallVectorImpl<BlockT *>::const_iterator;
+ const_entry_iterator entry_begin() const {
+ return const_entry_iterator{Entries.begin()};
+ }
+ const_entry_iterator entry_end() const {
+ return const_entry_iterator{Entries.end()};
+ }
+
+ using const_reverse_entry_iterator =
+ typename SmallVectorImpl<BlockT *>::const_reverse_iterator;
+ const_reverse_entry_iterator entry_rbegin() const {
+ return const_reverse_entry_iterator{Entries.rbegin()};
+ }
+ const_reverse_entry_iterator entry_rend() const {
+ return const_reverse_entry_iterator{Entries.rend()};
+ }
size_t getNumEntries() const { return Entries.size(); }
iterator_range<const_entry_iterator> entries() const {
@@ -252,12 +273,6 @@ template <typename ContextT> class GenericCycleInfo {
/// the subtree.
void moveTopLevelCycleToNewParent(CycleT *NewParent, CycleT *Child);
- /// Assumes that \p Cycle is the innermost cycle containing \p Block.
- /// \p Block will be appended to \p Cycle and all of its parent cycles.
- /// \p Block will be added to BlockMap with \p Cycle and
- /// BlockMapTopLevel with \p Cycle's top level parent cycle.
- void addBlockToCycle(BlockT *Block, CycleT *Cycle);
-
public:
GenericCycleInfo() = default;
GenericCycleInfo(GenericCycleInfo &&) = default;
@@ -275,6 +290,12 @@ template <typename ContextT> class GenericCycleInfo {
unsigned getCycleDepth(const BlockT *Block) const;
CycleT *getTopLevelParentCycle(BlockT *Block);
+ /// Assumes that \p Cycle is the innermost cycle containing \p Block.
+ /// \p Block will be appended to \p Cycle and all of its parent cycles.
+ /// \p Block will be added to BlockMap with \p Cycle and
+ /// BlockMapTopLevel with \p Cycle's top level parent cycle.
+ void addBlockToCycle(BlockT *Block, CycleT *Cycle);
+
/// Methods for debug and self-test.
//@{
#ifndef NDEBUG
diff --git a/llvm/lib/Transforms/Utils/FixIrreducible.cpp b/llvm/lib/Transforms/Utils/FixIrreducible.cpp
index 11e24d0585be4..9cad689c50e90 100644
--- a/llvm/lib/Transforms/Utils/FixIrreducible.cpp
+++ b/llvm/lib/Transforms/Utils/FixIrreducible.cpp
@@ -6,58 +6,55 @@
//
//===----------------------------------------------------------------------===//
//
-// An irreducible SCC is one which has multiple "header" blocks, i.e., blocks
-// with control-flow edges incident from outside the SCC. This pass converts a
-// irreducible SCC into a natural loop by applying the following transformation:
+// To convert an irreducible cycle C to a natural loop L:
//
-// 1. Collect the set of headers H of the SCC.
-// 2. Collect the set of predecessors P of these headers. These may be inside as
-// well as outside the SCC.
-// 3. Create block N and redirect every edge from set P to set H through N.
+// 1. Add a new node N to C.
+// 2. Redirect all external incoming edges through N.
+// 3. Redirect all edges incident on header H through N.
//
-// This converts the SCC into a natural loop with N as the header: N is the only
-// block with edges incident from outside the SCC, and all backedges in the SCC
-// are incident on N, i.e., for every backedge, the head now dominates the tail.
+// This is sufficient to ensure that:
//
-// INPUT CFG: The blocks A and B form an irreducible loop with two headers.
+// a. Every closed path in C also exists in L, with the modification that any
+// path passing through H now passes through N before reaching H.
+// b. Every external path incident on any entry of C is now incident on N and
+// then redirected to the entry.
+//
+// Thus, L is a strongly connected component dominated by N, and hence L is a
+// natural loop with header N.
+//
+// INPUT CFG: The blocks H and B form an irreducible loop with two headers.
//
// Entry
// / \
// v v
-// A ----> B
+// H ----> B
// ^ /|
// `----' |
// v
// Exit
//
-// OUTPUT CFG: Edges incident on A and B are now redirected through a
-// new block N, forming a natural loop consisting of N, A and B.
+// OUTPUT CFG:
//
// Entry
// |
// v
-// .---> N <---.
-// / / \ \
-// | / \ |
-// \ v v /
-// `-- A B --'
+// N <---.
+// / \ \
+// / \ |
+// v v /
+// H --> B --'
// |
// v
// Exit
//
-// The transformation is applied to every maximal SCC that is not already
-// recognized as a loop. The pass operates on all maximal SCCs found in the
-// function body outside of any loop, as well as those found inside each loop,
-// including inside any newly created loops. This ensures that any SCC hidden
-// inside a maximal SCC is also transformed.
//
// The actual transformation is handled by function CreateControlFlowHub, which
// takes a set of incoming blocks (the predecessors) and outgoing blocks (the
-// headers). The function also moves every PHINode in an outgoing block to the
+// entries). The function also moves every PHINode in an outgoing block to the
// hub. Since the hub dominates all the outgoing blocks, each such PHINode
-// continues to dominate its uses. Since every header in an SCC has at least two
-// predecessors, every value used in the header (or later) but defined in a
-// predecessor (or earlier) is represented by a PHINode in a header. Hence the
+// continues to dominate its uses. Since every entry the cycle has at least two
+// predecessors, every value used in the entry (or later) but defined in a
+// predecessor (or earlier) is represented by a PHINode in a entry. Hence the
// above handling of PHINodes is sufficient and no further processing is
// required to restore SSA.
//
@@ -68,8 +65,9 @@
#include "llvm/Transforms/Utils/FixIrreducible.h"
#include "llvm/ADT/SCCIterator.h"
+#include "llvm/Analysis/CycleAnalysis.h"
#include "llvm/Analysis/DomTreeUpdater.h"
-#include "llvm/Analysis/LoopIterator.h"
+#include "llvm/Analysis/LoopInfo.h"
#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"
#include "llvm/Transforms/Utils.h"
@@ -88,8 +86,9 @@ struct FixIrreducible : public FunctionPass {
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<DominatorTreeWrapperPass>();
- AU.addRequired<LoopInfoWrapperPass>();
+ AU.addRequired<CycleInfoWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();
+ AU.addPreserved<CycleInfoWrapperPass>();
AU.addPreserved<LoopInfoWrapperPass>();
}
@@ -113,27 +112,23 @@ INITIALIZE_PASS_END(FixIrreducible, "fix-irreducible",
// When a new loop is created, existing children of the parent loop may now be
// fully inside the new loop. Reconnect these as children of the new loop.
static void reconnectChildLoops(LoopInfo &LI, Loop *ParentLoop, Loop *NewLoop,
- SetVector<BasicBlock *> &Blocks,
- SetVector<BasicBlock *> &Headers) {
+ BasicBlock *OldHeader) {
auto &CandidateLoops = ParentLoop ? ParentLoop->getSubLoopsVector()
: LI.getTopLevelLoopsVector();
- // The new loop cannot be its own child, and any candidate is a
- // child iff its header is owned by the new loop. Move all the
- // children to a new vector.
+ // Any candidate is a child iff its header is owned by the new loop. Move all
+ // the children to a new vector.
auto FirstChild = std::partition(
- CandidateLoops.begin(), CandidateLoops.end(), [&](Loop *L) {
- return L == NewLoop || !Blocks.contains(L->getHeader());
- });
+ CandidateLoops.begin(), CandidateLoops.end(),
+ [&](Loop *L) { return !NewLoop->contains(L->getHeader()); });
SmallVector<Loop *, 8> ChildLoops(FirstChild, CandidateLoops.end());
CandidateLoops.erase(FirstChild, CandidateLoops.end());
for (Loop *Child : ChildLoops) {
LLVM_DEBUG(dbgs() << "child loop: " << Child->getHeader()->getName()
<< "\n");
- // TODO: A child loop whose header is also a header in the current
- // SCC gets destroyed since its backedges are removed. That may
- // not be necessary if we can retain such backedges.
- if (Headers.count(Child->getHeader())) {
+ // A child loop whose header was the old cycle header gets destroyed since
+ // its backedges are removed.
+ if (Child->getHeader() == OldHeader) {
for (auto *BB : Child->blocks()) {
if (LI.getLoopFor(BB) != Child)
continue;
@@ -161,21 +156,25 @@ static void reconnectChildLoops(LoopInfo &LI, Loop *ParentLoop, Loop *NewLoop,
// Given a set of blocks and headers in an irreducible SCC, convert it into a
// natural loop. Also insert this new loop at its appropriate place in the
// hierarchy of loops.
-static void createNaturalLoopInternal(LoopInfo &LI, DominatorTree &DT,
- Loop *ParentLoop,
- SetVector<BasicBlock *> &Blocks,
- SetVector<BasicBlock *> &Headers) {
-#ifndef NDEBUG
- // All headers are part of the SCC
- for (auto *H : Headers) {
- assert(Blocks.count(H));
- }
-#endif
+static bool fixIrreducible(Cycle &C, CycleInfo &CI, DominatorTree &DT,
+ LoopInfo *LI) {
+ if (C.isReducible())
+ return false;
SetVector<BasicBlock *> Predecessors;
- for (auto *H : Headers) {
- for (auto *P : predecessors(H)) {
+
+ // Redirect internal edges incident on the header.
+ BasicBlock *OldHeader = C.getHeader();
+ for (BasicBlock *P : predecessors(OldHeader)) {
+ if (C.contains(P))
Predecessors.insert(P);
+ }
+
+ // Redirect external incoming edges. This includes the edges on the header.
+ for (BasicBlock *E : C.entries()) {
+ for (BasicBlock *P : predecessors(E)) {
+ if (!C.contains(P))
+ Predecessors.insert(P);
}
}
@@ -189,21 +188,40 @@ static void createNaturalLoopInternal(LoopInfo &LI, DominatorTree &DT,
// Redirect all the backedges through a "hub" consisting of a series
// of guard blocks that manage the flow of control from the
// predecessors to the headers.
- SmallVector<BasicBlock *, 8> GuardBlocks;
+ SmallVector<BasicBlock *> GuardBlocks;
+
+ // Minor optimization: The cycle entries are discovered in an order that is
+ // the opposite of the order in which these blocks appear as branch targets.
+ // This results in a lot of condition inversions in the control flow out of
+ // the new ControlFlowHub, which can be mitigated if the orders match. So we
+ // reverse the entries when adding them to the hub.
+ SetVector<BasicBlock *> Entries;
+ Entries.insert(C.entry_rbegin(), C.entry_rend());
+
DomTreeUpdater DTU(DT, DomTreeUpdater::UpdateStrategy::Eager);
- CreateControlFlowHub(&DTU, GuardBlocks, Predecessors, Headers, "irr");
+ CreateControlFlowHub(&DTU, GuardBlocks, Predecessors, Entries, "irr");
#if defined(EXPENSIVE_CHECKS)
assert(DT.verify(DominatorTree::VerificationLevel::Full));
#else
assert(DT.verify(DominatorTree::VerificationLevel::Fast));
#endif
+ for (auto *G : GuardBlocks) {
+ LLVM_DEBUG(dbgs() << "added guard block: " << G->getName() << "\n");
+ CI.addBlockToCycle(G, &C);
+ }
+ C.setSingleEntry(GuardBlocks[0]);
+
+ if (!LI)
+ return true;
+
+ Loop *ParentLoop = LI->getLoopFor(OldHeader);
// Create a new loop from the now-transformed cycle
- auto NewLoop = LI.AllocateLoop();
+ auto *NewLoop = LI->AllocateLoop();
if (ParentLoop) {
ParentLoop->addChildLoop(NewLoop);
} else {
- LI.addTopLevelLoop(NewLoop);
+ LI->addTopLevelLoop(NewLoop);
}
// Add the guard blocks to the new loop. The first guard block is
@@ -213,16 +231,15 @@ static void createNaturalLoopInternal(LoopInfo &LI, DominatorTree &DT,
// are also propagated up the chain of parent loops.
for (auto *G : GuardBlocks) {
LLVM_DEBUG(dbgs() << "added guard block: " << G->getName() << "\n");
- NewLoop->addBasicBlockToLoop(G, LI);
+ NewLoop->addBasicBlockToLoop(G, *LI);
}
- // Add the SCC blocks to the new loop.
- for (auto *BB : Blocks) {
+ for (auto *BB : C.blocks()) {
NewLoop->addBlockEntry(BB);
- if (LI.getLoopFor(BB) == ParentLoop) {
+ if (LI->getLoopFor(BB) == ParentLoop) {
LLVM_DEBUG(dbgs() << "moved block from parent: " << BB->getName()
<< "\n");
- LI.changeLoopFor(BB, NewLoop);
+ LI->changeLoopFor(BB, NewLoop);
} else {
LLVM_DEBUG(dbgs() << "added block from child: " << BB->getName() << "\n");
}
@@ -230,129 +247,58 @@ static void createNaturalLoopInternal(LoopInfo &LI, DominatorTree &DT,
LLVM_DEBUG(dbgs() << "header for new loop: "
<< NewLoop->getHeader()->getName() << "\n");
- reconnectChildLoops(LI, ParentLoop, NewLoop, Blocks, Headers);
+ reconnectChildLoops(*LI, ParentLoop, NewLoop, OldHeader);
NewLoop->verifyLoop();
if (ParentLoop) {
ParentLoop->verifyLoop();
}
#if defined(EXPENSIVE_CHECKS)
- LI.verify(DT);
+ LI->verify(DT);
#endif // EXPENSIVE_CHECKS
-}
-
-namespace llvm {
-// Enable the graph traits required for traversing a Loop body.
-template <> struct GraphTraits<Loop> : LoopBodyTraits {};
-} // namespace llvm
-// Overloaded wrappers to go with the function template below.
-static BasicBlock *unwrapBlock(BasicBlock *B) { return B; }
-static BasicBlock *unwrapBlock(LoopBodyTraits::NodeRef &N) { return N.second; }
-
-static void createNaturalLoop(LoopInfo &LI, DominatorTree &DT, Function *F,
- SetVector<BasicBlock *> &Blocks,
- SetVector<BasicBlock *> &Headers) {
- createNaturalLoopInternal(LI, DT, nullptr, Blocks, Headers);
+ return true;
}
-static void createNaturalLoop(LoopInfo &LI, DominatorTree &DT, Loop &L,
- SetVector<BasicBlock *> &Blocks,
- SetVector<BasicBlock *> &Headers) {
- createNaturalLoopInternal(LI, DT, &L, Blocks, Headers);
-}
-
-// Convert irreducible SCCs; Graph G may be a Function* or a Loop&.
-template <class Graph>
-static bool makeReducible(LoopInfo &LI, DominatorTree &DT, Graph &&G) {
- bool Changed = false;
- for (auto Scc = scc_begin(G); !Scc.isAtEnd(); ++Scc) {
- if (Scc->size() < 2)
- continue;
- SetVector<BasicBlock *> Blocks;
- LLVM_DEBUG(dbgs() << "Found SCC:");
- for (auto N : *Scc) {
- auto BB = unwrapBlock(N);
- LLVM_DEBUG(dbgs() << " " << BB->getName());
- Blocks.insert(BB);
- }
- LLVM_DEBUG(dbgs() << "\n");
-
- // Minor optimization: The SCC blocks are usually discovered in an order
- // that is the opposite of the order in which these blocks appear as branch
- // targets. This results in a lot of condition inversions in the control
- // flow out of the new ControlFlowHub, which can be mitigated if the orders
- // match. So we discover the headers using the reverse of the block order.
- SetVector<BasicBlock *> Headers;
- LLVM_DEBUG(dbgs() << "Found headers:");
- for (auto *BB : reverse(Blocks)) {
- for (const auto P : predecessors(BB)) {
- // Skip unreachable predecessors.
- if (!DT.isReachableFromEntry(P))
- continue;
- if (!Blocks.count(P)) {
- LLVM_DEBUG(dbgs() << " " << BB->getName());
- Headers.insert(BB);
- break;
- }
- }
- }
- LLVM_DEBUG(dbgs() << "\n");
-
- if (Headers.size() == 1) {
- assert(LI.isLoopHeader(Headers.front()));
- LLVM_DEBUG(dbgs() << "Natural loop with a single header: skipped\n");
- continue;
- }
- createNaturalLoop(LI, DT, G, Blocks, Headers);
- Changed = true;
- }
- return Changed;
-}
-
-static bool FixIrreducibleImpl(Function &F, LoopInfo &LI, DominatorTree &DT) {
+static bool FixIrreducibleImpl(Function &F, CycleInfo &CI, DominatorTree &DT,
+ LoopInfo *LI) {
LLVM_DEBUG(dbgs() << "===== Fix irreducible control-flow in function: "
<< F.getName() << "\n");
assert(hasOnlySimpleTerminator(F) && "Unsupported block terminator.");
bool Changed = false;
- SmallVector<Loop *, 8> WorkList;
-
- LLVM_DEBUG(dbgs() << "visiting top-level\n");
- Changed |= makeReducible(LI, DT, &F);
-
- // Any SCCs reduced are now already in the list of top-level loops, so simply
- // add them all to the worklist.
- append_range(WorkList, LI);
-
- while (!WorkList.empty()) {
- auto L = WorkList.pop_back_val();
- LLVM_DEBUG(dbgs() << "visiting loop with header "
- << L->getHeader()->getName() << "\n");
- Changed |= makeReducible(LI, DT, *L);
- // Any SCCs reduced are now already in the list of child loops, so simply
- // add them all to the worklist.
- WorkList.append(L->begin(), L->end());
+ SmallVector<Cycle *> Worklist{CI.toplevel_cycles()};
+
+ while (!Worklist.empty()) {
+ Cycle *C = Worklist.pop_back_val();
+ Changed |= fixIrreducible(*C, CI, DT, LI);
+ append_range(Worklist, C->children());
}
return Changed;
}
bool FixIrreducible::runOnFunction(Function &F) {
- auto &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
+ auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();
+ LoopInfo *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;
+ auto &CI = getAnalysis<CycleInfoWrapperPass>().getResult();
auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
- return FixIrreducibleImpl(F, LI, DT);
+ return FixIrreducibleImpl(F, CI, DT, LI);
}
PreservedAnalyses FixIrreduciblePass::run(Function &F,
FunctionAnalysisManager &AM) {
- auto &LI = AM.getResult<LoopAnalysis>(F);
+ auto *LI = AM.getCachedResult<LoopAnalysis>(F);
+ auto &CI = AM.getResult<CycleAnalysis>(F);
auto &DT = AM.getResult<DominatorTreeAnalysis>(F);
- if (!FixIrreducibleImpl(F, LI, DT))
+
+ if (!FixIrreducibleImpl(F, CI, DT, LI))
return PreservedAnalyses::all();
+
PreservedAnalyses PA;
PA.preserve<LoopAnalysis>();
+ PA.preserve<CycleAnalysis>();
PA.preserve<DominatorTreeAnalysis>();
return PA;
}
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index b61838c06a1f9..62b42c892a11e 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -68,8 +68,9 @@
; GCN-O0-NEXT: Uniformity Analysis
; GCN-O0-NEXT: Unify divergent function exit nodes
; GCN-O0-NEXT: Dominator Tree Construction
-; GCN-O0-NEXT: Natural Loop Information
+; GCN-O0-NEXT: Cycle Info Analysis
; GCN-O0-NEXT: Convert irreducible control-flow into natural loops
+; GCN-O0-NEXT: Natural Loop Information
; GCN-O0-NEXT: Fixup each natural loop to have a single exit block
; GCN-O0-NEXT: Post-Dominator Tree Construction
; GCN-O0-NEXT: Dominance Frontier Construction
@@ -262,8 +263,9 @@
; GCN-O1-NEXT: Post-Dominator Tree Construction
; GCN-O1-NEXT: Unify divergent function exit nodes
; GCN-O1-NEXT: Dominator Tree Construction
-; GCN-O1-NEXT: Natural Loop Information
+; GCN-O1-NEXT: Cycle Info Analysis
; GCN-O1-NEXT:...
[truncated]
|
Note that I have not yet finished verifying all the lit tests. I might also have to add a few more tests, especially involving a mix of irreducible and reducible cycles that are siblings and/or nested inside each other in various combinations. Especially with some overlap in the entry and header nodes. TODO:
|
@@ -189,6 +195,21 @@ template <typename ContextT> class GenericCycle { | |||
//@{ | |||
using const_entry_iterator = | |||
typename SmallVectorImpl<BlockT *>::const_iterator; | |||
const_entry_iterator entry_begin() const { | |||
return const_entry_iterator{Entries.begin()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you should be able to drop the redundant return types here and below
return const_entry_iterator{Entries.begin()}; | |
return {Entries.begin()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Fixed in an intermediate patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
/// \brief Replace all entries with \p Block as single entry. | ||
void setSingleEntry(BlockT *Block) { | ||
Entries.clear(); | ||
Entries.push_back(Block); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be nullptr
? Should we add an assert to guard against this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Replaced it with a stronger test to check that Block
is contained in the current cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Current status: This is going to take a "bit" longer. Turns out that This needs a finer method that redirects only specific edges. Either that, or we let the pass destroy some cycles. But updating |
CreateControlFlowHub is a method that redirects control flow edges from a set of incoming blocks to a set of outgoing blocks through a new set of "guard" blocks. This is now refactored into a separate file with one enhancement: The input to the method is now a set of branches rather than two sets of blocks. The original implementation reroutes every edge from incoming blocks to outgoing blocks. But it is possible that for some incoming block InBB, some successor S might be in the set of outgoing blocks, but that particular edge should not be rerouted. The new implementation makes this possible by allowing the user to specify the targets of each branch that need to be rerouted. This is needed when improving the implementation of FixIrreducible llvm#101386. Current uses in FixIrreducible and UnifyLoopExits do not demonstrate this finer control over the edges being rerouted.
CreateControlFlowHub is a method that redirects control flow edges from a set of incoming blocks to a set of outgoing blocks through a new set of "guard" blocks. This is now refactored into a separate file with one enhancement: The input to the method is now a set of branches rather than two sets of blocks. The original implementation reroutes every edge from incoming blocks to outgoing blocks. But it is possible that for some incoming block InBB, some successor S might be in the set of outgoing blocks, but that particular edge should not be rerouted. The new implementation makes this possible by allowing the user to specify the targets of each branch that need to be rerouted. This is needed when improving the implementation of FixIrreducible llvm#101386. Current uses in FixIrreducible and UnifyLoopExits do not demonstrate this finer control over the edges being rerouted.
CreateControlFlowHub is a method that redirects control flow edges from a set of incoming blocks to a set of outgoing blocks through a new set of "guard" blocks. This is now refactored into a separate file with one enhancement: The input to the method is now a set of branches rather than two sets of blocks. The original implementation reroutes every edge from incoming blocks to outgoing blocks. But it is possible that for some incoming block InBB, some successor S might be in the set of outgoing blocks, but that particular edge should not be rerouted. The new implementation makes this possible by allowing the user to specify the targets of each branch that need to be rerouted. This is needed when improving the implementation of FixIrreducible #101386. Current use in FixIrreducible does not demonstrate this finer control over the edges being rerouted. But in UnifyLoopExits, when only one successor of an exiting block is an exit block, this refinement now reroutes only the relevant control-flow through the edge; the non-exit successor is not rerouted. This results in fewer branches and PHI nodes in the hub.
c60ea06
to
eb8519e
Compare
This now depends on the newly refactored ControlFlowHub, which correctly reroutes only the relevant edges. The effect was already caught in an existing test with nested cycles and a common header, so no new test needs to be written for this. |
|
CreateControlFlowHub is a method that redirects control flow edges from a set of incoming blocks to a set of outgoing blocks through a new set of "guard" blocks. This is now refactored into a separate file with one enhancement: The input to the method is now a set of branches rather than two sets of blocks. The original implementation reroutes every edge from incoming blocks to outgoing blocks. But it is possible that for some incoming block InBB, some successor S might be in the set of outgoing blocks, but that particular edge should not be rerouted. The new implementation makes this possible by allowing the user to specify the targets of each branch that need to be rerouted. This is needed when improving the implementation of FixIrreducible #101386. Current use in FixIrreducible does not demonstrate this finer control over the edges being rerouted. But in UnifyLoopExits, when only one successor of an exiting block is an exit block, this refinement now reroutes only the relevant control-flow through the edge; the non-exit successor is not rerouted. This results in fewer branches and PHI nodes in the hub.
01beb62
to
5f6172f
Compare
CreateControlFlowHub is a method that redirects control flow edges from a set of incoming blocks to a set of outgoing blocks through a new set of "guard" blocks. This is now refactored into a separate file with one enhancement: The input to the method is now a set of branches rather than two sets of blocks. The original implementation reroutes every edge from incoming blocks to outgoing blocks. But it is possible that for some incoming block InBB, some successor S might be in the set of outgoing blocks, but that particular edge should not be rerouted. The new implementation makes this possible by allowing the user to specify the targets of each branch that need to be rerouted. This is needed when improving the implementation of FixIrreducible llvm#101386. Current use in FixIrreducible does not demonstrate this finer control over the edges being rerouted. But in UnifyLoopExits, when only one successor of an exiting block is an exit block, this refinement now reroutes only the relevant control-flow through the edge; the non-exit successor is not rerouted. This results in fewer branches and PHI nodes in the hub.
1. CycleInfo efficiently locates all cycles in a single pass, while the SCC is repeated inside every natural loop. 2. CycleInfo provides a hierarchy of irreducible cycles, and the new implementation transforms each cycle in this hierarchy separately instead of reducing an entire irreducible SCC in a single step. This reduces the number of control-flow paths that pass through the header of each newly created loop. This is evidenced by the reduced number of predecessors on the "guard" blocks in the lit tests, and fewer operands on the corresponding PHI nodes. 3. When an entry of an irreducible cycle is the header of a child natural loop, the original implementation destroyed that loop. This is now preserved, since the incoming edges on non-header entries are not touched. 4. In the new implementation, if an irreducible cycle is a superset of a natural loop with the same header, then that natural loop is destroyed and replaced by the newly created loop.
eb8519e
to
6e5aaba
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/4768 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/4116 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/51/builds/2945 Here is the relevant piece of the build log for the reference
|
[FixIrreducible] Use CycleInfo instead of a custom SCC traversal
CycleInfo efficiently locates all cycles in a single pass, while the SCC is
repeated inside every natural loop.
CycleInfo provides a hierarchy of irreducible cycles, and the new
implementation transforms each cycle in this hierarchy separately instead of
reducing an entire irreducible SCC in a single step. This reduces the number
of control-flow paths that pass through the header of each newly created
loop. This is evidenced by the reduced number of predecessors on the "guard"
blocks in the lit tests, and fewer operands on the corresponding PHI nodes.
When an entry of an irreducible cycle is the header of a child natural loop,
the original implementation destroyed that loop. This is now preserved,
since the incoming edges on non-header entries are not touched.
In the new implementation, if an irreducible cycle is a superset of a natural
loop with the same header, then that natural loop is destroyed and replaced
by the newly created loop.