Skip to content

[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). #117506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 24, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 14 additions & 10 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2778,13 +2778,13 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State) {
PSE.getSE()->forgetLoop(OrigLoop);
PSE.getSE()->forgetBlockAndLoopDispositions();

// Don't apply optimizations below when no vector region remains, as they all
// require a vector loop at the moment.
if (!State.Plan->getVectorLoopRegion())
// Don't apply optimizations below when no (vector) loop remains, as they all
// require one at the moment.
VPBasicBlock *HeaderVPBB =
vputils::getTopLevelVectorLoopHeader(*State.Plan, State.VPDT);
if (!HeaderVPBB)
return;

VPRegionBlock *VectorRegion = State.Plan->getVectorLoopRegion();
VPBasicBlock *HeaderVPBB = VectorRegion->getEntryBasicBlock();
BasicBlock *HeaderBB = State.CFG.VPBB2IRBB[HeaderVPBB];

// Remove redundant induction instructions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could cse() be applied as a VPlan2VPlan transformation rather than to the generated IR below? Possibly even before dismantling regions, when header blocks are easier to find.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this should also be moved to a VPlan-transform. The limitation to the loop region isn't really material for the VPlan version.

Expand All @@ -2809,7 +2809,7 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State) {
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could setProfileInfoAfterUnrolling() be applied to the branch recipe as part of its VPIRMetadata, rather than above to the branch instruction it generates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, but it would also need adding metadata to the branch in the scalar loop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which calls for expanding VPlan's scope...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, although it will require modifying the original scalar loop in VPlan, which probably needs more thought what kind of modifications we need. Maybe adding metadata would be sufficient?


void InnerLoopVectorizer::fixNonInductionPHIs(VPTransformState &State) {
auto Iter = vp_depth_first_deep(Plan.getEntry());
auto Iter = vp_depth_first_shallow(Plan.getEntry());
for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(Iter)) {
for (VPRecipeBase &P : VPBB->phis()) {
VPWidenPHIRecipe *VPPhi = dyn_cast<VPWidenPHIRecipe>(&P);
Expand Down Expand Up @@ -7799,6 +7799,10 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
BestVPlan, BestVF,
TTI.getRegisterBitWidth(TargetTransformInfo::RGK_FixedWidthVector));
VPlanTransforms::removeDeadRecipes(BestVPlan);

VPBasicBlock *MiddleVPBB =
BestVPlan.getVectorLoopRegion() ? BestVPlan.getMiddleBlock() : nullptr;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduce closer to use - why hoist (from line 7964 below)?
If set to null, will its uses still work ok?
Should getMiddleBlock() return null in the absence of a vector loop region?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done here, just before disolving the region. To sink it, we would need to find the middle-block w/o region.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Deserves a comment.

Also worth noting phase ordering aspects of disolveLoopRegions() - intentionally after optimizeForVFAndUF()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks.

VPlanTransforms::disolveLoopRegions(BestVPlan);
VPlanTransforms::convertToConcreteRecipes(BestVPlan,
*Legal->getWidestInductionType());

Expand Down Expand Up @@ -7894,14 +7898,15 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
// 2.6. Maintain Loop Hints
// Keep all loop hints from the original loop on the vector loop (we'll
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the Loop Hints be captured in VPIRMetadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be possible, can do separately, thanks

// replace the vectorizer-specific hints below).
if (auto *LoopRegion = BestVPlan.getVectorLoopRegion()) {
VPBasicBlock *HeaderVPBB =
vputils::getTopLevelVectorLoopHeader(BestVPlan, State.VPDT);
if (HeaderVPBB) {
MDNode *OrigLoopID = OrigLoop->getLoopID();

std::optional<MDNode *> VectorizedLoopID =
makeFollowupLoopID(OrigLoopID, {LLVMLoopVectorizeFollowupAll,
LLVMLoopVectorizeFollowupVectorized});

VPBasicBlock *HeaderVPBB = LoopRegion->getEntryBasicBlock();
Loop *L = LI->getLoopFor(State.CFG.VPBB2IRBB[HeaderVPBB]);
if (VectorizedLoopID) {
L->setLoopID(*VectorizedLoopID);
Expand Down Expand Up @@ -7947,8 +7952,7 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
ILV.printDebugTracesAtEnd();

// 4. Adjust branch weight of the branch in the middle block.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this branch weight be captured in VPIRMetadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will look into that separately, thanks

if (BestVPlan.getVectorLoopRegion()) {
auto *MiddleVPBB = BestVPlan.getMiddleBlock();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why hoist setting MiddleVPBB from here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, currently it relies on the region. Could also do a bit more work to find it w/o region if desired

if (HeaderVPBB) {
auto *MiddleTerm =
cast<BranchInst>(State.CFG.VPBB2IRBB[MiddleVPBB]->getTerminator());
if (MiddleTerm->isConditional() &&
Expand Down
129 changes: 81 additions & 48 deletions llvm/lib/Transforms/Vectorize/VPlan.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,19 @@ VPBlockBase *VPBlockBase::getEnclosingBlockWithPredecessors() {
return Parent->getEnclosingBlockWithPredecessors();
}

bool VPBlockUtils::isHeader(const VPBlockBase *VPB,
const VPDominatorTree &VPDT) {
auto *VPBB = dyn_cast<VPBasicBlock>(VPB);
if (!VPBB)
return false;
if (auto *R = VPBB->getParent())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (auto *R = VPBB->getParent())
// If VPBB is in a region R, VPBB is a loop header if R is a loop region with VPBB as its entry, i.e., free of predecessors.
if (auto *R = VPBB->getParent())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

return !R->isReplicator() && VPBB->getNumPredecessors() == 0;

assert(!VPB->getParent() && "checking blocks in regions not implemented yet");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented above?

Suggested change
assert(!VPB->getParent() && "checking blocks in regions not implemented yet");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, removed, thanks

return VPB->getPredecessors().size() == 2 &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return VPB->getPredecessors().size() == 2 &&
// A header dominates its second predecessor (the latch), with the other predecessor being the preheader.
return VPB->getPredecessors().size() == 2 &&

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks!

VPDT.dominates(VPB, VPB->getPredecessors()[1]);
}

VPBasicBlock::iterator VPBasicBlock::getFirstNonPhi() {
iterator It = begin();
while (It != end() && It->isPhi())
Expand Down Expand Up @@ -420,12 +433,17 @@ void VPBasicBlock::connectToPredecessors(VPTransformState &State) {
if (ParentLoop && !State.LI->getLoopFor(NewBB))
ParentLoop->addBasicBlockToLoop(NewBB, *State.LI);

auto Preds = to_vector(getHierarchicalPredecessors());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suffice to getPredecessors() now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, updated thanks

if (VPBlockUtils::isHeader(this, State.VPDT)) {
// There's no block yet for the latch, don't try to connect it yet.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// There's no block yet for the latch, don't try to connect it yet.
// There's no block for the latch yet, connect to the preheader only.

(Could create a block for the latch - like creating blocks for successors?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated thanks. We could create both predecessors and successors for blocks if they don't exist yet, but that would be a bigger change. Once we also unroll replicate regions for VF, it would be possible to do an initial pass over the whole plan, create all IR BBs w/o connecting them and a second pass creating the instructions in the block + connecting them.

Preds = {Preds[0]};
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can alternatively do:

Suggested change
Preds = {Preds[0]};
}
Preds = getPredecessors()[0];
} else {
Preds = to_vector(getPredecessors());
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks


// Hook up the new basic block to its predecessors.
for (VPBlockBase *PredVPBlock : getHierarchicalPredecessors()) {
for (VPBlockBase *PredVPBlock : Preds) {
VPBasicBlock *PredVPBB = PredVPBlock->getExitingBasicBlock();
auto &PredVPSuccessors = PredVPBB->getHierarchicalSuccessors();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: getHierarchical*() are still needed due to replicate regions, when VF>1 (but removed when unrolling only, with UF>1, VF=1?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

BasicBlock *PredBB = CFG.VPBB2IRBB[PredVPBB];

BasicBlock *PredBB = CFG.VPBB2IRBB.lookup(PredVPBB);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can retrain

Suggested change
BasicBlock *PredBB = CFG.VPBB2IRBB.lookup(PredVPBB);
BasicBlock *PredBB = CFG.VPBB2IRBB[PredVPBB];

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored, thanks

assert(PredBB && "Predecessor basic-block not found building successor.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can alternatively introduce an assert(CFG.VPBB2IRBB.contains(PredVPBB) && ... before the CFG.VPBB2IRBB[PredVPBB].

auto *PredBBTerminator = PredBB->getTerminator();
LLVM_DEBUG(dbgs() << "LV: draw edge from" << PredBB->getName() << '\n');
Expand Down Expand Up @@ -487,11 +505,25 @@ void VPBasicBlock::execute(VPTransformState *State) {
bool Replica = bool(State->Lane);
BasicBlock *NewBB = State->CFG.PrevBB; // Reuse it if possible.

if (VPBlockUtils::isHeader(this, State->VPDT)) {
// Create and register the new vector loop.
Loop *PrevParentLoop = State->CurrentParentLoop;
State->CurrentParentLoop = State->LI->AllocateLoop();

// Insert the new loop into the loop nest and register the new basic blocks
// before calling any utilities such as SCEV that require valid LoopInfo.
if (PrevParentLoop)
PrevParentLoop->addChildLoop(State->CurrentParentLoop);
else
State->LI->addTopLevelLoop(State->CurrentParentLoop);
}

auto IsReplicateRegion = [](VPBlockBase *BB) {
auto *R = dyn_cast_or_null<VPRegionBlock>(BB);
return R && R->isReplicator();
assert((!R || R->isReplicator()) &&
"only replicate region blocks should remain");
return R;
};

// 1. Create an IR basic block.
if ((Replica && this == getParent()->getEntry()) ||
IsReplicateRegion(getSingleHierarchicalPredecessor())) {
Expand All @@ -514,6 +546,11 @@ void VPBasicBlock::execute(VPTransformState *State) {

// 2. Fill the IR basic block with IR instructions.
executeRecipes(State, NewBB);

// If this block is a latch, update CurrentParentLoop.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if current canonical block isLatch could be done by checking if its 2nd successor isHeader?
(Preheader has a single successor whereas latch of canonicalized vector loop has two, with header as 2nd).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If this block is a latch, update CurrentParentLoop.
// If this block is a latch, update CurrentParentLoop. The second successor of a latch is a header, with the other successor being an exit/middle.

Wrapping in VPBlockUtils::isLatch() may be clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks

if (getNumSuccessors() == 2 &&
VPBlockUtils::isHeader(getSuccessors()[1], State->VPDT))
State->CurrentParentLoop = State->CurrentParentLoop->getParentLoop();
}

VPBasicBlock *VPBasicBlock::clone() {
Expand Down Expand Up @@ -725,35 +762,13 @@ VPRegionBlock *VPRegionBlock::clone() {
}

void VPRegionBlock::execute(VPTransformState *State) {
ReversePostOrderTraversal<VPBlockShallowTraversalWrapper<VPBlockBase *>>
RPOT(Entry);

if (!isReplicator()) {
// Create and register the new vector loop.
Loop *PrevParentLoop = State->CurrentParentLoop;
State->CurrentParentLoop = State->LI->AllocateLoop();

// Insert the new loop into the loop nest and register the new basic blocks
// before calling any utilities such as SCEV that require valid LoopInfo.
if (PrevParentLoop)
PrevParentLoop->addChildLoop(State->CurrentParentLoop);
else
State->LI->addTopLevelLoop(State->CurrentParentLoop);

// Visit the VPBlocks connected to "this", starting from it.
for (VPBlockBase *Block : RPOT) {
LLVM_DEBUG(dbgs() << "LV: VPBlock in RPO " << Block->getName() << '\n');
Block->execute(State);
}

State->CurrentParentLoop = PrevParentLoop;
return;
}

assert(isReplicator() &&
"Loop regions should have been lowered to plain CFG");
assert(!State->Lane && "Replicating a Region with non-null instance.");

// Enter replicating mode.
assert(!State->VF.isScalable() && "VF is assumed to be non scalable.");

ReversePostOrderTraversal<VPBlockShallowTraversalWrapper<VPBlockBase *>> RPOT(
Entry);
State->Lane = VPLane(0);
for (unsigned Lane = 0, VF = State->VF.getKnownMinValue(); Lane < VF;
++Lane) {
Expand Down Expand Up @@ -847,6 +862,23 @@ void VPRegionBlock::print(raw_ostream &O, const Twine &Indent,
}
#endif

void VPRegionBlock::removeRegion() {
auto *Header = cast<VPBasicBlock>(getEntry());
VPBlockBase *Preheader = getSinglePredecessor();
auto *Exiting = cast<VPBasicBlock>(getExiting());

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

VPBlockBase *Middle = getSingleSuccessor();
VPBlockUtils::disconnectBlocks(Preheader, this);
VPBlockUtils::disconnectBlocks(this, Middle);

for (VPBlockBase *VPB : vp_depth_first_shallow(Entry))
VPB->setParent(getParent());

VPBlockUtils::connectBlocks(Preheader, Header);
VPBlockUtils::connectBlocks(Exiting, Middle);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the backedge be added too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, moved here, thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Exiting be called Latch? It is admittedly both, so could also be ExitingLatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to ExitingLatch, thanks!

VPBlockUtils::connectBlocks(Exiting, Header);
}

VPlan::VPlan(Loop *L) {
setEntry(createVPIRBasicBlock(L->getLoopPreheader()));
ScalarHeader = createVPIRBasicBlock(L->getHeader());
Expand Down Expand Up @@ -956,18 +988,16 @@ void VPlan::execute(VPTransformState *State) {
for (VPBlockBase *Block : RPOT)
Block->execute(State);

State->CFG.DTU.flush();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to flush now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is independent, will submit separately, thanks!


auto *LoopRegion = getVectorLoopRegion();
if (!LoopRegion)
VPBasicBlock *Header =
vputils::getTopLevelVectorLoopHeader(*this, State->VPDT);
if (!Header)
return;

VPBasicBlock *LatchVPBB = LoopRegion->getExitingBasicBlock();
auto *LatchVPBB = cast<VPBasicBlock>(Header->getPredecessors()[1]);
BasicBlock *VectorLatchBB = State->CFG.VPBB2IRBB[LatchVPBB];

// Fix the latch value of canonical, reduction and first-order recurrences
// phis in the vector loop.
VPBasicBlock *Header = LoopRegion->getEntryBasicBlock();
for (VPRecipeBase &R : Header->phis()) {
// Skip phi-like recipes that generate their backedege values themselves.
if (isa<VPWidenPHIRecipe>(&R))
Expand Down Expand Up @@ -1003,8 +1033,10 @@ void VPlan::execute(VPTransformState *State) {
bool NeedsScalar = isa<VPInstruction>(PhiR) ||
(isa<VPReductionPHIRecipe>(PhiR) &&
cast<VPReductionPHIRecipe>(PhiR)->isInLoop());

Value *Phi = State->get(PhiR, NeedsScalar);
// VPHeaderPHIRecipe supports getBackedgeValue() but VPInstruction does not.
// VPHeaderPHIRecipe supports getBackedgeValue() but VPInstruction does
// not.
Value *Val = State->get(PhiR->getOperand(1), NeedsScalar);
cast<PHINode>(Phi)->addIncoming(Val, VectorLatchBB);
}
Expand Down Expand Up @@ -1365,17 +1397,18 @@ void VPlanPrinter::dumpRegion(const VPRegionBlock *Region) {

#endif

/// Returns true if there is a vector loop region and \p VPV is defined in a
/// loop region.
static bool isDefinedInsideLoopRegions(const VPValue *VPV) {
const VPRecipeBase *DefR = VPV->getDefiningRecipe();
return DefR && (!DefR->getParent()->getPlan()->getVectorLoopRegion() ||
DefR->getParent()->getEnclosingLoopRegion());
}
bool VPValue::isDefinedOutsideLoop() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not adjusted the name of isDefinedInsideLoopRegions yet, as it's code is now not touched in the PR. can do separately or pull in here.

As you prefer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored original name for now

auto *DefR = getDefiningRecipe();
if (!DefR)
return true;

bool VPValue::isDefinedOutsideLoopRegions() const {
return !isDefinedInsideLoopRegions(this);
// For non-live-ins, check if is in a region only if the top-level loop region
// still exits.
const VPBasicBlock *DefVPBB = DefR->getParent();
auto *Plan = DefVPBB->getPlan();
return Plan->getVectorLoopRegion() && !DefVPBB->getEnclosingLoopRegion();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How/Does this differ from the existing isDefinedOutsideLoopRegions(), if it still depends on (loop) regions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previousisDefinedOutsideLoopRegions would return true if there are no loop regions (because the only possible case was when it had been removed). Now we need to be more conservative, because the regions will be removed, but loops still remain.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

De Morgan'izing previous isDefinedOutsideLoopRegions():

!(DefR && (!DefR->getParent()->getPlan()->getVectorLoopRegion() ||
           DefR->getParent()->getEnclosingLoopRegion()))
==
!DefR || !(!DefR->getParent()->getPlan()->getVectorLoopRegion() ||
           DefR->getParent()->getEnclosingLoopRegion())
==
!DefR || (DefR->getParent()->getPlan()->getVectorLoopRegion() &&
          !DefR->getParent()->getEnclosingLoopRegion())

which appears to be what the current isDefinedOutsideLoop() is doing?

This is already conservative in the sense that answering false means it may be defined inside loops - those that remain after loop regions are removed? Perhaps more accurately called mustBeDefinedOutsideLoops(), instead of negating isDefinedInsideLoopRegions() which is precisely documented to return "true if there is a vector loop region and \p VPV is defined in a loop region".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, messed something up on my side when trying to restore, should be done now. Still kept the new name isDefinedOutsideLoop.

I've not adjusted the name of isDefinedInsideLoopRegions yet, as it's code is now not touched in the PR. can do separately or pull in here.

}

void VPValue::replaceAllUsesWith(VPValue *New) {
replaceUsesWithIf(New, [](VPUser &, unsigned) { return true; });
}
Expand Down
17 changes: 9 additions & 8 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -1589,9 +1589,7 @@ struct VPWidenSelectRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
return getOperand(0);
}

bool isInvariantCond() const {
return getCond()->isDefinedOutsideLoopRegions();
}
bool isInvariantCond() const { return getCond()->isDefinedOutsideLoop(); }

/// Returns true if the recipe only uses the first lane of operand \p Op.
bool onlyFirstLaneUsed(const VPValue *Op) const override {
Expand All @@ -1604,17 +1602,16 @@ struct VPWidenSelectRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
/// A recipe for handling GEP instructions.
class VPWidenGEPRecipe : public VPRecipeWithIRFlags {
bool isPointerLoopInvariant() const {
return getOperand(0)->isDefinedOutsideLoopRegions();
return getOperand(0)->isDefinedOutsideLoop();
}

bool isIndexLoopInvariant(unsigned I) const {
return getOperand(I + 1)->isDefinedOutsideLoopRegions();
return getOperand(I + 1)->isDefinedOutsideLoop();
}

bool areAllOperandsInvariant() const {
return all_of(operands(), [](VPValue *Op) {
return Op->isDefinedOutsideLoopRegions();
});
return all_of(operands(),
[](VPValue *Op) { return Op->isDefinedOutsideLoop(); });
}

public:
Expand Down Expand Up @@ -3566,6 +3563,10 @@ class VPRegionBlock : public VPBlockBase {
/// Clone all blocks in the single-entry single-exit region of the block and
/// their recipes without updating the operands of the cloned recipes.
VPRegionBlock *clone() override;

/// Remove the current region from its VPlan, connecting its predecessor to
/// its entry and exiting block to its successor.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// its entry and exiting block to its successor.
/// its entry, and its exiting block to its successor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated ,thanks

void removeRegion();
};

/// VPlan models a candidate for vectorization, encoding various decisions take
Expand Down
Loading