[RISCV] Move exact VLEN VLMAX encoding to RISCVInsertVSETVLI. NFC-ish #79338

lukel97 · 2024-01-24T17:45:56Z

In #75412 and #75509 we started to use the vsetvli a0, zero encoding to reduce
register pressure when we know the exact VLEN.

We do this by canonicalizing VL ops to X0 in RISCVISelLowering.

This is only needed for cases where the AVL doesn't fit inside an immediate
though, since we already have code in RISCVInsertVSETVLI that detects if we
know the exact VLEN and uses the immediate encoding if possible.

This patch removes the need to do the canonicalization in RISCVISelLowering by
handling said case in RISCVInsertVSETVLI itself.

There are a handful of reasons I'm proposing to do this:

It keeps the vsetvli logic in the one pass
We get to move code out of SelectionDAG so we don't forget about it in
GlobalISel when the time comes
Canonicalizing the VL in RISCVISelLowering means that we lose the original VL
for fixed length vectors, which might be useful information for passes that
sit between lowering and RISCVInsertVSETVLI. (I discovered this when working
on a patch for RISCVFoldMasks.cpp that worked on fixed length vectors)

llvmbot · 2024-01-24T17:46:38Z

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

In #75412 and #75509 we started to use the vsetvli a0, zero encoding to reduce
register pressure when we know the exact VLEN.

We do this by canonicalizing VL ops to X0 in RISCVISelLowering.

This is only needed for cases where the AVL doesn't fit inside an immediate
though, since we already have code in RISCVInsertVSETVLI that detects if we
know the exact VLEN and uses the immediate encoding if possible.

This patch removes the need to do the canonicalization in RISCVISelLowering by
handling said case in RISCVInsertVSETVLI itself.

There are a handful of reasons I'm proposing to do this:

It keeps the vsetvli logic in the one pass
We get to move code out of SelectionDAG so we don't forget about it in
GlobalISEL when the time comes
Canonicalizing the VL in RISCVISelLowering means that we lose the original VL
for fixed length vectors, which might be useful information for passes that
sit between lowering and RISCVInsertVSETVLI. (I discovered this when working
on a patch for RISCVFoldMasks.cpp that worked on fixed length vectors)

Full diff: https://github.com/llvm/llvm-project/pull/79338.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+9-21)
(modified) llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp (+14-2)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 56a5ab14a4a9f8d..4a3474803dcf831 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -2592,16 +2592,8 @@ static SDValue getAllOnesMask(MVT VecVT, SDValue VL, const SDLoc &DL,
   return DAG.getNode(RISCVISD::VMSET_VL, DL, MaskVT, VL);
 }
 
-static SDValue getVLOp(uint64_t NumElts, MVT ContainerVT, const SDLoc &DL,
-                       SelectionDAG &DAG, const RISCVSubtarget &Subtarget) {
-  // If we know the exact VLEN, and our VL is exactly equal to VLMAX,
-  // canonicalize the representation.  InsertVSETVLI will pick the immediate
-  // encoding later if profitable.
-  const auto [MinVLMAX, MaxVLMAX] =
-      RISCVTargetLowering::computeVLMAXBounds(ContainerVT, Subtarget);
-  if (MinVLMAX == MaxVLMAX && NumElts == MinVLMAX)
-    return DAG.getRegister(RISCV::X0, Subtarget.getXLenVT());
-
+static SDValue getVLOp(uint64_t NumElts, const SDLoc &DL, SelectionDAG &DAG,
+                       const RISCVSubtarget &Subtarget) {
   return DAG.getConstant(NumElts, DL, Subtarget.getXLenVT());
 }
 
@@ -2618,7 +2610,7 @@ static std::pair<SDValue, SDValue>
 getDefaultVLOps(uint64_t NumElts, MVT ContainerVT, const SDLoc &DL,
                 SelectionDAG &DAG, const RISCVSubtarget &Subtarget) {
   assert(ContainerVT.isScalableVector() && "Expecting scalable container type");
-  SDValue VL = getVLOp(NumElts, ContainerVT, DL, DAG, Subtarget);
+  SDValue VL = getVLOp(NumElts, DL, DAG, Subtarget);
   SDValue Mask = getAllOnesMask(ContainerVT, VL, DL, DAG);
   return {Mask, VL};
 }
@@ -8908,8 +8900,7 @@ SDValue RISCVTargetLowering::LowerINTRINSIC_W_CHAIN(SDValue Op,
     MVT VT = Op->getSimpleValueType(0);
     MVT ContainerVT = getContainerForFixedLengthVector(VT);
 
-    SDValue VL = getVLOp(VT.getVectorNumElements(), ContainerVT, DL, DAG,
-                         Subtarget);
+    SDValue VL = getVLOp(VT.getVectorNumElements(), DL, DAG, Subtarget);
     SDValue IntID = DAG.getTargetConstant(VlsegInts[NF - 2], DL, XLenVT);
     auto *Load = cast<MemIntrinsicSDNode>(Op);
     SmallVector<EVT, 9> ContainerVTs(NF, ContainerVT);
@@ -9029,8 +9020,7 @@ SDValue RISCVTargetLowering::LowerINTRINSIC_VOID(SDValue Op,
     MVT VT = Op->getOperand(2).getSimpleValueType();
     MVT ContainerVT = getContainerForFixedLengthVector(VT);
 
-    SDValue VL = getVLOp(VT.getVectorNumElements(), ContainerVT, DL, DAG,
-                         Subtarget);
+    SDValue VL = getVLOp(VT.getVectorNumElements(), DL, DAG, Subtarget);
     SDValue IntID = DAG.getTargetConstant(VssegInts[NF - 2], DL, XLenVT);
     SDValue Ptr = Op->getOperand(NF + 2);
 
@@ -9534,7 +9524,7 @@ SDValue RISCVTargetLowering::lowerINSERT_SUBVECTOR(SDValue Op,
     // Set the vector length to only the number of elements we care about. Note
     // that for slideup this includes the offset.
     unsigned EndIndex = OrigIdx + SubVecVT.getVectorNumElements();
-    SDValue VL = getVLOp(EndIndex, ContainerVT, DL, DAG, Subtarget);
+    SDValue VL = getVLOp(EndIndex, DL, DAG, Subtarget);
 
     // Use tail agnostic policy if we're inserting over Vec's tail.
     unsigned Policy = RISCVII::TAIL_UNDISTURBED_MASK_UNDISTURBED;
@@ -9711,8 +9701,7 @@ SDValue RISCVTargetLowering::lowerEXTRACT_SUBVECTOR(SDValue Op,
         getDefaultVLOps(VecVT, ContainerVT, DL, DAG, Subtarget).first;
     // Set the vector length to only the number of elements we care about. This
     // avoids sliding down elements we're going to discard straight away.
-    SDValue VL = getVLOp(SubVecVT.getVectorNumElements(), ContainerVT, DL, DAG,
-                         Subtarget);
+    SDValue VL = getVLOp(SubVecVT.getVectorNumElements(), DL, DAG, Subtarget);
     SDValue SlidedownAmt = DAG.getConstant(OrigIdx, DL, XLenVT);
     SDValue Slidedown =
         getVSlidedown(DAG, Subtarget, DL, ContainerVT,
@@ -10132,7 +10121,7 @@ RISCVTargetLowering::lowerFixedLengthVectorLoadToRVV(SDValue Op,
     return DAG.getMergeValues({Result, NewLoad.getValue(1)}, DL);
   }
 
-  SDValue VL = getVLOp(VT.getVectorNumElements(), ContainerVT, DL, DAG, Subtarget);
+  SDValue VL = getVLOp(VT.getVectorNumElements(), DL, DAG, Subtarget);
 
   bool IsMaskOp = VT.getVectorElementType() == MVT::i1;
   SDValue IntID = DAG.getTargetConstant(
@@ -10189,8 +10178,7 @@ RISCVTargetLowering::lowerFixedLengthVectorStoreToRVV(SDValue Op,
     return DAG.getStore(Store->getChain(), DL, NewValue, Store->getBasePtr(),
                         Store->getMemOperand());
 
-  SDValue VL = getVLOp(VT.getVectorNumElements(), ContainerVT, DL, DAG,
-                       Subtarget);
+  SDValue VL = getVLOp(VT.getVectorNumElements(), DL, DAG, Subtarget);
 
   bool IsMaskOp = VT.getVectorElementType() == MVT::i1;
   SDValue IntID = DAG.getTargetConstant(
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index a14f9a283547370..7ba93354df9bbc6 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -858,13 +858,14 @@ static VSETVLIInfo computeInfoForInstr(const MachineInstr &MI, uint64_t TSFlags,
 
   if (RISCVII::hasVLOp(TSFlags)) {
     const MachineOperand &VLOp = MI.getOperand(getVLOpNum(MI));
+    const unsigned VLMAX = computeVLMAX(ST.getRealMaxVLen(), SEW, VLMul);
+
     if (VLOp.isImm()) {
       int64_t Imm = VLOp.getImm();
       // Conver the VLMax sentintel to X0 register.
       if (Imm == RISCV::VLMaxSentinel) {
         // If we know the exact VLEN, see if we can use the constant encoding
         // for the VLMAX instead.  This reduces register pressure slightly.
-        const unsigned VLMAX = computeVLMAX(ST.getRealMaxVLen(), SEW, VLMul);
         if (ST.getRealMinVLen() == ST.getRealMaxVLen() && VLMAX <= 31)
           InstrInfo.setAVLImm(VLMAX);
         else
@@ -873,7 +874,18 @@ static VSETVLIInfo computeInfoForInstr(const MachineInstr &MI, uint64_t TSFlags,
       else
         InstrInfo.setAVLImm(Imm);
     } else {
-      InstrInfo.setAVLReg(VLOp.getReg());
+      Register Reg = VLOp.getReg();
+      // If VL is a register, it may be a materialized constant that didn't fit
+      // into an uimm5. If we also know the exact VLEN, and the VL is equal to
+      // the exact VLEN, use the X0 encoding so we don't need the ADDI.
+      // doLocalPostpass will remove the ADDI if it's dead.
+      if (ST.getRealMinVLen() == ST.getRealMaxVLen() &&
+          VLOp.getReg().isVirtual())
+        if (auto *VLDef = MRI->getVRegDef(VLOp.getReg());
+            VLDef && isNonZeroLoadImmediate(*VLDef) &&
+            VLDef->getOperand(2).getImm() == VLMAX)
+          Reg = RISCV::X0;
+      InstrInfo.setAVLReg(Reg);
     }
   } else {
     assert(isScalarExtractInstr(MI));

In llvm#75412 and llvm#75509 we started to use the vsetvli a0, zero encoding to reduce register pressure when we know the exact VLEN. We do this by canonicalizing VL ops to X0 in RISCVISelLowering. This is only needed for cases where the AVL doesn't fit inside an immediate though, since we already have code in RISCVInsertVSETVLI that detects if we know the exact VLEN and uses the immediate encoding if possible. This patch removes the need to do the canonicalization in RISCVISelLowering by handling said case in RISCVInsertVSETVLI itself. There are a handful of reasons I'm proposing to do this: - It keeps the vsetvli logic in the one pass - We get to move code out of SelectionDAG so we don't forget about it in GlobalISEL when the time comes - Canonicalizing the VL in RISCVISelLowering means that we lose the original VL for fixed length vectors, which might be useful information for passes that sit between lowering and RISCVInsertVSETVLI. (I discovered this when working on a patch for RISCVFoldMasks.cpp that worked on fixed length vectors)

lukel97 · 2024-07-25T10:35:41Z

Superseded by #100551

lukel97 requested review from preames and topperc January 24, 2024 17:45

llvmbot added the backend:RISC-V label Jan 24, 2024

lukel97 force-pushed the canonicalize-exact-vlen-x0-insertvsetvli branch from f5c7b2e to 6a8d675 Compare February 22, 2024 08:27

lukel97 closed this Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Move exact VLEN VLMAX encoding to RISCVInsertVSETVLI. NFC-ish #79338

[RISCV] Move exact VLEN VLMAX encoding to RISCVInsertVSETVLI. NFC-ish #79338

lukel97 commented Jan 24, 2024 •

edited

Loading

llvmbot commented Jan 24, 2024

lukel97 commented Jul 25, 2024

[RISCV] Move exact VLEN VLMAX encoding to RISCVInsertVSETVLI. NFC-ish #79338

[RISCV] Move exact VLEN VLMAX encoding to RISCVInsertVSETVLI. NFC-ish #79338

Conversation

lukel97 commented Jan 24, 2024 • edited Loading

llvmbot commented Jan 24, 2024

lukel97 commented Jul 25, 2024

lukel97 commented Jan 24, 2024 •

edited

Loading