Skip to content

[SPIR-V] Improve correctness of emitted MIR between passes for branching instructions #106966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

VyacheslavLevytskyy
Copy link
Contributor

This PR improves correctness of emitted MIR between passes for branching instructions and thus increase number of passing tests when expensive checks are on. Specifically, we address here such issues with machine verifier as:

  • fix switch generation: generate correct successors and undo the "address taken" status to reflect that a successor doesn't actually correspond to an IR-level basic block;
  • fix incorrect definition of OpBranch and OpBranchConditional in TableGen (SPIRVInstrInfo.td) to set isBarrier status properly and set a correct type of virtual registers;
  • fix a case when Phi refers to a type definition that goes after the Phi instruction, so that the virtual register definition of the type doesn't dominate all uses.

This PR decrease number of failing tests under expensive checks from 56 to 50.

@llvmbot
Copy link
Member

llvmbot commented Sep 2, 2024

@llvm/pr-subscribers-backend-spir-v

Author: Vyacheslav Levytskyy (VyacheslavLevytskyy)

Changes

This PR improves correctness of emitted MIR between passes for branching instructions and thus increase number of passing tests when expensive checks are on. Specifically, we address here such issues with machine verifier as:

  • fix switch generation: generate correct successors and undo the "address taken" status to reflect that a successor doesn't actually correspond to an IR-level basic block;
  • fix incorrect definition of OpBranch and OpBranchConditional in TableGen (SPIRVInstrInfo.td) to set isBarrier status properly and set a correct type of virtual registers;
  • fix a case when Phi refers to a type definition that goes after the Phi instruction, so that the virtual register definition of the type doesn't dominate all uses.

This PR decrease number of failing tests under expensive checks from 56 to 50.


Full diff: https://github.com/llvm/llvm-project/pull/106966.diff

9 Files Affected:

  • (modified) llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp (-6)
  • (modified) llvm/lib/Target/SPIRV/SPIRVISelLowering.cpp (+16)
  • (modified) llvm/lib/Target/SPIRV/SPIRVInstrInfo.td (+4-2)
  • (modified) llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp (+13-2)
  • (modified) llvm/test/CodeGen/SPIRV/branching/OpSwitchBranches.ll (+5-1)
  • (modified) llvm/test/CodeGen/SPIRV/branching/OpSwitchEmpty.ll (+5-1)
  • (modified) llvm/test/CodeGen/SPIRV/branching/OpSwitchUnreachable.ll (+4-1)
  • (modified) llvm/test/CodeGen/SPIRV/branching/Two_OpSwitch_same_register.ll (+4-1)
  • (modified) llvm/test/CodeGen/SPIRV/transcoding/GlobalFunAnnotate.ll (+1-1)
diff --git a/llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp b/llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp
index 54bcb96772e7ab..553d86efa3df34 100644
--- a/llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp
@@ -168,8 +168,6 @@ SPIRVGlobalRegistry::getOrCreateConstIntReg(uint64_t Val, SPIRVType *SpvType,
   ConstantInt *CI = ConstantInt::get(const_cast<IntegerType *>(LLVMIntTy), Val);
   Register Res = DT.find(CI, CurMF);
   if (!Res.isValid()) {
-    // TODO: handle cases where the type is not 32bit wide
-    // TODO: https://github.com/llvm/llvm-project/issues/88129
     Res =
         CurMF->getRegInfo().createGenericVirtualRegister(LLT::scalar(BitWidth));
     CurMF->getRegInfo().setRegClass(Res, &SPIRV::iIDRegClass);
@@ -197,8 +195,6 @@ SPIRVGlobalRegistry::getOrCreateConstFloatReg(APFloat Val, SPIRVType *SpvType,
   auto *const CI = ConstantFP::get(Ctx, Val);
   Register Res = DT.find(CI, CurMF);
   if (!Res.isValid()) {
-    // TODO: handle cases where the type is not 32bit wide
-    // TODO: https://github.com/llvm/llvm-project/issues/88129
     Res =
         CurMF->getRegInfo().createGenericVirtualRegister(LLT::scalar(BitWidth));
     CurMF->getRegInfo().setRegClass(Res, &SPIRV::fIDRegClass);
@@ -391,8 +387,6 @@ Register SPIRVGlobalRegistry::getOrCreateCompositeOrNull(
       SpvScalConst =
           getOrCreateBaseRegister(Val, I, SpvType, TII, BitWidth, ZeroAsNull);
 
-    // TODO: handle cases where the type is not 32bit wide
-    // TODO: https://github.com/llvm/llvm-project/issues/88129
     LLT LLTy = LLT::scalar(64);
     Register SpvVecConst =
         CurMF->getRegInfo().createGenericVirtualRegister(LLTy);
diff --git a/llvm/lib/Target/SPIRV/SPIRVISelLowering.cpp b/llvm/lib/Target/SPIRV/SPIRVISelLowering.cpp
index 8db9808bb87e1d..682fca7cc7747c 100644
--- a/llvm/lib/Target/SPIRV/SPIRVISelLowering.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVISelLowering.cpp
@@ -353,6 +353,7 @@ void SPIRVTargetLowering::finalizeLowering(MachineFunction &MF) const {
   GR.setCurrentFunc(MF);
   for (MachineFunction::iterator I = MF.begin(), E = MF.end(); I != E; ++I) {
     MachineBasicBlock *MBB = &*I;
+    SmallPtrSet<MachineInstr *, 8> ToMove;
     for (MachineBasicBlock::iterator MBBI = MBB->begin(), MBBE = MBB->end();
          MBBI != MBBE;) {
       MachineInstr &MI = *MBBI++;
@@ -456,8 +457,23 @@ void SPIRVTargetLowering::finalizeLowering(MachineFunction &MF) const {
             MI.removeOperand(i);
         }
       } break;
+      case SPIRV::OpPhi: {
+        // Phi refers to a type definition that goes after the Phi
+        // instruction, so that the virtual register definition of the type
+        // doesn't dominate all uses. Let's place the type definition
+        // instruction at the end of the predecessor.
+        MachineBasicBlock *Curr = MI.getParent();
+        SPIRVType *Type = GR.getSPIRVTypeForVReg(MI.getOperand(1).getReg());
+        if (Type->getParent() == Curr && !Curr->pred_empty())
+          ToMove.insert(const_cast<MachineInstr *>(Type));
+      } break;
       }
     }
+    for (MachineInstr *MI : ToMove) {
+      MachineBasicBlock *Curr = MI->getParent();
+      MachineBasicBlock *Pred = *Curr->pred_begin();
+      Pred->insert(Pred->getFirstTerminator(), Curr->remove_instr(MI));
+    }
   }
   ProcessedMF.insert(&MF);
   TargetLowering::finalizeLowering(MF);
diff --git a/llvm/lib/Target/SPIRV/SPIRVInstrInfo.td b/llvm/lib/Target/SPIRV/SPIRVInstrInfo.td
index 79b9bb87739fec..0cbd2f4536075d 100644
--- a/llvm/lib/Target/SPIRV/SPIRVInstrInfo.td
+++ b/llvm/lib/Target/SPIRV/SPIRVInstrInfo.td
@@ -622,9 +622,11 @@ def OpLoopMerge: Op<246, (outs), (ins ID:$merge, ID:$continue, LoopControl:$lc,
 def OpSelectionMerge: Op<247, (outs), (ins ID:$merge, SelectionControl:$sc),
                   "OpSelectionMerge $merge $sc">;
 def OpLabel: Op<248, (outs ID:$label), (ins), "$label = OpLabel">;
+let isBarrier = 1, isTerminator=1 in {
+  def OpBranch: Op<249, (outs), (ins unknown:$label), "OpBranch $label">;
+}
 let isTerminator=1 in {
-  def OpBranch: Op<249, (outs), (ins ID:$label), "OpBranch $label">;
-  def OpBranchConditional: Op<250, (outs), (ins ID:$cond, ID:$true, ID:$false, variable_ops),
+  def OpBranchConditional: Op<250, (outs), (ins ID:$cond, unknown:$true, unknown:$false, variable_ops),
                   "OpBranchConditional $cond $true $false">;
   def OpSwitch: Op<251, (outs), (ins ID:$sel, ID:$dflt, variable_ops), "OpSwitch $sel $dflt">;
 }
diff --git a/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp b/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
index 4e45aa4b888a92..6bf08be6bb9194 100644
--- a/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
@@ -778,8 +778,10 @@ static void processSwitches(MachineFunction &MF, SPIRVGlobalRegistry *GR,
   }
 
   SmallPtrSet<MachineInstr *, 8> ToEraseMI;
+  SmallPtrSet<MachineBasicBlock *, 8> ClearAddressTaken;
   for (auto &SwIt : Switches) {
     MachineInstr &MI = *SwIt.first;
+    MachineBasicBlock *MBB = MI.getParent();
     SmallVector<MachineInstr *, 8> &Ins = SwIt.second;
     SmallVector<MachineOperand, 8> NewOps;
     for (unsigned i = 0; i < Ins.size(); ++i) {
@@ -790,8 +792,11 @@ static void processSwitches(MachineFunction &MF, SPIRVGlobalRegistry *GR,
         if (It == BB2MBB.end())
           report_fatal_error("cannot find a machine basic block by a basic "
                              "block in a switch statement");
-        NewOps.push_back(MachineOperand::CreateMBB(It->second));
-        MI.getParent()->addSuccessor(It->second);
+        MachineBasicBlock *Succ = It->second;
+        ClearAddressTaken.insert(Succ);
+        NewOps.push_back(MachineOperand::CreateMBB(Succ));
+        if (!llvm::is_contained(MBB->successors(), Succ))
+          MBB->addSuccessor(Succ);
         ToEraseMI.insert(Ins[i]);
       } else {
         NewOps.push_back(
@@ -830,6 +835,12 @@ static void processSwitches(MachineFunction &MF, SPIRVGlobalRegistry *GR,
     }
     BlockAddrI->eraseFromParent();
   }
+
+  // BlockAddress operands were used to keep information between passes,
+  // let's undo the "address taken" status to reflect that Succ doesn't
+  // actually correspond to an IR-level basic block.
+  for (MachineBasicBlock *Succ: ClearAddressTaken)
+    Succ->setAddressTakenIRBlock(nullptr);
 }
 
 static bool isImplicitFallthrough(MachineBasicBlock &MBB) {
diff --git a/llvm/test/CodeGen/SPIRV/branching/OpSwitchBranches.ll b/llvm/test/CodeGen/SPIRV/branching/OpSwitchBranches.ll
index 454b51d952a365..f5dc1efa2fb957 100644
--- a/llvm/test/CodeGen/SPIRV/branching/OpSwitchBranches.ll
+++ b/llvm/test/CodeGen/SPIRV/branching/OpSwitchBranches.ll
@@ -1,4 +1,8 @@
-; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
 
 define i32 @test_switch_branches(i32 %a) {
 entry:
diff --git a/llvm/test/CodeGen/SPIRV/branching/OpSwitchEmpty.ll b/llvm/test/CodeGen/SPIRV/branching/OpSwitchEmpty.ll
index c309883577405b..768ab6ed2a05c2 100644
--- a/llvm/test/CodeGen/SPIRV/branching/OpSwitchEmpty.ll
+++ b/llvm/test/CodeGen/SPIRV/branching/OpSwitchEmpty.ll
@@ -8,7 +8,11 @@
 ;; Command:
 ;; clang -cc1 -triple spir -emit-llvm -o test/SPIRV/OpSwitchEmpty.ll OpSwitchEmpty.cl -disable-llvm-passes
 
-; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
 
 ; CHECK-SPIRV: %[[#X:]] = OpFunctionParameter %[[#]]
 ; CHECK-SPIRV: OpSwitch %[[#X]] %[[#DEFAULT:]]{{$}}
diff --git a/llvm/test/CodeGen/SPIRV/branching/OpSwitchUnreachable.ll b/llvm/test/CodeGen/SPIRV/branching/OpSwitchUnreachable.ll
index 6eb36e5756ecf6..7fbd06c67b56e0 100644
--- a/llvm/test/CodeGen/SPIRV/branching/OpSwitchUnreachable.ll
+++ b/llvm/test/CodeGen/SPIRV/branching/OpSwitchUnreachable.ll
@@ -1,4 +1,7 @@
-; RUN: llc -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
 ; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
 
 define void @test_switch_with_unreachable_block(i1 %a) {
diff --git a/llvm/test/CodeGen/SPIRV/branching/Two_OpSwitch_same_register.ll b/llvm/test/CodeGen/SPIRV/branching/Two_OpSwitch_same_register.ll
index f5c12e1c0f5a3f..265226cbfe2380 100644
--- a/llvm/test/CodeGen/SPIRV/branching/Two_OpSwitch_same_register.ll
+++ b/llvm/test/CodeGen/SPIRV/branching/Two_OpSwitch_same_register.ll
@@ -1,6 +1,9 @@
-; RUN: llc -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
 ; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
 
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
+
 define spir_kernel void @test_two_switch_same_register(i32 %value) {
 ; CHECK-SPIRV:      OpSwitch %[[#REGISTER:]] %[[#DEFAULT1:]] 1 %[[#CASE1:]] 0 %[[#CASE2:]]
   switch i32 %value, label %default1 [
diff --git a/llvm/test/CodeGen/SPIRV/transcoding/GlobalFunAnnotate.ll b/llvm/test/CodeGen/SPIRV/transcoding/GlobalFunAnnotate.ll
index 33bece5b9c00f7..d17e228c2ef88d 100644
--- a/llvm/test/CodeGen/SPIRV/transcoding/GlobalFunAnnotate.ll
+++ b/llvm/test/CodeGen/SPIRV/transcoding/GlobalFunAnnotate.ll
@@ -1,4 +1,4 @@
-; RUN: llc -O0 -mtriple=spirv64-unknown-linux %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv64-unknown-linux %s -o - | FileCheck %s --check-prefix=CHECK-SPIRV
 ; TODO: %if spirv-tools %{ llc -O0 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
 
 ; CHECK-SPIRV: OpDecorate %[[#]] UserSemantic "annotation_on_function"

Copy link

github-actions bot commented Sep 2, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@VyacheslavLevytskyy VyacheslavLevytskyy merged commit ebdadcf into llvm:main Sep 3, 2024
9 checks passed
VyacheslavLevytskyy added a commit that referenced this pull request Sep 26, 2024
…Sel (#108893)

Two main goals of this PR are:
* to support "Arithmetic with Overflow" intrinsics, including the
special case when those intrinsics are being generated by the
CodeGenPrepare pass during translations with optimization;
* to redirect intrinsics with aggregate return type to be lowered via
GlobalISel operations instead of SPIRV-specific unfolding/lowering (see
#95012).

There is a new test case
`llvm/test/CodeGen/SPIRV/passes/translate-aggregate-uaddo.ll` that
describes and checks the general logics of the translation.

This PR continues a series of PRs aimed to identify and fix flaws in
code emission, to improve pass rates for the mode with expensive checks
set on (see #101732,
#104104,
#106966), having in mind the
ultimate goal of proceeding towards the non-experimental status of
SPIR-V Backend.

The reproducers are:

1) consider `llc -O3 -mtriple=spirv64-unknown-unknown ...` with:

```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
  br label %l1

l1:
  %e = phi i32 [ %a, %entry ], [ %i, %body ]
  %i = add nsw i32 %e, 1
  %fl = icmp eq i32 %i, 0
  br i1 %fl, label %exit, label %body

body:
  store i8 42, ptr addrspace(4) %p
  br label %l1

exit:
  ret i32 %i
}
```

2) consider `llc -O0 -mtriple=spirv64-unknown-unknown ...` with:

```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
  br label %l1

l1:                                               ; preds = %body, %entry
  %e = phi i32 [ %a, %entry ], [ %math, %body ]
  %0 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %e, i32 1)
  %math = extractvalue { i32, i1 } %0, 0
  %ov = extractvalue { i32, i1 } %0, 1
  br i1 %ov, label %exit, label %body

body:                                             ; preds = %l1
  store i8 42, ptr addrspace(4) %p, align 1
  br label %l1

exit:                                             ; preds = %l1
  ret i32 %math
}
```
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Sep 27, 2024
…Sel (llvm#108893)

Two main goals of this PR are:
* to support "Arithmetic with Overflow" intrinsics, including the
special case when those intrinsics are being generated by the
CodeGenPrepare pass during translations with optimization;
* to redirect intrinsics with aggregate return type to be lowered via
GlobalISel operations instead of SPIRV-specific unfolding/lowering (see
llvm#95012).

There is a new test case
`llvm/test/CodeGen/SPIRV/passes/translate-aggregate-uaddo.ll` that
describes and checks the general logics of the translation.

This PR continues a series of PRs aimed to identify and fix flaws in
code emission, to improve pass rates for the mode with expensive checks
set on (see llvm#101732,
llvm#104104,
llvm#106966), having in mind the
ultimate goal of proceeding towards the non-experimental status of
SPIR-V Backend.

The reproducers are:

1) consider `llc -O3 -mtriple=spirv64-unknown-unknown ...` with:

```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
  br label %l1

l1:
  %e = phi i32 [ %a, %entry ], [ %i, %body ]
  %i = add nsw i32 %e, 1
  %fl = icmp eq i32 %i, 0
  br i1 %fl, label %exit, label %body

body:
  store i8 42, ptr addrspace(4) %p
  br label %l1

exit:
  ret i32 %i
}
```

2) consider `llc -O0 -mtriple=spirv64-unknown-unknown ...` with:

```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
  br label %l1

l1:                                               ; preds = %body, %entry
  %e = phi i32 [ %a, %entry ], [ %math, %body ]
  %0 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %e, i32 1)
  %math = extractvalue { i32, i1 } %0, 0
  %ov = extractvalue { i32, i1 } %0, 1
  br i1 %ov, label %exit, label %body

body:                                             ; preds = %l1
  store i8 42, ptr addrspace(4) %p, align 1
  br label %l1

exit:                                             ; preds = %l1
  ret i32 %math
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants