Skip to content

[DebugInfo] Clone dbg.values in SimplifyCFG like normal instructions #72526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

jmorse
Copy link
Member

@jmorse jmorse commented Nov 16, 2023

The code in the CloneInstructionsIntoPredec... function modified by this patch has a long history that dates back to 2011, see d715ec8. There, when folding branches, all dbg.value intrinsics seen when folding would be saved and then re-inserted at the end of whatever was folded. Over the last 12 years this behaviour has been preserved.

However, IMO it's bad behaviour. If we have:

inst1
dbg.value1
inst2
dbg.value2

And we fold that sequence into a different block, then we would want the instructions and variable assignments to appear in the same order. However because of this old behaviour, the dbg.values are sunk, and we get:

inst1
inst2
dbg.value1
dbg.value2

This clustering of dbg.values can make assignments to the same variable invisible, as well as reducing the coverage of other assignments.

This patch relaxes the CloneInstructions... function and allows it to clone and update dbg.values in-place, causing them to appear in the original order in the destination block. I've added some extra dbg.values to the updated test: without the changes to the pass, the dbg.values sink into a blob ahead of the select.

(Metadata changes to make the LLVM-IR parser not drop the debug-info for it being out of date. The RemoveDIs related RUN line has been removed because it was spuriously passing due to the debug-info being dropped).

~

This has come up as part of the RemoveDIs stuff because it's a weird behaviour that requires explicit instrumentation. IMO it doesn't need to be so abnormal, the dbg.values can be treated like normal instructions. And that makes it easier to move from one representation to the other, as all the utilities we've written are geared towards updating things like they're dbg.values.

The code in the CloneInstructionsIntoPredec... function modified by this
patch has a long history that dates back to 2011, see d715ec8.
There, when folding branches, all dbg.value intrinsics seen when folding
would be saved and then re-inserted at the end of whatever was folded. Over
the last 12 years this behaviour has been preserved.

However, IMO it's bad behaviour. If we have:

  inst1
  dbg.value1
  inst2
  dbg.value2

And we fold that sequence into a different block, then we would want the
instructions and variable assignments to appear in the same order. However
because of this old behaviour, the dbg.values are sunk, and we get:

  inst1
  inst2
  dbg.value1
  dbg.value2

This clustering of dbg.values can make assignments to the same variable
invisible, as well as reducing the coverage of other assignments.

This patch relaxes the CloneInstructions... function and allows it to clone
and update dbg.values in-place, causing them to appear in the original
order in the destination block. I've added some extra dbg.values to the
updated test: without the changes to the pass, the dbg.values sink into a
blob ahead of the select.

(Metadata changes to make the LLVM-IR parser not drop the debug-info for it
being out of date. The RemoveDIs related RUN line has been removed because
it was spuriously passing due to the debug-info being dropped).
@llvmbot
Copy link
Member

llvmbot commented Nov 16, 2023

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-debuginfo

Author: Jeremy Morse (jmorse)

Changes

The code in the CloneInstructionsIntoPredec... function modified by this patch has a long history that dates back to 2011, see d715ec8. There, when folding branches, all dbg.value intrinsics seen when folding would be saved and then re-inserted at the end of whatever was folded. Over the last 12 years this behaviour has been preserved.

However, IMO it's bad behaviour. If we have:

inst1
dbg.value1
inst2
dbg.value2

And we fold that sequence into a different block, then we would want the instructions and variable assignments to appear in the same order. However because of this old behaviour, the dbg.values are sunk, and we get:

inst1
inst2
dbg.value1
dbg.value2

This clustering of dbg.values can make assignments to the same variable invisible, as well as reducing the coverage of other assignments.

This patch relaxes the CloneInstructions... function and allows it to clone and update dbg.values in-place, causing them to appear in the original order in the destination block. I've added some extra dbg.values to the updated test: without the changes to the pass, the dbg.values sink into a blob ahead of the select.

(Metadata changes to make the LLVM-IR parser not drop the debug-info for it being out of date. The RemoveDIs related RUN line has been removed because it was spuriously passing due to the debug-info being dropped).

~

This has come up as part of the RemoveDIs stuff because it's a weird behaviour that requires explicit instrumentation. IMO it doesn't need to be so abnormal, the dbg.values can be treated like normal instructions. And that makes it easier to move from one representation to the other, as all the utilities we've written are geared towards updating things like they're dbg.values.


Full diff: https://github.com/llvm/llvm-project/pull/72526.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+6-12)
  • (modified) llvm/test/Transforms/SimplifyCFG/branch-fold-dbg.ll (+21-13)
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 4dae52a8ecffdf6..401a71d59f3324e 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -1101,12 +1101,12 @@ static void CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses(
   // Note that there may be multiple predecessor blocks, so we cannot move
   // bonus instructions to a predecessor block.
   for (Instruction &BonusInst : *BB) {
-    if (isa<DbgInfoIntrinsic>(BonusInst) || BonusInst.isTerminator())
+    if (BonusInst.isTerminator())
       continue;
 
     Instruction *NewBonusInst = BonusInst.clone();
 
-    if (PTI->getDebugLoc() != NewBonusInst->getDebugLoc()) {
+    if (!isa<DbgInfoIntrinsic>(BonusInst) && PTI->getDebugLoc() != NewBonusInst->getDebugLoc()) {
       // Unless the instruction has the same !dbg location as the original
       // branch, drop it. When we fold the bonus instructions we want to make
       // sure we reset their debug locations in order to avoid stepping on
@@ -1126,6 +1126,10 @@ static void CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses(
     NewBonusInst->dropUBImplyingAttrsAndMetadata();
 
     NewBonusInst->insertInto(PredBlock, PTI->getIterator());
+
+    if (isa<DbgInfoIntrinsic>(BonusInst))
+      continue;
+
     NewBonusInst->takeName(&BonusInst);
     BonusInst.setName(NewBonusInst->getName() + ".old");
 
@@ -3744,16 +3748,6 @@ static bool performBranchToCommonDestFolding(BranchInst *BI, BranchInst *PBI,
   PBI->setCondition(
       createLogicalOp(Builder, Opc, PBI->getCondition(), BICond, "or.cond"));
 
-  // Copy any debug value intrinsics into the end of PredBlock.
-  for (Instruction &I : *BB) {
-    if (isa<DbgInfoIntrinsic>(I)) {
-      Instruction *NewI = I.clone();
-      RemapInstruction(NewI, VMap,
-                       RF_NoModuleLevelChanges | RF_IgnoreMissingLocals);
-      NewI->insertBefore(PBI);
-    }
-  }
-
   ++NumFoldBranchToCommonDest;
   return true;
 }
diff --git a/llvm/test/Transforms/SimplifyCFG/branch-fold-dbg.ll b/llvm/test/Transforms/SimplifyCFG/branch-fold-dbg.ll
index 41486c5935a392c..e8b8cc6edafe0c2 100644
--- a/llvm/test/Transforms/SimplifyCFG/branch-fold-dbg.ll
+++ b/llvm/test/Transforms/SimplifyCFG/branch-fold-dbg.ll
@@ -8,23 +8,26 @@
 define i1 @foo(i32) nounwind ssp !dbg !0 {
 ; CHECK-LABEL: @foo(
 ; CHECK-NEXT:  Entry:
-; CHECK-NEXT:    [[TMP1:%.*]] = icmp slt i32 [[TMP0:%.*]], 0
-; CHECK-NEXT:    [[TMP2:%.*]] = icmp sgt i32 [[TMP0]], 4
-; CHECK-NEXT:    [[OR_COND:%.*]] = or i1 [[TMP1]], [[TMP2]]
-; CHECK-NEXT:    br i1 [[OR_COND]], label [[COMMON_RET:%.*]], label [[BB2:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp slt i32 [[TMP0:%.*]], 0, !dbg [[DBG7:![0-9]+]]
+; CHECK-NEXT:    [[TMP2:%.*]] = icmp sgt i32 [[TMP0]], 4, !dbg [[DBG7]]
+; CHECK-NEXT:    [[OR_COND:%.*]] = or i1 [[TMP1]], [[TMP2]], !dbg [[DBG7]]
+; CHECK-NEXT:    br i1 [[OR_COND]], label [[COMMON_RET:%.*]], label [[BB2:%.*]], !dbg [[DBG7]]
 ; CHECK:       BB2:
-; CHECK-NEXT:    [[TMP3:%.*]] = shl i32 1, [[TMP0]]
-; CHECK-NEXT:    [[TMP4:%.*]] = and i32 [[TMP3]], 31
-; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq i32 [[TMP4]], 0
+; CHECK-NEXT:    [[TMP3:%.*]] = shl i32 1, [[TMP0]], !dbg [[DBG7]]
+; CHECK-NEXT:    [[TMP4:%.*]] = and i32 [[TMP3]], 31, !dbg [[DBG7]]
+; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq i32 [[TMP4]], 0, !dbg [[DBG7]]
+; CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr null, metadata [[META8:![0-9]+]], metadata !DIExpression()), !dbg [[DBG13:![0-9]+]]
 ; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds [5 x %0], ptr @[[GLOB0:[0-9]+]], i32 0, i32 [[TMP0]]
+; CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr [[TMP6]], metadata [[META8]], metadata !DIExpression()), !dbg [[DBG13]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = icmp eq ptr [[TMP6]], null
-; CHECK-NEXT:    [[OR_COND2:%.*]] = select i1 [[TMP5]], i1 true, i1 [[TMP7]]
+; CHECK-NEXT:    call void @llvm.dbg.value(metadata ptr [[TMP6]], metadata [[META8]], metadata !DIExpression()), !dbg [[DBG13]]
+; CHECK-NEXT:    [[OR_COND2:%.*]] = select i1 [[TMP5]], i1 true, i1 [[TMP7]], !dbg [[DBG7]]
 ; CHECK-NEXT:    [[TMP8:%.*]] = icmp slt i32 [[TMP0]], 0
-; CHECK-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[OR_COND2]], i1 false, i1 [[TMP8]]
-; CHECK-NEXT:    br label [[COMMON_RET]]
+; CHECK-NEXT:    [[SPEC_SELECT:%.*]] = select i1 [[OR_COND2]], i1 false, i1 [[TMP8]], !dbg [[DBG7]]
+; CHECK-NEXT:    br label [[COMMON_RET]], !dbg [[DBG7]]
 ; CHECK:       common.ret:
 ; CHECK-NEXT:    [[COMMON_RET_OP:%.*]] = phi i1 [ false, [[ENTRY:%.*]] ], [ [[SPEC_SELECT]], [[BB2]] ]
-; CHECK-NEXT:    ret i1 [[COMMON_RET_OP]]
+; CHECK-NEXT:    ret i1 [[COMMON_RET_OP]], !dbg [[DBG14:![0-9]+]]
 ;
 Entry:
   %1 = icmp slt i32 %0, 0, !dbg !5
@@ -42,9 +45,11 @@ BB2:                                              ; preds = %BB1
 
 
 BB3:                                              ; preds = %BB2
+  call void @llvm.dbg.value(metadata ptr null, metadata !7, metadata !DIExpression()), !dbg !12
   %6 = getelementptr inbounds [5 x %0], ptr @0, i32 0, i32 %0, !dbg !6
-  call void @llvm.dbg.value(metadata ptr %6, metadata !7, metadata !{}), !dbg !12
+  call void @llvm.dbg.value(metadata ptr %6, metadata !7, metadata !DIExpression()), !dbg !12
   %7 = icmp eq ptr %6, null, !dbg !13
+  call void @llvm.dbg.value(metadata ptr %6, metadata !7, metadata !DIExpression()), !dbg !12
   br i1 %7, label %BB5, label %BB4, !dbg !13
 
 BB4:                                              ; preds = %BB3
@@ -58,12 +63,13 @@ BB5:                                              ; preds = %BB3, %BB2, %BB1, %E
 declare void @llvm.dbg.value(metadata, metadata, metadata) nounwind readnone
 
 !llvm.dbg.cu = !{!2}
+!llvm.module.flags = !{!16, !17}
 
 !0 = distinct !DISubprogram(name: "foo", line: 231, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: false, unit: !2, file: !15, scope: !1, type: !3)
 !1 = !DIFile(filename: "a.c", directory: "/private/tmp")
 !2 = distinct !DICompileUnit(language: DW_LANG_C99, producer: "clang (trunk 129006)", isOptimized: true, emissionKind: FullDebug, file: !15, enums: !4, retainedTypes: !4)
 !3 = !DISubroutineType(types: !4)
-!4 = !{null}
+!4 = !{}
 !5 = !DILocation(line: 131, column: 2, scope: !0)
 !6 = !DILocation(line: 134, column: 2, scope: !0)
 !7 = !DILocalVariable(name: "bar", line: 232, scope: !8, file: !1, type: !9)
@@ -75,3 +81,5 @@ declare void @llvm.dbg.value(metadata, metadata, metadata) nounwind readnone
 !13 = !DILocation(line: 234, column: 2, scope: !8)
 !14 = !DILocation(line: 274, column: 1, scope: !8)
 !15 = !DIFile(filename: "a.c", directory: "/private/tmp")
+!16 = !{i32 1, !"Debug Info Version", i32 3}
+!17 = !{i32 2, !"Dwarf Version", i32 4}

Copy link

github-actions bot commented Nov 16, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to be removing weird behaviour. Hurray!

It looks like in llvm/test/Transforms/SimplifyCFG/branch-fold-dbg.ll some conditional assignments become unconditional, which seems bad/undesirable. But the existing behaviour of pushing the dbg.values to the end of the block doesn't solve that, so we might as well preserve the order (as you're doing here). Does that line up with your thoughts on this?

I wonder if there are compile time implications - I assume previously lots of block folding would result in a lot of redundant dbg.values (because of how they get clumped without this patch) which would result in many getting removed in cleanup passes. That might've artificially been improving performance over the baseline correct behaviour. Not much we can do about it I guess.

@jmorse
Copy link
Member Author

jmorse commented Nov 16, 2023

/me squints -- right you are, I hadn't spotted that those dbg.values should actually be dropped. I think this change doesn't make anything worse seeing how all the source-locations are dropped too, and it removes a special-debug-juggling location, which is desirable for RemoveDIs. Pushed up to the compile-time-tracker for testing, just in case.

This is actually something moderately important we're going to have to track -- it exposes one of our favourite pieces of IR, "what happens when we merge a PHI-value variable to be calculated by a select in a single block".

Copy link
Contributor

@OCHyams OCHyams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM pending compile time tracker results - as discussed, this is an improvement, even if the end result isn't perfect.

@jmorse
Copy link
Member Author

jmorse commented Nov 17, 2023

jmorse added a commit that referenced this pull request Nov 24, 2023
…72526)

The code in the CloneInstructionsIntoPredec... function modified by this
patch has a long history that dates back to 2011, see d715ec8.
There, when folding branches, all dbg.value intrinsics seen when folding
would be saved and then re-inserted at the end of whatever was folded. Over
the last 12 years this behaviour has been preserved.

However, IMO it's bad behaviour. If we have:

  inst1
  dbg.value1
  inst2
  dbg.value2

And we fold that sequence into a different block, then we would want the
instructions and variable assignments to appear in the same order. However
because of this old behaviour, the dbg.values are sunk, and we get:

  inst1
  inst2
  dbg.value1
  dbg.value2

This clustering of dbg.values can make assignments to the same variable
invisible, as well as reducing the coverage of other assignments.

This patch relaxes the CloneInstructions... function and allows it to clone
and update dbg.values in-place, causing them to appear in the original
order in the destination block. I've added some extra dbg.values to the
updated test: without the changes to the pass, the dbg.values sink into a
blob ahead of the select. The RemoveDIs code can't cope with this right now
so I've removed the "--try..." flag, restored in a commit to land in a
couple of hours.

(Metadata changes to make the LLVM-IR parser not drop the debug-info for it
being out of date. The RemoveDIs related RUN line has been removed because
it was spuriously passing due to the debug-info being dropped).
@jmorse
Copy link
Member Author

jmorse commented Nov 24, 2023

Landed in 84f6e1d -- I removed the "--try..." RUN line from the modified test, which had been added in the meantime. Seeing how the debug-info in the test was broken and being dropped, right up to the point where 84f6e1d landed, it wasn't actually covering anything.

RemoveDIs coverage for that test will be restored in #72546

@jmorse jmorse closed this Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants