Skip to content

Commit 80a817c

Browse files
SLTozerIanWood1
authored andcommitted
[DLCov] Implement DebugLoc coverage tracking (llvm#107279)
This is part of a series of patches that tries to improve DILocation bug detection in Debugify; see the review for more details. This is the patch that adds the main feature, adding a set of `DebugLoc::get<Kind>` functions that can be used for instructions with intentionally empty DebugLocs to prevent Debugify from treating them as bugs, removing the currently-pervasive false positives and allowing us to use Debugify (in its original DI preservation mode) to reliably detect existing bugs and regressions. This patch does not add uses of these functions, except for once in Clang before optimizations, and in `Instruction::dropLocation()`, since that is an obvious case that immediately removes a set of false positives.
1 parent a157a32 commit 80a817c

File tree

7 files changed

+179
-11
lines changed

7 files changed

+179
-11
lines changed

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -961,6 +961,22 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
961961
Debugify.setOrigDIVerifyBugsReportFilePath(
962962
CodeGenOpts.DIBugsReportFilePath);
963963
Debugify.registerCallbacks(PIC, MAM);
964+
965+
#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
966+
// If we're using debug location coverage tracking, mark all the
967+
// instructions coming out of the frontend without a DebugLoc as being
968+
// compiler-generated, to prevent both those instructions and new
969+
// instructions that inherit their location from being treated as
970+
// incorrectly empty locations.
971+
for (Function &F : *TheModule) {
972+
if (!F.getSubprogram())
973+
continue;
974+
for (BasicBlock &BB : F)
975+
for (Instruction &I : BB)
976+
if (!I.getDebugLoc())
977+
I.setDebugLoc(DebugLoc::getCompilerGenerated());
978+
}
979+
#endif
964980
}
965981
// Attempt to load pass plugins and register their callbacks with PB.
966982
for (auto &PluginFN : CodeGenOpts.PassPlugins) {

llvm/docs/HowToUpdateDebugInfo.rst

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,47 @@ See the discussion in the section about
169169
:ref:`merging locations<WhenToMergeLocation>` for examples of when the rule for
170170
dropping locations applies.
171171

172+
.. _NewInstLocations:
173+
174+
Setting locations for new instructions
175+
--------------------------------------
176+
177+
Whenever a new instruction is created and there is no suitable location for that
178+
instruction, that instruction should be annotated accordingly. There are a set
179+
of special ``DebugLoc`` values that can be set on an instruction to annotate the
180+
reason that it does not have a valid location. These are as follows:
181+
182+
* ``DebugLoc::getCompilerGenerated()``: This indicates that the instruction is a
183+
compiler-generated instruction, i.e. it is not associated with any user source
184+
code.
185+
186+
* ``DebugLoc::getDropped()``: This indicates that the instruction has
187+
intentionally had its source location removed, according to the rules for
188+
:ref:`dropping locations<WhenToDropLocation>`; this is set automatically by
189+
``Instruction::dropLocation()``.
190+
191+
* ``DebugLoc::getUnknown()``: This indicates that the instruction does not have
192+
a known or currently knowable source location, e.g. that it is infeasible to
193+
determine the correct source location, or that the source location is
194+
ambiguous in a way that LLVM cannot currently represent.
195+
196+
* ``DebugLoc::getTemporary()``: This is used for instructions that we don't
197+
expect to be emitted (e.g. ``UnreachableInst``), and so should not need a
198+
valid location; if we ever try to emit a temporary location into an object/asm
199+
file, this indicates that something has gone wrong.
200+
201+
Where applicable, these should be used instead of leaving an instruction without
202+
an assigned location or explicitly setting the location as ``DebugLoc()``.
203+
Ordinarily these special locations are identical to an absent location, but LLVM
204+
built with coverage-tracking
205+
(``-DLLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING="COVERAGE"``) will keep track of
206+
these special locations in order to detect unintentionally-missing locations;
207+
for this reason, the most important rule is to *not* apply any of these if it
208+
isn't clear which, if any, is appropriate - an absent location can be detected
209+
and fixed, while an incorrectly annotated instruction is much harder to detect.
210+
On the other hand, if any of these clearly apply, then they should be used to
211+
prevent false positives from being flagged up.
212+
172213
Rules for updating debug values
173214
===============================
174215

llvm/include/llvm/IR/DebugLoc.h

Lines changed: 95 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
#ifndef LLVM_IR_DEBUGLOC_H
1515
#define LLVM_IR_DEBUGLOC_H
1616

17+
#include "llvm/Config/config.h"
1718
#include "llvm/IR/TrackingMDRef.h"
1819
#include "llvm/Support/DataTypes.h"
1920

@@ -22,6 +23,73 @@ namespace llvm {
2223
class LLVMContext;
2324
class raw_ostream;
2425
class DILocation;
26+
class Function;
27+
28+
#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
29+
// Used to represent different "kinds" of DebugLoc, expressing that the
30+
// instruction it is part of is either normal and should contain a valid
31+
// DILocation, or otherwise describing the reason why the instruction does
32+
// not contain a valid DILocation.
33+
enum class DebugLocKind : uint8_t {
34+
// The instruction is expected to contain a valid DILocation.
35+
Normal,
36+
// The instruction is compiler-generated, i.e. it is not associated with any
37+
// line in the original source.
38+
CompilerGenerated,
39+
// The instruction has intentionally had its source location removed,
40+
// typically because it was moved outside of its original control-flow and
41+
// presenting the prior source location would be misleading for debuggers
42+
// or profilers.
43+
Dropped,
44+
// The instruction does not have a known or currently knowable source
45+
// location, e.g. the attribution is ambiguous in a way that can't be
46+
// represented, or determining the correct location is complicated and
47+
// requires future developer effort.
48+
Unknown,
49+
// DebugLoc is attached to an instruction that we don't expect to be
50+
// emitted, and so can omit a valid DILocation; we don't expect to ever try
51+
// and emit these into the line table, and trying to do so is a sign that
52+
// something has gone wrong (most likely a DebugLoc leaking from a transient
53+
// compiler-generated instruction).
54+
Temporary
55+
};
56+
57+
// Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
58+
// to ignore intentionally-empty DebugLocs.
59+
class DILocAndCoverageTracking : public TrackingMDNodeRef {
60+
public:
61+
DebugLocKind Kind;
62+
// Default constructor for empty DebugLocs.
63+
DILocAndCoverageTracking()
64+
: TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
65+
// Valid or nullptr MDNode*, normal DebugLocKind.
66+
DILocAndCoverageTracking(const MDNode *Loc)
67+
: TrackingMDNodeRef(const_cast<MDNode *>(Loc)),
68+
Kind(DebugLocKind::Normal) {}
69+
DILocAndCoverageTracking(const DILocation *Loc);
70+
// Explicit DebugLocKind, which always means a nullptr MDNode*.
71+
DILocAndCoverageTracking(DebugLocKind Kind)
72+
: TrackingMDNodeRef(nullptr), Kind(Kind) {}
73+
};
74+
template <> struct simplify_type<DILocAndCoverageTracking> {
75+
using SimpleType = MDNode *;
76+
77+
static MDNode *getSimplifiedValue(DILocAndCoverageTracking &MD) {
78+
return MD.get();
79+
}
80+
};
81+
template <> struct simplify_type<const DILocAndCoverageTracking> {
82+
using SimpleType = MDNode *;
83+
84+
static MDNode *getSimplifiedValue(const DILocAndCoverageTracking &MD) {
85+
return MD.get();
86+
}
87+
};
88+
89+
using DebugLocTrackingRef = DILocAndCoverageTracking;
90+
#else
91+
using DebugLocTrackingRef = TrackingMDNodeRef;
92+
#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING
2593

2694
/// A debug info location.
2795
///
@@ -31,7 +99,8 @@ namespace llvm {
3199
/// To avoid extra includes, \a DebugLoc doubles the \a DILocation API with a
32100
/// one based on relatively opaque \a MDNode pointers.
33101
class DebugLoc {
34-
TrackingMDNodeRef Loc;
102+
103+
DebugLocTrackingRef Loc;
35104

36105
public:
37106
DebugLoc() = default;
@@ -47,6 +116,31 @@ namespace llvm {
47116
/// IR.
48117
explicit DebugLoc(const MDNode *N);
49118

119+
#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
120+
DebugLoc(DebugLocKind Kind) : Loc(Kind) {}
121+
DebugLocKind getKind() const { return Loc.Kind; }
122+
#endif
123+
124+
#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
125+
static inline DebugLoc getTemporary() {
126+
return DebugLoc(DebugLocKind::Temporary);
127+
}
128+
static inline DebugLoc getUnknown() {
129+
return DebugLoc(DebugLocKind::Unknown);
130+
}
131+
static inline DebugLoc getCompilerGenerated() {
132+
return DebugLoc(DebugLocKind::CompilerGenerated);
133+
}
134+
static inline DebugLoc getDropped() {
135+
return DebugLoc(DebugLocKind::Dropped);
136+
}
137+
#else
138+
static inline DebugLoc getTemporary() { return DebugLoc(); }
139+
static inline DebugLoc getUnknown() { return DebugLoc(); }
140+
static inline DebugLoc getCompilerGenerated() { return DebugLoc(); }
141+
static inline DebugLoc getDropped() { return DebugLoc(); }
142+
#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING
143+
50144
/// Get the underlying \a DILocation.
51145
///
52146
/// \pre !*this or \c isa<DILocation>(getAsMDNode()).

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2098,6 +2098,10 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
20982098
}
20992099

21002100
if (!DL) {
2101+
// FIXME: We could assert that `DL.getKind() != DebugLocKind::Temporary`
2102+
// here, or otherwise record any temporary DebugLocs seen to ensure that
2103+
// transient compiler-generated instructions aren't leaking their DLs to
2104+
// other instructions.
21012105
// We have an unspecified location, which might want to be line 0.
21022106
// If we have already emitted a line-0 record, don't repeat it.
21032107
if (LastAsmLine == 0)

llvm/lib/IR/DebugInfo.cpp

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -987,8 +987,10 @@ void Instruction::updateLocationAfterHoist() { dropLocation(); }
987987

988988
void Instruction::dropLocation() {
989989
const DebugLoc &DL = getDebugLoc();
990-
if (!DL)
990+
if (!DL) {
991+
setDebugLoc(DebugLoc::getDropped());
991992
return;
993+
}
992994

993995
// If this isn't a call, drop the location to allow a location from a
994996
// preceding instruction to propagate.
@@ -1000,7 +1002,7 @@ void Instruction::dropLocation() {
10001002
}
10011003

10021004
if (!MayLowerToCall) {
1003-
setDebugLoc(DebugLoc());
1005+
setDebugLoc(DebugLoc::getDropped());
10041006
return;
10051007
}
10061008

@@ -1019,7 +1021,7 @@ void Instruction::dropLocation() {
10191021
//
10201022
// One alternative is to set a line 0 location with the existing scope and
10211023
// inlinedAt info. The location might be sensitive to when inlining occurs.
1022-
setDebugLoc(DebugLoc());
1024+
setDebugLoc(DebugLoc::getDropped());
10231025
}
10241026

10251027
//===----------------------------------------------------------------------===//

llvm/lib/IR/DebugLoc.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,12 @@
1111
#include "llvm/IR/DebugInfo.h"
1212
using namespace llvm;
1313

14+
#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
15+
DILocAndCoverageTracking::DILocAndCoverageTracking(const DILocation *L)
16+
: TrackingMDNodeRef(const_cast<DILocation *>(L)),
17+
Kind(DebugLocKind::Normal) {}
18+
#endif // ENABLE_DEBUGLOC_COVERAGE_TRACKING
19+
1420
//===----------------------------------------------------------------------===//
1521
// DebugLoc Implementation
1622
//===----------------------------------------------------------------------===//

llvm/lib/Transforms/Utils/Debugify.cpp

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,16 @@ bool llvm::stripDebugifyMetadata(Module &M) {
290290
return Changed;
291291
}
292292

293+
bool hasLoc(const Instruction &I) {
294+
const DILocation *Loc = I.getDebugLoc().get();
295+
#if ENABLE_DEBUGLOC_COVERAGE_TRACKING
296+
DebugLocKind Kind = I.getDebugLoc().getKind();
297+
return Loc || Kind != DebugLocKind::Normal;
298+
#else
299+
return Loc;
300+
#endif
301+
}
302+
293303
bool llvm::collectDebugInfoMetadata(Module &M,
294304
iterator_range<Module::iterator> Functions,
295305
DebugInfoPerPass &DebugInfoBeforePass,
@@ -362,9 +372,7 @@ bool llvm::collectDebugInfoMetadata(Module &M,
362372
LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n');
363373
DebugInfoBeforePass.InstToDelete.insert({&I, &I});
364374

365-
const DILocation *Loc = I.getDebugLoc().get();
366-
bool HasLoc = Loc != nullptr;
367-
DebugInfoBeforePass.DILocations.insert({&I, HasLoc});
375+
DebugInfoBeforePass.DILocations.insert({&I, hasLoc(I)});
368376
}
369377
}
370378
}
@@ -607,10 +615,7 @@ bool llvm::checkDebugInfoMetadata(Module &M,
607615

608616
LLVM_DEBUG(dbgs() << " Collecting info for inst: " << I << '\n');
609617

610-
const DILocation *Loc = I.getDebugLoc().get();
611-
bool HasLoc = Loc != nullptr;
612-
613-
DebugInfoAfterPass.DILocations.insert({&I, HasLoc});
618+
DebugInfoAfterPass.DILocations.insert({&I, hasLoc(I)});
614619
}
615620
}
616621
}

0 commit comments

Comments
 (0)