Skip to content

Commit 4f21fb8

Browse files
authored
[PGO] Reland PGO's Counter Reset and File Dumping APIs #76471 (#78285)
#76471 caused buildbot failures on Windows. For more details, see #77546. This PR revises the test and relands #76471.
1 parent 02aa695 commit 4f21fb8

File tree

12 files changed

+310
-57
lines changed

12 files changed

+310
-57
lines changed

clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ ExpandModularHeadersPPCallbacks::ExpandModularHeadersPPCallbacks(
100100
/*OwnsHeaderSearch=*/false);
101101
PP->Initialize(Compiler.getTarget(), Compiler.getAuxTarget());
102102
InitializePreprocessor(*PP, *PO, Compiler.getPCHContainerReader(),
103-
Compiler.getFrontendOpts());
103+
Compiler.getFrontendOpts(), Compiler.getCodeGenOpts());
104104
ApplyHeaderSearchOptions(*HeaderInfo, *HSO, LangOpts,
105105
Compiler.getTarget().getTriple());
106106
}

clang/docs/UsersManual.rst

+104
Original file line numberDiff line numberDiff line change
@@ -2809,6 +2809,110 @@ indexed format, regardeless whether it is produced by frontend or the IR pass.
28092809
overhead. ``prefer-atomic`` will be transformed to ``atomic`` when supported
28102810
by the target, or ``single`` otherwise.
28112811

2812+
Fine Tuning Profile Collection
2813+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2814+
2815+
The PGO infrastructure provides user program knobs to fine tune profile
2816+
collection. Specifically, the PGO runtime provides the following functions
2817+
that can be used to control the regions in the program where profiles should
2818+
be collected.
2819+
2820+
* ``void __llvm_profile_set_filename(const char *Name)``: changes the name of
2821+
the profile file to ``Name``.
2822+
* ``void __llvm_profile_reset_counters(void)``: resets all counters to zero.
2823+
* ``int __llvm_profile_dump(void)``: write the profile data to disk.
2824+
* ``int __llvm_orderfile_dump(void)``: write the order file to disk.
2825+
2826+
For example, the following pattern can be used to skip profiling program
2827+
initialization, profile two specific hot regions, and skip profiling program
2828+
cleanup:
2829+
2830+
.. code-block:: c
2831+
2832+
int main() {
2833+
initialize();
2834+
2835+
// Reset all profile counters to 0 to omit profile collected during
2836+
// initialize()'s execution.
2837+
__llvm_profile_reset_counters();
2838+
... hot region 1
2839+
// Dump the profile for hot region 1.
2840+
__llvm_profile_set_filename("region1.profraw");
2841+
__llvm_profile_dump();
2842+
2843+
// Reset counters before proceeding to hot region 2.
2844+
__llvm_profile_reset_counters();
2845+
... hot region 2
2846+
// Dump the profile for hot region 2.
2847+
__llvm_profile_set_filename("region2.profraw");
2848+
__llvm_profile_dump();
2849+
2850+
// Since the profile has been dumped, no further profile data
2851+
// will be collected beyond the above __llvm_profile_dump().
2852+
cleanup();
2853+
return 0;
2854+
}
2855+
2856+
These APIs' names can be introduced to user programs in two ways.
2857+
They can be declared as weak symbols on platforms which support
2858+
treating weak symbols as ``null`` during linking. For example, the user can
2859+
have
2860+
2861+
.. code-block:: c
2862+
2863+
__attribute__((weak)) int __llvm_profile_dump(void);
2864+
2865+
// Then later in the same source file
2866+
if (__llvm_profile_dump)
2867+
if (__llvm_profile_dump() != 0) { ... }
2868+
// The first if condition tests if the symbol is actually defined.
2869+
// Profile dumping only happens if the symbol is defined. Hence,
2870+
// the user program works correctly during normal (not profile-generate)
2871+
// executions.
2872+
2873+
Alternatively, the user program can include the header
2874+
``profile/instr_prof_interface.h``, which contains the API names. For example,
2875+
2876+
.. code-block:: c
2877+
2878+
#include "profile/instr_prof_interface.h"
2879+
2880+
// Then later in the same source file
2881+
if (__llvm_profile_dump() != 0) { ... }
2882+
2883+
The user code does not need to check if the API names are defined, because
2884+
these names are automatically replaced by ``(0)`` or the equivalence of noop
2885+
if the ``clang`` is not compiling for profile generation.
2886+
2887+
Such replacement can happen because ``clang`` adds one of two macros depending
2888+
on the ``-fprofile-generate`` and the ``-fprofile-use`` flags.
2889+
2890+
* ``__LLVM_INSTR_PROFILE_GENERATE``: defined when one of
2891+
``-fprofile[-instr]-generate``/``-fcs-profile-generate`` is in effect.
2892+
* ``__LLVM_INSTR_PROFILE_USE``: defined when one of
2893+
``-fprofile-use``/``-fprofile-instr-use`` is in effect.
2894+
2895+
The two macros can be used to provide more flexibiilty so a user program
2896+
can execute code specifically intended for profile generate or profile use.
2897+
For example, a user program can have special logging during profile generate:
2898+
2899+
.. code-block:: c
2900+
2901+
#if __LLVM_INSTR_PROFILE_GENERATE
2902+
expensive_logging_of_full_program_state();
2903+
#endif
2904+
2905+
The logging is automatically excluded during a normal build of the program,
2906+
hence it does not impact performance during a normal execution.
2907+
2908+
It is advised to use such fine tuning only in a program's cold regions. The weak
2909+
symbols can introduce extra control flow (the ``if`` checks), while the macros
2910+
(hence declarations they guard in ``profile/instr_prof_interface.h``)
2911+
can change the control flow of the functions that use them between profile
2912+
generation and profile use (which can lead to discarded counters in such
2913+
functions). Using these APIs in the program's cold regions introduces less
2914+
overhead and leads to more optimized code.
2915+
28122916
Disabling Instrumentation
28132917
^^^^^^^^^^^^^^^^^^^^^^^^^
28142918

clang/include/clang/Basic/CodeGenOptions.h

+3
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,9 @@ class CodeGenOptions : public CodeGenOptionsBase {
494494
return getProfileInstr() == ProfileCSIRInstr;
495495
}
496496

497+
/// Check if any form of instrumentation is on.
498+
bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; }
499+
497500
/// Check if Clang profile use is on.
498501
bool hasProfileClangUse() const {
499502
return getProfileUse() == ProfileClangInstr;

clang/include/clang/Frontend/Utils.h

+3-1
Original file line numberDiff line numberDiff line change
@@ -43,12 +43,14 @@ class PCHContainerReader;
4343
class Preprocessor;
4444
class PreprocessorOptions;
4545
class PreprocessorOutputOptions;
46+
class CodeGenOptions;
4647

4748
/// InitializePreprocessor - Initialize the preprocessor getting it and the
4849
/// environment ready to process a single file.
4950
void InitializePreprocessor(Preprocessor &PP, const PreprocessorOptions &PPOpts,
5051
const PCHContainerReader &PCHContainerRdr,
51-
const FrontendOptions &FEOpts);
52+
const FrontendOptions &FEOpts,
53+
const CodeGenOptions &CodeGenOpts);
5254

5355
/// DoPrintPreprocessedInput - Implement -E mode.
5456
void DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,

clang/lib/Frontend/CompilerInstance.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -470,7 +470,7 @@ void CompilerInstance::createPreprocessor(TranslationUnitKind TUKind) {
470470

471471
// Predefine macros and configure the preprocessor.
472472
InitializePreprocessor(*PP, PPOpts, getPCHContainerReader(),
473-
getFrontendOpts());
473+
getFrontendOpts(), getCodeGenOpts());
474474

475475
// Initialize the header search object. In CUDA compilations, we use the aux
476476
// triple (the host triple) to initialize our header search, since we need to

clang/lib/Frontend/InitPreprocessor.cpp

+19-4
Original file line numberDiff line numberDiff line change
@@ -1364,12 +1364,22 @@ static void InitializePredefinedMacros(const TargetInfo &TI,
13641364
TI.getTargetDefines(LangOpts, Builder);
13651365
}
13661366

1367+
static void InitializePGOProfileMacros(const CodeGenOptions &CodeGenOpts,
1368+
MacroBuilder &Builder) {
1369+
if (CodeGenOpts.hasProfileInstr())
1370+
Builder.defineMacro("__LLVM_INSTR_PROFILE_GENERATE");
1371+
1372+
if (CodeGenOpts.hasProfileIRUse() || CodeGenOpts.hasProfileClangUse())
1373+
Builder.defineMacro("__LLVM_INSTR_PROFILE_USE");
1374+
}
1375+
13671376
/// InitializePreprocessor - Initialize the preprocessor getting it and the
13681377
/// environment ready to process a single file.
1369-
void clang::InitializePreprocessor(
1370-
Preprocessor &PP, const PreprocessorOptions &InitOpts,
1371-
const PCHContainerReader &PCHContainerRdr,
1372-
const FrontendOptions &FEOpts) {
1378+
void clang::InitializePreprocessor(Preprocessor &PP,
1379+
const PreprocessorOptions &InitOpts,
1380+
const PCHContainerReader &PCHContainerRdr,
1381+
const FrontendOptions &FEOpts,
1382+
const CodeGenOptions &CodeGenOpts) {
13731383
const LangOptions &LangOpts = PP.getLangOpts();
13741384
std::string PredefineBuffer;
13751385
PredefineBuffer.reserve(4080);
@@ -1416,6 +1426,11 @@ void clang::InitializePreprocessor(
14161426
InitializeStandardPredefinedMacros(PP.getTargetInfo(), PP.getLangOpts(),
14171427
FEOpts, Builder);
14181428

1429+
// The PGO instrumentation profile macros are driven by options
1430+
// -fprofile[-instr]-generate/-fcs-profile-generate/-fprofile[-instr]-use,
1431+
// hence they are not guarded by InitOpts.UsePredefines.
1432+
InitializePGOProfileMacros(CodeGenOpts, Builder);
1433+
14191434
// Add on the predefines from the driver. Wrap in a #line directive to report
14201435
// that they come from the command line.
14211436
Builder.append("# 1 \"<command line>\" 1");

clang/test/Profile/c-general.c

+10
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,16 @@
99
// Also check compatibility with older profiles.
1010
// RUN: %clang_cc1 -triple x86_64-apple-macosx10.9 -main-file-name c-general.c %s -o - -emit-llvm -fprofile-instrument-use-path=%S/Inputs/c-general.profdata.v1 | FileCheck -allow-deprecated-dag-overlap -check-prefix=PGOUSE %s
1111

12+
// RUN: %clang -fprofile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
13+
// RUN: %clang -fprofile-instr-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
14+
// RUN: %clang -fcs-profile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
15+
//
16+
// RUN: %clang -fprofile-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s
17+
// RUN: %clang -fprofile-instr-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s
18+
19+
// PROFGENMACRO:#define __LLVM_INSTR_PROFILE_GENERATE 1
20+
// PROFUSEMACRO:#define __LLVM_INSTR_PROFILE_USE 1
21+
1222
// PGOGEN: @[[SLC:__profc_simple_loops]] = private global [4 x i64] zeroinitializer
1323
// PGOGEN: @[[IFC:__profc_conditionals]] = private global [13 x i64] zeroinitializer
1424
// PGOGEN: @[[EEC:__profc_early_exits]] = private global [9 x i64] zeroinitializer

compiler-rt/include/CMakeLists.txt

+1
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ endif(COMPILER_RT_BUILD_ORC)
4444
if (COMPILER_RT_BUILD_PROFILE)
4545
set(PROFILE_HEADERS
4646
profile/InstrProfData.inc
47+
profile/instr_prof_interface.h
4748
)
4849
endif(COMPILER_RT_BUILD_PROFILE)
4950

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
/*===---- instr_prof_interface.h - Instrumentation PGO User Program API ----===
2+
*
3+
* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
* See https://llvm.org/LICENSE.txt for license information.
5+
* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
*
7+
*===-----------------------------------------------------------------------===
8+
*
9+
* This header provides a public interface for fine-grained control of counter
10+
* reset and profile dumping. These interface functions can be directly called
11+
* in user programs.
12+
*
13+
\*===---------------------------------------------------------------------===*/
14+
15+
#ifndef COMPILER_RT_INSTR_PROFILING
16+
#define COMPILER_RT_INSTR_PROFILING
17+
18+
#ifdef __cplusplus
19+
extern "C" {
20+
#endif
21+
22+
#ifdef __LLVM_INSTR_PROFILE_GENERATE
23+
// Profile file reset and dump interfaces.
24+
// When `-fprofile[-instr]-generate`/`-fcs-profile-generate` is in effect,
25+
// clang defines __LLVM_INSTR_PROFILE_GENERATE to pick up the API calls.
26+
27+
/*!
28+
* \brief Set the filename for writing instrumentation data.
29+
*
30+
* Sets the filename to be used for subsequent calls to
31+
* \a __llvm_profile_write_file().
32+
*
33+
* \c Name is not copied, so it must remain valid. Passing NULL resets the
34+
* filename logic to the default behaviour.
35+
*
36+
* Note: There may be multiple copies of the profile runtime (one for each
37+
* instrumented image/DSO). This API only modifies the filename within the
38+
* copy of the runtime available to the calling image.
39+
*
40+
* Warning: This is a no-op if continuous mode (\ref
41+
* __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
42+
* that in continuous mode, profile counters are mmap()'d to the profile at
43+
* program initialization time. Support for transferring the mmap'd profile
44+
* counts to a new file has not been implemented.
45+
*/
46+
void __llvm_profile_set_filename(const char *Name);
47+
48+
/*!
49+
* \brief Interface to set all PGO counters to zero for the current process.
50+
*
51+
*/
52+
void __llvm_profile_reset_counters(void);
53+
54+
/*!
55+
* \brief this is a wrapper interface to \c __llvm_profile_write_file.
56+
* After this interface is invoked, an already dumped flag will be set
57+
* so that profile won't be dumped again during program exit.
58+
* Invocation of interface __llvm_profile_reset_counters will clear
59+
* the flag. This interface is designed to be used to collect profile
60+
* data from user selected hot regions. The use model is
61+
* __llvm_profile_reset_counters();
62+
* ... hot region 1
63+
* __llvm_profile_dump();
64+
* .. some other code
65+
* __llvm_profile_reset_counters();
66+
* ... hot region 2
67+
* __llvm_profile_dump();
68+
*
69+
* It is expected that on-line profile merging is on with \c %m specifier
70+
* used in profile filename . If merging is not turned on, user is expected
71+
* to invoke __llvm_profile_set_filename to specify different profile names
72+
* for different regions before dumping to avoid profile write clobbering.
73+
*/
74+
int __llvm_profile_dump(void);
75+
76+
// Interface to dump the current process' order file to disk.
77+
int __llvm_orderfile_dump(void);
78+
79+
#else
80+
81+
#define __llvm_profile_set_filename(Name)
82+
#define __llvm_profile_reset_counters()
83+
#define __llvm_profile_dump() (0)
84+
#define __llvm_orderfile_dump() (0)
85+
86+
#endif
87+
88+
#ifdef __cplusplus
89+
} // extern "C"
90+
#endif
91+
92+
#endif

compiler-rt/lib/profile/InstrProfiling.h

+11-50
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,17 @@
1212
#include "InstrProfilingPort.h"
1313
#include <stdio.h>
1414

15+
// Make sure __LLVM_INSTR_PROFILE_GENERATE is always defined before
16+
// including instr_prof_interface.h so the interface functions are
17+
// declared correctly for the runtime.
18+
// __LLVM_INSTR_PROFILE_GENERATE is always `#undef`ed after the header,
19+
// because compiler-rt does not support profiling the profiling runtime itself.
20+
#ifndef __LLVM_INSTR_PROFILE_GENERATE
21+
#define __LLVM_INSTR_PROFILE_GENERATE
22+
#endif
23+
#include "profile/instr_prof_interface.h"
24+
#undef __LLVM_INSTR_PROFILE_GENERATE
25+
1526
#define INSTR_PROF_VISIBILITY COMPILER_RT_VISIBILITY
1627
#include "profile/InstrProfData.inc"
1728

@@ -100,12 +111,6 @@ ValueProfNode *__llvm_profile_begin_vnodes();
100111
ValueProfNode *__llvm_profile_end_vnodes();
101112
uint32_t *__llvm_profile_begin_orderfile();
102113

103-
/*!
104-
* \brief Clear profile counters to zero.
105-
*
106-
*/
107-
void __llvm_profile_reset_counters(void);
108-
109114
/*!
110115
* \brief Merge profile data from buffer.
111116
*
@@ -156,50 +161,6 @@ void __llvm_profile_instrument_target_value(uint64_t TargetValue, void *Data,
156161
int __llvm_profile_write_file(void);
157162

158163
int __llvm_orderfile_write_file(void);
159-
/*!
160-
* \brief this is a wrapper interface to \c __llvm_profile_write_file.
161-
* After this interface is invoked, an already dumped flag will be set
162-
* so that profile won't be dumped again during program exit.
163-
* Invocation of interface __llvm_profile_reset_counters will clear
164-
* the flag. This interface is designed to be used to collect profile
165-
* data from user selected hot regions. The use model is
166-
* __llvm_profile_reset_counters();
167-
* ... hot region 1
168-
* __llvm_profile_dump();
169-
* .. some other code
170-
* __llvm_profile_reset_counters();
171-
* ... hot region 2
172-
* __llvm_profile_dump();
173-
*
174-
* It is expected that on-line profile merging is on with \c %m specifier
175-
* used in profile filename . If merging is not turned on, user is expected
176-
* to invoke __llvm_profile_set_filename to specify different profile names
177-
* for different regions before dumping to avoid profile write clobbering.
178-
*/
179-
int __llvm_profile_dump(void);
180-
181-
int __llvm_orderfile_dump(void);
182-
183-
/*!
184-
* \brief Set the filename for writing instrumentation data.
185-
*
186-
* Sets the filename to be used for subsequent calls to
187-
* \a __llvm_profile_write_file().
188-
*
189-
* \c Name is not copied, so it must remain valid. Passing NULL resets the
190-
* filename logic to the default behaviour.
191-
*
192-
* Note: There may be multiple copies of the profile runtime (one for each
193-
* instrumented image/DSO). This API only modifies the filename within the
194-
* copy of the runtime available to the calling image.
195-
*
196-
* Warning: This is a no-op if continuous mode (\ref
197-
* __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
198-
* that in continuous mode, profile counters are mmap()'d to the profile at
199-
* program initialization time. Support for transferring the mmap'd profile
200-
* counts to a new file has not been implemented.
201-
*/
202-
void __llvm_profile_set_filename(const char *Name);
203164

204165
/*!
205166
* \brief Set the FILE object for writing instrumentation data. Return 0 if set

0 commit comments

Comments
 (0)