@@ -2809,6 +2809,110 @@ indexed format, regardeless whether it is produced by frontend or the IR pass.
2809
2809
overhead. ``prefer-atomic `` will be transformed to ``atomic `` when supported
2810
2810
by the target, or ``single `` otherwise.
2811
2811
2812
+ Fine Tuning Profile Collection
2813
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2814
+
2815
+ The PGO infrastructure provides user program knobs to fine tune profile
2816
+ collection. Specifically, the PGO runtime provides the following functions
2817
+ that can be used to control the regions in the program where profiles should
2818
+ be collected.
2819
+
2820
+ * ``void __llvm_profile_set_filename(const char *Name) ``: changes the name of
2821
+ the profile file to ``Name ``.
2822
+ * ``void __llvm_profile_reset_counters(void) ``: resets all counters to zero.
2823
+ * ``int __llvm_profile_dump(void) ``: write the profile data to disk.
2824
+ * ``int __llvm_orderfile_dump(void) ``: write the order file to disk.
2825
+
2826
+ For example, the following pattern can be used to skip profiling program
2827
+ initialization, profile two specific hot regions, and skip profiling program
2828
+ cleanup:
2829
+
2830
+ .. code-block :: c
2831
+
2832
+ int main() {
2833
+ initialize();
2834
+
2835
+ // Reset all profile counters to 0 to omit profile collected during
2836
+ // initialize()'s execution.
2837
+ __llvm_profile_reset_counters();
2838
+ ... hot region 1
2839
+ // Dump the profile for hot region 1.
2840
+ __llvm_profile_set_filename("region1.profraw");
2841
+ __llvm_profile_dump();
2842
+
2843
+ // Reset counters before proceeding to hot region 2.
2844
+ __llvm_profile_reset_counters();
2845
+ ... hot region 2
2846
+ // Dump the profile for hot region 2.
2847
+ __llvm_profile_set_filename("region2.profraw");
2848
+ __llvm_profile_dump();
2849
+
2850
+ // Since the profile has been dumped, no further profile data
2851
+ // will be collected beyond the above __llvm_profile_dump().
2852
+ cleanup();
2853
+ return 0;
2854
+ }
2855
+
2856
+ These APIs' names can be introduced to user programs in two ways.
2857
+ They can be declared as weak symbols on platforms which support
2858
+ treating weak symbols as ``null `` during linking. For example, the user can
2859
+ have
2860
+
2861
+ .. code-block :: c
2862
+
2863
+ __attribute__((weak)) int __llvm_profile_dump(void);
2864
+
2865
+ // Then later in the same source file
2866
+ if (__llvm_profile_dump)
2867
+ if (__llvm_profile_dump() != 0) { ... }
2868
+ // The first if condition tests if the symbol is actually defined.
2869
+ // Profile dumping only happens if the symbol is defined. Hence,
2870
+ // the user program works correctly during normal (not profile-generate)
2871
+ // executions.
2872
+
2873
+ Alternatively, the user program can include the header
2874
+ ``profile/instr_prof_interface.h ``, which contains the API names. For example,
2875
+
2876
+ .. code-block :: c
2877
+
2878
+ #include "profile/instr_prof_interface.h"
2879
+
2880
+ // Then later in the same source file
2881
+ if (__llvm_profile_dump() != 0) { ... }
2882
+
2883
+ The user code does not need to check if the API names are defined, because
2884
+ these names are automatically replaced by ``(0) `` or the equivalence of noop
2885
+ if the ``clang `` is not compiling for profile generation.
2886
+
2887
+ Such replacement can happen because ``clang `` adds one of two macros depending
2888
+ on the ``-fprofile-generate `` and the ``-fprofile-use `` flags.
2889
+
2890
+ * ``__LLVM_INSTR_PROFILE_GENERATE ``: defined when one of
2891
+ ``-fprofile[-instr]-generate ``/``-fcs-profile-generate `` is in effect.
2892
+ * ``__LLVM_INSTR_PROFILE_USE ``: defined when one of
2893
+ ``-fprofile-use ``/``-fprofile-instr-use `` is in effect.
2894
+
2895
+ The two macros can be used to provide more flexibiilty so a user program
2896
+ can execute code specifically intended for profile generate or profile use.
2897
+ For example, a user program can have special logging during profile generate:
2898
+
2899
+ .. code-block :: c
2900
+
2901
+ #if __LLVM_INSTR_PROFILE_GENERATE
2902
+ expensive_logging_of_full_program_state();
2903
+ #endif
2904
+
2905
+ The logging is automatically excluded during a normal build of the program,
2906
+ hence it does not impact performance during a normal execution.
2907
+
2908
+ It is advised to use such fine tuning only in a program's cold regions. The weak
2909
+ symbols can introduce extra control flow (the ``if `` checks), while the macros
2910
+ (hence declarations they guard in ``profile/instr_prof_interface.h ``)
2911
+ can change the control flow of the functions that use them between profile
2912
+ generation and profile use (which can lead to discarded counters in such
2913
+ functions). Using these APIs in the program's cold regions introduces less
2914
+ overhead and leads to more optimized code.
2915
+
2812
2916
Disabling Instrumentation
2813
2917
^^^^^^^^^^^^^^^^^^^^^^^^^
2814
2918
0 commit comments