Skip to content

Commit 4859cdf

Browse files
authored
unix: BOLT fixes (#463)
As part of investigating failures with BOLT when upgrading to LLVM 19, I found and fixed a few issues with BOLT. First, `test_embed` had been segfaulting on BOLT instrumented binaries. Why I'm not entirely sure. But the segfault only seems to occur in instrumentation mode. These tests are doing low-level things with the interpreter. So I suspect some kind of global mutable state issue or something. I found the exact tests triggering the segfaults and added annotations to skip them. The CPython build system treats the segfault as fatal on 3.13 but not 3.12. This means that on 3.12 we were only running a subset of tests and not collecting BOLT instrumentation nor applying optimizations for all tests after `test_embed`. The removal of the segfault enables us to enable BOLT on 3.13+. Second, LLVM 19.x has a hard error when handling PIC compiled functions containing computed gotos. It appears prior versions of LLVM could silently have buggy behavior in this scenario. We need to skip functions with computed gotos to allow LLVM 19.x to work with BOLT. It makes sense to apply this patch before LLVM 19.x upgrade to prevent bugs with computed gotos. Third, I noticed BOLT was complaining about the lack of `-update-debug-sections` during instrumentation. The 2nd and 3rd issues require common arguments to both BOLT instrumentation and application invocations. The patch fixing both introduces a new configure variable to hold common BOLT arguments. This patch is a good candidate for upstreaming.
1 parent 4c4d347 commit 4859cdf

4 files changed

+125
-6
lines changed

cpython-unix/build-cpython.sh

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,11 @@ if [ -n "${PYTHON_MEETS_MINIMUM_VERSION_3_11}" ]; then
259259
patch -p1 -i ${ROOT}/patch-pwd-remove-conditional.patch
260260
fi
261261

262+
# Adjust BOLT flags to yield better behavior. See inline details in patch.
263+
if [ -n "${PYTHON_MEETS_MINIMUM_VERSION_3_12}" ]; then
264+
patch -p1 -i ${ROOT}/patch-configure-bolt-flags.patch
265+
fi
266+
262267
# The optimization make targets are both phony and non-phony. This leads
263268
# to PGO targets getting reevaluated after a build when you use multiple
264269
# make invocations. e.g. `make install` like we do below. Fix that.
@@ -293,6 +298,19 @@ if [ -n "${CROSS_COMPILING}" ]; then
293298
patch -p1 -i ${ROOT}/patch-force-cross-compile.patch
294299
fi
295300

301+
# BOLT instrumented binaries segfault in some test_embed tests for unknown reasons.
302+
# On 3.12 (minimum BOLT version), the segfault causes the test harness to
303+
# abort and BOLT optimization uses the partial test results. On 3.13, the segfault
304+
# is a fatal error.
305+
if [ -n "${PYTHON_MEETS_MINIMUM_VERSION_3_10}" ]; then
306+
patch -p1 -i ${ROOT}/patch-test-embed-prevent-segfault.patch
307+
fi
308+
309+
# Same as above but for an additional set of tests introduced in 3.14.
310+
if [ -n "${PYTHON_MEETS_MINIMUM_VERSION_3_14}" ]; then
311+
patch -p1 -i ${ROOT}/patch-test-embed-prevent-segfault-3.14.patch
312+
fi
313+
296314
# Most bits look at CFLAGS. But setup.py only looks at CPPFLAGS.
297315
# So we need to set both.
298316
CFLAGS="${EXTRA_TARGET_CFLAGS} -fPIC -I${TOOLS_PATH}/deps/include -I${TOOLS_PATH}/deps/include/ncursesw"
@@ -393,12 +411,7 @@ fi
393411

394412
if [ -n "${CPYTHON_OPTIMIZED}" ]; then
395413
CONFIGURE_FLAGS="${CONFIGURE_FLAGS} --enable-optimizations"
396-
if [[ -n "${PYTHON_MEETS_MINIMUM_VERSION_3_13}" && -n "${BOLT_CAPABLE}" ]]; then
397-
# Due to a SEGFAULT when running `test_embed` with BOLT instrumented binaries, we can't use
398-
# BOLT on Python 3.13+.
399-
# TODO: Find a fix for this or consider skipping these tests specifically
400-
echo "BOLT is disabled on Python 3.13+"
401-
elif [[ -n "${PYTHON_MEETS_MINIMUM_VERSION_3_12}" && -n "${BOLT_CAPABLE}" ]]; then
414+
if [[ -n "${PYTHON_MEETS_MINIMUM_VERSION_3_12}" && -n "${BOLT_CAPABLE}" ]]; then
402415
CONFIGURE_FLAGS="${CONFIGURE_FLAGS} --enable-bolt"
403416
fi
404417
fi
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
diff --git a/configure.ac b/configure.ac
2+
index bc8c357e996..eef55d4839a 100644
3+
--- a/configure.ac
4+
+++ b/configure.ac
5+
@@ -2104,6 +2104,27 @@ AS_VAR_IF([enable_shared], [yes], [
6+
BOLT_BINARIES="${BOLT_BINARIES} \$(INSTSONAME)"
7+
])
8+
9+
+AC_ARG_VAR(
10+
+ [BOLT_COMMON_FLAGS],
11+
+ [Common arguments to llvm-bolt when instrumenting and applying]
12+
+)
13+
+
14+
+AC_MSG_CHECKING([BOLT_COMMON_FLAGS])
15+
+if test -z "${BOLT_COMMON_FLAGS}"
16+
+then
17+
+ AS_VAR_SET(
18+
+ [BOLT_COMMON_FLAGS],
19+
+ [m4_normalize("
20+
+ [-update-debug-sections]
21+
+
22+
+ dnl At least LLVM 19.x doesn't support computed gotos in PIC compiled code.
23+
+ dnl Exclude functions containing computed gotos.
24+
+ dnl TODO this may be fixed in LLVM 20.x via https://github.com/llvm/llvm-project/pull/120267.
25+
+ [-skip-funcs=_PyEval_EvalFrameDefault,sre_ucs1_match/1,sre_ucs2_match/1,sre_ucs4_match/1]
26+
+ ")]
27+
+ )
28+
+fi
29+
+
30+
AC_ARG_VAR(
31+
[BOLT_INSTRUMENT_FLAGS],
32+
[Arguments to llvm-bolt when instrumenting binaries]
33+
@@ -2111,7 +2132,7 @@ AC_ARG_VAR(
34+
AC_MSG_CHECKING([BOLT_INSTRUMENT_FLAGS])
35+
if test -z "${BOLT_INSTRUMENT_FLAGS}"
36+
then
37+
- BOLT_INSTRUMENT_FLAGS=
38+
+ BOLT_INSTRUMENT_FLAGS="${BOLT_COMMON_FLAGS}"
39+
fi
40+
AC_MSG_RESULT([$BOLT_INSTRUMENT_FLAGS])
41+
42+
@@ -2125,7 +2146,7 @@ then
43+
AS_VAR_SET(
44+
[BOLT_APPLY_FLAGS],
45+
[m4_normalize("
46+
- -update-debug-sections
47+
+ ${BOLT_COMMON_FLAGS}
48+
-reorder-blocks=ext-tsp
49+
-reorder-functions=hfsort+
50+
-split-functions
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
diff --git a/Lib/test/test_embed.py b/Lib/test/test_embed.py
2+
index 7110fb889f3..61e4d0f6179 100644
3+
--- a/Lib/test/test_embed.py
4+
+++ b/Lib/test/test_embed.py
5+
@@ -940,6 +940,7 @@ def check_all_configs(self, testname, expected_config=None,
6+
self.check_global_config(configs)
7+
return configs
8+
9+
+ @unittest.skip("segfaults on BOLT instrumented binaries")
10+
def test_init_default_config(self):
11+
self.check_all_configs("test_init_initialize_config", api=API_COMPAT)
12+
13+
@@ -1039,6 +1040,7 @@ def test_init_from_config(self):
14+
self.check_all_configs("test_init_from_config", config, preconfig,
15+
api=API_COMPAT)
16+
17+
+ @unittest.skip("segfaults on BOLT instrumented binaries")
18+
def test_init_compat_env(self):
19+
preconfig = {
20+
'allocator': ALLOCATOR_FOR_CONFIG,
21+
@@ -1074,6 +1076,7 @@ def test_init_compat_env(self):
22+
self.check_all_configs("test_init_compat_env", config, preconfig,
23+
api=API_COMPAT)
24+
25+
+ @unittest.skip("segfaults on BOLT instrumented binaries")
26+
def test_init_python_env(self):
27+
preconfig = {
28+
'allocator': ALLOCATOR_FOR_CONFIG,
29+
@@ -1772,6 +1775,7 @@ def test_init_set_config(self):
30+
self.check_all_configs("test_init_set_config", config,
31+
api=API_ISOLATED)
32+
33+
+ @unittest.skip("segfaults on BOLT instrumented binaries")
34+
def test_initconfig_api(self):
35+
preconfig = {
36+
'configure_locale': True,
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
diff --git a/Lib/test/test_embed.py b/Lib/test/test_embed.py
2+
index 13713cf37b8..ba23880b15f 100644
3+
--- a/Lib/test/test_embed.py
4+
+++ b/Lib/test/test_embed.py
5+
@@ -1615,6 +1615,7 @@ def test_getpath_abspath_win32(self):
6+
for (_, expected), result in zip(CASES, results):
7+
self.assertEqual(result, expected)
8+
9+
+ @unittest.skip("segfaults on BOLT instrumented binaries")
10+
def test_global_pathconfig(self):
11+
# Test C API functions getting the path configuration:
12+
#
13+
@@ -1866,6 +1867,7 @@ def test_no_memleak(self):
14+
self.assertEqual(blocks, 0, out)
15+
16+
17+
+@unittest.skip("segfaults on BOLT instrumented binaries")
18+
class StdPrinterTests(EmbeddingTestsMixin, unittest.TestCase):
19+
# Test PyStdPrinter_Type which is used by _PySys_SetPreliminaryStderr():
20+
# "Set up a preliminary stderr printer until we have enough

0 commit comments

Comments
 (0)