Skip to content

GH-131556: calculate PYBUILDDIR in the makefile instead of sysconfig #131761

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

FFY00
Copy link
Member

@FFY00 FFY00 commented Mar 26, 2025

Right now, this value is being calculated during the build process when calling python -m sysconfig --generate-posix-vars. This command, among other things, calculates the PYBUILDDIR value and writes it to pybuilddir.txt, making it available to the Makefile this way.

This is problematic because it makes it impossible to write Makefile rules with targets under PYBUILDDIR — the target and prerequisites of rules are always expanded immediately as the Makefile is read, only the rule recipe is deferred [1], meaning that all targets must be known before any rule is executed. Since PYBUILDDIR is only known in the middle of the build process — after the target list has been built — we cannot write rules for files under it.

We have had to work around this limitation in a couple ways:

  • Extension modules, which need to be present in PYBUILDDIR to execute the interpreter in-tree, are built in the source tree instead, and afterwards symlinked to PYBUILDDIR once its value is known.
  • Instead of the sysconfigdata module and the sysconfig vars JSON file having their own target, they are currently are generated by the pybuilddir.txt target.

Additionally, this limitation is also prone to cause issues such as GH-131556.

That said, on top of the Makefile issues, PYBUILDDIR being calculated by sysconfig also creates its own additional complications, necessitating more workarounds and introducing unecessary complexity to the sysconfig data generation code — there's a chicken-and-egg problem in certain systems, where we need to know PYBUILDDIR to install the sysconfigdata module, but the sysconfigdata module is needed to calculate the value PYBUILDDIR [2]. We currently handle this by manually building a module object for sysconfigdata, injecting it into sys.modules, and then calculating the value of PYBUILDDIR.

By defining PYBUILDDIR directly in Makefile.pre.in, we solve all these problems. The current build system design is the result of a succession of small fixes adapting the original sysconfig code from over 15 years ago to work with the rest of the codebase as it evolved. That is to say, I don't think there's any technical reasoning behind this particular design, I believe it is simply the result of technical debt. Therefore, I don't see any reason not to move the PYBUILDDIR definition over to Makefile.pre.in, which to me seems to make more sense.

[1] https://www.gnu.org/software/make/manual/html_node/Reading-Makefiles.html
[2]

# There's a chicken-and-egg situation on OS X with regards to the
# _sysconfigdata module after the changes introduced by #15298:
# get_config_vars() is called by get_platform() as part of the
# `make pybuilddir.txt` target -- which is a precursor to the
# _sysconfigdata.py module being constructed. Unfortunately,
# get_config_vars() eventually calls _init_posix(), which attempts
# to import _sysconfigdata, which we won't have built yet. In order
# for _init_posix() to work, if we're on Darwin, just mock up the
# _sysconfigdata module manually and populate it with the build vars.
# This is more than sufficient for ensuring the subsequent call to
# get_platform() succeeds.
# GH-127178: Since we started generating a .json file, we also need this to
# be able to run sysconfig.get_config_vars().
module = types.ModuleType(name)
module.build_time_vars = vars
sys.modules[name] = module

Right now, this value is being calculated during the build process when calling
`python -m sysconfig --generate-posix-vars`. This command, ammong other things,
calculates the PYBUILDDIR value and writes it to a pybuilddir.txt, making it
available to the Makefile this way.

This is problematic because it makes it impossible to write Makefile rules with
targets under PYBUILDDIR — the target and prerequisites of rule are always
expanded immediatily as the Makefile is read, only the rule recipe is deferred
[1], meaning that all targets must be known before any rule is executed. Since
PYBUILDDIR is only known in the middle of the build process — after the target
list has been built — we cannot write rules for files under it.

We have had to worked around this limitation in a couple ways:
- Extension modules, which need to be present in PYBUILDDIR to execute the
  interpreter in-tree, are built in the source tree, and afterwards
  symlinked to PYBUILDDIR once its value is known.
- Instead of the sysconfigdata module and the sysconfig vars JSON file, instead
  of having theirn own target, are generated by the pybuilddir.txt target.

Additionally, this limitation is also prone to cause issues such as pythonGH-131556.

That said, on top of the Makefile issues, PYBUILDDIR being calculated by
sysconfig also creates its own additional complications, necessitating more
workarounds and introducing unecessary complexity to the sysconfig data
generation code — there's a chicken-and-egg problem in certain systems, where we
need to know PYBUILDDIR to install the sysconfigdata module, but the
sysconfigdata module is necessary to be avaible to calculate the value
PYBUILDDIR [2]. We currently handle this by manually building a module object for
sysconfigdata and inject it to sys.modules.

By defining PYBUILDDIR directly in Makefile.pre.in, we solve all these problems.
The current build system design is the result of a sucession small fixes
adapting the original sysconfig code from over 15 years ago to work with the rest
of codebase as it evolved. That is to say — I don't think there's any technical
resoning behind this particular design, I believe it is simply the result of
technical debt. Therefore, I don't see any reason why not to move the PYBUILDDIR
definition over to Makefile.pre.in, which to me seems to make more sense

[1] https://www.gnu.org/software/make/manual/html_node/Reading-Makefiles.html
[2] https://github.com/python/cpython/blob/898e6b395e63ad7f8bbe421adf0af8947d429925/Lib/sysconfig/__main__.py#L206-L221

Signed-off-by: Filipe Laíns <[email protected]>
@FFY00 FFY00 added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 26, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @FFY00 for commit 6d86d82 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F131761%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 26, 2025
@FFY00 FFY00 changed the title GH-131556: calculate PYBUILDDIR in configure instead of sysconfig GH-131556: calculate PYBUILDDIR in the makefile instead of sysconfig Mar 26, 2025
@FFY00
Copy link
Member Author

FFY00 commented Mar 26, 2025

The current CI failures are due to GH-131770, which went unnoticed until now due to GH-131769, which I unintentionally fixed in this PR 😅.

@FFY00 FFY00 requested a review from colesbury March 26, 2025 20:02
Copy link
Member

@brettcannon brettcannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the wasi.py changes.

@FFY00
Copy link
Member Author

FFY00 commented Mar 26, 2025

@ned-deily I'd appreciate if you could take a look at this and test it locally, since it removes the need for a workaround that was originally introduced for macOS. Currently, that workaround is also necessary on Linux, which I have already tested. While I don't expect any different behavior on macOS, which the CI seems to agree, if you have some time to double-check it, it would be appreciated. Thanks!

Copy link
Contributor

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the macOS changes makes sense, but I'll defer to @ned-deily's experience with the historical context on the workaround that has been removed.

A couple of minor issues flagged with the Emscripten pieces.

sysconfig_data = (
f"{emscripten_build_dir}/build/lib.emscripten-wasm32-{python_version}"
f"{emscripten_build_dir}/build/"
f"lib.emscripten-wasm32-emscripten-{python_version}-{abiflags}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't quite work - pydebug is used in a couple of places (e.g., L203) that isn't covered by this change. I agree using abiflags is a better way to handle it - but we should probably use abiflags to set pydebug.

Copy link
Member Author

@FFY00 FFY00 Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, I missed that line, I meant to do the exact same change I did to the WASI script. Since I changed the format of pybuilddir.txt, which is where sysconfig_data points to, we shouldn't be adding -pydebug. My change to the sysconfig_data variable already sets it to the correct value, I just forgot to remove the line below in the commit 😅.

I don't remember seeing pydebug being used elsewhere, but I will check when I am back at the computer.

That said, the script is broken as-is, meaning it must not be covered by the CI. We should probably fix that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, the script is broken as-is, meaning must not be covered by the CI. We should probably fix that.

Correct - is isn't, and yes, we should.

The machine that will be used for the Emscripten buildbot has been commissioned. Right now, Emscripten hard-crashes on a couple of tests when -uall is enabled, but once those are resolved, we should be able to add the Emscripten buildbot to the pool.

if flag == 'd':
configure.append("--with-pydebug")
elif flag == 't':
configure.append("--disable-gil")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the state of Emscripten on free threading will be... I'm torn on whether it's better to leave the option enabled, and just be aware that it will quite possible break, or to error out if a free threaded build is used. Maybe the former with a warning message?

Either way, we should probably handle it similar to pydebug - set a flag (freethreaded?) earlier in the process and then use that flag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the build-python installation built by this script is specifically meant to be used in the cross-compilation process for the Emscripten build, I don't think we need remove it. Having it here means the user specifically asked for it.

Regarding the warning, I think it makes sense, but I think the build system is probably a more suitable place for it. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - I think the build system is a better place for that warning - assuming it's even needed. Its possible free threaded builds might work... especially given that there's no thread support in Emscripten.

@bedevere-app
Copy link

bedevere-app bot commented Mar 27, 2025

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Copy link
Member

@ned-deily ned-deily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My approach to reviewing this PR was to do a few before and after builds on macOS. My primary focus was differences in the generated Makefiles and the contents of the generated pybuilddir.txt file.

There appear to be three differences in the Makefile:

  1. The addition of the new PYBUILDDIR variable:

> PYBUILDDIR = "build/lib.darwin-darwin-3.14-t"

  1. The paths for the shared module .so's in the SHAREDMODS variable now point to the build/lib.<platform> directory rather than the previous symlinked path in the Modules directory:
< 	SHAREDMODS = "Modules/array.cpython-314t-darwin.so Modules/_asyncio.cpython-314t-darwin.so [...]

> 	SHAREDMODS = "build/lib.darwin-darwin-3.14-t/array.cpython-314t-darwin.so build/lib.darwin-darwin-3.14-t/_asyncio.cpython-314t-darwin.so [...]
  1. The addition of the new SYSCONFIGDATA* variables:
> 	SYSCONFIGDATA_NAME = "_sysconfigdata_t_darwin_darwin"
> 	SYSCONFIGVARS_DEPS = "\"
> 	SYSCONFIGVARS_JSON_NAME = "_sysconfigvars_t_darwin_darwin.json"
> 	SYSCONFIG_SRC = "\"

In addition, the contents of the generated pybuilddir.txt file (and, thus, the name of the directory where the .so files are built) has changed for macOS. For example, in this case building with ./configure --disable-gil --with-pydebug with a deployment target of macOS 15.4:

< build/lib.macosx-15.4-arm64-3.14

> build/lib.darwin-darwin-3.14-td

A lot of the differences here (and the "technical debt") can be attributed to the support in the Makefile prior to 3.11 for two different mechanisms for building standard library extension modules: the (older) explicit method using the Modules/Setup* files and the (slightly newer) Distutils-based setup.py method. Both co-existed and explain why there were the .so symlinks, i.e. because Distutils built .so's in the build/lib.<platform> directory while the Modules/Setup* files built them in the Modules directory and the installed build was based on what was in Modules. After the major changes in 3.11 and subsequent releases that removed setup.py and the build dependency on Distutils, some artifacts remained, including (AFAIK) continued support for Modules/Setup*; this area has never been particularly well-documented.

I (and, I expect, most other core devs) have very little personal experience with using Modules/Setup.local to tailor a build (or to what extent, if any, it all still works the same in 3.14 as it did prior to 3.11). So before making any changes that might affect builds that use Modules/Setup.local, I think these changes need to be reviewed by people who actually use it. @Yhg1s might be a good starting point for this. And also @erlend-aasland as having been involved in making the changes for 3.11 and beyond.

AFAICT, the difference in the build/lib.* names for macOS builds should not be a big deal for most builders since the use of the path was essentially isolated to the Makefile outside of pybuilddir.txt and it didn't have any influence on an installed Python. I would guess that few people running Python from a build directory (rather than from an installed location) would be affected by the change in directory name, particularly if they are using pybuilddir.txt. I'm not sure why Distutils included the deployment target (i.e. the 15.4) and/or the Mac architecture (arm64) in the build directory name but it might have allowed using one top-level build directory for multiple invocations of ./configure. I would be surprised if anyone were actually doing that (if it ever worked); using separate top-level build directories for each build configuration would be a simple fix and safer approach if so. (And, of course, it was the use of these variables in the directory name that lead to the "chicken-and-egg" circular dependency in the first place so that's a plus point.)

So, in summary, I don't see any significant macOS-specific problems with the PR as it stands. But the PR should be reviewed by someone with hands-on experience with using Modules/Setup.local in a post-3.11 environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants