Skip to content

Commit 68c3048

Browse files
authored
internal: repos to create a toolchain from a locally installed Python (#2000)
This adds the primitives for defining a toolchain based on a locally installed Python. Doing this consists of two parts: * A repo rule to define a Python runtime pointing to a local Python installation. * A repo rule to define toolchains for those runtimes. The runtime repos create platform runtimes, i.e, it sets py_runtime.interpreter_path. This means the runtime isn't included in the runfiles. Note that these repo rules are largely implementation details, and are definitely not stable API-wise. Creating public APIs to use them through WORKSPACE or bzlmod will be done in a separate change (there's a few design and behavior questions to discuss). This is definitely experimental quality. In particular, the code that tries to figure out the C headers/libraries is very finicky. I couldn't find solid docs about how to do this, and there's a lot of undocumented settings, so what's there is what I was able to piece together from my laptop's behavior. Misc other changes: * Also fixes a bug if a pyenv-backed interpreter path is used for precompiling: pyenv uses `$0` to determine what to re-exec. The `:current_interpreter_executable` target used its own name, which pyenv didn't understand. * The repo logger now also accepts a string. This should help prevent accidentally passing a string causing an error. It's also just a bit more convenient when doing development. * Repo loggers will automatically include their rule name and repo name. This makes following logging output easier. * Makes `repo_utils.execute()` report progress. * Adds `repo_utils.getenv`, `repo_utils.watch`, and `repo_utils.watch_tree`: backwards compatibility functions for their `rctx` equivalents. * Adds `repo_utils.which_unchecked`: calls `which`, but allows for failure. * Adds `repo_utils.get_platforms_os_name()`: Returns the name used in `@platforms` for the OS reported by `rctx`. * Makes several repo util functions call `watch()` or `getenv()`, if available. This makes repository rules better respect environmental changes. * Adds more detail to the definition of an in-build vs platform runtime * Adds a README for the integration tests directory. Setting up and using one is a bit more involved than other tests, so some docs help. * Allows integration tests to specify bazel versions to use.
1 parent 68f752e commit 68c3048

21 files changed

+924
-38
lines changed

.bazelrc

+2-2
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
# (Note, we cannot use `common --deleted_packages` because the bazel version command doesn't support it)
55
# To update these lines, execute
66
# `bazel run @rules_bazel_integration_test//tools:update_deleted_packages`
7-
build --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/py_proto_library,examples/bzlmod/py_proto_library/example.com/another_proto,examples/bzlmod/py_proto_library/example.com/proto,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/another_proto,examples/py_proto_library/example.com/proto,tests/integration/compile_pip_requirements,tests/integration/compile_pip_requirements_test_from_external_repo,tests/integration/custom_commands,tests/integration/ignore_root_user_error,tests/integration/ignore_root_user_error/submodule,tests/integration/pip_parse,tests/integration/pip_parse/empty,tests/integration/py_cc_toolchain_registered
8-
query --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/py_proto_library,examples/bzlmod/py_proto_library/example.com/another_proto,examples/bzlmod/py_proto_library/example.com/proto,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/another_proto,examples/py_proto_library/example.com/proto,tests/integration/compile_pip_requirements,tests/integration/compile_pip_requirements_test_from_external_repo,tests/integration/custom_commands,tests/integration/ignore_root_user_error,tests/integration/ignore_root_user_error/submodule,tests/integration/pip_parse,tests/integration/pip_parse/empty,tests/integration/py_cc_toolchain_registered
7+
build --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/py_proto_library,examples/bzlmod/py_proto_library/example.com/another_proto,examples/bzlmod/py_proto_library/example.com/proto,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/another_proto,examples/py_proto_library/example.com/proto,tests/integration/compile_pip_requirements,tests/integration/compile_pip_requirements_test_from_external_repo,tests/integration/custom_commands,tests/integration/ignore_root_user_error,tests/integration/ignore_root_user_error/submodule,tests/integration/local_toolchains,tests/integration/pip_parse,tests/integration/pip_parse/empty,tests/integration/py_cc_toolchain_registered
8+
query --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/py_proto_library,examples/bzlmod/py_proto_library/example.com/another_proto,examples/bzlmod/py_proto_library/example.com/proto,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/another_proto,examples/py_proto_library/example.com/proto,tests/integration/compile_pip_requirements,tests/integration/compile_pip_requirements_test_from_external_repo,tests/integration/custom_commands,tests/integration/ignore_root_user_error,tests/integration/ignore_root_user_error/submodule,tests/integration/local_toolchains,tests/integration/pip_parse,tests/integration/pip_parse/empty,tests/integration/py_cc_toolchain_registered
99

1010
test --test_output=errors
1111

docs/sphinx/glossary.md

+25
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,30 @@ common attributes
77
[Common attributes](https://bazel.build/reference/be/common-definitions#common-attributes)
88
for a complete listing
99

10+
in-build runtime
11+
: An in-build runtime is one where the Python runtime, and all its files, are
12+
known to the build system and a Python binary includes all the necessary parts
13+
of the runtime in its runfiles. Such runtimes may be remotely downloaded, part
14+
of your source control, or mapped in from local files by repositories.
15+
16+
The main advantage of in-build runtimes is they ensure you know what Python
17+
runtime will be used, since it's part of the build itself and included in
18+
the resulting binary. The main disadvantage is the additional work it adds to
19+
building. The whole Python runtime is included in a Python binary's runfiles,
20+
which can be a significant number of files.
21+
22+
platform runtime
23+
: A platform runtime is a Python runtime that is assumed to be installed on the
24+
system where a Python binary runs, whereever that may be. For example, using `/usr/bin/python3`
25+
as the interpreter is a platform runtime -- it assumes that, wherever the binary
26+
runs (your local machine, a remote worker, within a container, etc), that path
27+
is available. Such runtimes are _not_ part of a binary's runfiles.
28+
29+
The main advantage of platform runtimes is they are lightweight insofar as
30+
building the binary is concerned. All Bazel has to do is pass along a string
31+
path to the interpreter. The disadvantage is, if you don't control the systems
32+
being run on, you may get different Python installations than expected.
33+
1034
rule callable
1135
: A function that behaves like a rule. This includes, but is not is not
1236
limited to:
@@ -26,3 +50,4 @@ simple label
2650
nonconfigurable
2751
: A nonconfigurable value cannot use `select`. See Bazel's
2852
[configurable attributes](https://bazel.build/reference/be/common-definitions#configurable-attributes) documentation.
53+

python/private/full_version.bzl

+1-1
Original file line numberDiff line numberDiff line change
@@ -40,4 +40,4 @@ def full_version(version):
4040
),
4141
)
4242
else:
43-
fail("Unknown version format: {}".format(version))
43+
fail("Unknown version format: '{}'".format(version))
+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Copyright 2024 The Bazel Authors. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
import json
16+
import sys
17+
import sysconfig
18+
19+
data = {
20+
"major": sys.version_info.major,
21+
"minor": sys.version_info.minor,
22+
"micro": sys.version_info.micro,
23+
"include": sysconfig.get_path("include"),
24+
"implementation_name": sys.implementation.name,
25+
}
26+
27+
config_vars = [
28+
# The libpythonX.Y.so file. Usually?
29+
# It might be a static archive (.a) file instead.
30+
"LDLIBRARY",
31+
# The directory with library files. Supposedly.
32+
# It's not entirely clear how to get the directory with libraries.
33+
# There's several types of libraries with different names and a plethora
34+
# of settings.
35+
# https://stackoverflow.com/questions/47423246/get-pythons-lib-path
36+
# For now, it seems LIBDIR has what is needed, so just use that.
37+
"LIBDIR",
38+
# The versioned libpythonX.Y.so.N file. Usually?
39+
# It might be a static archive (.a) file instead.
40+
"INSTSONAME",
41+
# The libpythonX.so file. Usually?
42+
# It might be a static archive (a.) file instead.
43+
"PY3LIBRARY",
44+
# The platform-specific filename suffix for library files.
45+
# Includes the dot, e.g. `.so`
46+
"SHLIB_SUFFIX",
47+
]
48+
data.update(zip(config_vars, sysconfig.get_config_vars(*config_vars)))
49+
print(json.dumps(data))

python/private/local_runtime_repo.bzl

+252
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
# Copyright 2024 The Bazel Authors. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
"""Create a repository for a locally installed Python runtime."""
16+
17+
load("//python/private:enum.bzl", "enum")
18+
load(":repo_utils.bzl", "REPO_DEBUG_ENV_VAR", "repo_utils")
19+
20+
# buildifier: disable=name-conventions
21+
_OnFailure = enum(
22+
SKIP = "skip",
23+
WARN = "warn",
24+
FAIL = "fail",
25+
)
26+
27+
_TOOLCHAIN_IMPL_TEMPLATE = """\
28+
# Generated by python/private/local_runtime_repo.bzl
29+
30+
load("@rules_python//python/private:local_runtime_repo_setup.bzl", "define_local_runtime_toolchain_impl")
31+
32+
define_local_runtime_toolchain_impl(
33+
name = "local_runtime",
34+
lib_ext = "{lib_ext}",
35+
major = "{major}",
36+
minor = "{minor}",
37+
micro = "{micro}",
38+
interpreter_path = "{interpreter_path}",
39+
implementation_name = "{implementation_name}",
40+
os = "{os}",
41+
)
42+
"""
43+
44+
def _local_runtime_repo_impl(rctx):
45+
logger = repo_utils.logger(rctx)
46+
on_failure = rctx.attr.on_failure
47+
48+
platforms_os_name = repo_utils.get_platforms_os_name(rctx)
49+
if not platforms_os_name:
50+
if on_failure == "fail":
51+
fail("Unrecognized host platform '{}': cannot determine OS constraint".format(
52+
rctx.os.name,
53+
))
54+
55+
if on_failure == "warn":
56+
logger.warn(lambda: "Unrecognized host platform '{}': cannot determine OS constraint".format(
57+
rctx.os.name,
58+
))
59+
60+
# else, on_failure must be skip
61+
rctx.file("BUILD.bazel", _expand_incompatible_template())
62+
return
63+
64+
result = _resolve_interpreter_path(rctx)
65+
if not result.resolved_path:
66+
if on_failure == "fail":
67+
fail("interpreter not found: {}".format(result.describe_failure()))
68+
69+
if on_failure == "warn":
70+
logger.warn(lambda: "interpreter not found: {}".format(result.describe_failure()))
71+
72+
# else, on_failure must be skip
73+
rctx.file("BUILD.bazel", _expand_incompatible_template())
74+
return
75+
else:
76+
interpreter_path = result.resolved_path
77+
78+
logger.info(lambda: "resolved interpreter {} to {}".format(rctx.attr.interpreter_path, interpreter_path))
79+
80+
exec_result = repo_utils.execute_unchecked(
81+
rctx,
82+
op = "local_runtime_repo.GetPythonInfo({})".format(rctx.name),
83+
arguments = [
84+
interpreter_path,
85+
rctx.path(rctx.attr._get_local_runtime_info),
86+
],
87+
quiet = True,
88+
)
89+
if exec_result.return_code != 0:
90+
if on_failure == "fail":
91+
fail("GetPythonInfo failed: {}".format(exec_result.describe_failure()))
92+
if on_failure == "warn":
93+
logger.warn(lambda: "GetPythonInfo failed: {}".format(exec_result.describe_failure()))
94+
95+
# else, on_failure must be skip
96+
rctx.file("BUILD.bazel", _expand_incompatible_template())
97+
return
98+
99+
info = json.decode(exec_result.stdout)
100+
logger.info(lambda: _format_get_info_result(info))
101+
102+
# NOTE: Keep in sync with recursive glob in define_local_runtime_toolchain_impl
103+
repo_utils.watch_tree(rctx, rctx.path(info["include"]))
104+
105+
# The cc_library.includes values have to be non-absolute paths, otherwise
106+
# the toolchain will give an error. Work around this error by making them
107+
# appear as part of this repo.
108+
rctx.symlink(info["include"], "include")
109+
110+
shared_lib_names = [
111+
info["PY3LIBRARY"],
112+
info["LDLIBRARY"],
113+
info["INSTSONAME"],
114+
]
115+
116+
# In some cases, the value may be empty. Not clear why.
117+
shared_lib_names = [v for v in shared_lib_names if v]
118+
119+
# In some cases, the same value is returned for multiple keys. Not clear why.
120+
shared_lib_names = {v: None for v in shared_lib_names}.keys()
121+
shared_lib_dir = info["LIBDIR"]
122+
123+
# The specific files are symlinked instead of the whole directory
124+
# because it can point to a directory that has more than just
125+
# the Python runtime shared libraries, e.g. /usr/lib, or a Python
126+
# specific directory with pip-installed shared libraries.
127+
rctx.report_progress("Symlinking external Python shared libraries")
128+
for name in shared_lib_names:
129+
origin = rctx.path("{}/{}".format(shared_lib_dir, name))
130+
131+
# The reported names don't always exist; it depends on the particulars
132+
# of the runtime installation.
133+
if origin.exists:
134+
repo_utils.watch(rctx, origin)
135+
rctx.symlink(origin, "lib/" + name)
136+
137+
rctx.file("WORKSPACE", "")
138+
rctx.file("MODULE.bazel", "")
139+
rctx.file("REPO.bazel", "")
140+
rctx.file("BUILD.bazel", _TOOLCHAIN_IMPL_TEMPLATE.format(
141+
major = info["major"],
142+
minor = info["minor"],
143+
micro = info["micro"],
144+
interpreter_path = interpreter_path,
145+
lib_ext = info["SHLIB_SUFFIX"],
146+
implementation_name = info["implementation_name"],
147+
os = "@platforms//os:{}".format(repo_utils.get_platforms_os_name(rctx)),
148+
))
149+
150+
local_runtime_repo = repository_rule(
151+
implementation = _local_runtime_repo_impl,
152+
doc = """
153+
Use a locally installed Python runtime as a toolchain implementation.
154+
155+
Note this uses the runtime as a *platform runtime*. A platform runtime means
156+
means targets don't include the runtime itself as part of their runfiles or
157+
inputs. Instead, users must assure that where the targets run have the runtime
158+
pre-installed or otherwise available.
159+
160+
This results in lighter weight binaries (in particular, Bazel doesn't have to
161+
create thousands of files for every `py_test`), at the risk of having to rely on
162+
a system having the necessary Python installed.
163+
""",
164+
attrs = {
165+
"interpreter_path": attr.string(
166+
doc = """
167+
An absolute path or program name on the `PATH` env var.
168+
169+
Values with slashes are assumed to be the path to a program. Otherwise, it is
170+
treated as something to search for on `PATH`
171+
172+
Note that, when a plain program name is used, the path to the interpreter is
173+
resolved at repository evalution time, not runtime of any resulting binaries.
174+
""",
175+
default = "python3",
176+
),
177+
"on_failure": attr.string(
178+
default = _OnFailure.SKIP,
179+
values = sorted(_OnFailure.__members__.values()),
180+
doc = """
181+
How to handle errors when trying to automatically determine settings.
182+
183+
* `skip` will silently skip creating a runtime. Instead, a non-functional
184+
runtime will be generated and marked as incompatible so it cannot be used.
185+
This is best if a local runtime is known not to work or be available
186+
in certain cases and that's OK. e.g., one use windows paths when there
187+
are people running on linux.
188+
* `warn` will print a warning message. This is useful when you expect
189+
a runtime to be available, but are OK with it missing and falling back
190+
to some other runtime.
191+
* `fail` will result in a failure. This is only recommended if you must
192+
ensure the runtime is available.
193+
""",
194+
),
195+
"_get_local_runtime_info": attr.label(
196+
allow_single_file = True,
197+
default = "//python/private:get_local_runtime_info.py",
198+
),
199+
"_rule_name": attr.string(default = "local_runtime_repo"),
200+
},
201+
environ = ["PATH", REPO_DEBUG_ENV_VAR],
202+
)
203+
204+
def _expand_incompatible_template():
205+
return _TOOLCHAIN_IMPL_TEMPLATE.format(
206+
interpreter_path = "/incompatible",
207+
implementation_name = "incompatible",
208+
lib_ext = "incompatible",
209+
major = "0",
210+
minor = "0",
211+
micro = "0",
212+
os = "@platforms//:incompatible",
213+
)
214+
215+
def _resolve_interpreter_path(rctx):
216+
"""Find the absolute path for an interpreter.
217+
218+
Args:
219+
rctx: A repository_ctx object
220+
221+
Returns:
222+
`struct` with the following fields:
223+
* `resolved_path`: `path` object of a path that exists
224+
* `describe_failure`: `Callable | None`. If a path that doesn't exist,
225+
returns a description of why it couldn't be resolved
226+
A path object or None. The path may not exist.
227+
"""
228+
if "/" not in rctx.attr.interpreter_path and "\\" not in rctx.attr.interpreter_path:
229+
# Provide a bit nicer integration with pyenv: recalculate the runtime if the
230+
# user changes the python version using e.g. `pyenv shell`
231+
repo_utils.getenv(rctx, "PYENV_VERSION")
232+
result = repo_utils.which_unchecked(rctx, rctx.attr.interpreter_path)
233+
resolved_path = result.binary
234+
describe_failure = result.describe_failure
235+
else:
236+
repo_utils.watch(rctx, rctx.attr.interpreter_path)
237+
resolved_path = rctx.path(rctx.attr.interpreter_path)
238+
if not resolved_path.exists:
239+
describe_failure = lambda: "Path not found: {}".format(repr(rctx.attr.interpreter_path))
240+
else:
241+
describe_failure = None
242+
243+
return struct(
244+
resolved_path = resolved_path,
245+
describe_failure = describe_failure,
246+
)
247+
248+
def _format_get_info_result(info):
249+
lines = ["GetPythonInfo result:"]
250+
for key, value in sorted(info.items()):
251+
lines.append(" {}: {}".format(key, value if value != "" else "<empty string>"))
252+
return "\n".join(lines)

0 commit comments

Comments
 (0)