Skip to content

Commit 15ce2bf

Browse files
committed
Parallelize bytecode compilation ✨
Bytecode compilation is slow. It's often one of the biggest contributors to the install step's sluggishness. For better or worse, we can't really enable --no-compile by default as it has the potential to render certain workflows permanently slower in a subtle way.[^1] To improve the situation, bytecode compilation can be parallelized across a pool of processes (or sub-interpreters on Python 3.14). I've observed a 1.1x to 3x improvement in install step times locally.[^2] This patch has been written to be relatively comprehensible, but for posterity, these are the high-level implementation notes: - We can't use compileall.compile_dir() because it spins up a new worker pool on every invocation. If it's used as a "drop-in" replacement for compileall.compile_file(), then the pool creation overhead will be paid for every package installed. This is bad and kills most of the gains. Redesigning the installation logic to compile everything at the end was rejected for being too invasive (a key goal was to avoid affecting the package installation order). - A bytecode compiler is created right before package installation starts and reused for all packages. Depending on platform and workload, either a serial (in-process) compiler or parallel compiler will be used. They both have the same interface, accepting a batch of Python filepaths to compile. - This patch was designed to as low-risk as reasonably possible. pip does not contain any parallelized code, thus introducing any sort of parallelization poses a nontrivial risk. To minimize this risk, the only code parallelized is the bytecode compilation code itself (~10 LOC). In addition, the package install order is unaffected and pip will fall back to serial compilation if parallelization is unsupported. The criteria for parallelization are: 1. There are at least 2 CPUs available. The process CPU count is used if available, otherwise the system CPU count. If there is only one CPU, serial compilation will always be used because even a parallel compiler with one worker will add extra overhead. 2. The maximum amount of workers is at least 2. This is controlled by the --install-jobs option.[^3] It defaults to "auto" which uses the process/system CPU count.[^4] 3. There is "enough" code for parallelization to be "worth it". This criterion exists so pip won't waste (say) 100ms on spinning up a parallel compiler when compiling serially would only take 20ms.[^5] The limit is set to 1 MB of Python code. This is admittedly rather crude, but it seems to work well enough having tested on a variety of systems. [^1]: Basically, if the Python files are installed to a read-only directory, then importing those files will be permanently slower as the .pyc files will never be cached. This is quite subtle, enough so that we can't really expect newbies to recognise and know how to address this (there is the PYTHONPYCACHEPREFIX envvar, but if you're advanced enough to use it, then you are also advanced enough to know when to use uv or pip's --no-compile). [^2]: The 1.1x was on a painfully slow dual-core/HDD-equipped Windows install installing simply setuptools. The 3x was observed on my main 8-core Ryzen 5800HS Windows machine while installing pip's own test dependencies. [^3]: Yes, this is probably not the best name, but adding an option for just bytecode compilation seems silly. Anyway, this will give us room if we ever parallelize more parts of the install step. [^4]: Up to a hard-coded limit of 8 to avoid resource exhaustion. This number was chosen arbitrarily, but is definitely high enough to net a major improvement. [^5]: This is important because I don't want to slow down tiny installs (e.g., pip install six ... or our own test suite). Creating a new process is prohibitively expensive on Windows (and to a lesser degree on macOS) for various reasons, so parallelization can't be simply used all of time.
1 parent 331400c commit 15ce2bf

File tree

8 files changed

+439
-31
lines changed

8 files changed

+439
-31
lines changed

news/13247.feature.rst

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Bytecode compilation is parallelized to significantly speed up installation of
2+
large/many packages. By default, the number of workers matches the available CPUs
3+
(up to a hard-coded limit), but can be adjusted using the ``--install-jobs``
4+
option. To disable parallelization, pass ``--install-jobs 1``.

src/pip/_internal/cli/cmdoptions.py

+33
Original file line numberDiff line numberDiff line change
@@ -1070,6 +1070,39 @@ def check_list_path_option(options: Values) -> None:
10701070
)
10711071

10721072

1073+
def _handle_jobs(
1074+
option: Option, opt_str: str, value: str, parser: OptionParser
1075+
) -> None:
1076+
if value == "auto":
1077+
setattr(parser.values, option.dest, "auto")
1078+
return
1079+
1080+
try:
1081+
if (count := int(value)) > 0:
1082+
setattr(parser.values, option.dest, count)
1083+
return
1084+
except ValueError:
1085+
pass
1086+
1087+
msg = "should be a positive integer or 'auto'"
1088+
raise_option_error(parser, option=option, msg=msg)
1089+
1090+
1091+
install_jobs: Callable[..., Option] = partial(
1092+
Option,
1093+
"--install-jobs",
1094+
dest="install_jobs",
1095+
default="auto",
1096+
type=str,
1097+
action="callback",
1098+
callback=_handle_jobs,
1099+
help=(
1100+
"Maximum number of workers to use while installing packages. "
1101+
"To disable parallelization, pass 1. (default: %default)"
1102+
),
1103+
)
1104+
1105+
10731106
##########
10741107
# groups #
10751108
##########

src/pip/_internal/commands/install.py

+7
Original file line numberDiff line numberDiff line change
@@ -270,6 +270,8 @@ def add_options(self) -> None:
270270
),
271271
)
272272

273+
self.cmd_opts.add_option(cmdoptions.install_jobs())
274+
273275
@with_cleanup
274276
def run(self, options: Values, args: List[str]) -> int:
275277
if options.use_user_site and options.target_dir is not None:
@@ -416,6 +418,10 @@ def run(self, options: Values, args: List[str]) -> int:
416418
# we're not modifying it.
417419
modifying_pip = pip_req.satisfied_by is None
418420
protect_pip_from_modification_on_windows(modifying_pip=modifying_pip)
421+
if modifying_pip:
422+
# Parallelization will re-import pip when starting new workers
423+
# during installation which is unsafe if pip is being modified.
424+
options.install_jobs = 1
419425

420426
reqs_to_build = [
421427
r
@@ -465,6 +471,7 @@ def run(self, options: Values, args: List[str]) -> int:
465471
use_user_site=options.use_user_site,
466472
pycompile=options.compile,
467473
progress_bar=options.progress_bar,
474+
workers=options.install_jobs,
468475
)
469476

470477
lib_locations = get_lib_location_guesses(

src/pip/_internal/operations/install/wheel.py

+12-26
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,15 @@
11
"""Support for installing and building the "wheel" binary package format."""
22

33
import collections
4-
import compileall
54
import contextlib
65
import csv
7-
import importlib
86
import logging
97
import os.path
108
import re
119
import shutil
1210
import sys
13-
import warnings
1411
from base64 import urlsafe_b64encode
1512
from email.message import Message
16-
from io import StringIO
1713
from itertools import chain, filterfalse, starmap
1814
from typing import (
1915
IO,
@@ -51,6 +47,7 @@
5147
from pip._internal.models.scheme import SCHEME_KEYS, Scheme
5248
from pip._internal.utils.filesystem import adjacent_tmp_file, replace
5349
from pip._internal.utils.misc import ensure_dir, hash_file, partition
50+
from pip._internal.utils.pyc_compile import BytecodeCompiler
5451
from pip._internal.utils.unpacking import (
5552
current_umask,
5653
is_within_directory,
@@ -417,12 +414,12 @@ def make(
417414
return super().make(specification, options)
418415

419416

420-
def _install_wheel( # noqa: C901, PLR0915 function is too long
417+
def _install_wheel( # noqa: C901 function is too long
421418
name: str,
422419
wheel_zip: ZipFile,
423420
wheel_path: str,
424421
scheme: Scheme,
425-
pycompile: bool = True,
422+
pycompiler: Optional[BytecodeCompiler],
426423
warn_script_location: bool = True,
427424
direct_url: Optional[DirectUrl] = None,
428425
requested: bool = False,
@@ -601,25 +598,14 @@ def pyc_source_file_paths() -> Generator[str, None, None]:
601598
continue
602599
yield full_installed_path
603600

604-
def pyc_output_path(path: str) -> str:
605-
"""Return the path the pyc file would have been written to."""
606-
return importlib.util.cache_from_source(path)
607-
608601
# Compile all of the pyc files for the installed files
609-
if pycompile:
610-
with contextlib.redirect_stdout(StringIO()) as stdout:
611-
with warnings.catch_warnings():
612-
warnings.filterwarnings("ignore")
613-
for path in pyc_source_file_paths():
614-
success = compileall.compile_file(path, force=True, quiet=True)
615-
if success:
616-
pyc_path = pyc_output_path(path)
617-
assert os.path.exists(pyc_path)
618-
pyc_record_path = cast(
619-
"RecordPath", pyc_path.replace(os.path.sep, "/")
620-
)
621-
record_installed(pyc_record_path, pyc_path)
622-
logger.debug(stdout.getvalue())
602+
if pycompiler is not None:
603+
for module in pycompiler(pyc_source_file_paths()):
604+
if module.is_success:
605+
pyc_record_path = module.pyc_path.replace(os.path.sep, "/")
606+
record_installed(RecordPath(pyc_record_path), module.pyc_path)
607+
if output := module.compile_output:
608+
logger.debug(output)
623609

624610
maker = PipScriptMaker(None, scheme.scripts)
625611

@@ -718,7 +704,7 @@ def install_wheel(
718704
wheel_path: str,
719705
scheme: Scheme,
720706
req_description: str,
721-
pycompile: bool = True,
707+
pycompiler: Optional[BytecodeCompiler] = None,
722708
warn_script_location: bool = True,
723709
direct_url: Optional[DirectUrl] = None,
724710
requested: bool = False,
@@ -730,7 +716,7 @@ def install_wheel(
730716
wheel_zip=z,
731717
wheel_path=wheel_path,
732718
scheme=scheme,
733-
pycompile=pycompile,
719+
pycompiler=pycompiler,
734720
warn_script_location=warn_script_location,
735721
direct_url=direct_url,
736722
requested=requested,

src/pip/_internal/req/__init__.py

+38-3
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
11
import collections
22
import logging
3+
from contextlib import nullcontext
34
from dataclasses import dataclass
4-
from typing import Generator, List, Optional, Sequence, Tuple
5+
from functools import partial
6+
from typing import Generator, Iterable, List, Optional, Sequence, Tuple
7+
from zipfile import ZipFile
58

69
from pip._internal.cli.progress_bars import get_install_progress_renderer
710
from pip._internal.utils.logging import indent_log
11+
from pip._internal.utils.pyc_compile import WorkerSetting, create_bytecode_compiler
812

913
from .req_file import parse_requirements
1014
from .req_install import InstallRequirement
@@ -33,6 +37,28 @@ def _validate_requirements(
3337
yield req.name, req
3438

3539

40+
def _does_python_size_surpass_threshold(
41+
requirements: Iterable[InstallRequirement], threshold: int
42+
) -> bool:
43+
"""Inspect wheels to check whether there is enough .py code to
44+
enable bytecode parallelization.
45+
"""
46+
py_size = 0
47+
for req in requirements:
48+
if not req.local_file_path or not req.is_wheel:
49+
# No wheel to inspect as this is a legacy editable.
50+
continue
51+
52+
with ZipFile(req.local_file_path, allowZip64=True) as wheel_file:
53+
for entry in wheel_file.infolist():
54+
if entry.filename.endswith(".py"):
55+
py_size += entry.file_size
56+
if py_size > threshold:
57+
return True
58+
59+
return False
60+
61+
3662
def install_given_reqs(
3763
requirements: List[InstallRequirement],
3864
global_options: Sequence[str],
@@ -43,6 +69,7 @@ def install_given_reqs(
4369
use_user_site: bool,
4470
pycompile: bool,
4571
progress_bar: str,
72+
workers: WorkerSetting,
4673
) -> List[InstallationResult]:
4774
"""
4875
Install everything in the given list.
@@ -68,7 +95,15 @@ def install_given_reqs(
6895
)
6996
items = renderer(items)
7097

71-
with indent_log():
98+
if pycompile:
99+
code_size_check = partial(
100+
_does_python_size_surpass_threshold, to_install.values()
101+
)
102+
pycompiler = create_bytecode_compiler(workers, code_size_check)
103+
else:
104+
pycompiler = None
105+
106+
with indent_log(), pycompiler or nullcontext():
72107
for requirement in items:
73108
req_name = requirement.name
74109
assert req_name is not None
@@ -87,7 +122,7 @@ def install_given_reqs(
87122
prefix=prefix,
88123
warn_script_location=warn_script_location,
89124
use_user_site=use_user_site,
90-
pycompile=pycompile,
125+
pycompiler=pycompiler,
91126
)
92127
except Exception:
93128
# if install did not succeed, rollback previous uninstall

src/pip/_internal/req/req_install.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@
5353
redact_auth_from_url,
5454
)
5555
from pip._internal.utils.packaging import get_requirement
56+
from pip._internal.utils.pyc_compile import BytecodeCompiler
5657
from pip._internal.utils.subprocess import runner_with_spinner_message
5758
from pip._internal.utils.temp_dir import TempDirectory, tempdir_kinds
5859
from pip._internal.utils.unpacking import unpack_file
@@ -812,7 +813,7 @@ def install(
812813
prefix: Optional[str] = None,
813814
warn_script_location: bool = True,
814815
use_user_site: bool = False,
815-
pycompile: bool = True,
816+
pycompiler: Optional[BytecodeCompiler] = None,
816817
) -> None:
817818
assert self.req is not None
818819
scheme = get_scheme(
@@ -869,7 +870,7 @@ def install(
869870
self.local_file_path,
870871
scheme=scheme,
871872
req_description=str(self.req),
872-
pycompile=pycompile,
873+
pycompiler=pycompiler,
873874
warn_script_location=warn_script_location,
874875
direct_url=self.download_info if self.is_direct else None,
875876
requested=self.user_supplied,

0 commit comments

Comments
 (0)