-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
GH-112855: Speed up pathlib.PurePath
pickling
#112856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The second item in the tuple returned from `__reduce__()` is a tuple of arguments to supply to path constructor. Previously we returned the `parts` tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation. With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness). By also omitting to `sys.intern()` the path parts, we slightly speed up path parsing/normalization more generally.
Makes pickling ~3x faster, depending on what you include in the measurement: $ ./python -m timeit -s "from pathlib import PurePath" "PurePath('foo').__reduce__()"
100000 loops, best of 5: 2.17 usec per loop # before
500000 loops, best of 5: 617 nsec per loop # after
$ ./python -m timeit -s "from pathlib import PurePath; p = PurePath('foo')" "p.__reduce__()"
2000000 loops, best of 5: 169 nsec per loop # before
5000000 loops, best of 5: 78.1 nsec per loop # after It's hard to measure using only public APIs, but path parsing generally is a little faster due to dropping the
|
@AlexWaygood asked whether this might make pathlib + multiprocessing faster, and indeed it does. Simple benchmark script: from multiprocessing import Pool
from pathlib import PurePath
def f(pathobj):
return str(pathobj)
if __name__ == '__main__':
paths = [PurePath(str(i), str(j))
for i in range(1000)
for j in range(1000)]
with Pool(5) as p:
p.map(f, paths) Before:
After:
--> ~1.85x faster in this example. |
Simple compactness test: import pathlib
import pickle
paths = pathlib.Path().glob('**/*')
print(len(pickle.dumps(tuple(paths)))) In my CPython checkout, the size increased by ~20% with this PR. If I instead glob from Seems like a good bargain to me. |
This doesn't look like a good idea to me, especially as it may increase memory consumption when unpickling. Unless you know of a workload where this gives a benefit, I would tend to reject this PR. |
Thanks @pitrou. Personally I see no reason to prioritise memory consumption over speed of processing here. As a user of pathlib, I'd expect the performance/space characteristics of serialising path objects to roughly match that of string paths. All workloads involving pickling are affected, such as the |
This example is not a workload, it's a completely unrealistic micro-benchmark which doesn't reflect actual usage. To rephrase my objection: |
Could you provide a workload, real-world or otherwise, that shows your concerns? Everything I try shows a speed-up. |
I would say: glob a large filesystem tree (something like |
Mega, thank you. Running a glob across my media collection (133003 files): barney@acorn /media/sexy $ /usr/bin/time -v ~/projects/cpython/python -c 'from pathlib import Path; list(Path.cwd().glob("**/*"))'
Maximum resident set size (kbytes): 88284 # before
Maximum resident set size (kbytes): 84828 # after ... which is a little surprising! I'll dig in. |
(make sure you compile CPython in non-debug mode, by the way :-)) |
Turns out pathlib doesn't intern strings while globbing or iterating over directories, and possibly never did? The private |
IIRC it was meant to (globbing or walking directories is really the primary situation where you'd get multiple instances of the same path component), so I'm a bit surprised. |
It surprises me too :o FWIW, I've restored the interning of path parts in this PR, so the only thing it changes is |
I plan to merge this patch within a few days, as I'm reasonably sure it will provide a yummy speedup without having much impact on memory usage, given that pathlib doesn't intern parts in most cases where related paths are generated (directory children, |
@barneygale do you think it's worth backporting this pickling optimization to 3.12? 3x performance for a one-line change is pretty nice |
@Hnasar we don't backport performance optimisations, I'm afraid, only bugfixes. |
The second item in the tuple returned from
__reduce__()
is a tuple of arguments to supply to path constructor. Previously we returned theparts
tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation.With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness).
It's worth noting that, in the olden times, pathlib performed this parsing/normalization up-front in every case, and so using
parts
for pickling was almost free. Nowadays pathlib only parses/normalises paths when it's necessary or advantageous to do so (e.g. computing a path parent, or iterating over a directory, respectively).