GH-112855: Speed up `pathlib.PurePath` pickling #112856

barneygale · 2023-12-07T20:20:12Z

The second item in the tuple returned from __reduce__() is a tuple of arguments to supply to path constructor. Previously we returned the parts tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation.

With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness).

It's worth noting that, in the olden times, pathlib performed this parsing/normalization up-front in every case, and so using parts for pickling was almost free. Nowadays pathlib only parses/normalises paths when it's necessary or advantageous to do so (e.g. computing a path parent, or iterating over a directory, respectively).

Issue: Optimize pathlib path pickling #112855

The second item in the tuple returned from `__reduce__()` is a tuple of arguments to supply to path constructor. Previously we returned the `parts` tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation. With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness). By also omitting to `sys.intern()` the path parts, we slightly speed up path parsing/normalization more generally.

barneygale · 2023-12-07T20:27:18Z

Makes pickling ~3x faster, depending on what you include in the measurement:

$ ./python -m timeit -s "from pathlib import PurePath" "PurePath('foo').__reduce__()"
100000 loops, best of 5:   2.17 usec per loop  # before
500000 loops, best of 5: 617    nsec per loop  # after

$ ./python -m timeit -s "from pathlib import PurePath; p = PurePath('foo')" "p.__reduce__()"
2000000 loops, best of 5: 169   nsec per loop  # before
5000000 loops, best of 5:  78.1 nsec per loop  # after

It's hard to measure using only public APIs, but path parsing generally is a little faster due to dropping the sys.intern(str(x)) bit. I measure about a 3% improvement from generating a string representation of PurePath('foo'):

./python -m timeit -s "from pathlib import PurePath;" "str(PurePath('foo'))"
100000 loops, best of 5: 2.98 usec per loop  # before
100000 loops, best of 5: 2.88 usec per loop  # after

barneygale · 2023-12-07T21:19:08Z

@AlexWaygood asked whether this might make pathlib + multiprocessing faster, and indeed it does. Simple benchmark script:

from multiprocessing import Pool
from pathlib import PurePath

def f(pathobj):
    return str(pathobj)

if __name__ == '__main__':
    paths = [PurePath(str(i), str(j))
             for i in range(1000)
             for j in range(1000)]
    with Pool(5) as p:
        p.map(f, paths)

Before:

$ time ./python mp.py
real    0m7.730s
user    0m13.161s
sys     0m0.869s

After:

real    0m4.165s
user    0m10.614s
sys     0m0.976s

--> ~1.85x faster in this example.

barneygale · 2023-12-08T01:26:55Z

Simple compactness test:

import pathlib
import pickle

paths = pathlib.Path().glob('**/*')
print(len(pickle.dumps(tuple(paths))))

In my CPython checkout, the size increased by ~20% with this PR.

If I instead glob from Path.cwd(), which adds a repetitive /home/barney/projects/cpython prefix to every path, then the size increased by ~50% from baseline.

Seems like a good bargain to me.

pitrou · 2023-12-11T15:10:18Z

This doesn't look like a good idea to me, especially as it may increase memory consumption when unpickling. Unless you know of a workload where this gives a benefit, I would tend to reject this PR.

barneygale · 2023-12-11T15:33:32Z

Thanks @pitrou. Personally I see no reason to prioritise memory consumption over speed of processing here. As a user of pathlib, I'd expect the performance/space characteristics of serialising path objects to roughly match that of string paths. All workloads involving pickling are affected, such as the multiprocessing example in my previous comment, where the wall time is nearly halved by this PR.

pitrou · 2023-12-11T16:02:10Z

All workloads involving pickling are affected, such as the multiprocessing example in my previous comment, where the wall time is nearly halved by this PR.

This example is not a workload, it's a completely unrealistic micro-benchmark which doesn't reflect actual usage.

To rephrase my objection:
"Unless you know of a (actual, real world) workload where this gives a benefit, I would tend to reject this PR".

Lib/pathlib/_abc.py

Lib/test/test_pathlib.py

barneygale · 2023-12-11T16:24:10Z

Could you provide a workload, real-world or otherwise, that shows your concerns? Everything I try shows a speed-up.

pitrou · 2023-12-11T16:27:42Z

I would say: glob a large filesystem tree (something like list(Path('/usr').glob('**'))) and measure memory consumption.

barneygale · 2023-12-11T16:38:42Z

Mega, thank you. Running a glob across my media collection (133003 files):

barney@acorn /media/sexy $ /usr/bin/time -v ~/projects/cpython/python -c 'from pathlib import Path; list(Path.cwd().glob("**/*"))'
	Maximum resident set size (kbytes): 88284  # before
	Maximum resident set size (kbytes): 84828  # after

... which is a little surprising! I'll dig in.

pitrou · 2023-12-11T16:47:06Z

(make sure you compile CPython in non-debug mode, by the way :-))

barneygale · 2023-12-17T01:10:51Z

Turns out pathlib doesn't intern strings while globbing or iterating over directories, and possibly never did? The private _make_child_relpath() constructor creates a fully-parsed path, bypassing the path parsing routine (_Flavour.parse_parts() in the past, _abc.PurePathBase.parse_path() currently), and therefore skipping the calls to sys.intern().

pitrou · 2023-12-17T10:00:01Z

Turns out pathlib doesn't intern strings while globbing or iterating over directories, and possibly never did?

IIRC it was meant to (globbing or walking directories is really the primary situation where you'd get multiple instances of the same path component), so I'm a bit surprised.

barneygale · 2023-12-22T22:54:40Z

It surprises me too :o

FWIW, I've restored the interning of path parts in this PR, so the only thing it changes is __reduce__()

barneygale · 2024-04-14T14:54:50Z

I plan to merge this patch within a few days, as I'm reasonably sure it will provide a yummy speedup without having much impact on memory usage, given that pathlib doesn't intern parts in most cases where related paths are generated (directory children, with_name(), etc). But I'll keep a close eye on bug reports / forums / etc to see if anyone reports increased memory/disk usage when pickling paths in 3.13.

Hnasar · 2024-11-08T21:45:16Z

@barneygale do you think it's worth backporting this pickling optimization to 3.12? 3x performance for a one-line change is pretty nice
(thanks for all your work on pathlib btw!!)

AlexWaygood · 2024-11-08T21:46:56Z

@Hnasar we don't backport performance optimisations, I'm afraid, only bugfixes.

barneygale added performance Performance or resource usage topic-pathlib labels Dec 7, 2023

bedevere-app bot added the awaiting core review label Dec 7, 2023

bedevere-app bot mentioned this pull request Dec 7, 2023

Optimize pathlib path pickling #112855

Closed

Remove comment that is now invalid.

f4e7ebd

barneygale mentioned this pull request Dec 8, 2023

[WIP] GH-65238: Preserve trailing slash in pathlib #112363

Closed

4 tasks

Merge branch 'main' into optimise-path-pickle

3cb7bbe

AlexWaygood self-requested a review December 9, 2023 23:56

serhiy-storchaka requested a review from pitrou December 11, 2023 11:54

pitrou reviewed Dec 11, 2023

View reviewed changes

Lib/pathlib/_abc.py Outdated Show resolved Hide resolved

pitrou reviewed Dec 11, 2023

View reviewed changes

Lib/test/test_pathlib.py Outdated Show resolved Hide resolved

barneygale added 2 commits December 18, 2023 04:42

Merge branch 'main' into optimise-path-pickle

0799d7a

Restore part interning, tests changes.

9720628

barneygale added 3 commits December 22, 2023 22:54

Merge branch 'main' into optimise-path-pickle

49f5c27

Merge branch 'main' into optimise-path-pickle

4747a5e

Remove unnecessary parentheses

1ca0037

barneygale merged commit 15fbd53 into python:main Apr 20, 2024

bedevere-app bot removed the awaiting core review label Apr 20, 2024

barneygale mentioned this pull request May 24, 2024

Stop manually interning strings in pathlib #119518

Closed

Uh oh!

GH-112855: Speed up pathlib.PurePath pickling #112856

GH-112855: Speed up pathlib.PurePath pickling #112856

Uh oh!

Conversation

barneygale commented Dec 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

barneygale commented Dec 7, 2023

Uh oh!

barneygale commented Dec 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

barneygale commented Dec 8, 2023

Uh oh!

pitrou commented Dec 11, 2023

Uh oh!

barneygale commented Dec 11, 2023

Uh oh!

pitrou commented Dec 11, 2023

Uh oh!

Uh oh!

Uh oh!

barneygale commented Dec 11, 2023

Uh oh!

pitrou commented Dec 11, 2023

Uh oh!

barneygale commented Dec 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pitrou commented Dec 11, 2023

Uh oh!

barneygale commented Dec 17, 2023

Uh oh!

pitrou commented Dec 17, 2023

Uh oh!

barneygale commented Dec 22, 2023

Uh oh!

barneygale commented Apr 14, 2024

Uh oh!

Hnasar commented Nov 8, 2024

Uh oh!

AlexWaygood commented Nov 8, 2024

Uh oh!

Uh oh!

GH-112855: Speed up `pathlib.PurePath` pickling #112856

GH-112855: Speed up `pathlib.PurePath` pickling #112856

barneygale commented Dec 7, 2023 •

edited

Loading

barneygale commented Dec 7, 2023 •

edited

Loading

barneygale commented Dec 11, 2023 •

edited

Loading