Skip to content

gh-118761: substitute re import in base64.b16decode for a more efficient alternative #128736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 14, 2025

Conversation

picnixz
Copy link
Member

@picnixz picnixz commented Jan 11, 2025

Benchmarks are on a RELEASE build (no PGO, no LTO).

See #128736 (comment) for the runtime performance improvements as well.

PR

$ ./python -I -X importtime -c 'import base64'
import time: self [us] | cumulative | imported package
...
import time:       179 |        179 | linecache
import time:       205 |        205 |     _struct
import time:       442 |        646 |   struct
import time:       227 |        227 |   binascii
import time:       298 |       1169 | base64
$ hyperfine --warmup 16 "./python -c 'import base64'"
Benchmark 1: ./python -c 'import base64'
  Time (mean ± σ):       5.7 ms ±   0.5 ms    [User: 4.8 ms, System: 1.1 ms]
  Range (min … max):     5.2 ms …  11.9 ms    455 runs

Main

$ ./python -I -X importtime -c 'import base64'
import time: self [us] | cumulative | imported package
...
import time:       180 |        180 | linecache
import time:       312 |        312 |       types
import time:      1609 |       1921 |     enum
import time:        87 |         87 |       _sre
import time:       250 |        250 |         re._constants
import time:       388 |        637 |       re._parser
import time:        89 |         89 |       re._casefix
import time:       339 |       1150 |     re._compiler
import time:       103 |        103 |         itertools
import time:        83 |         83 |         keyword
import time:        58 |         58 |           _operator
import time:       206 |        264 |         operator
import time:       129 |        129 |         reprlib
import time:        49 |         49 |         _collections
import time:       785 |       1411 |       collections
import time:        43 |         43 |       _functools
import time:       543 |       1997 |     functools
import time:       122 |        122 |     copyreg
import time:       826 |       6015 |   re
import time:       115 |        115 |     _struct
import time:        74 |        189 |   struct
import time:       122 |        122 |   binascii
import time:       279 |       6603 | base64
$ hyperfine --warmup 16 "./python -c 'import base64'"
Benchmark 1: ./python -c 'import base64'
  Time (mean ± σ):       9.5 ms ±   0.4 ms    [User: 8.2 ms, System: 1.3 ms]
  Range (min … max):     9.1 ms …  13.5 ms    295 runs

Importing `base64` is now up to six times faster.

The `re` module is now locally imported by `base64.b16decode`
and is no more implicitly exposed as `base64.re`.
@picnixz picnixz added the performance Performance or resource usage label Jan 11, 2025
@picnixz picnixz requested a review from hugovk January 11, 2025 16:30
@picnixz
Copy link
Member Author

picnixz commented Jan 11, 2025

@hugovk I'm requesting your review since you've commented on the issue. There are also #128732 and #128738 to review FYI.

This entirely removes the needs to a regex.
Copy link
Member

@hugovk hugovk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@picnixz picnixz changed the title gh-118761: improve import time for base64 gh-118761: substitute re import in base64.b16decode for a more efficient alternative Jan 13, 2025
@picnixz picnixz requested a review from AA-Turner January 14, 2025 12:26
@AA-Turner AA-Turner merged commit bbd3300 into python:main Jan 14, 2025
38 checks passed
@AA-Turner
Copy link
Member

Thanks!

A

@picnixz picnixz deleted the perf/import/base64-118761 branch January 14, 2025 13:49
@vstinner
Copy link
Member

@AA-Turner merged commit bbd3300 into python:main Jan 14, 2025

FYI @picnixz was recently promoted as a core dev and so can merge his own changes.

@picnixz
Copy link
Member Author

picnixz commented Jan 14, 2025

(I don't mind others merging my PRs by the way)

@chris-eibl
Copy link
Member

chris-eibl commented Jan 14, 2025

and reduce the import time of :mod:base64 by up to six times.

9.5 ms down to 5.7 ms is impressive but not six times - rather 60%?

@picnixz
Copy link
Member Author

picnixz commented Jan 14, 2025

hyperfine benchmarks also take into account interpreter's startup and so, while -X importtime only takes into account the time needed to import the module itself (hence the differences).

@chris-eibl
Copy link
Member

Sure, my bad. Importtime itself is of course six times faster.

Thank you all for the great work here in cPython!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants