-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Improve speed of stdlib functions by replacing re
uses
#130167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
re
of various stdlib modulesre
uses
re
usesre
uses
Unless you find and list examples of useful replacements in an stdlib module, this is not properly an issue. If you have or do, make a PR. Or suggest a revision of the HOWTO section. |
@terryjreedy thank you for your comment and I know about the counterarguments you brought up but it doesn't change the increase in speed and the fact that we can get rid of unnecessary imports (sometimes or often, I don't know the statistics yet :) I've only submitted one PR so far but I'm open to finding more examples since we use |
I removed the paragraph for new contributions as this issue is not necessary something we want to address immediately. Before opening a PR, benchmarks must be given and be convincing. If we lose the readability at the cost of an improved import time, I'm not sure it's worth unless we're like 5 or 6 times faster and if it's an important module that is usually imported and that |
Usually, i think, the readability of code without regular expressions is better. And from a performance standpoint, I think the speed of the function is the main improvement, although import time is also important. you can find |
If you can remove Note that it's also important to possibly plan for future extensions of the regex usage. Like getting the matched substring could be useful later. Now, I'm not against improving the various |
I've restored your guidelines but added the requirements of having benchmarks |
Considering I wrote #128983, I think we can treat it as a legit improvement. But let's only focus on changing re usages, not changing the imports or something else. |
Co-authored-by: Marius Juston <[email protected]> Co-authored-by: Pieter Eendebak <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Marius Juston <[email protected]> Co-authored-by: Pieter Eendebak <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>
We can often find the module
re
in the standard library modules but it can be replaced (if it is possible). I don't suggest removing it everywhere, there are places where its use is appropriate, but there are also places where it is an unnecessary solution and leads to unpleasant consequences (they can be found below)Cons of regular expressions and reasons to replace regular expressions with functions and methods:
import re
which will affect import timeImportant
For those who want to work on the issue, please:
pyperf
,hyperfine
, andtuna
together with-X importtime
to compare import times and execution time.gh-130167: Improve speed of `module.function` by replacing `re`
Linked PRs
difflib.IS_LINE_JUNK
by replacingre
#130170inspect.formatannotation
by replacingre
#130242ftplib.parse150
by replacingre
#130243textwrap.dedent()
#131919textwrap.dedent()
optimization #131925textwrap.{de,in}dent
#131924_pydecimal._all_zeros
and_pydecimal._exact_half
by replacingre
#132065textwrap.dedent
#132666The text was updated successfully, but these errors were encountered: