-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
gh-100792: Make email.message.Message.__contains__
twice as fast
#100793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If we're doing microoptimizations, I think |
No, in this case it is slower: def __contains__(self, name):
name_lower = name.lower()
return any(name_lower == k.lower() for k, v in self._headers) Results:
|
The code with |
@eendebakpt we don't want to make existing code slower just to use some style related thing, do we? :) I agree that
So, I think we can call this pattern native to this module :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking! (and I think I was just wrong about any
sometimes being faster than the loop, don't see why it would be)
Anyway, this looks fine to me! cc @JelleZijlstra
One quick note (you're probably aware of this, but in case other potential contributors are reading): CPython is often quite hesitant to accept micro-optimisations and I think this PR comes close to that line. For example, if counterfactually the existing code used any
and you changed it to a loop for the same speedup, I think that change would be rejected (without stronger evidence that this is something worth optimising).
(Why is CPython conservative here? First, reviewing changes itself costs maintainer time. Code churn risks bugs, obscures history, and invites more churn. Often micro-optimisations are not robust in the face of differing Python implementations or changes in the interpreter; we should avoid local minima. Such changes often affect readability, but readability is subjective, and this can lead to debate that further eats at maintainer time or leaves contributors feeling unwelcome)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I'd also prefer the cleaner, more idiomatic code using any()
-- the precise performance characteristics of any
vs a for
loop feel like they're subject to change in the future, and I disagree that style decisions made a decade and a half ago should determine the style of new additions to the code base.
But, this is precisely the kind of bikeshedding that @hauntsaninja was talking about. So, I don't want to block the PR based on my style preferences -- it is indeed a nice optimisation :)
Misc/NEWS.d/next/Library/2023-01-06-14-05-15.gh-issue-100792.CEOJth.rst
Outdated
Show resolved
Hide resolved
No strong opinion here but @hauntsaninja feel free to merge based on your best judgment. Also, in the future no need to ping me on all PRs any more, though of course you can if you want another opinion. |
…EOJth.rst Co-authored-by: Alex Waygood <[email protected]>
See also @pochmann's comments on the issue: #100792 (comment) |
Thanks for caring and for making Python faster! |
See my micro-benchmarks in the original issue.
email.message.Message.__contains__
faster #100792