Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
gh-120397: improve the speed of str.count, bytes.count et al. for single characters by about 2x. #120398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-120397: improve the speed of str.count, bytes.count et al. for single characters by about 2x. #120398
Changes from 8 commits
b2f9fb5
cb56443
fff610b
1359366
5837e41
c6f4068
0eaccd5
a85202c
807706d
fb83c6a
05a1fc2
2fab99b
ce9ab9b
0cc9369
30a65a7
a16c7ec
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make that function private (I don't think it should be exposed except in this module).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is private, as it is static. The STRINGLIB macro is to prevent name clobbering. This function will be generated for STRINGLIB_CHAR==Py_UCS1, Py_UCS2 and PyUCS4. I think keeping it this way is correct. But I may be wrong of course. How do you suggest making it private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh just by adding an underscore before its name (I should have been clearer when I said "private", I meant it in the naming but I think we don't care about underscores in C files).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't use underscore prefix in Python for static functions. Moreover, the macro adds a prefix such as
ucs1lib_
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maxcount is only used in the replace function, it is very unlikely that this condition will ever be triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but there's no reason to check for
PY_SSIZE_T_MAX
specifically when this works as well.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are correct. However, this function needs some refactoring, as this maxcount provision is only there for replace. Replace for single characters is special.cased elsewhere, so maxcount is actually always Pyssize_t_max I think. I want to revisit this at a later point.