-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-70278: Fix PyUnicode_FromFormat() with precision for %s and %V #120365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-70278: Fix PyUnicode_FromFormat() with precision for %s and %V #120365
Conversation
PyUnicode_FromFormat() no longer produces the ending \ufffd character for truncated C string when use precision with %s and %V. It now truncates the string before the start of truncated multibyte sequences.
Objects/unicodeobject.c
Outdated
} | ||
else { | ||
length = 0; | ||
while (length < precision && str[length]) { | ||
length++; | ||
} | ||
pconsumed = (length < precision) ? NULL : &consumed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment explaining why you set pconsumed? Explain the expected behavior of truncating incomplete sequence at the end, but still replace invalid sequence in the middle. Something like that :-) You can add a reference to the issue gh-70278.
…t-truncate' into PyUnicode_FromFormat-truncate
Can you try to add an unit test? |
There are new tests in this PR. What other cases do you want to test? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
For tests, GitHub didn't show me the whole diff when I clicked on a notification. I'm always confused by the UI, sorry.
…%V (pythonGH-120365) PyUnicode_FromFormat() no longer produces the ending \ufffd character for truncated C string when use precision with %s and %V. It now truncates the string before the start of truncated multibyte sequences.
…%V (pythonGH-120365) PyUnicode_FromFormat() no longer produces the ending \ufffd character for truncated C string when use precision with %s and %V. It now truncates the string before the start of truncated multibyte sequences.
…%V (pythonGH-120365) PyUnicode_FromFormat() no longer produces the ending \ufffd character for truncated C string when use precision with %s and %V. It now truncates the string before the start of truncated multibyte sequences.
PyUnicode_FromFormat() no longer produces the ending \ufffd character for truncated C string when use precision with %s and %V. It now truncates the string before the start of truncated multibyte sequences.