Skip to content

Use-after-free in unicode_escape decoder with error handler #133767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sethmlarson opened this issue May 9, 2025 · 0 comments
Open

Use-after-free in unicode_escape decoder with error handler #133767

sethmlarson opened this issue May 9, 2025 · 0 comments
Assignees
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes 3.12 only security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump type-security A security issue

Comments

@sethmlarson
Copy link
Contributor

sethmlarson commented May 9, 2025

Crash report

What happened?

When using .decode("unicode_escape") with an error handler there is a use-after-free segfault.

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Output from running 'python -VV' on the command line:

No response

Linked PRs

@sethmlarson sethmlarson added the type-crash A hard crash of the interpreter, possibly with a core dump label May 9, 2025
@sethmlarson sethmlarson changed the title Use-after-free in unicode escape decoder with error handler Use-after-free in unicode_escape decoder with error handler May 9, 2025
@serhiy-storchaka serhiy-storchaka self-assigned this May 9, 2025
@picnixz picnixz added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode type-security A security issue 3.11 only security fixes 3.10 only security fixes 3.9 only security fixes 3.12 only security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes labels May 10, 2025
serhiy-storchaka added a commit that referenced this issue May 12, 2025
…rror handler (GH-129648)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 12, 2025
…h an error handler (pythonGH-129648)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)

Co-authored-by: Serhiy Storchaka <[email protected]>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 12, 2025
…der with an error handler (pythonGH-129648)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)

Co-authored-by: Serhiy Storchaka <[email protected]>
serhiy-storchaka added a commit that referenced this issue May 13, 2025
…th an error handler (GH-129648) (GH-133942)

If the error handler is used, a new bytes object is created to set as
the object attribute of UnicodeDecodeError, and that bytes object then
replaces the original data. A pointer to the decoded data will became invalid
after destroying that temporary bytes object. So we need other way to return
the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not
use the error handlers registry, but it should be changed for compatibility
with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit 9f69a58)

Co-authored-by: Serhiy Storchaka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes 3.12 only security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump type-security A security issue
Projects
Development

No branches or pull requests

3 participants