Skip to content

Commit 7fca996

Browse files
vstinnerhugovk
authored andcommitted
pythongh-111089: Revert PyUnicode_AsUTF8() changes (python#111833)
* Revert "pythongh-111089: Use PyUnicode_AsUTF8() in Argument Clinic (python#111585)" This reverts commit d9b606b. * Revert "pythongh-111089: Use PyUnicode_AsUTF8() in getargs.c (python#111620)" This reverts commit cde1071. * Revert "pythongh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (python#111091)" This reverts commit d731579. * Revert "pythongh-111089: Add PyUnicode_AsUTF8() to the limited C API (python#111121)" This reverts commit d8f32be. * Revert "pythongh-111089: Use PyUnicode_AsUTF8() in sqlite3 (python#111122)" This reverts commit 37e4e20.
1 parent ec90934 commit 7fca996

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+952
-244
lines changed

Doc/c-api/unicode.rst

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -992,19 +992,11 @@ These are the UTF-8 codec APIs:
992992
993993
As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size.
994994
995-
Raise an exception if the *unicode* string contains embedded null
996-
characters. To accept embedded null characters and truncate on purpose
997-
at the first null byte, ``PyUnicode_AsUTF8AndSize(unicode, NULL)`` can be
998-
used instead.
999-
1000995
.. versionadded:: 3.3
1001996
1002997
.. versionchanged:: 3.7
1003998
The return type is now ``const char *`` rather of ``char *``.
1004999
1005-
.. versionchanged:: 3.13
1006-
Raise an exception if the string contains embedded null characters.
1007-
10081000
10091001
UTF-32 Codecs
10101002
"""""""""""""

Doc/data/stable_abi.dat

Lines changed: 0 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Doc/whatsnew/3.13.rst

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1149,9 +1149,6 @@ New Features
11491149
:c:func:`PyErr_WriteUnraisable`, but allow to customize the warning mesage.
11501150
(Contributed by Serhiy Storchaka in :gh:`108082`.)
11511151

1152-
* Add :c:func:`PyUnicode_AsUTF8` function to the limited C API.
1153-
(Contributed by Victor Stinner in :gh:`111089`.)
1154-
11551152

11561153
Porting to Python 3.13
11571154
----------------------
@@ -1222,12 +1219,6 @@ Porting to Python 3.13
12221219
Note that ``Py_TRASHCAN_BEGIN`` has a second argument which
12231220
should be the deallocation function it is in.
12241221

1225-
* The :c:func:`PyUnicode_AsUTF8` function now raises an exception if the string
1226-
contains embedded null characters. To accept embedded null characters and
1227-
truncate on purpose at the first null byte,
1228-
``PyUnicode_AsUTF8AndSize(unicode, NULL)`` can be used instead.
1229-
(Contributed by Victor Stinner in :gh:`111089`.)
1230-
12311222
* On Windows, ``Python.h`` no longer includes the ``<stddef.h>`` standard
12321223
header file. If needed, it should now be included explicitly. For example, it
12331224
provides ``offsetof()`` function, and ``size_t`` and ``ptrdiff_t`` types.

Include/cpython/unicodeobject.h

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -440,6 +440,22 @@ PyAPI_FUNC(PyObject*) PyUnicode_FromKindAndData(
440440
const void *buffer,
441441
Py_ssize_t size);
442442

443+
/* --- Manage the default encoding ---------------------------------------- */
444+
445+
/* Returns a pointer to the default encoding (UTF-8) of the
446+
Unicode object unicode.
447+
448+
Like PyUnicode_AsUTF8AndSize(), this also caches the UTF-8 representation
449+
in the unicodeobject.
450+
451+
_PyUnicode_AsString is a #define for PyUnicode_AsUTF8 to
452+
support the previous internal function with the same behaviour.
453+
454+
Use of this API is DEPRECATED since no size information can be
455+
extracted from the returned data.
456+
*/
457+
458+
PyAPI_FUNC(const char *) PyUnicode_AsUTF8(PyObject *unicode);
443459

444460
/* === Characters Type APIs =============================================== */
445461

Include/unicodeobject.h

Lines changed: 11 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -443,25 +443,17 @@ PyAPI_FUNC(PyObject*) PyUnicode_AsUTF8String(
443443
PyObject *unicode /* Unicode object */
444444
);
445445

446-
// Returns a pointer to the UTF-8 encoding of the Unicode object unicode.
447-
//
448-
// Raise an exception if the string contains embedded null characters.
449-
// Use PyUnicode_AsUTF8AndSize() to accept embedded null characters.
450-
//
451-
// This function caches the UTF-8 encoded string in the Unicode object
452-
// and subsequent calls will return the same string. The memory is released
453-
// when the Unicode object is deallocated.
454-
PyAPI_FUNC(const char *) PyUnicode_AsUTF8(PyObject *unicode);
455-
456-
// Returns a pointer to the UTF-8 encoding of the
457-
// Unicode object unicode and the size of the encoded representation
458-
// in bytes stored in `*size` (if size is not NULL).
459-
//
460-
// On error, `*size` is set to 0 (if size is not NULL).
461-
//
462-
// This function caches the UTF-8 encoded string in the Unicode object
463-
// and subsequent calls will return the same string. The memory is released
464-
// when the Unicode object is deallocated.
446+
/* Returns a pointer to the default encoding (UTF-8) of the
447+
Unicode object unicode and the size of the encoded representation
448+
in bytes stored in *size.
449+
450+
In case of an error, no *size is set.
451+
452+
This function caches the UTF-8 encoded string in the unicodeobject
453+
and subsequent calls will return the same string. The memory is released
454+
when the unicodeobject is deallocated.
455+
*/
456+
465457
#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x030A0000
466458
PyAPI_FUNC(const char *) PyUnicode_AsUTF8AndSize(
467459
PyObject *unicode,

Lib/test/test_capi/test_unicode.py

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -914,10 +914,7 @@ def test_asutf8(self):
914914
self.assertEqual(unicode_asutf8('abc', 4), b'abc\0')
915915
self.assertEqual(unicode_asutf8('абв', 7), b'\xd0\xb0\xd0\xb1\xd0\xb2\0')
916916
self.assertEqual(unicode_asutf8('\U0001f600', 5), b'\xf0\x9f\x98\x80\0')
917-
918-
# disallow embedded null characters
919-
self.assertRaises(ValueError, unicode_asutf8, 'abc\0', 0)
920-
self.assertRaises(ValueError, unicode_asutf8, 'abc\0def', 0)
917+
self.assertEqual(unicode_asutf8('abc\0def', 8), b'abc\0def\0')
921918

922919
self.assertRaises(UnicodeEncodeError, unicode_asutf8, '\ud8ff', 0)
923920
self.assertRaises(TypeError, unicode_asutf8, b'abc', 0)

Lib/test/test_stable_abi_ctypes.py

Lines changed: 0 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Misc/NEWS.d/next/C API/2023-10-20-01-42-43.gh-issue-111089.VIrd5q.rst

Lines changed: 0 additions & 2 deletions
This file was deleted.

Misc/NEWS.d/next/C API/2023-10-20-18-07-24.gh-issue-111089.RxkyrQ.rst

Lines changed: 0 additions & 2 deletions
This file was deleted.

Misc/stable_abi.toml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2478,8 +2478,6 @@
24782478
added = '3.13'
24792479
[function.PySys_AuditTuple]
24802480
added = '3.13'
2481-
[function.PyUnicode_AsUTF8]
2482-
added = '3.13'
24832481
[function._Py_SetRefcnt]
24842482
added = '3.13'
24852483
abi_only = true

Modules/_io/clinic/_iomodule.c.h

Lines changed: 25 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Modules/_io/clinic/fileio.c.h

Lines changed: 7 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Modules/_io/clinic/textio.c.h

Lines changed: 19 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Modules/_io/clinic/winconsoleio.c.h

Lines changed: 7 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Modules/_multiprocessing/clinic/multiprocessing.c.h

Lines changed: 7 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Modules/_multiprocessing/clinic/semaphore.c.h

Lines changed: 7 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Modules/_multiprocessing/posixshmem.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ _posixshmem_shm_open_impl(PyObject *module, PyObject *path, int flags,
4848
{
4949
int fd;
5050
int async_err = 0;
51-
const char *name = PyUnicode_AsUTF8(path);
51+
const char *name = PyUnicode_AsUTF8AndSize(path, NULL);
5252
if (name == NULL) {
5353
return -1;
5454
}
@@ -87,7 +87,7 @@ _posixshmem_shm_unlink_impl(PyObject *module, PyObject *path)
8787
{
8888
int rv;
8989
int async_err = 0;
90-
const char *name = PyUnicode_AsUTF8(path);
90+
const char *name = PyUnicode_AsUTF8AndSize(path, NULL);
9191
if (name == NULL) {
9292
return NULL;
9393
}

0 commit comments

Comments
 (0)