Docs C API: Clarify what happens when null bytes are passed to PyUnicode_AsUTF8 (#127458)

ZeroIntensity · StanFromIreland · tomasr8 · web-flow · commit e792f4bc2e71 · 2025-01-20T16:54:29.000+01:00
Co-authored-by: Stan U. &lt;89152624+StanFromIreland@users.noreply.github.com&gt;
Co-authored-by: Tomas R. &lt;tomas.roun8@gmail.com&gt;
Co-authored-by: Victor Stinner &lt;vstinner@python.org&gt;
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
@@ -1054,6 +1054,15 @@ These are the UTF-8 codec APIs:
 
    As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size.
 
+   .. warning::
+
+      This function does not have any special behavior for
+      `null characters <https://en.wikipedia.org/wiki/Null_character>`_ embedded within
+      *unicode*. As a result, strings containing null characters will remain in the returned
+      string, which some C functions might interpret as the end of the string, leading to
+      truncation. If truncation is an issue, it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize`
+      instead.
+
    .. versionadded:: 3.3
 
    .. versionchanged:: 3.7