-
-
Notifications
You must be signed in to change notification settings - Fork 32k
Crash when concurrently writing with print
and concurrently modifying sys.stdout
#130163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
contextlib.redirect_stdout
in threadscontextlib.redirect_stdout
in threads
The issue could be with |
Here's the backtrace:
The immediate cause seems to be related to the printing itself, but I didn't dig into it too much |
Substituting class FakeStringIO:
def write(self, value):
pass
def getvalue(self):
return "hello1\nhello2\n"
def flush(self):
pass |
I suspect it has something to do with the |
Might be something to do with from contextlib import redirect_stdout
from io import StringIO
from threading import Thread
import time
def test_redirect():
text = StringIO()
with redirect_stdout(text):
time.sleep(0.1)
print(text.getvalue())
for x in range(100):
Thread(target=test_redirect, args=()).start() edit: Both crashes seem to be caused by
|
Have you tried removing the usage to |
Yes, it is caused by
The easiest way to fix this is probably to protect The best solution might be to update the documentation and point out this potential issue with |
Using
But any program that does from threading import Thread
import sys
import time
def test_redirect():
text = open("/dev/null", "w")
old_stdout = sys.stdout
sys.stdout = text
print("hello1")
time.sleep(0.3)
print("hello2")
sys.stdout = old_stdout
for x in range(200):
Thread(target=test_redirect, args=()).start()
Would making the presence of the critical section dependent on whether the build is freethreaded (and/or the GIL is disabled) avoid the performance impact for gilfull builds? Or would it be a bad idea to have this conditional difference in behavior?
ISTM that fixing the segfault is necessary, even if a note is added to docs, because users may hit the segfault without using |
Yes it's not redirect_stdout that is the culprit, it's the PyFile API. |
It's much harder to segfault in a non-debug build. There are many
Edit: of course, just after I posted this comment, a segfault in a non-debug build happened: #0 __new_sem_post (sem=0x0) at ./nptl/sem_post.c:35
#1 0x00005555558758ed in PyThread_release_lock (
lock=<optimized out>)
at Python/thread_pthread.h:618
#2 0x00005555558c7921 in _io__Buffered_close_impl (
self=0x3365e1f00c0)
at ./Modules/_io/bufferedio.c:596
#3 _io__Buffered_close (
self=<_io.BufferedWriter at remote 0x3365e1f00c0>,
_unused_ignored=<optimized out>)
at ./Modules/_io/clinic/bufferedio.c.h:375
#4 0x0000555555654b83 in method_vectorcall_NOARGS (
func=<method_descriptor at remote 0x335fe3d6130>,
args=0x7ffe6fdff7e8, nargsf=<optimized out>,
kwnames=0x0) at Objects/descrobject.c:447
#5 0x000055555564184e in _PyObject_VectorcallTstate (
tstate=0x555555e53bb0,
callable=<method_descriptor at remote 0x335fe3d6130>, args=0x7ffe6fdff7e8, nargsf=<optimized out>,
kwnames=0x0)
at ./Include/internal/pycore_call.h:167
args=args@entry=0x7ffe6fdff7e8, nargsf=<optimized out>, nargsf@entry=9223372036854775809,
kwnames=kwnames@entry=0x0) at Objects/call.c:856
#7 0x00005555558cf1c0 in PyObject_CallMethodNoArgs (self=<optimized out>,
name=<optimized out>) at ./Include/cpython/abstract.h:65
#8 _io_TextIOWrapper_close_impl (self=0x3365e2000e0) at ./Modules/_io/textio.c:3167
#9 _io_TextIOWrapper_close (self=<_io.TextIOWrapper at remote 0x3365e2000e0>,
_unused_ignored=<optimized out>) at ./Modules/_io/clinic/textio.c.h:1138
#10 0x0000555555654b83 in method_vectorcall_NOARGS (
func=<method_descriptor at remote 0x335fe3d8890>, args=0x7ffe6fdff8d0,
nargsf=<optimized out>, kwnames=0x0) at Objects/descrobject.c:447
#11 0x000055555564184e in _PyObject_VectorcallTstate (tstate=0x555555e53bb0,
callable=<method_descriptor at remote 0x335fe3d8890>, args=0x7ffe6fdff8d0,
nargsf=<optimized out>, kwnames=0x0) at ./Include/internal/pycore_call.h:167
#12 PyObject_VectorcallMethod (name=<optimized out>, args=args@entry=0x7ffe6fdff8d0,
nargsf=<optimized out>, nargsf@entry=9223372036854775809, kwnames=kwnames@entry=0x0)
at Objects/call.c:856
#13 0x00005555558bebad in PyObject_CallMethodNoArgs (self=<optimized out>,
name=<optimized out>) at ./Include/cpython/abstract.h:65
#14 iobase_finalize (self=<_io.TextIOWrapper at remote 0x3365e2000e0>)
at ./Modules/_io/iobase.c:317
#15 0x00005555556bf3a0 in PyObject_CallFinalizer (
self=<_io.TextIOWrapper at remote 0x3365e2000e0>) at Objects/object.c:568
#16 0x00005555556c5dfc in PyObject_CallFinalizerFromDealloc (
self=self@entry=<_io.TextIOWrapper at remote 0x3365e2000e0>) at Objects/object.c:586
#17 0x00005555558bfd4e in _PyIOBase_finalize (
self=self@entry=<_io.TextIOWrapper at remote 0x3365e2000e0>) at ./Modules/_io/iobase.c:340
#18 0x00005555558ce6ca in textiowrapper_dealloc (
op=<_io.TextIOWrapper at remote 0x3365e2000e0>) at ./Modules/_io/textio.c:1468
#19 0x00005555556a8ebd in Py_XDECREF (op=<optimized out>) at ./Include/refcount.h:502
#20 insertdict (mp=0x335fe01a600, key='stdout', hash=-9074624609838634197,
value=<_io.TextIOWrapper(mode='w') at remote 0x3375a1f00e0>, interp=<optimized out>)
at Objects/dictobject.c:1864
#21 0x00005555556aae68 in _PyObjectDict_SetItem (tp=tp@entry=0x555555b023a0 <PyModule_Type>,
obj=obj@entry=<module at remote 0x335fe25e060>, dictptr=<optimized out>,
key=key@entry='stdout',
value=value@entry=<_io.TextIOWrapper(mode='w') at remote 0x3375a1f00e0>)
at Objects/dictobject.c:7502
#22 0x00005555556c3add in _PyObject_GenericSetAttrWithDict (
obj=<module at remote 0x335fe25e060>, name='stdout',
value=<_io.TextIOWrapper(mode='w') at remote 0x3375a1f00e0>, dict=0x0)
at Objects/object.c:1853
#23 0x00005555556c150c in PyObject_SetAttr (v=v@entry=<module at remote 0x335fe25e060>,
name=<optimized out>,
value=value@entry=<_io.TextIOWrapper(mode='w') at remote 0x3375a1f00e0>)
at Objects/object.c:1443
#24 0x00005555557d2701 in _PyEval_EvalFrameDefault (tstate=<optimized out>,
frame=<optimized out>, throwflag=<optimized out>)
at ./Include/internal/pycore_stackref.h:184
#25 0x00005555557dcf6e in _PyEval_EvalFrame (tstate=0x555555e53bb0, frame=<optimized out>,
throwflag=0) at ./Include/internal/pycore_ceval.h:116
#26 _PyEval_Vector (tstate=0x555555e53bb0, func=0x335fe4f48c0, locals=0x0,
args=<optimized out>, argcount=1, kwnames=<optimized out>) at Python/ceval.c:1745
#27 0x00005555556445c3 in _PyObject_VectorcallTstate (tstate=0x555555e53bb0,
callable=<function at remote 0x335fe4f48c0>, args=0x7ffe6fdffe18, nargsf=1, kwnames=0x0)
at ./Include/internal/pycore_call.h:167
#28 method_vectorcall (method=<optimized out>, args=0x555555b45450 <_PyRuntime+113424>,
nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:72
#29 0x00005555558fa480 in thread_run (boot_raw=0x555555e53b70)
at ./Modules/_threadmodule.c:354
#30 0x000055555587508b in pythread_wrapper (arg=<optimized out>)
at Python/thread_pthread.h:242
#31 0x00007ffff7ca1e2e in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#32 0x00007ffff7d33a4c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 Is it a different crash? |
Ok, it's perhaps not the PyFile API, but just the buffer API itself. To check this, can you try no using |
Ok, I couldn't manage to make it crash with a pure buffer API, so it's likely the |
Even with a critical section, I still get a segfault: diff --git a/Objects/fileobject.c b/Objects/fileobject.c
index 7025b5bcffc..7b84f5962fe 100644
--- a/Objects/fileobject.c
+++ b/Objects/fileobject.c
@@ -3,6 +3,7 @@
#include "Python.h"
#include "pycore_call.h" // _PyObject_CallNoArgs()
#include "pycore_runtime.h" // _PyRuntime
+#include "pycore_critical_section.h"
#ifdef HAVE_UNISTD_H
# include <unistd.h> // isatty()
@@ -101,34 +102,33 @@ PyFile_GetLine(PyObject *f, int n)
/* Interfaces to write objects/strings to file-like objects */
+static inline int
+_PyFile_WriteObject(PyObject *f, PyObject *v)
+{
+ PyObject *writer = PyObject_GetAttr(f, &_Py_ID(write));
+ if (writer == NULL) return -1;
+ PyObject *res = PyObject_CallOneArg(writer, v);
+ Py_DECREF(writer);
+ if (res == NULL) return -1;
+ Py_DECREF(res);
+ return 0;
+}
+
int
PyFile_WriteObject(PyObject *v, PyObject *f, int flags)
{
- PyObject *writer, *value, *result;
if (f == NULL) {
PyErr_SetString(PyExc_TypeError, "writeobject with NULL file");
return -1;
}
- writer = PyObject_GetAttr(f, &_Py_ID(write));
- if (writer == NULL)
- return -1;
- if (flags & Py_PRINT_RAW) {
- value = PyObject_Str(v);
- }
- else
- value = PyObject_Repr(v);
- if (value == NULL) {
- Py_DECREF(writer);
- return -1;
- }
- result = PyObject_CallOneArg(writer, value);
+ int rc = 0;
+ PyObject *value = (flags & Py_PRINT_RAW) ? PyObject_Str(v) : PyObject_Repr(v);
+ Py_BEGIN_CRITICAL_SECTION(f);
+ rc = _PyFile_WriteObject(f, value);
+ Py_END_CRITICAL_SECTION();
Py_DECREF(value);
- Py_DECREF(writer);
- if (result == NULL)
- return -1;
- Py_DECREF(result);
- return 0;
+ return rc;
} |
And adding a critical section to |
A funny observation. When using |
Not sure this helps, but I get a (non-deterministic) segfault under GDB with: from contextlib import redirect_stdout
from threading import Thread
import sys
import time
def test_redirect():
text = open("/dev/null", "w")
with redirect_stdout(text):
sys.stdout.buffer.write(b"hello1")
time.sleep(0.3)
sys.stdout.buffer.write(b"hello1")
for x in range(200):
Thread(target=test_redirect, args=()).start() The backtrace is: #0 PyObject_CallFinalizerFromDealloc (self=0x200381b0110) at Objects/object.c:602
#1 0x000055555598aa30 in _PyIOBase_finalize (self=self@entry=0x200381b0110) at ./Modules/_io/iobase.c:340
#2 0x000055555599da85 in textiowrapper_dealloc (op=0x200381b0110) at ./Modules/_io/textio.c:1468
#3 0x000055555570d483 in _Py_Dealloc (op=op@entry=0x200381b0110) at Objects/object.c:3015
#4 0x000055555570d642 in _Py_DecRefSharedDebug (o=o@entry=0x200381b0110,
filename=filename@entry=0x5555559f662f "./Include/refcount.h", lineno=lineno@entry=502)
at Objects/object.c:418
#5 0x00005555556e55b4 in Py_DECREF (filename=0x5555559f662f "./Include/refcount.h", lineno=502,
op=0x200381b0110) at ./Include/refcount.h:347
#6 0x00005555556f38b7 in Py_XDECREF (op=<optimized out>) at ./Include/refcount.h:502
#7 insertdict (interp=<optimized out>, mp=mp@entry=0x2000001f850,
key=key@entry=0x555555c801d8 <_PyRuntime+106200>, hash=737787358347787621,
value=value@entry=0x2003a1b0110) at Objects/dictobject.c:1864
#8 0x00005555556f3b09 in setitem_take2_lock_held (mp=mp@entry=0x2000001f850,
key=key@entry=0x555555c801d8 <_PyRuntime+106200>, value=value@entry=0x2003a1b0110)
at Objects/dictobject.c:2594
#9 0x00005555556f3c5a in setitem_lock_held (mp=mp@entry=0x2000001f850,
key=key@entry=0x555555c801d8 <_PyRuntime+106200>, value=value@entry=0x2003a1b0110)
at ./Include/refcount.h:519
#10 0x00005555556f456e in _PyDict_SetItem_LockHeld (dict=dict@entry=0x2000001f850,
name=name@entry=0x555555c801d8 <_PyRuntime+106200>, value=value@entry=0x2003a1b0110)
at Objects/dictobject.c:6829
#11 0x00005555556f467d in _PyObjectDict_SetItem (tp=tp@entry=0x555555c3e3a0 <PyModule_Type>,
obj=obj@entry=0x20000259f50, dictptr=0x20000259f70, key=key@entry=0x555555c801d8 <_PyRuntime+106200>,
value=value@entry=0x2003a1b0110) at Objects/dictobject.c:7502
#12 0x000055555571118e in _PyObject_GenericSetAttrWithDict (obj=0x20000259f50,
name=0x555555c801d8 <_PyRuntime+106200>, value=<optimized out>, dict=dict@entry=0x0)
at Objects/object.c:1853
#13 0x00005555557113c4 in PyObject_GenericSetAttr (obj=<optimized out>, name=<optimized out>,
value=<optimized out>) at Objects/object.c:1881
#14 0x000055555570fbe8 in PyObject_SetAttr (v=0x20000259f50, name=<optimized out>, value=0x2003a1b0110)
at Objects/object.c:1443
#15 0x000055555582d976 in builtin_setattr_impl (module=module@entry=0x2000025c5d0, obj=<optimized out>,
name=<optimized out>, value=<optimized out>) at Python/bltinmodule.c:1695
#16 0x000055555582d9d1 in builtin_setattr (module=0x2000025c5d0, args=args@entry=0x7fffb67fbb48,
nargs=nargs@entry=3) at Python/clinic/bltinmodule.c.h:660
#17 0x000055555584b9ed in _PyEval_EvalFrameDefault (tstate=tstate@entry=0x555555d90f10,
frame=0x7ffff742b240, frame@entry=0x7ffff742b020, throwflag=throwflag@entry=0)
at Python/generated_cases.c.h:2011 |
Yes, but this assumes that Btw, I didn't manage to have a segfault with your latest reproducer so maybe it's also an OS/kernel issue? |
Here's one that sometimes segfaults for me under GDB without from threading import Thread
import time
null = open("/dev/null", "w")
class Dummy:
stdout = null
dummy = Dummy()
def test_redirect():
old_stdout = dummy.stdout
dummy.stdout = open("/dev/null", "w")
dummy.stdout.buffer.write(b"hello1\n")
time.sleep(0.3)
dummy.stdout.buffer.write(b"hello2\n")
dummy.stdout = old_stdout
dummy.stdout.buffer.write(b"hello3\n")
for x in range(200):
Thread(target=test_redirect, args=()).start() Backtrace same as the one before. |
I don't know why I can make it fail on my side :') I only get |
Ok, so let's track it down separately. We can use this issue for the non thread-safetiness of |
I've updated the description in this issue and filed #130202 for the resurrection. If someone wants to take on the |
contextlib.redirect_stdout
in threadsprint
and concurrently modifying sys.stdout
If @devdanzin wouldn't, I will try. |
There's an open C API WG discussion about the problem with |
Oh, I wouldn't know how. Feel free to take it forward. |
Thanks, I will try. |
I've looked at how @colesbury @ZeroIntensity, @serhiy-storchaka please take a look. Here's a copy from my PR:
Also, can you point me out what to do next? I can replace |
#130503 is an adaptation of #111035 or #129736 that does not introduce new public API and does not change the current behavior (except fixing crashes, mainly). It will make minimal difference with #111035 or #129736, whatever will be accepted for future Python version. I added also tests for Thank you for your tests, @sergey-miryanov, but sorry, I have not used them. They are high quality, but too large and expensive. It is not worth to keep them forever after fixing this issue. |
@serhiy-storchaka I simplified it a bit yesterday, not sure if you saw the latest changes. Maybe it will be useful, but it is all up to you. Anyway, it was very interesting. |
The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed reference, has been replaced by using one of the following functions, which return a strong reference and distinguish a missing attribute from an error: _PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(), _PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().
…honGH-130503) The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed reference, has been replaced by using one of the following functions, which return a strong reference and distinguish a missing attribute from an error: _PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(), _PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString(). (cherry picked from commit 0ef4ffe) Co-authored-by: Serhiy Storchaka <[email protected]>
GH-130556) The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed reference, has been replaced by using one of the following functions, which return a strong reference and distinguish a missing attribute from an error: _PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(), _PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString(). (cherry picked from commit 0ef4ffe)
…honGH-130503) (pythonGH-130556) The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed reference, has been replaced by using one of the following functions, which return a strong reference and distinguish a missing attribute from an error: _PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(), _PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString(). (cherry picked from commit 0ef4ffe) (cherry picked from commit 7c1b76f) Co-authored-by: Serhiy Storchaka <[email protected]>
…thonGH-130568) (cherry picked from commit 2ab7e11)
GH-130556) (GH-130576) The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed reference, has been replaced by using one of the following functions, which return a strong reference and distinguish a missing attribute from an error: _PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(), _PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString(). (cherry picked from commit 0ef4ffe) (cherry picked from commit 7c1b76f) (cherry picked from commit 2ab7e11)
I cannot reproduce the crash using #130163 (comment) reproducer. Can we close the issue? |
It has already been fixed. |
…30503) The use of PySys_GetObject() and _PySys_GetAttr(), which return a borrowed reference, has been replaced by using one of the following functions, which return a strong reference and distinguish a missing attribute from an error: _PySys_GetOptionalAttr(), _PySys_GetOptionalAttrString(), _PySys_GetRequiredAttr(), and _PySys_GetRequiredAttrString().
Uh oh!
There was an error while loading. Please reload this page.
Crash report
What happened?
Playing with the code from #130148, I stumbled on a segfault in a free-threaded debug build with the following (seems to trigger faster/more consistently when pasted in the REPL):
I didn't have time to check whether JIT, debug or no-gil are strictly necessary for this to crash.JIT is not needed for this to crash.Cause
The problem is that in the
print()
implementation,_PySys_GetAttr
returns a borrowed reference.cpython/Python/bltinmodule.c
Lines 2173 to 2175 in 655fc8a
So if
sys.stdout
changes concurrently with theprint()
, the program may crash becausefile
will point to a deallocated Python object.This affects the GIL-enabled build as well. See this reproducer: https://gist.github.com/colesbury/c48f50e95d5d68e24814a56e2664e587
Suggested fix
Introduce a
_PySys_GetAttrRef
that returns a new reference instead of a borrowed reference. Use that instead.We should audit all the uses of
_PySys_GetAttr
andPySys_GetObject
. I expect most of them will need to be replaced with functions that return new references, but it doesn't all have to be in a single PR.CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.14.0a5+ experimental free-threading build (heads/main:359c7dde3bb, Feb 15 2025, 17:54:11) [GCC 11.4.0]
Linked PRs
The text was updated successfully, but these errors were encountered: