-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Crash when generator frame proxies outlive their generator #125723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@gaogaotiantian @markshannon Genuinely strange interaction here when frame proxies for generator objects outlive both their generator and their eval loop. Hard to hit in normal code, but pretty easy to hit at an interactive prompt (assuming you're messing about with locals proxies interactively for some reason). I assume it relates to something going wrong with the handoff of frame state from the eval loop to the generator object (given the garbage data being emitted, maybe the frame proxy pointer is still referencing the no longer valid loop frame?) (Note: it's literally been years since I looked at that code, so I don't actually know how it currently works. This assumption is only based on a vague memory of how it used to work around 3.10 or so) |
Definitely looks to be some kind of refcounting issue when the proxy is the only thing still referencing the underlying frame:
The initial ref creation in Line 338 in e924bb6
Py_NewRef , which seems fine.
But this ownership check in Line 64 in e924bb6
I initially thought that when
Assuming that analysis is correct, I think at the very least, both Leaving the data ownership alone for data owned by the C stack is based on Edit: I found the line I had missed that makes this work in the general case. Creating a Python frame object inherently grants ownership of the frame data to that linked frame object: Line 1863 in e924bb6
So it looks like mere code introspection isn't going to be enough to track this one down, it's going to take experimentation on a local build. The fact
|
@ncoghlan are you still investigating or you need help on this? I don't have any clue at this point so I'll need to dig into the code base as well. If this is too much for you I can take over. However, if you can figure it out and fix it, I'd appreciate it :) |
I can not reproduce it on main branch but I can reproduce it on 3.13 branch. I think I can find more clue about the root cause. |
Fun fact, after bisect, I found #119209 may fixed this issue in some circumstance. When I run the test code import sys
def g():
a = 1
yield locals(), sys._getframe().f_locals
ns = {}
for i in range(10):
exec("snapshot, live_locals = next(g())", locals=ns)
print(ns) e9875ec Would be correct and crash in 73ab83b BTW, I compile the code in When I try to compile the CPython I think I may need some time to dive more deeper |
@gaogaotiantian I'm not currently investigating this one (my initial investigation was mostly to work out how the test suite had missed it, which turned out to be the "frame proxy must outlive its creating evaluation loop" criterion for hitting the crash). It would have been nice if the problem had been obvious merely on reading the code, but alas, the cause (whatever it turns out to be) isn't that simple. @Zheaoli That's definitely an interesting find. It suggests we might be inadvertently creating a reference cycle somewhere, and that commit happened to change exactly when the cycle gets collected. (The locals proxy isn't supposed to be getting involved in any cycles that aren't explicitly created in the Python code, but there's clearly something unexpected going on, so inadvertently created cycles can't be ruled out in advance) |
Something that occurred to me is whether it might be something the specialising interpreter is doing with the refcount handling (or the generator frame state management), rather than being inherent in the C code itself. It's the fact that this works:
while this potentially crashes:
that got me thinking in that direction, as the only substantive difference I can see between the two versions is that in the second form, If the eval loop ends up clearing a generator frame state it doesn't actually own when the eval loop terminates in the second scenario, that would explain that state being gone when the frame proxy tries to access it. |
Alas, the tip of 3.13 still crashes even with the initial set of #125038 changes merged: acoghlan@TerminalMist:~/devel/cpython$ ./python ../_misc/gen_locals_crash.py
{'snapshot': {'a': 1}, 'live_locals': {'a': 1}}
{'snapshot': {'a': 1}, 'live_locals': {'a': '_statistics.cpython-313-x86_64-linux-gnu.so'}}
Segmentation fault (core dumped)
acoghlan@TerminalMist:~/devel/cpython$ ./python --version
Python 3.13.0+
acoghlan@TerminalMist:~/devel/cpython$ git describe
v3.13.0-249-g9ecaee6a551 Current crasher implementation: import sys
def g():
a = 1
yield locals(), sys._getframe().f_locals
ns = {}
for i in range(10):
exec("snapshot, live_locals = next(g())", locals=ns)
print(ns) |
Actually, I can reproduce this crash on main branch. |
I've tried some variants of repro. import sys
g_f_locals = {}
def g():
global g_f_locals
g_f_locals = sys._getframe().f_locals
a = 0
yield 0
direct_call = True
if direct_call:
next(g())
else:
gen = g(); next(gen)
print(g_f_locals) If |
… generator (#126956) Co-authored-by: Kirill Podoprigora <[email protected]> Co-authored-by: Alyssa Coghlan <[email protected]>
IMHO, this issue can be closed. |
Crash report
What happened?
Checking some frame proxy behaviour at the interactive prompt, I encountered the following crash:
(Crash was initially seen in the new REPL, the above reproduction in the basic REPL showed it wasn't specific to the new REPL).
Subsequent investigation suggests that the problem relates to the frame proxy outliving the eval loop that created it, as the following script was able to reliably reproduce the crash (as shown below):
Changing the code to explicitly keep the generator object alive:
is sufficient to eliminate the crash:
The crash is still eliminated, even when
gen
is created viaexec
rather than creating it in the main eval loop:CPython versions tested on:
3.13, CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
No response
Linked PRs
The text was updated successfully, but these errors were encountered: