-
Notifications
You must be signed in to change notification settings - Fork 52
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Move GC to a separate thread #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I abandoned work on it. The implementation worked, but there was an obscure crash that I didn't manage to diagnose. |
Note the challenge is to run the test suite with threaded GC enabled: diff --git a/Python/pylifecycle.c b/Python/pylifecycle.c
index 1467527da3..d6e8129849 100644
--- a/Python/pylifecycle.c
+++ b/Python/pylifecycle.c
@@ -901,7 +901,7 @@ _Py_InitializeMainInterpreter(PyInterpreterState *interp,
}
/* XXX allow setting GC mode with an env var? */
- if (_PyGC_SetThreaded(0)) {
+ if (_PyGC_SetThreaded(1)) {
Py_FatalError("_PyGC_SetThreaded failed");
}
|
Thanks. I see the crash in test__xxsubinterpreters. Also a few other tests are failing due to what looks like finalizers not being called. I'll try to understand what you did. |
I actually ended fixing some of the finalizer bugs, but I need to find where i stored that work 😅 In any case, is interesting to note that my benchmarks showed a little slowdown in general (although not much). I suspect this is due to the extra context switching, but in any case I don't think this will actually end giving us any speedups, although it will potentially help with the problems described in PEP 556 |
Speedups were not the goal indeed, but to make the GC context more predictable (instead of GC-induced code such as finalizers running at potentially any point). Even latency would probably not be improved very much, because of the GIL (I don't know how workable it would be to release the GIL at some points of garbage collection). |
Note that this GC execution context issue is actually extremely serious for real-world code (anything with complex finalizers with side effects). It also motivated the creation of the SimpleQueue object, which can help alleviate the problem, but at the cost of additional complexity (and potential bugs or performance issues as well). |
Yep, that's why I was interested in pushing it in the first place but got side-tracked by the parser work and extinguish the fires of the 3.9 release :( |
OTOH in the context of "Faster CPython" the idea was to make things faster by letting a second core handle (most of) the GC work -- this is what some other languages do. But it would require something like the Gilectomy first. |
Right, but those "other languages" are generally fully GC-based. In CPython, much of the cost of reclaiming memory is born by regular reference counting. |
Regardless of where we run the "GC", i.e. the cycle finder and breaker, it would make sense to improve the general decref/dealloc code. We want to:
To keep dealloc efficient we want to free memory for simple objects without having to chase function pointers, and not to have to worry about finalizers and weakrefs when clearing internal references. Something like this (in C flavoured Python): @inline
def Py_DECREF(obj):
# Refcount never reaches 0 unless object is genuinely unreachable.
if obj.refcount == 1:
dealloc(obj)
else:
obj.refcount -= 1
def dealloc(obj):
if has_weakrefs(obj) or needs_finalizing(obj):
finalizer_list.append(obj)
elif has_internal_references(obj):
recursive_clear_and_free(obj)
else:
# Objects that cannot refer to other objects e.g. ints, strs.
# Implicitly reduce refcount to zero.
free(obj)
def recursive_clear_and_free(obj):
# obj cannot be "resurrected" as there are no references to it; strong or weak.
if tstate.recursive_dealloc > TRASHCAN_LIMIT: # Currently limit is 50?
trashcan_list.append(obj)
return
tstate.recursive_dealloc += 1
for offset in obj.__class__.layout:
item = obj[offset]
obj[offset] = NULL
Py_DECREF(item)
assert obj.refcount == 1
# Implicitly reduce refcount to zero.
free(obj)
tstate.recursive_dealloc -= 1
if tstate.recursive_dealloc == 0:
while trashcan_list:
obj = trashcan_list.pop()
Py_DECREF(obj)
if finalizer_list and not RUN_FINALIZERS_IN_THREAD:
clear_finalizer_list()
def clear_finalizer_list():
while finalizer_list:
item = finalizer_list.pop()
if has_weak_refs(item):
do_weak_ref_callbacks(item)
if has_finalizer(item):
finalize(item)
Py_DECREF(item)
if RUN_FINALIZERS_IN_THREAD:
def thread_run_clear_finalizer_list():
while True:
clear_finalizer_list()
#sleep until there is something to do I have done absolutely no testing on the above. So it will be incorrect. Hopefully the general idea is clear though. |
@markshannon Arranging gc like that would make it more predictable but could slow it down, right? (I’m assuming that trashcan only kicks in after 50 because it’s slower so not worth it for small chains.) |
I think it should be quicker as type-specific dealloc code doesn't need worry about trashcans, finalizers or resurrection. Maintaining a stack of objects, and making deallocation iterative rather than recursive should be even quicker. Pushing a single pointer is going to faster than making a C call. |
I'm assuming you meant that the last line of I think this can get complicated because we need to make sure that we don't add the same object twice to the Another question - if we consume the finalizer_list on a separate thread then it's on the interpreter state rather than the thread state? Or the separate thread looks at the different thread's lists? In either case, doesn't this require locking? |
This is indeed an extremely delicate area of the GC where very obscure bugs have been found over the years. I would be very careful to introduce any changes of that part of the GC machinery without a very good reason, because causing obscure regressions is super easy, and even more dangerous when you involve weak references with types that have |
Why is this fragile? It shouldn't be. I believe the reason it is so fragile is that we break the most important invariant of refcounting, that The algorithm above #58 (comment) should maintain that invariant, so weakref callbacks and finalizers can do as they please.
Yes. The code should be: def dealloc(obj):
if has_weakrefs(obj) or obj.needs_finalizing:
obj.needs_finalizing = False
finalizer_list.append(obj)
... |
Do you mean (I don't see how this takes into account everything that can happen in the weakref callback.)
I agree. |
Yes, that is an def dealloc(obj):
if has_weakrefs(obj):
weakref_list.append(obj)
elif obj.needs_finalizing:
obj.needs_finalizing = False
finalizer_list.append(obj) might be better. weakref callback functions are just functions. There is nothing magical about them. |
Sure, but with most functions we don't have resurrection concerns. You're suggesting to replace the GC algorithm by a state transition algorithm, which is currently not fully specified, so it's hard to see how we prevent objects from going into the finalizing stage more than once. |
So let's not have resurrection. Once an object is dead, it is dead. The tricky part is getting there from where we are now 🙁 |
oh, I see. But that's a change in the language. |
Removing resurrection is not possible: there is a lot of code our there that relies on that. We have already explored that route many times when improving the GC and the conclusion is that it will break many things in very obscure ways so is s very backwards incompatible change. |
We already prevent GC objects from being finalized multiple times as a consequence of PEP 442. That's what |
Am personally interested in making some headway to realize PEP 556, other than @pitrou 's progress, has anyone made any more changes, if so is there a fork I can access? I have planned to work on this during the sprint to also abate other things related to the Python GC research I do. @pablogsal do you have any additional progress you made in addition to @pitrou's work? |
(Sorry for the delay, I just returned from vacation) Yeah, I did some work on top of it (see #58 (comment)) (mainly fixing some obscure edge cases and bugs) but I need to search for it. Otherwise, I think I mostly remember where the problems were so I will be happy to collaborate in the sprint :) |
Good, @pablogsal , when you are successful in the search, please let me know. @pitrou where is your fork, I can lurk at that too. I don't plan to work on it much before the sprint anyway, so let's plan to work on it then. |
The initial work by @pitrou can be found here: https://github.com/pitrou/cpython/tree/threaded_gc |
Noted and thanks @pablogsal |
Mark:
Pablo:
What if resurrection was not done in @nanjekyejoannah - I'd like to join you in the sprint as well. |
That's is unfortunately still backwards incompatible because current code that resurrects on purpose (or not) is doing it in Notice that currently is clear when an object is alive or death, and also when has been finalized or resurrected. The problem is that resurrections affect the GC and finalizes, and those semantics are guaranteed. |
Sure, it would take several versions to deprecate resurrection in IIUC the complication with resurrection is that you can't call finalise twice so you can have an object which is still alive after being finalised. |
Not by a lot. There is some complexity in resurrection but it is not one of the biggest pain points of the GC, like for instance weak references and tp_clear. The code that handles resurrection is quite contained and since some changes I worked in Python 3.9, it doesn't even abort the full collection anymore.
Yeah but those effects are almost non problematic. We just set a flag that says "this object had been already finalized" and to all purposes the resurrected object is alive like any other object. |
Async generators are built-in objects whose semantics fundamentally rely on resurrection. Their |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Antoine started working on this in PEP 556.
@pitrou, can you update us where you are with this idea?
The text was updated successfully, but these errors were encountered: