Move GC to a separate thread #120
Replies: 37 comments 1 reply
-
I abandoned work on it. The implementation worked, but there was an obscure crash that I didn't manage to diagnose. |
Beta Was this translation helpful? Give feedback.
-
Note the challenge is to run the test suite with threaded GC enabled: diff --git a/Python/pylifecycle.c b/Python/pylifecycle.c
index 1467527da3..d6e8129849 100644
--- a/Python/pylifecycle.c
+++ b/Python/pylifecycle.c
@@ -901,7 +901,7 @@ _Py_InitializeMainInterpreter(PyInterpreterState *interp,
}
/* XXX allow setting GC mode with an env var? */
- if (_PyGC_SetThreaded(0)) {
+ if (_PyGC_SetThreaded(1)) {
Py_FatalError("_PyGC_SetThreaded failed");
}
|
Beta Was this translation helpful? Give feedback.
-
Thanks. I see the crash in test__xxsubinterpreters. Also a few other tests are failing due to what looks like finalizers not being called. I'll try to understand what you did. |
Beta Was this translation helpful? Give feedback.
-
I actually ended fixing some of the finalizer bugs, but I need to find where i stored that work 😅 In any case, is interesting to note that my benchmarks showed a little slowdown in general (although not much). I suspect this is due to the extra context switching, but in any case I don't think this will actually end giving us any speedups, although it will potentially help with the problems described in PEP 556 |
Beta Was this translation helpful? Give feedback.
-
Speedups were not the goal indeed, but to make the GC context more predictable (instead of GC-induced code such as finalizers running at potentially any point). Even latency would probably not be improved very much, because of the GIL (I don't know how workable it would be to release the GIL at some points of garbage collection). |
Beta Was this translation helpful? Give feedback.
-
Note that this GC execution context issue is actually extremely serious for real-world code (anything with complex finalizers with side effects). It also motivated the creation of the SimpleQueue object, which can help alleviate the problem, but at the cost of additional complexity (and potential bugs or performance issues as well). |
Beta Was this translation helpful? Give feedback.
-
Yep, that's why I was interested in pushing it in the first place but got side-tracked by the parser work and extinguish the fires of the 3.9 release :( |
Beta Was this translation helpful? Give feedback.
-
OTOH in the context of "Faster CPython" the idea was to make things faster by letting a second core handle (most of) the GC work -- this is what some other languages do. But it would require something like the Gilectomy first. |
Beta Was this translation helpful? Give feedback.
-
Right, but those "other languages" are generally fully GC-based. In CPython, much of the cost of reclaiming memory is born by regular reference counting. |
Beta Was this translation helpful? Give feedback.
-
Regardless of where we run the "GC", i.e. the cycle finder and breaker, it would make sense to improve the general decref/dealloc code. We want to:
To keep dealloc efficient we want to free memory for simple objects without having to chase function pointers, and not to have to worry about finalizers and weakrefs when clearing internal references. Something like this (in C flavoured Python): @inline
def Py_DECREF(obj):
# Refcount never reaches 0 unless object is genuinely unreachable.
if obj.refcount == 1:
dealloc(obj)
else:
obj.refcount -= 1
def dealloc(obj):
if has_weakrefs(obj) or needs_finalizing(obj):
finalizer_list.append(obj)
elif has_internal_references(obj):
recursive_clear_and_free(obj)
else:
# Objects that cannot refer to other objects e.g. ints, strs.
# Implicitly reduce refcount to zero.
free(obj)
def recursive_clear_and_free(obj):
# obj cannot be "resurrected" as there are no references to it; strong or weak.
if tstate.recursive_dealloc > TRASHCAN_LIMIT: # Currently limit is 50?
trashcan_list.append(obj)
return
tstate.recursive_dealloc += 1
for offset in obj.__class__.layout:
item = obj[offset]
obj[offset] = NULL
Py_DECREF(item)
assert obj.refcount == 1
# Implicitly reduce refcount to zero.
free(obj)
tstate.recursive_dealloc -= 1
if tstate.recursive_dealloc == 0:
while trashcan_list:
obj = trashcan_list.pop()
Py_DECREF(obj)
if finalizer_list and not RUN_FINALIZERS_IN_THREAD:
clear_finalizer_list()
def clear_finalizer_list():
while finalizer_list:
item = finalizer_list.pop()
if has_weak_refs(item):
do_weak_ref_callbacks(item)
if has_finalizer(item):
finalize(item)
Py_DECREF(item)
if RUN_FINALIZERS_IN_THREAD:
def thread_run_clear_finalizer_list():
while True:
clear_finalizer_list()
#sleep until there is something to do I have done absolutely no testing on the above. So it will be incorrect. Hopefully the general idea is clear though. |
Beta Was this translation helpful? Give feedback.
-
@markshannon Arranging gc like that would make it more predictable but could slow it down, right? (I’m assuming that trashcan only kicks in after 50 because it’s slower so not worth it for small chains.) |
Beta Was this translation helpful? Give feedback.
-
I think it should be quicker as type-specific dealloc code doesn't need worry about trashcans, finalizers or resurrection. Maintaining a stack of objects, and making deallocation iterative rather than recursive should be even quicker. Pushing a single pointer is going to faster than making a C call. |
Beta Was this translation helpful? Give feedback.
-
I'm assuming you meant that the last line of I think this can get complicated because we need to make sure that we don't add the same object twice to the Another question - if we consume the finalizer_list on a separate thread then it's on the interpreter state rather than the thread state? Or the separate thread looks at the different thread's lists? In either case, doesn't this require locking? |
Beta Was this translation helpful? Give feedback.
-
This is indeed an extremely delicate area of the GC where very obscure bugs have been found over the years. I would be very careful to introduce any changes of that part of the GC machinery without a very good reason, because causing obscure regressions is super easy, and even more dangerous when you involve weak references with types that have |
Beta Was this translation helpful? Give feedback.
-
Why is this fragile? It shouldn't be. I believe the reason it is so fragile is that we break the most important invariant of refcounting, that The algorithm above #58 (comment) should maintain that invariant, so weakref callbacks and finalizers can do as they please.
Yes. The code should be: def dealloc(obj):
if has_weakrefs(obj) or obj.needs_finalizing:
obj.needs_finalizing = False
finalizer_list.append(obj)
... |
Beta Was this translation helpful? Give feedback.
-
Am personally interested in making some headway to realize PEP 556, other than @pitrou 's progress, has anyone made any more changes, if so is there a fork I can access? I have planned to work on this during the sprint to also abate other things related to the Python GC research I do. @pablogsal do you have any additional progress you made in addition to @pitrou's work? |
Beta Was this translation helpful? Give feedback.
-
(Sorry for the delay, I just returned from vacation) Yeah, I did some work on top of it (see #58 (comment)) (mainly fixing some obscure edge cases and bugs) but I need to search for it. Otherwise, I think I mostly remember where the problems were so I will be happy to collaborate in the sprint :) |
Beta Was this translation helpful? Give feedback.
-
Good, @pablogsal , when you are successful in the search, please let me know. @pitrou where is your fork, I can lurk at that too. I don't plan to work on it much before the sprint anyway, so let's plan to work on it then. |
Beta Was this translation helpful? Give feedback.
-
The initial work by @pitrou can be found here: https://github.com/pitrou/cpython/tree/threaded_gc |
Beta Was this translation helpful? Give feedback.
-
Noted and thanks @pablogsal |
Beta Was this translation helpful? Give feedback.
-
Mark:
Pablo:
What if resurrection was not done in @nanjekyejoannah - I'd like to join you in the sprint as well. |
Beta Was this translation helpful? Give feedback.
-
That's is unfortunately still backwards incompatible because current code that resurrects on purpose (or not) is doing it in Notice that currently is clear when an object is alive or death, and also when has been finalized or resurrected. The problem is that resurrections affect the GC and finalizes, and those semantics are guaranteed. |
Beta Was this translation helpful? Give feedback.
-
Sure, it would take several versions to deprecate resurrection in IIUC the complication with resurrection is that you can't call finalise twice so you can have an object which is still alive after being finalised. |
Beta Was this translation helpful? Give feedback.
-
Not by a lot. There is some complexity in resurrection but it is not one of the biggest pain points of the GC, like for instance weak references and tp_clear. The code that handles resurrection is quite contained and since some changes I worked in Python 3.9, it doesn't even abort the full collection anymore.
Yeah but those effects are almost non problematic. We just set a flag that says "this object had been already finalized" and to all purposes the resurrected object is alive like any other object. |
Beta Was this translation helpful? Give feedback.
-
If you move GC to a separate thread, then please include an API to run functions in that thread, too. Language bindings like JCC or PythonNet must be able to register the thread with Java JVM and .NET CLR. Otherwise GC of a Python object with Java reference can lead to a segfault. |
Beta Was this translation helpful? Give feedback.
-
Async generators are built-in objects whose semantics fundamentally rely on resurrection. Their |
Beta Was this translation helpful? Give feedback.
-
@njsmith The problem is C code that resurrects objects during deallocation. |
Beta Was this translation helpful? Give feedback.
-
One other thing we could do, is to run finalizers in a more restricted environment.
|
Beta Was this translation helpful? Give feedback.
-
I also agree, but wonder how these could be enforced? Do we need a per-thread flag saying "finalizer is running" that we check for specific operations? (It would have a counter that is incremented when a finalizer is called and decremented when it returns.) |
Beta Was this translation helpful? Give feedback.
-
I think "resurrection" has usually been used to mean "__del__ method
creates new reachable references to the object so that it is no longer a
candidate for collection"? If you mean a more specific thing like "objects
never transition from refcount 0 to 1" then that makes sense to me as a
concept, but to avoid confusion you should spell that out instead of just
saying "resurrection" and assuming everyone knows what you mean :-)
…On Mon, Mar 7, 2022, 04:43 Mark Shannon ***@***.***> wrote:
@njsmith <https://github.com/njsmith>
Objects can do whatever they like in their __del__ methods.
It is up to the VM to prevent resurrection by calling the __del__ method
*before* the ref-count drops to zero.
You can't resurrect an object if it isn't dead.
The problem is C code that resurrects objects during deallocation.
—
Reply to this email directly, view it on GitHub
<#120 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEU42EZ5QY3Q2W6AUJ5NU3U6X2WPANCNFSM5QDFYIAA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Antoine started working on this in PEP 556.
@pitrou, can you update us where you are with this idea?
Beta Was this translation helpful? Give feedback.
All reactions