-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize class creation #132042
Comments
I would try if @AA-Turner doesn't pick this up :) |
I think the whole concept of slots (the We should view the
For pure Python objects, all slots can be filled in with a function that does the dynamic lookup, which should be very quick. For classes defined by Also, see faster-cpython/ideas#146 (comment) |
The bytecode for creating classes is also a bit of a mess. We seem to be creating code objects, just to create functions just to call them, to do things that could easily be done inline. There is also a fair bit of machinery about finding the metaclass and the base class tuple. We should compute those in the interpreter as pass them into the class creation machinery. E.g given a
For
For multiple inheritance I'm missing the code for setting |
Previous attempt in 2017: #76527 |
I have added some tests results - #132156 (comment) |
Ok, I got rid of |
Currently, creating an empty class is about 70x slower than creating an empty function in my profiling. Classes are much more complex and it makes sense that they're slower to create, but 70x feels excessive. (Related: #118761.)
I ran some profiling on my Mac with a sample script that just made empty classes in a loop:
A few things stood out:
tp_*
,nb_*
, etc. functions in the C struct for the type. We do this by iterating over all the slots, then looking up the function name (e.g.,__add__
) in the MRO and placing it in the slot for this class.__add__
is both nb_add and sq_concat), and does that by iterating over all the slotdefs and finding other slots with the same name. It does that using some scratch space in the interpreter state, which seems not thread-safe. I feel we could precompute the data instead, so we don't have to figure it out at runtime. For example, the slotdef struct could grow a new member to indicate whether or not the name is unique.Most types will define very few of these slots, so it makes sense to try to look for an approach that does less work for slots without changes. I think something like this should work:
__dict__
. For those slots only, perform an update.This should make it possible to make class creation something like 2x faster. I haven't started working on implementing this and I may not have time to do it; if you see this and are interested, feel free to pick it up!
Linked PRs
The text was updated successfully, but these errors were encountered: