Skip to content

Specialize access to Enum attributes #95004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
markshannon opened this issue Jul 19, 2022 · 25 comments
Open

Specialize access to Enum attributes #95004

markshannon opened this issue Jul 19, 2022 · 25 comments
Assignees
Labels
3.12 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@markshannon
Copy link
Member

Currently we don't specialize loading of enum attributes.

class Color(Enum):
     RED = "Red"

Color.RED   # This load is not specialized

We can take advantage of the fact that type(Color.RED) == Color when specializing.
Since we check the version number of Color in LOAD_ATTR_CLASS, we are implicitly checking the version of type(Color.RED).
If Color.RED is not a descriptor when specialized, it will not be a descriptor when executed.
Therefore, we can specialize it in the same way as immutable, non-descriptor class attributes using LOAD_ATTR_CLASS.

@Fidget-Spinner
@brandtbucher

@Fidget-Spinner
Copy link
Member

Are you working on this?

@markshannon
Copy link
Member Author

Are you working on this?

No. If you want to do this, just assign yourself.
It might be worth waiting for 3.11 to calm down a bit first.

@Fidget-Spinner Fidget-Spinner self-assigned this Jul 19, 2022
@brandtbucher
Copy link
Member

I imagine that this will piggyback off of #93988, since EnumType defines __getattr__?

@ethanfurman
Copy link
Member

__getattr__ is no longer necessary, and will be removed in 3.12 (since I forgot to do it in 3.11).

@ethanfurman
Copy link
Member

If Color.RED is not a descriptor when specialized, it will not be a descriptor when executed. Therefore, we can specialize it in the same way as immutable, non-descriptor class attributes using LOAD_ATTR_CLASS.

Even if Color.__dict__['RED'] is a descriptor, its result will still always be the member Color.RED when called on the class. Is that optimizable?

In case it matters, the exact descriptor would be enum.property -- the big difference between it and the standard property is the behavior when called on the class -- either the matching enum member, or an AttributeError.

@brandtbucher
Copy link
Member

brandtbucher commented Jul 19, 2022

Even if Color.__dict__['RED'] is a descriptor, its result will still always be the member Color.RED when called on the class. Is that optimizable?

Interesting. Looking at it now, I'm not sure why LOAD_ATTR_CLASS currently only handles methods and non-descriptors. As you note, it should be able to handle anything (at least anything without a descriptor in the metaclass, which we already guard against), including descriptors in the class __dict__.

@Fidget-Spinner?

@Fidget-Spinner
Copy link
Member

@brandtbucher it's probably an oversight in how I implemented it. I was being conservative at the time (also I admit just using the direct results from the analyze_descriptor function for simplicity).

@Fidget-Spinner
Copy link
Member

Some complications with specialising enums:

  • Inheriting from enums uses metaclasses. Which we (just) disabled.
  • The enum metaclass is a mutable class. Which we don't specialize (it may be dangerous to).
  • The enum property is a descriptor.

We probably need a specialisation for metaclasses, like what Brandt suggested in his other PR.

@markshannon
Copy link
Member Author

Inheriting from enums uses metaclasses. Which we (just) disabled.

All classes have a metaclass. The issue is whether the metaclass is mutable, has overrides, or shadows class attributes.
This is much the same as for normal instances and their classes, which is why we have analyze_descriptor

The enum metaclass is a mutable class.

This is the only problem here, AFAICT. The easy fix is to make EnumType immutable. It doesn't need to be mutable.
A more general, and much more complex fix is to update the version number of all classes that are instances of a metaclass, when that metaclass changes.

The enum property is a descriptor.

Properties are descriptors. Which property are you talking about?

@Fidget-Spinner
Copy link
Member

Properties are descriptors. Which property are you talking about?

This part is wrong. I initially assumed enum members defined __get__ and __set__. I don't think they do.

@brandtbucher
Copy link
Member

This part is wrong. I initially assumed enum members defined __get__ and __set__. I don't think they do.

No, I think it's just name and value that do that. Still something to consider, though.

@brandtbucher
Copy link
Member

I was being conservative at the time (also I admit just using the direct results from the analyze_descriptor function for simplicity).

We're probably going to want to revisit this soon. Looking at it with fresh eyes, things like the GETSET_OVERRIDDEN checks are incorrect in this context too.

This is the only problem here, AFAICT. The easy fix is to make EnumType immutable. It doesn't need to be mutable.
A more general, and much more complex fix is to update the version number of all classes that are instances of a metaclass, when that metaclass changes.

It seems to me that a much simpler solution would just be to version the metaclass too (main...brandtbucher:cpython:load-attr-metaclass).

@markshannon
Copy link
Member Author

It seems to me that a much simpler solution would just be to version the metaclass too

Simpler, but slower. It is probably the best thing to do for now, we can always do the more complex thing later.

@ethanfurman
Copy link
Member

The enum metaclass is a mutable class.

This is the only problem here, AFAICT. The easy fix is to make EnumType immutable. It doesn't need to be mutable. A more general, and much more complex fix is to update the version number of all classes that are instances of a metaclass, when that metaclass changes.

Is there anything I can do to make EnumType immutable?

@brandtbucher
Copy link
Member

As far as I know, the only way is to set the right flags on the type at the C level. So, short of reimplementing EnumType in C, there probably isn't a simple way to do it without hacks (like exposing a C function somewhere that you can decorate your type with).

@brandtbucher
Copy link
Member

brandtbucher commented Jul 20, 2022

So let's just version the metaclass for now. As soon as EnumType removes its __getattr__ function (and we iron out some of the quirks of class attribute specialization), we'll be able to specialize Enum (and many others, like sympy) attribute accesses without resorting to hacks.

I'll open a PR in a bit.

@ethanfurman
Copy link
Member

This part is wrong. I initially assumed enum members defined __get__ and __set__. I don't think they do.

No, I think it's just name and value that do that. Still something to consider, though.

In EnumClass.__dict__ the member names, along with value and name and any other enum.propertys assigned by the user, are descriptors. The members themselves are not.

@brandtbucher
Copy link
Member

Ah, thanks.

Also, I was wrong earlier: I totally forgot that descriptor __get__ methods still need to be called (without an instance) on class attribute accesses too. So it's not safe to cache any descriptors here, as I previously thought. :(

@ethanfurman
Copy link
Member

In EnumClass.__dict__ the member names ... are descriptors.

At least they will be once the performance is improved (which I believe it is even now in 3.12) -- in 3.11 they are only descriptors if the enum class defines a member which matches a property in the inheritance chain:

class ConfusingEnumBase(Enum):
    @enum.property
    def attribute(self):
        return 'example'

class ConfusingEnum(ConfusingEnumBase):
    cls = 1
    decorator = 2
    attribute = 3

In the above, the cls and decorator members live directly in ConfusingEnum.__dict__, while attribute is a descriptor in ConfusingEnum.__dict__.

@ethanfurman
Copy link
Member

ethanfurman commented Jul 20, 2022

Also, I was wrong earlier: I totally forgot that descriptor __get__ methods still need to be called (without an instance) on class attribute accesses too. So it's not safe to cache any descriptors here, as I previously thought. :(

Even if the result of calling __get__ on the class never changes? Is there a C hack to mark such things as immutable so it only needs to be called once and then the result cached?

@brandtbucher
Copy link
Member

brandtbucher commented Jul 20, 2022

Even if the result of calling __get__ on the class never changes? Is there a C hack to mark such things as immutable so it only needs to be called once and then the result cached?

We don't cache the results of function calls anywhere, since in general that's an extremely fragile semantic break. We do things like caching the descriptors themselves, but not the result of calling any Python code. It also puts us in a really nasty situation that we've so far avoided: storing strong references in the inline caches.

Realistically, the best we can probably do without adding caches is caching the descriptor and inlining the __get__ call. But I'm not sure if that will make enough difference to matter (and we still won't have __getattr__ handling until something like #93988 is in).

@warsaw
Copy link
Member

warsaw commented Jul 20, 2022

I'm not in an environment where I can research this so I'll just ask: is the enum specialization being discussed here limited only to stdlib enums? I ask because I've been thinking about resurrecting my flufl.enum package (a precursor to the stdlib library). I'd promote it as a limited, simplified enum library. It might not get much use and I'm not sure whether it would even benefit from enum-specific specialization, but I'd like to know if this is a general specification or something that only benefits the stdlib package (not that that's a bad thing).

@Fidget-Spinner
Copy link
Member

is the enum specialization being discussed here limited only to stdlib enums

Most likely not. The specializations work on things with the same "shape", "type" or belonging in the same "family" (behavior-wise). So anything specializing stdlib enums would also work for anything that has the same characteristics (ie, mutable metaclass, attribute belonging to a class).

If your enum implementation is simpler, we likely already specialize for it. If you want to check the specialization works, run your enums in a loopy function then pass that to dis.dis(f, adaptive=True) in 3.12. If you see weird LOAD_ATTR opcodes (other than LOAD_ATTR_ADAPTIVE. It means it worked.

@gryznar
Copy link

gryznar commented Oct 26, 2022

Is it currently possible in a satisfying way to make EnumType immutable? Unfortunately, in the current situation, implementing mixins which are inherited from mutable objects has no sense and implementing immutability on the user's side is very hard (if at all possible).

@ethanfurman
Copy link
Member

The above comment is based on this Stackoverflow question. As I mention there, the immutability of EnumType has no bearing on enum members being mutable (and it's the mutability of the members that is the problem).

@iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
Projects
None yet
Development

No branches or pull requests

7 participants