Skip to content

gh-74690: Improve performance when testing against protocols with many attributes. #112157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions Lib/typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1850,6 +1850,11 @@ def __instancecheck__(cls, instance):
if super().__instancecheck__(instance):
return True

# quick true negative check for protocols with many attributes.
if not hasattr(instance, '__getattr__'):
if not (cls.__protocol_attrs__ <= set(dir(instance))):
Copy link
Member

@AlexWaygood AlexWaygood Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about classes that define custom __dir__() methods? What about classes that define custom __getattribute__ methods? I think this approach is very risky

Copy link
Contributor Author

@randolf-scholz randolf-scholz Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, is there any C-implemented alternative to dir, that returns some iterable of the attribute names? I saw inspect.getmembers_static, but it is mostly plain python and way too slow. If custom __dir__ is a worry, one could maybe use object.__dir__(instance).

dir itself is even inefficient since it sorts the names, here we really only need something that iterates over the names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In regard to the custom __getattribute__ stuff, I think the current code already suffers from that, no? The docs say about getattr_static:

Note: this function may not be able to retrieve all attributes that getattr can fetch (like dynamically created attributes) and may find attributes that getattr can’t (like descriptors that raise AttributeError).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any C-implemented alternative to dir, that returns some iterable of the attribute names?

Not that I know of

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In regard to the custom __getattribute__ stuff, I think the current code already suffers from that, no?

Ah indeed; good point

Copy link
Contributor Author

@randolf-scholz randolf-scholz Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So really the issue here seems that

  1. getattr_static is just really slow (I re-ran the test, replacing it with plain getattr and got 11x speedup)
    • getattr can be slow if it triggers some computation.
  2. isinstance(x, proto) only checks statically known attributes and can give incorrect results if certain attributes are defined dynamically.
    • since Protocols are about static typing, this seems ok to do.
  3. There is nothing in the data-model for classes to provide something like a __hasattr__ to communicate which attributes are set dynamically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your analysis is correct.

  • getattr_static is just really slow (I re-ran the test, replacing it with plain getattr and got 11x speedup)

    • getattr can be slow if it triggers some computation.
  • isinstance(x, proto) only checks statically known attributes and can give incorrect results if certain attributes are defined dynamically.

Note that we changed these two bullet points in Python 3.12 after long discussions in python/typing#1363 and #102433. Previously, isinstance() checks didn't use inspect.getattr_static, and did resolve dynamically defined attributes. But it was decided that that behaviour was bad, so we decided to accept the performance penalty in order to have more correct and less unexpected behaviour. (And through various other performance optimisations to compensate for that, isinstance() checks against runtime-checkable protocols are still overall much faster in Python 3.12 than in Python 3.11, despite the slowdown from switching to inspect.getattr_static.)

return False

getattr_static = _lazy_load_getattr_static()
for attr in cls.__protocol_attrs__:
try:
Expand Down