Skip to content

Kill __subclasscheck__ #283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Sep 27, 2016
Merged

Kill __subclasscheck__ #283

merged 28 commits into from
Sep 27, 2016

Conversation

ilevkivskyi
Copy link
Member

@ilevkivskyi ilevkivskyi commented Sep 18, 2016

This PR:

The idea is to make most things not classes. Now _ForwardRef(), TypeVar(), Union[], Tuple[], Callable[] are not classes (i.e. new class objects are almost never created by typing).

Using isinstance() or issubclass() rises TypeError for almost everything. There are exceptions:

  • Unsubscripted generics are still OK, e.g. issubclass({}, typing.Mapping). This is done to (a) not break existing code by addition of type information; (b) to allow using typing classes as a replacement for collections.abc in class and instance checks. Finally, there is an agreement that a generic without parameters assumes Any, and Any means fallback to dynamic typing.
  • isinstance(lambda x: x, typing.Callable) is also OK. Although Callable is not a generic class, when unsubscribed, it could be also used as a replacement for collections.abc.Callable.
  • The first rule for generics makes isinstance([], typing.List) possible, for consistency I also allowed isinstance((), typing.Tuple).

Finally, generics should be classes, to allow subclassing, but now the outcome of __getitem__ on classes is cached. I use an extended version of functools.lru_cache that allows fallback to non-cached version for unhashable arguments.

This is still WIP, since I did this only for Python 3.

@gvanrossum @JukkaL Please take a look.

@ilevkivskyi
Copy link
Member Author

@gvanrossum @JukkaL
It looks like this PR also fixes http://bugs.python.org/issue26075, http://bugs.python.org/issue25830, and http://bugs.python.org/issue26477

I added this to the description.

@JukkaL
Copy link
Contributor

JukkaL commented Sep 20, 2016

I did a quick pass on the diff and tried it with mypy, and I like it. The changes seem to speed up a no-op mypy invocation (mypy) by roughly 20%, which is nice (though I didn't do very careful benchmarking). There still seems to be maybe a 5-10% slowdown in startup time due to type annotations, but this seems acceptable already.

The generic type object cache hit rate when running mypy -x was about 78% -- increasing the cache size to 50 improved that to 80%. A further increase only had a marginal effect. The improvement with a larger cache might be higher to larger codebases, but it's hard to tell.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Here are some nits. My main concern is over __slots__ but even that's minor. I haven't reviewed everything (in fact I only reviewed src/typing.py) but I think we should just go ahead and release this with 3.6b2 and hope for the best. Maybe we'll get some real-world feedback from that and we can tweak things before 3.5.3 is finalized (sorry, there's no date in PEP 478, but we'll get ample warning, and I expect it to be around the same time as 3.6.0 goes out, mid December according to PEP 494).

@@ -1,5 +1,6 @@
import abc
from abc import abstractmethod, abstractproperty
from functools import lru_cache, wraps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather see import functools, like the following imports. (abstractmethod and -property are special because they are used a lot, and also well-known).

@@ -130,27 +131,43 @@ def __repr__(self):
return '%s.%s' % (self.__module__, _qualname(self))


class Final:
class TypingBase(metaclass=TypingMeta, _root=True):
Copy link
Member

@gvanrossum gvanrossum Sep 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class name should start with _, and ditto for Final below (and probably also TypingMeta above).

class TypingBase(metaclass=TypingMeta, _root=True):
"""Indicator of special typing constructs."""

# things that are not classes also need these
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you capitalize sentences and add a period at the end?

"""Wrapper to hold a forward reference."""

def __new__(cls, arg):
def __init__(self, arg):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but I still think it should call super().

assert isinstance(impl_type, type), repr(impl_type)
assert not isinstance(impl_type, TypingMeta), repr(impl_type)
self.name = name
self.type_var = type_var
self.type_var = _type_check(type_var, "Type alias accepts only types")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questionable, since this is not an end user API. If this error ever were to trigger that would mean the author of typing.py itself had made a mistake. That's why it was just an assert.

@@ -400,9 +394,9 @@ def longest(x: A, y: A) -> A:
A.__constraints__ == (str, bytes)
"""

def __new__(cls, name, *constraints, bound=None,
def __init__(self, name, *constraints, bound=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'd still like the super() method called.

return '{}.{}'.format(cls.__module__, cls.__name__[1:])


class Final(TypingBase, _root=True):
Copy link
Member

@gvanrossum gvanrossum Sep 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to _FinalTypingBase?

class Final:
class TypingBase(metaclass=TypingMeta, _root=True):
"""Indicator of special typing constructs."""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a __slots__ here too, otherwise the __slots__ in the subclass is pointless. Also all the final classes (and probably also those inheriting from TypingBase) should have their own __slots__.

(I guess it's a little-known fact that __slots__ only suppresses the __dict__ when used for every class in the MRO...)

@@ -900,6 +823,17 @@ def _next_in_mro(cls):
return next_in_mro


def tp_cache(func):
cached = lru_cache(maxsize=50)(func)
Copy link
Member

@gvanrossum gvanrossum Sep 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this cache size way too small? I'd expect that a program that benefits from this cache would be pretty large so it would have way more than 50 different generic type instantiations (if I understand that that's what it's caching -- it looks like it's only used on __getitem__ in GenericMeta).

Though when the cache works it's amazing -- I measured 20x speedup on List[int]!

Also, I printed the cache info for running mypy itself, and there 50 seems fine -- with maxsize=50 there are 120 misses, while with maxsize=1000 there are 115 misses, and currsize is also 115, so it seems mypy itself contains 115 distinct generic type instantiations. (The hit count when running on itself was 1071, so that would point to roughly 10x speedup.)

The default maxsize for lru_cache() is 128 -- perhaps just leave it up to that?

@ilevkivskyi ilevkivskyi changed the title [WIP] Kill __subclasscheck__ Kill __subclasscheck__ Sep 26, 2016
@ilevkivskyi
Copy link
Member Author

@gvanrossum
Thank you for review! I agree with all comments and implemented them (also in /python2). Except for one: I added _ to _TypingBase and _FinalTypingBase as you requested, but not to TypingMeta, there are few reasons for this:

  • It was around for a long time (probably since beginning).
  • Adding underscore there would probably also require adding a lot of them to TupleMeta etc. in /python2.
  • I think we could keep TypingMeta as a part of "semi-public" API (just in case someone wants to make extensions to typing module or mypy).

@ilevkivskyi
Copy link
Member Author

@JukkaL
I agree that the cache size should be increased. The idea is that projects that will use typing and mypy are probably very large (this is also Guido's opinion). Now the cache size is 128.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to merge this as-is. While there are a few things left to do, it's easier to do them in a follow-up PR -- this one is just too big to review already.


__slots__ = ()


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one blank line is required.

isinstance(args[1], tuple)):
# Close enough.
raise TypeError("Cannot subclass %r" % cls)
return object.__new__(cls)
Copy link
Member

@gvanrossum gvanrossum Sep 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this use super()?


class Final:

class _FinalTypingBase(_TypingBase, _root=True):
"""Mix-in class to prevent instantiation."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd update the docstring to hint that it prevents instantiation unless _root=True is given.

assert isinstance(impl_type, type), repr(impl_type)
assert not isinstance(impl_type, TypingMeta), repr(impl_type)
assert isinstance(type_var, (type, _TypingBase))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add repr(type_var) as the message?

def __new__(cls, *args, **kwds):
"""Constructor.

This only exists to give a better error message in case
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe getting a better error isn't enough reason to insert an extra function call? Or does the cache make this moot?

Union[Manager, int, Employee] == Union[int, Employee]
Union[Employee, Manager] == Employee

- Corollary: if Any is present it is the sole survivor, e.g.::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over at mypy we've convinced ourselves (with some difficulty!) that this is wrong. Union[int, Any] should not be simplified to Any. Intuitively, the reason is that if x has type Union[int, Any] then calling x.lower() would be a mistake because it's wrong to call .lower() on an int (even though it's okay to call it on an Any). More about this at python/mypy#1642.


Union[int, Any] == Any

- Similar for object::
Copy link
Member

@gvanrossum gvanrossum Sep 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this statement is not in question -- in fact this is the reason that we were initially misguided about Union[int, Any]. Technically it just falls under the subclass rule stated above. (You could edit the docstring to make this the corollary.)

@@ -499,22 +597,16 @@ def __new__(cls, name, bases, namespace, parameters=None, _root=False):
for t1 in params:
if t1 is Any:
Copy link
Member

@gvanrossum gvanrossum Sep 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess getting rid of this check should be sufficient to avoid simplifying away Any. (Also the comment above should be updated.)

try:
return cached(*args, **kwds)
except TypeError:
pass # Do not duplicate real errors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is mysterious. I suppose you mean that we only catch TypeErrors so that other errors won't be masked by this. And I recall from a previous version that the issue here is that maybe an argument may not be hashable, and then we want the real error from the uncached function (I think). But TypeError would also be raised for various bugs in func (in general TypeError usually means there's a bug). In the end I think it's fine to just catch TypeError, because the uncached function should reproduce the error exactly. It would also be fine to catch Exception, but presumably unnecessary. So I'm just asking for a better comment. :-)


- You can use Optional[X] as a shorthand for Union[X, None].
Optional[X] is equivalent to Union[X, type(None)].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While type(None) is technically correct I'm not sure how much it adds to mention it here (especially since Union[x, None] and Union[x, type(None)] are always equivalent :-).

@gvanrossum
Copy link
Member

Feel free to iterate more on this! Especially the Union[Any, int] issue. But I figured we might as well take a checkpoint, so this is now merged (also into CPython 3.5, 3.6 and 3.7==default). I closed the three CPython bugs you mentioned.

@ilevkivskyi
Copy link
Member Author

@gvanrossum
Thank you for merging this!
I will submit a new PR for Union[int, Any] in typing.py and probably also PR for PEPs 484 and 483.

This is a good catch, I didn't notice this while editing PEP 483. The point is that type is characterized by two sets: set of values and set of functions (methods), as mentioned in PEP 483. With such definition neither int is not a subtype of Any (since int has less methods) nor vice versa (since Any has more values). The fact that Any is at the same time at the top of type hierarchy (it has all values) and at its bottom (it has all methods) requires us to introduce two different concepts: subtype and compatibility.

Actually, now I think that Any should be considered a subtype of object. Indeed, both have all values, but Any has more methods, therefore it is a subtype of object. This, in turn, means that Union[object, Any] should be simplified to just object. For example, if x has type Union[object, Any] then the only way to call x.lower() is when x is not object, but this is not possible:

x: Union[object, Any]
x.lower()  # this should be rejected
if not isinstance(x, object):
    x.lower()  # this should be allowed, but it is not reachable

What do you think?

(If it will help, the relation between object and Any is similar to relation between, e.g., Mapping and MutableMapping, they both have the same set of values, but MutableMapping has more methods, therefore it is a subtype of Mapping.)

@gvanrossum
Copy link
Member

gvanrossum commented Sep 28, 2016 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants