Skip to content

[in progress] TypedDict: Recognize creation of TypedDict instance. Define TypedDictType. #2342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 24, 2016

Conversation

davidfstr
Copy link
Contributor

@davidfstr davidfstr commented Oct 27, 2016

This PR was created in the following steps:

  • Recognize creation of TypedDict instance. Define TypedDictType.
  • Implement all visitor methods for the new TypedDictType:
    • Easy visitors.
    • Interesting visitors: Subtype, Join, Meet, Constraint Solve
  • Write a zillion tests.

Remaining work:

[x] Figure out some way to trigger the TypeJoinVisitor.visit_typeddict_type() path.
    [x] Do it.
[x] Figure out some way to trigger the TypeMeetVisitor.visit_typeddict_type() path.
    [x] Do it.
[x] Figure out some way to trigger the ConstraintBuilderVisitor.visit_typeddict_type() path.
    [ ] Do it.
[ ] Extend mypy stubs to recognize that superclasses of Mapping include {Sized, Iterable, Container}

I am creating this PR now to solicit assistance on approach for the preceding items of remaining work.

Since I haven't been able to trigger the code paths for the Join, Meet, and Constraint Solve visitors, I am not as confident in their correctness.

Out of Scope for this PR:

  • Anything in the test suite sections: __getitem__, __setitem__, get, isinstance
  • Creating TypedDict types with keyword-based syntax or a dict(...) call.

@gvanrossum
Copy link
Member

@JukkaL can probably help you most effectively. In the meantime could you rebase?

@davidfstr
Copy link
Contributor Author

davidfstr commented Oct 30, 2016

Rebased.

The first question under consideration is how can I trigger the TypeJoinVisitor.visit_typeddict_type() code path? Thoughts @JukkaL ?

Prior investigation notes:

  • It looks like ConditionalTypeBinder.frame_context has logic that joins types under certain circumstances when flows of execution merge when popping a stack frame. However I haven't been able to figure out how to trigger this joining logic. This scenario appears to be the most important one where joining is used.
  • There is also a tertiary scenario in analyze_iterable_item_type where an expression like (rect, circle) infers the type Tuple[Union[Circle, Rectangle], ...] via a join. However this doesn't appear to a primary scenario for join, so I did not attempt to trigger it.

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 31, 2016

Probably the most common scenario for joins is something like x = [expr1, expr2], where the inferred type of x will depend on the join of the expression types.

For meets, it's a little trickier. Mypy calculates the meet of A and B when type checking this program and reveals the value of the meet:

from typing import TypeVar, Callable

class A: pass
class B(A): pass

T = TypeVar('T')

def f(x: Callable[[T, T], None]) -> T: pass
def g(x: A, y: B) -> None: pass
reveal_type(f(g))  # Reveals B, which is the meet of A and B

Also, there are some unit tests in mypy/test/testtypes.py that call join_types and meet_types directly. It may be worth adding some test cases there.

extra_item_names = [k for k in kwargs_item_names if k not in callee_item_names]

self.chk.fail(
'Expected items {} but found {}. Missing {}. Extra {}.'.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, the message logic would better live in mypy/messages.py, as a method.

Maybe omit empty Missing / Extra from the message.

item_actual_type = self.chk.check_simple_assignment(
lvalue_type=item_expected_type, rvalue=item_value, context=item_value,
msg=messages.INCOMPATIBLE_TYPES,
lvalue_name='TypedDict item',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include item name in lvalue_name, as otherwise it can be hard to figure out which item has an invalid value type.

items[item_name] = item_actual_type

mapping_value_type = join.join_type_list(list(items.values()))
fallback = self.chk.named_generic_type('typing.Mapping',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question about the fallback type is actually kind of tricky. It's not clear to me if Mapping is the most practical fallback definition, though I think that it's a type-safe one.

In existing code, people likely use Dict[str, Any] for things that would be suitable for TypedDict. Introducing TypedDict gradually to such a codebase would likely imply getting rid of many of the Dict[str, Any] types or replacing them with Mapping[str, Any] (or adding casts), so the introduction wouldn't be perfectly gradual.

Here are alternatives that may or may not be reasonable:

  • If a typed dict has uniform value types, make the fallback Dict[str, value_type]. This wouldn't be quite safe since dict values support __del__ and adding new keys.
  • Have two fallbacks (the second one could be special cased somehow), Mapping[str, ...] (as currently) and Dict[str, Any]. Code dealing with Any types won't be safe anyway, so this would still be safe for fully typed code but would also perhaps provide a smoother gradual typing story.

We don't need to decide this now -- it may well take some practical experimentation with real code to determine the best approach.

@@ -342,11 +342,26 @@ def visit_tuple_type(self, template: TupleType) -> List[Constraint]:
else:
return []

def visit_typeddict_type(self, template: TypedDictType) -> List[Constraint]:
actual = self.actual
# TODO: Is it correct to require identical keys here?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. I think that it would be reasonable to consider the intersection of keys here, between actual and template. Compatibility will be checked elsewhere so it won't be unsafe.

(item_name, self.join(s_item_type, t_item_type))
for (item_name, s_item_type, t_item_type) in self.s.zip(t)
])
fallback = join_instances(self.s.fallback, t.fallback)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear whether taking the join of the fallbacks is the right thing to do. For example, a join may have fewer items, and the fallback could actually be more specific than operand fallbacks (if it's Mapping[str, ...]). Maybe at least add a TODO comment about this.

dictype = (self.named_type_or_none('builtins.dict', [strtype, AnyType()])
or self.object_type())
fallback = dictype
mapping_value_type = join.join_type_list(types)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that we can't take joins at this stage of semantic analysis, since some of the required semantic information might not have been processed yet. Actually, calculating joins may only be possible after all semantic analysis passes have been completed for a strongly-connected component of modules.

@@ -125,6 +125,10 @@ def visit_instance(self, left: Instance) -> bool:
self.right,
self.check_type_parameter):
return True
if left.type.typeddict_type is not None and is_subtype(left.type.typeddict_type,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this likely isn't right. We must assume that a typed dict type is always represented as TypedDictType here. The way to ensure this is probably by updating TypeAnalyser.visit_unbound_type to perform a check on typeddict_type and return one as needed. It already does this for tuple_type (towards the end of the method), and this should be similar.

def visit_typeddict_type(self, left: TypedDictType) -> bool:
right = self.right
if isinstance(right, Instance):
if right.type.typeddict_type is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, we should assume that a typed dict type is represented as a TypedDictType. This should be handled in a similar fashion to TypeInfo.tuple_type / TupleType.

if not left.names_are_wider_than(right):
return False
for (_, l, r) in left.zip(right):
if not is_subtype(r, l, self.check_type_parameter):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Item types should be invariant, since they are mutable. This would make them covariant -- use is_equivalent instead of is_subtype. Also add test case.

@@ -223,6 +224,13 @@ def visit_tuple_type(self, t: TupleType) -> Type:
fallback = t.fallback if t.fallback else self.builtin_type('builtins.tuple', [AnyType()])
return TupleType(self.anal_array(t.items), fallback, t.line)

def visit_typeddict_type(self, t: TypedDictType) -> Type:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not sufficient, I think. It seems that we hit the visit_unbound_type case instead (see above for a discussion of that), and it should be handled similarly to named tuples.

from mypy_extensions import TypedDict
from typing import Mapping
Point = TypedDict('Point', {'x': int, 'y': int})
def as_mapping(p: Point) -> Mapping[str, int]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check what happens if we have an incompatible Mapping return type.

[builtins fixtures/dict.pyi]
[out]
main: note: In function "convert":
main:5: error: Incompatible return value type (got "Point", expected "ObjectPoint")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use # E: for this and elsewhere (you can leave the main: note: ... in the [out] section, or modify the flags to omit context information in output).

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 31, 2016

Some more ideas for test cases:

  • Test calling fallback methods.
  • Test d[k] (__getitem__) result types and invalid key types.
  • Test d[k] = v (__setitem__) with valid/invalid item/key types.
  • Tests for serialization (incremental checking).

Also, we should reject a typed dict type as a base class.

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 31, 2016

Generally looks great -- thanks for working on this! This is going to be a useful feature.

@davidfstr
Copy link
Contributor Author

Thanks for the solid feedback. This will take me some time to integrate. :-)

@gvanrossum
Copy link
Member

@davidfstr Are you ready for another review? If so could you resolve the merge conflict (maybe using rebase) so the tests can run?

@davidfstr
Copy link
Contributor Author

@gvanrossum Not ready for another review yet. I'll make another post when I am.

@davidfstr
Copy link
Contributor Author

The second question under consideration is how can I trigger the ConstraintBuilderVisitor.visit_typeddict_type() code path? Thoughts @JukkaL ?

From the following call hierarchy analysis, it looks like an expression involving Callable[...] might do the trick:

infer_constraints @ constraints.py
  infer_type_arguments @ infer.py
    infer_function_type_arguments_using_context -- if callee.is_generic()
      check_call***
        check_call_expr_with_callee_type
          visit_call_expr***
        check_op_local
          ...
        visit_unary_expr
        check_lst_expr
          ...
        visit_dict_expr
        ... (many more)
  unify_generic_callable @ subtypes.py
    is_callable_subtype
      visit_callable_type*** @ subtypes.py
        ...

…Type.

Notable visitor implementations added:
* Subtype
* Join
* Meet
* Constraint Solve

Also:
* Fix support for using dict(...) in TypedDict instance constructor.
* Allow instantiation of empty TypedDict.
* Disallow underscore prefix on TypedDict item names.
* TypeAnalyser: Resolve an unbound reference to a typeddict as a named TypedDictType rather than as an Instance.
@davidfstr
Copy link
Contributor Author

I have squashed and rebased all changes to the tip of master. I think most of the major items of feedback have been integrated. There are some lingering issues, but it would probably be more efficient to address those with followup PRs rather than adding to the already 1,500+ line diff in this PR.

Thanks for taking the time to review all these changes.


Summary of immediate remaining issues:

  1. I suspect that we can't take joins at this stage of semantic analysis, since some of the required semantic information might not have been processed yet. Actually, calculating joins may only be possible after all semantic analysis passes have been completed for a strongly-connected component of modules.

  2. Also, we should reject a typed dict type as a base class.

  3. TODO: Figure out some way to trigger the ConstraintBuilderVisitor.visit_typeddict_type() path.

  4. TODO: Fix mypy stubs so that the following passes in the test suite

  • testCanConvertTypedDictToAnySuperclassOfMapping
  • testJoinOfTypedDictWithCompatibleMappingSupertypeIsSupertype

@gvanrossum
Copy link
Member

Thanks. I need to take a break from all work, hopefully it can wait or someone else will be able to review it.

@JukkaL
Copy link
Collaborator

JukkaL commented Nov 24, 2016

Thanks for the updates! I'm going to merge this and create follow-up issues for further work.

Now that we have the basic infrastructure in for typed dicts, it should be possible to add the remaining functionality through pretty small, specific PRs, which are going to be easier and faster to review.

@JukkaL JukkaL merged commit 7fd6eba into python:master Nov 24, 2016
@JukkaL
Copy link
Collaborator

JukkaL commented Nov 24, 2016

Here are some follow-up issues: https://github.com/python/mypy/issues?q=is%3Aopen+is%3Aissue+label%3ATypedDict

I think #2486 and #2487 are the most urgent, as they may be enough to start experimenting with real code.

@davidfstr
Copy link
Contributor Author

davidfstr commented Nov 25, 2016

Thanks @JukkaL !

The next few items I'm planning to help with are:

  1. shepherding the recent PR that improves TypedDict runtime support to allow instantiation with keyword arguments, preventing isinstance checking, etc

  2. Getting and setting items of TypedDict #2486 -- getting and setting items of a TypedDict instance

@davidfstr davidfstr deleted the typeddict_instance branch January 22, 2017 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants