Skip to content

Require the static type of all expressions to be a subtype of their context; push coercions down to satisfy this. #3471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
stereotype441 opened this issue Nov 16, 2023 · 5 comments
Assignees
Labels
feature Proposed language feature that solves one or more problems type-inference Type inference, issues or improvements

Comments

@stereotype441
Copy link
Member

Background

Usually, an expression's context type schema describes what the static type of the expression must be (or be coercible to) in order to avoid a compile-time error. For example, in the declaration int x = f();, if the static type of f() is not either int or a type coercible to int (i.e., dynamic), then a compile-time error will occur; therefore, f() is analyzed with a context of int.

However, there are two notable exceptions:

  • When an assignment is made to a local variable that's undergone type promotion, the promoted type of the variable is used as the context for analyzing the right hand side of the assignment. If the static type of the expression winds up not being a subtype of the promoted type, no compile-time error occurs; instead, the variable is simply demoted.
  • When an if-null expression, e1 ?? e2, has no context type schema, or its type schema is dynamic or _, the static type of e1 is used as the context for analyzing e2. If the static type of e2 winds up not being a subtype of its context, no compile-time error necessarily occurs; instead, LUB is used to compute the static type of the whole if-null expression.

As a result of these two exceptions, we've implemented coercions in a somewhat counter-intuitive way: instead of coercing the type of each expression as necessary to ensure that it's a subtype of its context (or reporting a compile time error if no suitable coercion exists), we've implemented some coercions so that they take effect only at the point of an assignability check. Notably, we do this for the dynamic downcast coercion, and this leads to an unfortunate footgun. Consider:

String intOrDynamic(bool b, int i, dynamic d) => b ? i : d;

This code is accepted without any static errors or warnings. The return type String is used as the context type for analyzing the conditional expression b ? i : d; this is pushed down into the two branches of the conditional, i and d. Although i has type int, which is not assignable to String, no error is reported, because there is no assignability check inside a conditional expression. By the same reasoning, analysis of d with context String does not (yet) produce a dynamic downcast. So the static type of i is int and the static type of d is dynamic. Now, the static type of the whole conditional expression is LUB(int, dynamic), which is dynamic. At this point, an assignability check is performed, to see if => b ? i : d can be used as the body of a function with return type String. Since a coercion exists from dynamic to String (namely, dynamic downcast), there is no static error.

But it would be really nice if this code had a static error. Because at runtime, it's guaranteed to throw an exception whenever b is true.

We have often talked about redefining the behavior of conditional expressions so that if either branch of the conditional expression isn't assignable to the context, a compile-time error would be issued. But if we did that, then things would get weird because of the exceptions noted above. For example, this is a valid and reasonable thing to do:

f(Object o, bool b, double d, String s) {
  if (o is int) { // promotes `o` to `int`
    ...
    if (b) {
      o = d; // demotes `o` back to `Object`
    } else {
      o = s; // demotes `o` back to `Object`
    }
  }
}

And it also seems reasonable that we could refactor the "if/else" into a conditional expression, like so:

f(Object o, bool b, double d, String ) {
  if (o is int) { // promotes `o` to `int`
    ....
    o = b ? d : s; // demotes `o` back to `Object`
  }
}

But if we changed the behavior of conditional expressions so that it was an error if either branch didn't satisfy the context, then this innocent refactor would lead to a surprising compile-time error, because the context for b ? d : s is int, and neither double nor String is assignable to int.

Similar problems occur with switch expressions.

Proposal

Change the treatment of assignments to promoted local variables, so that the context for the right hand side of the assignment is the the static (unpromoted) type of the local variable, not its promoted type.

Change the way we implement coercions so that for each expression in the program, a provisional static type is initally computed, using the rules for expression static types that we currently have, and then:

  • Let T be the provisional static type of the expression.
  • Let R be the greatest closure of the context type schema with respect to _. Or, if the expression has no type schema, let R be dynamic.
  • If T <: R, then T is the static type of the expression, and no coercion is applied.
  • Otherwise, if the expression is a conditional expression, switch expression, or if-null expression, then the static type of the expression is R (See the "soundness" section below for why this is ok to do).
  • Otherwise, if a coercion exists from type T to U, and U <: R, then that coercion is applied, and the static type of the expression is U.
  • Otherwise, a compile time error is reported.

(Note that these bullet points establish the invariant that the static type of any expression is a subtype of the greatest closure if its context type schema with respect to _.)

Consequences

Different type inference for the right hand side of an assignment to a promoted local variable

Today, when an assignment is made to a promoted local variable, the right hand side is analyzed using the promoted type of the variable as its context. So, for example:

f(Object o) {
  if (o is List<int>) {
    ...
    o = [];
  }
}

Since o is promoted to List<int> at the time of the assignment, [] is analyzed using a context of List<int>, and therefore [] is interpreted as <int>[].

With the change, [] will be analyzed using a context of Object (the unpromoted type of o). So [] will be interpreted as <dynamic>[].

I'm not sure whether this is better or worse. But I suspect that it will be very rare for it to make a difference in real-world code.

Better handling of dynamic in conditional and switch expressions

Recall the intOrDynamic function:

String intOrDynamic(bool b, int i, dynamic d) => b ? i : d;

With the change, the context type String is still pushed down into the subexpression i. A provisional static type of int is computed for i. But now, following the bullet points above, we see that:

  • T is int
  • R is String
  • T is not a subtype of R,
  • the expression i is not a conditional expression, switch expression, or if-null expression,
  • and no coercion exists from T to U, where U <: R.

Therefore a compile-time error is issued. The footgun has been fixed!

Better handling of conditional and switch expressions where LUB produces too big a type.

Consider this code from #1618:

PSW foo(bool bar) {
  final PSW result = bar ? CNB() : AB();
  return result;
}

class CNB extends SW implements PSW { }

class AB extends SW implements PSW { }

class PSW implements W { }

class SW extends W { }

class W { }

Today, the conditional expression bar ? CNB() : AB() had a static type of LUB(CNB, AB), which is W, leading to a compile-time error because W is not assignable to PSW.

With the change, the type W is merely the provisional type of the conditional expression. Then, following the bullet points above:

  • T is W
  • R is PSW
  • T is not a subtype of R
  • the expression bar ? CNB() : AB() is a conditional expression.

Therefore, the static type of bar ? CNB() : AB() is R (i.e., PSW).

This fully addresses #1618.

If-null becomes slightly more restrictive

Consider the following code (which is allowed today):

List<int>? maybeList = ...;
var somethingToIterate = maybeList ?? Iterable.empty();
for (var i in somethingToIterate) {
  ...
}

With the change, this becomes a compile-time error, because the context type for Iterable.empty() is List<int>?, and Iterable is not a subtype of List.

This could definitely be a source of breakage for users, and we should investigate how often it arises in practice.

However, even if it does come up a non-trivial number of times, there are at least three easy fixes:

  • Inline the variable somethingToIterate (i.e. for (var i in maybeList ?? Iterable.empty())). This works because the if-null expression now has a context of Iterable<_>, so this context is passed down to Iterable.empty() rather than using the static type of maybeList as the context.
  • Give an explicit type of Iterable<int> to somethingToIterate. This works for the same reason; the if-null expression now has a context, and that context is not _ or dynamic.
  • Add a cast to maybeList: var somethingToIterate = (maybeList as Iterable<int>?) ?? Iterable.empty();. This works because the static type of the left hand side of the if-null expression is now Iterable<int>, and this type is a suitable context for analyzing Iterable.empty().

Coercions are pushed down into if-null expressions, conditional expressions, and switch expressions.

Consider this code:

abstract class C {
  void call();
}
void Function() f(bool b, C c, void Function() g) => b ? c : g;

Today, this produces a compile-time error, because the static type of b ? c : g is LUB(C, void Function()), which is Object, and Object is not assignable to void Function().

With this change, the coercion is pushed down to the branches of the conditional. Since the provisional static type of c is C, which is not a subtype of void Function(), c is coerced using a .call tearoff, and the static type of c is void Function(). Which means the code is accepted, and interpreted as:

void Function() f(bool b, C c, void Function() g) => b ? c.call : g;

This should fully address #3363.

Coercions are pushed down into cascade targets.

This has advantages and disadvantages.

On the advantage side, consider this code:

int f(dynamic d) => d..foo();

Today, the implicit downcast is interpreted as happening outside the cascade, i.e.:

int f(dynamic d) => (d..foo()) as int;

This code is statically accepted, because it is permissible to call any method on a dynamic type. But it's guaranteed to throw an exception at runtime, because the only possible values for d which will survive the dynamic downcast are integers, and integers don't have a foo method.

With the change, the implicit downcast is interpreted as happening inside the cascade, i.e.:

int f(dynamic d) => (d as int)..foo();

And therefore there will be a compile-time error.

On the disadvantage side, consider this code:

abstract class C {
  void call();
  void m();
}
void Function() f(C c) => c..m();

Today this is permitted, because the call to m happens before the implicit tearoff of .call. With the change, it will become a compile-time error.

If the user wants to keep the old behavior, they can always do the .call tearoff explicitly:

void Function() f(C c) => (c..m()).call;

Soundness

As promised, here's why it's sound to treat the static type of a conditional expression, switch expression, or if-null expression as R (the greatest closure of the context type schema), even if the provisional static type is not a subtype of R.

Recall that the bullet points listed in the "proposal" section establish the invariant that the static type of every expression is a subtype of the greatest closure of its context type schema with respect to _.

For conditional expressions and switch expressions, the context type schema is passed down into each branch without alteration. Therefore, thanks to the invariant, the static type of each branch is a subtype of R. Since the runtime value of a conditional expression or switch expression is always obtained from one of its branches, if we assume soundness of the subexpressions, then the runtime type of the whole conditional exprssion or switch expression must also be a subtype of R. So it does not violate soundness to assign the whole conditional expression or switch expression a static type of R.

For an if-null expression e1 ?? e2, things are slightly more complicated. First, if the if-null expression has no context, or the context of the if-null expresison is either _ or dynamic, then R is a top type, and therefore it does not violate soundness to assign the whole if-null expression a static type of R, because every runtime type is a subtype of a top type.

If, on the other hand, the context of the if-null expression is not either _ or dynamic, then:

  • Let S denote the context type schema for the if-null expression as a whole.
  • So R is the greatest closure of S with respect to _.
  • e1 is analyzed using a context of S?. Therefore, by the invariant described above, the static type of e1 is a subtype of R?.
  • e2 is analyzed using a context of S. Therefore, by the invariant, the static type of e2 is a subtype of R.
  • At runtime, if e1 evaluates to a non-null value, then it will be used as the value for the whole if-null expression. Since e1 has a static type of R?, and we're considering the case where it is not null, it follows, from the assumption of soundness of e1, that the runtime type of e1 is a subtype of R, and therefore the runtime type of the whole if-null expression is a subtype of R.
  • If, on the other hand, e1 evaluates to null, then e2 will be used as the value for the whole if-null expression. Since e2 has a static type of R, it follows, from the assumption of soundness of e2, that the runtime type of e2 is a subtype of R, and therefore the runtime type of the whole if-null expression is a subtype of R.
  • Thus, in either case, the runtime type of the whole if-null expression is a subtype of R. So it does not violate soundness to assign the whole if-null expression a static type of R.
@stereotype441 stereotype441 added the feature Proposed language feature that solves one or more problems label Nov 16, 2023
@stereotype441
Copy link
Member Author

@dart-lang/language-team As discussed in today's language team meeting. Let me know what you think!

@lrhn
Copy link
Member

lrhn commented Nov 16, 2023

The one thing this will not allow, that I've wanted, is an aspirational context for the expression of an as cast.

It would be nice if 1 as double didn't throw, because it knows that you want a double. That can only work if we have aspirational contexts.
(But it might just be trying to mash two different functionalities into one operator. The "completely unlimited unrelated cast" that it does today, and a more static kind of "I want this type" signal.)

Otherwise I think it's a good simplification.

Both promoted variable assignment and ?? changes are likely to be breaking.

On the other hand, using the unpromoted variable type should get rid of intersection types as context types, which is an added bonus.
The difficulty is that the breakage of that won't be at the assignment itself, that'll likely still work, it'll just demote the variable and cause a later failure. That makes it harder to find and fix automatically.

The ?? issue is real, but I expect most occurrences of someList ?? Iterable.empty() to have a context type, likely occurring directly on a for/in with a context type of Iterable<_>.
(Which means that it becomes Iterable<dynamic> today, and nobody notices the implicit dynamic?)

I think we should try to implement this, to see how much it breaks.

@eernstg
Copy link
Member

eernstg commented Nov 16, 2023

This is amazing! I'd certainly support trying out how breaking this is, and doing it even in the case where there is some breakage.

It's a highly desirable property of this proposal that it simplifies so many things, and that it includes #1618 and #3363.

One thing came to mind, in bullet 2 of the second rule of the proposal:

Let R be the greatest closure of the context type schema with respect to _. Or, if the expression has no type schema, let R be dynamic.

IIUC then it makes no difference at all that we use dynamic here rather than any other top type: The next bullet says that "if T <: R" then the type of the expression is T, and that's always true for any top type including dynamic. So there's no need to worry about having a larger number of expressions of type dynamic if we adopt these rules. On the other hand, it seems likely that we could just as well say "..., let R be Object?", just to practice a good habit. ;-)

(OK, we'd probably need to say "_ or Object?" near the end in the section about ?? as well.)

The main reason to hesitate would be that this proposal will cause promoted variables to be demoted more often. You might even say that a promoted variable appears to wish to return to its unpromoted status because of the preference given to context types during type inference. That sounds like changing the user experience from the current level, and moving it down a notch. That could be a source of serious disappointment.

It would be nice if we could rely on the following migration scenario: We introduce the new rules, some promoted variables get demoted, and then there are errors about those variables later on, and developers will undo all the changes that matter because they'll change things like v = [] to v = <T>[] to avoid the demotion. This is a good scenario because the breakage occurs at compile time.

However, it's perfectly possible that there are no errors when v is used later on, if v is a List<dynamic> after the change. So we can't really hope for compile-time-only breakage.

Maybe we should link this change (at least in terms of migration advice) with some 'strict-...' options, to catch locations where the type dynamic suddenly pops up? Perhaps it would be useful to have a lint that flags variable demotions?

(As usual, it might be useful to have differential analysis, flagging locations where we have variable demotions that didn't occur with the pre-change rules, but that's probably much more work.)

@stereotype441
Copy link
Member Author

I'm starting work on this. I plan to do a prototype first, then see how much breakage it causes inside google3. Once I have that information we can re-assess whether we want to change course.

copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Dec 5, 2023
I plan to use this to guard my work on
dart-lang/language#3471.

Bug: dart-lang/language#3471
Change-Id: I49afae7260c14b8c6b646ccc0ee39efae3891c34
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/338649
Reviewed-by: Michael Thomsen <[email protected]>
Reviewed-by: Konstantin Shcheglov <[email protected]>
Commit-Queue: Paul Berry <[email protected]>
Reviewed-by: Johnni Winther <[email protected]>
copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Dec 7, 2023
… assignments.

This change implements one of the features of experimental feature
`inference-update-3`: with the experimental feature enabled,
assignments to local variables use the declared (or inferred) type of
the local variable as the context for evaluating the RHS of the
assignment, regardless of whether the local variable is promoted. With
the experimental feature disabled, assignments to local variables
continue to use the promoted type of the local variable as the context
for evaluating the RHS of the assignment.

This eliminates one of the scenarios in which the context type of an
assignment is "aspirational" (i.e., not required to be met in order to
prevent a compile time error). Once all aspirational context types
have been removed from the language, we will be able to re-work
coercions to be based on context type, which fixes a number of
usability footguns in the language. See
dart-lang/language#3471 for details.

Bug: dart-lang/language#3471
Change-Id: Ic07ac1810b641a9208c168846cd5fd912088d62b
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/338802
Reviewed-by: Bob Nystrom <[email protected]>
Reviewed-by: Konstantin Shcheglov <[email protected]>
Reviewed-by: Johnni Winther <[email protected]>
Commit-Queue: Paul Berry <[email protected]>
copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Jan 15, 2024
According to the patterns spec
(https://github.com/dart-lang/language/blob/main/accepted/3.0/patterns/feature-specification.md),
if `A` is the type of the formal parameter of the operator
declaration, then operand should be type checked with context type
`A?` when op is `==` or `!=`, and with context type `A` otherwise.

Prior to this commit, context type `A` was always used.

There should be no observable behavior difference, because the operand
of a relational operator is required to be a constant expression, and
there is no constant expression that behaves differently when its
context is `A?` vs `A`.

However, we are contemplating some possible future language changes
that *would* make the difference observable (e.g.,
dart-lang/language#3471), so it seems
prudent to fix the behavior to match the spec.

Change-Id: Ib3a1c82de45c65a851cbd613899ba1f72c215fbe
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/346420
Commit-Queue: Paul Berry <[email protected]>
Reviewed-by: Konstantin Shcheglov <[email protected]>
copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Jan 18, 2024
The front end and analyzer use the same representation for types and
type schemas, with the unknown type schema (`_`) treated as a
pseudo-type. This creates the risk of accidentally mixing types and
schemas, resulting in `_` accidentally "leaking" into the type system
and showing up in static analysis results or error messages.

As a step toward reducing this risk, this change adds a type parameter
to `TypeAnalyzer`, preventing the shared type analysis code from being
able to assume that types and schemas are represented the same. This
extra discipline makes it much easier to search through the code and
identify how types are manipulated vs. how type schemas are
manipulated, and makes it impossible for the shared type analysis to
accidentally leak `_` into the type system.

I believe this change will also make it easier to implement some type
inference improvements we've been contemplating, such as improved type
inference of `.map(...).toList()`, as well as
dart-lang/language#3471, because those
improvements may require introducing new kinds of type schemas.

Change-Id: Ifcd7e2c4e1172ee39719ce8c8b10d7f10f6a7b6f
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/345353
Reviewed-by: Chloe Stefantsova <[email protected]>
Reviewed-by: Johnni Winther <[email protected]>
Commit-Queue: Paul Berry <[email protected]>
Reviewed-by: Brian Wilkerson <[email protected]>
@leafpetersen leafpetersen added the type-inference Type inference, issues or improvements label Jan 19, 2024
@leafpetersen
Copy link
Member

I filed an issue here to explore a corner of this that I believe is not covered above.

copybara-service bot pushed a commit to dart-lang/sdk that referenced this issue Feb 7, 2024
After discussion with Leaf, I believe the primary use case that
`interface-update-3` was intended to address can be better addressed
through the proposal of language issue
dart-lang/language#1618 (instead of
dart-lang/language#3471, which I had been
working on previously).

Since I had not made a lot of progress on
dart-lang/language#3471, rather than set up
a brand new experiment flag, it seems better to simply back out the
work that I'd previously done, and repurpose the `inference-update-3`
flag to work on issue 1618.

Change-Id: I6ee1cb29f722f8e1f0710cbd0600cb87b8fd26a1
Bug: dart-lang/language#1618
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/350620
Commit-Queue: Paul Berry <[email protected]>
Reviewed-by: Nate Bosch <[email protected]>
Reviewed-by: Chloe Stefantsova <[email protected]>
Reviewed-by: Konstantin Shcheglov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems type-inference Type inference, issues or improvements
Projects
None yet
Development

No branches or pull requests

4 participants