Skip to content

Latest commit

 

History

History
116 lines (68 loc) · 11.9 KB

normalization.md

File metadata and controls

116 lines (68 loc) · 11.9 KB

Normalization and Aliases

What is normalization

In Rust there are a number of types that are considered equal to some "underlying" type, for example inherent associated types, trait associated types, free type aliases (type Foo = u32), and opaque types (-> impl RPIT). Alias types are represented by the TyKind::Alias variant, with the kind of aliases tracked by the AliasTyKind enum.

Normalization is the process of taking these alias types and determining the underlying type that they are equal to. For example given some type alias type Foo = u32, normalizing Foo would give u32.

Entry points to normalization

When interfacing with the type system it will often be the case that it's necessary to request a type be normalized. There are a number of different entry points to the underlying normalization logic and each entry point should only be used in specific parts of the compiler.

An additional complication is that the compiler is currently undergoing a transition from the old trait solver to the new trait solver. As part of this transition our approach to normalization in the compiler has changed somewhat significantly, resulting in some normalization entry points being "old solver only" slated for removal in the long-term once the new solver has stabilized.

Here is a rough overview of the different entry points to normalization in the compiler.

  • infcx.at.normalize
  • tcx.normalize_erasing_regions
  • infcx.query_normalize
  • traits::normalize_with_depth(_to)

infcx.at.normalize/deeply_normalize/structurally_normalize

normalize/deeply_normalize/structurally_normalize are the main normalization entry points for normalizing during various analysis' such as type checking, impl wellformedness checking, collecting the types of RPITITs, etc. It's able to handle inference variables during normalization and will return any nested goals required for the normalization to hold.

These normalization functions are often mirrored on other contexts that wrap an InferCtxt, such as FnCtxt or ObligationCtxt. They behave largely the same except that these wrappers can either handle providing some of the arguments to the normalize functions or handle the returned goals itself.

Due to the new normalization approach of the new solver the normalize method is a no-op under the new solver and is slated for removal once the new solver is stabilized. Under the new solver the intention is to delay normalization up until matching on the type is actually required, at which point structurally_normalize should be called. In some rare cases it is still desirable to eagerly normalize a whole value ahead of time and so deeply_normalize exists.

When matching on types during HIR typeck we would like to emit an error if the type is an inference variable as we do not know what type it will wind up being inferred to. The FnCtxt type (used during HIR typeck) has a method for this, fcx.structurally_resolve, when the new solver is enabled it will also attempt to normalize the type via structurally_normalize.

Due to this there is a pattern in HIR typeck where a type is first normalized via normalize (doing nothing in the new solver), and then structurally_resolve'd (normalizing in the new solver, but erroring on inference variables under both solvers). This pattern should be preferred over calling structurally_normalize during HIR typeck.

tcx.normalize_erasing_regions

normalize_erasing_regions is generally used by parts of the compiler that are not doing type system analysis' as this normalization entry point does not handle inference variables, lifetimes, or any diagnostics. Lints and codegen make heavy use of this entry point as they typically are working with fully inferred aliases that can be assumed to be well formed (or atleast, are not responsible for erroring on).

infcx.query_normalize

infcx.query_normalize is very rarely used, it has almost all the same restrictions as normalize_erasing_regions (cannot handle inference variables, no diagnostics support) with the main difference being that it retains lifetime information. For this reason normalize_erasing_regions is the better choice in almost all circumstances as it is more efficient due to caching lifetime-erased queries.

In practice query_normalize is used for normalization in the borrow checker, and elsewhere as a performance optimization over infcx.normalize. Once the new solver is stabilized it is expected that query_normalize can be removed from the compiler as the new solvers normalization implementation should be performant enough for it to not be a performance regression.

traits::normalize_with_depth(_to)

traits::normalize_with_depth(_to) is only used by the internals of the old trait solver. It is effectively calling into the internals of how normalization is implemented by the old solver. Other normalization entry points cannot be used from within the internals of the old trait solver as it would result in handling goal cycles and recursion depth incorrectly.

When the new solver is stabilized, the old solver and its implementation of normalization will be removed (of which this function is part of).

Alias handling

FIXME: This section is somewhat incomplete and could do with expansion

Ambiguous vs Rigid Aliases

Aliases can either be "ambiguous" or "rigid".

When an alias cannot yet be normalized due to containing inference variables, such as <_ as Iterator>::Item, we consider it to be an "ambiguous" alias where "ambiguous" refers to the fact that it is not clear what type implements Iterator yet.

On the other hand if we can determine the source for how the self type (_ in the previous example) implements the trait (e.g. Iterator) but the source does not determine the underlying type of the associated type (e.g. Item) then it is considered a "rigid" alias.

We generally consider types to be rigid if their "shape" isn't going to change, for example Box is rigid as no amount of normalization can turn a Box into a u32, whereas <vec::IntoIter<u32> as Iterator>::Item is not rigid as it can be normalized to u32.

If an alias is not ambiguous and also is not rigid then it is either not well formed (the self type does not implement the trait), or it can simply be normalized to its underlying type.

Alias Equality

In the old solver equating two aliases simply equates the generic arguments of the aliases. This is incorrect as two ambiguous aliases may wind up having their generic arguments inferred differently but still normalizing to the same rigid type.

To work around this the old solver eagerly normalizes all types to ensure the unnormalized types are never encountered. We also (in the new solver too) normalize ambiguous aliases to some new inference variable ?r to represent its normalized (rigid) form, then emit a goal to defer proving that the alias normalizes to ?r

This has a few advantages:

  • Matching on a Ty after normalization will only encounter rigid aliases or inference variables, not ambiguous aliases
  • When we are able to normalize the ambiguous alias we can wait until it normalized to something rigid instead of a different ambiguous alias before inferring ?r.
  • In cases where multiple aliases wind up being required to be equal to ?r inference can be stronger as the first alias to be normalized to a rigid type can infer ?r.
  • In the old solver we never encounter ambiguous aliases and so cannot wind up accidentally equating two ambiguous aliases generic arguments

There are some shortcomings with this around higher ranked types containing ambiguous aliases that make use of the bound variables, e.g. for<'a> fn(<?x as Trait<'a>>::Assoc). Creating an inference variable ?r to represent the normalized form of <?x as Trait<'a>>::Assoc is problematic as ?r would be unable to name the lifetime 'a due to being in a lower universe even though there could exist some type to infer ?x to that would implement for<'a> Trait<'a, Assoc = &'a u32>.

In both the old and new solver we do not normalize aliases to inference variables if they make use of bound vars from a higher ranked type. In the old solver this is unsound in coherence due to equality of aliases simply equating the generic arguments regardless of whether the alias is rigid.

In the new solver this is not a soundness bug as we do not equate the arguments of aliases unless they are known to be rigid.

Normalization as a side effect of equality

Under the new solvers approach to normalization and equality of aliases we check equality of aliases with a PredicateKind::AliasRelate goal that can be deferred until furthur inference progress has been made, if necessary.

AliasRelate(lhs, rhs) is implemented by first structurally normalizing both the lhs and the rhs and then relating the resulting rigid types (or inference variables). Importantly, if lhs or rhs ends up as an alias, this alias can now be treated as rigid and gets unified without emitting a nested AliasRelate goal: source.

This means that AliasRelate with an unconstrained rhs ends up acting as a function which fully normalizes lhs before assigning the resulting rigid type to an inference variable. This is used by fn structurally_normalize_ty both inside and outside of the trait solver.