Skip to content

Commit 6d4302e

Browse files
committed
Create 2022-07-25-keyword-generics.md
1 parent 51629ac commit 6d4302e

File tree

1 file changed

+343
-0
lines changed

1 file changed

+343
-0
lines changed

Diff for: posts/2022-07-25-keyword-generics.md

+343
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,343 @@
1+
---
2+
layout: post
3+
title: "Announcing the Keyword Generics Initiative"
4+
author: Yoshua Wuyts, on behalf of the Keyword Generics Initiative
5+
release: false
6+
---
7+
8+
We ([Oli], [Niko], and [Yosh]) are excited to announce the start of the [Keyword
9+
Generics Initiative][kwi], a new initiative [^initiative] under the purview of
10+
the language team. We're officially just a few weeks old now, and in this post
11+
we want to briefly share why we've started this initiative, and share some
12+
insight on what we're about.
13+
14+
[Oli]: https://github.com/oli-obk
15+
[Niko]: https://github.com/nikomatsakis
16+
[Yosh]: https://github.com/yoshuawuyts
17+
[kwi]: https://github.com/rust-lang/keyword-generics-initiative
18+
19+
[^initiative]: Rust governance terminology can sometimes get confusing. An
20+
"initiative" in Rust parlance is different from a "working group" or "team".
21+
Initiatives are intentionally limited: they exist to explore, design, and
22+
implement specific pieces of work - and once that work comes to a close, the
23+
initiative will wind back down. This is different from, say, the lang team -
24+
which essentially carries a `'static` lifetime - and whose work (ominously) does
25+
not have a clearly defined end.
26+
27+
## A missing kind of generic
28+
29+
One of Rust's defining features is the ability to write functions which are
30+
_generic_ over their input types. That allows us to write a function once,
31+
leaving it up to the compiler to generate the right implementations for us.
32+
33+
But while we're able to be generic over _types_ in a function, we're not able to
34+
be generic over the qualifier keywords of functions and traits. The post ["What
35+
color is your function"][color] [^color] describes what happens when you're not
36+
able to be generic over the `async` qualifier. But we see this problem apply for
37+
practically all qualifier keywords. We're looking to fill in that gap in our
38+
capabilities by introducing "keyword generics" [^name]: the ability to be
39+
generic over keywords such as `const` and `async`.
40+
41+
[color]: https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/
42+
[^color]: R. Nystrom, “What Color is Your Function?,” Feb. 01, 2015.
43+
https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/
44+
(accessed Apr. 06, 2022).
45+
46+
[^name]: The longer, more specific name would be: "keyword modifier generics".
47+
We've tried calling it that, but it's a bit of a mouthful. So we're just
48+
sticking with "keyword generics" for now, even if the name for this feature may
49+
end up being called something more specific in the reference and documentation.
50+
51+
To give you a quick taste of what we're working on, this is how we imagine you
52+
may want to write a function which is generic over "asyncness" sometime in the
53+
future:
54+
55+
```rust
56+
async<A> trait Read {
57+
async<A> fn read(&mut self, buf: &mut [u8]) -> Result<usize>;
58+
async<A> fn read_to_string(&mut self, buf: &mut String) -> Result<usize> { ... }
59+
}
60+
61+
/// Read from a reader into a string.
62+
async<A> fn read_to_string(reader: &mut impl Read * A) -> std::io::Result<String> {
63+
let mut string = String::new();
64+
reader.read_to_string(&mut string).await?;
65+
string
66+
}
67+
```
68+
69+
This function introduces a "keyword generic" parameter into the function of `A`.
70+
You can think of this as a flag which indicates whether the function is being
71+
compiled in an async context or not. The parameter `A` is forwarded to the `impl
72+
Read`, making that conditional on "asyncness" as well.
73+
74+
In the function body you can see a `.await` call. Because [the `.await` keyword
75+
marks cancellation sites][cancel] we unfortunately can't just infer them
76+
[^cancellation]. Instead we require them to be written for when the code is
77+
compiled in async mode, but are essentially reduced to a no-op in non-async
78+
mode.
79+
80+
[cancel]: https://blog.yoshuawuyts.com/async-cancellation-1/
81+
[^cancellation]: No really, we can't just infer them - and it may not be as
82+
simple as omitting all `.await` calls either. The Async WG is working through
83+
the full spectrum of cancellation sites, async drop, and more. But for now we're
84+
working under the assumption that `.await` will remain relevant going forward.
85+
And even in the off chance that it isn't, fallibility has similar requirements
86+
at the call site as async does.
87+
88+
We still have lots of details left to figure out, but we hope this at least
89+
shows the general *feel* of what we're going for. We want to make it
90+
significantly easier to write code which works in both sync and non-async Rust.
91+
92+
## A peek at the past: horrors before const
93+
94+
Rust didn't always have `const` as part of the language. A long long long long
95+
long time ago (2018) we had to write a regular function for runtime computations
96+
and associated const of generic type logic for compile-time computations. As an
97+
example, to add the number `1` to a constant provided to you, you had to write
98+
([playground]):
99+
100+
[playground]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=50e818b79b8af322ed4384d3c33e9773
101+
102+
```rust
103+
trait Const<T> {
104+
const VAL: T;
105+
}
106+
107+
/// `42` as a "const" (type) generic:
108+
struct FourtyTwo;
109+
impl Const<i32> for FourtyTwo {
110+
const VAL: i32 = 42;
111+
}
112+
113+
/// `C` -> `C + 1` operation:
114+
struct AddOne<C: Const<i32>>(C);
115+
impl<C: Const<i32>> Const<i32> for AddOne<C> {
116+
const VAL: i32 = C::VAL + 1;
117+
}
118+
119+
AddOne::<FourtyTwo>::VAL
120+
```
121+
122+
Today this is as easy as writing a `const fn`:
123+
124+
```rust
125+
const fn add_one(i: i32) -> i32 {
126+
i + 1
127+
}
128+
129+
add_one(42)
130+
```
131+
132+
The interesting part here is that you can also just call this function in
133+
runtime code, which means the implementation is shared between both `const`
134+
(CTFE[^ctfe]) and non-`const` (runtime) contexts.
135+
136+
[^ctfe]: CTFE stands for "Compile Time Function Execution": `const` functions
137+
can be evaluated during compilation, which is implemented using a Rust
138+
interpreter (miri).
139+
140+
## Memories of the present: async today
141+
142+
People write duplicate code for async/non-async with the only difference being
143+
the `async` keyword. A good example of that code today is [`async-std`], which
144+
duplicates and translates a large part of the stdlib's API surface to be async
145+
[^async-std]. And because the Async WG has made it an explicit goal to [bring
146+
async Rust up to par with non-async Rust][async-vision], the issue of code
147+
duplication is particularly relevant for the Async WG as well. Nobody on the
148+
Async WG seems particularly keen on proposing we add a second instance of just
149+
about every API currently in the stdlib.
150+
151+
[`async-std`]: https://docs.rs/async-std/latest/async_std/
152+
[async-vision]: https://rust-lang.github.io/wg-async/vision/how_it_feels.html
153+
[^async-std]: Some limitations in `async-std` apply: async Rust is missing async
154+
`Drop`, async traits, and async closures. So not all APIs could be duplicated.
155+
Also we explicitly didn't reimplement any of the collection APIs to be
156+
async-aware, which means users are subject to the "sandwich problem". The
157+
purpose of `async-std` has been to be a proving ground to test whether creating
158+
an async mirror of the stdlib would be possible: and we've proven (modulo
159+
missing language features) that it is.
160+
161+
We're in a similar situation with `async` today as `const` was prior to 2018.
162+
Duplicating entire interfaces and wrapping them in inefficient `block_on` calls
163+
is the approach taken by e.g. the `mongodb`
164+
[[async](https://docs.rs/mongodb/latest/mongodb/index.html),
165+
[non-async](https://docs.rs/mongodb/latest/mongodb/sync/index.html)], `postgres`
166+
[[async](https://docs.rs/tokio-postgres/latest/tokio_postgres/index.html),
167+
[non-async](https://docs.rs/postgres/latest/postgres/)], and `reqwest`
168+
[[async](https://docs.rs/reqwest/latest/reqwest/),
169+
[non-async](https://docs.rs/reqwest/latest/reqwest/blocking/index.html)] crates:
170+
```rust
171+
// "crate_name"
172+
async fn foo() -> Bar { ... }
173+
174+
// "blocking_crate_name" or "crate_name::blocking"
175+
// take the `async fn foo` and block the thread until
176+
// it finishes executing.
177+
fn foo() -> Bar {
178+
futures::executor::block_on(crate_name::foo())
179+
}
180+
```
181+
182+
This requires effort on the user's side to find and use the right crates for
183+
their code. And it requires effort by the crate authors to keep the sync and
184+
async APIs in sync with each other.
185+
186+
In the ecosystem some solutions exist to work around these issues in automated ways.
187+
188+
An example of such a solution is the [`maybe-async`
189+
crate](https://docs.rs/maybe-async/0.2.6/maybe_async/) which relies on proc
190+
macros. Instead of writing two separate copies of `foo`, it generates a sync and
191+
async variant for you:
192+
193+
```rust
194+
#[maybe_async]
195+
async fn foo() -> Bar { ... }
196+
```
197+
198+
This macro however is limited, and has clear issues with respect to diagnostics
199+
and ergonomics. That is because it is in effect implementing a way to be generic
200+
over the `async` keyword entirely using macros, which is the type of
201+
transformation a compiler / type system is better equipped to deal with.
202+
203+
## A taste of trouble: the sandwich problem
204+
205+
A pervasive issue in existing Rust is the _sandwich_ problem. It occurs when a
206+
type passed into an operation wants to perform control flow not supported by the
207+
type it's passed into. Thus creating a _sandwich_ [^dilemma] The classic example
208+
is a `map` operation:
209+
210+
[^dilemma]: Not to be confused with the higher-order _sandwich dilemma_ which is
211+
when you look at the sandwich problem and attempt to determine whether the
212+
sandwich is two slices of bread with a topping in between, or two toppings with
213+
a slice of bread in between. Imo the operation part of the problem feels more
214+
_bready_, but that would make for a weird-looking sandwich. Ergo: sandwich
215+
dilemma. (yes, you can ignore all of this.)
216+
217+
```rust
218+
enum Option<T> {
219+
Some(T),
220+
None,
221+
}
222+
223+
impl<T> Option<T> {
224+
fn map<J>(self, f: impl FnOnce(T) -> J) -> Option<J> { ... }
225+
}
226+
```
227+
228+
```rust
229+
my_option.map(|x| x.await)
230+
```
231+
232+
This will produce a compiler error: the closure `f` is not an async context, so
233+
`.await` cannot be used within it. And we can't just convert the closure to be
234+
`async` either, since `fn map` doesn't know how to call async functions. In
235+
order to solve this issue, we could provide a new `async_map` method which
236+
_does_ provide an async closure. But we may want to repeat those for more
237+
effects, and that would result in a combinatorial explosion of effects. Take for
238+
example "can fail" and "can be async":
239+
240+
| | not async | async |
241+
| -------------- | ------------ | ------------------ |
242+
| __infallible__ | `fn map` | `fn async_map` |
243+
| __fallible__ | `fn try_map` | `fn async_try_map` |
244+
245+
That's a lot of API surface for just a single method, and __that problem
246+
multiplies across the entire API surface in the stdlib__. We expect that once we
247+
start applying "keyword generics" to traits, we will be able to solve the
248+
sandwich problem. The type `f` would be marked generic over a set of effects,
249+
and the compiler would choose the right variant during compilation.
250+
251+
## Affecting all effects
252+
253+
Both `const` and `async` share a very similar issue, and we expect that other
254+
"effects" will face the same issue. "fallibility" is particularly on our mind here,
255+
but it isn't the only effect. In order for the language to feel consistent we
256+
need consistent solutions.
257+
258+
## FAQ
259+
260+
### Q: Will this make the language more complicated?
261+
262+
The goal of keyword generics is not to minimize the complexity of the Rust
263+
programming language, but to _minimize the complexity of programming in Rust._
264+
These two might sound similar, but they're not. Our reasoning here is that by
265+
_adding_ a feature, we will actually be able to significantly reduce the surface
266+
area of the stdlib, crates.io libraries, and user code - leading to a more
267+
streamlined user experience.
268+
269+
Choosing between sync or async code is a fundamental choice which needs to be
270+
made. This is complexity which cannot be avoided, and which needs to exist
271+
somewhere. Currently in Rust that complexity is thrust entirely on users of
272+
Rust, making them responsible for choosing whether their code should support
273+
async Rust or not. But other languages have made diferent choices. For example
274+
Go doesn't distinguish between "sync" and "async" code, and has a runtime which
275+
is able to remove that distinction.
276+
277+
The work we're doing would make it so that complexity of choosing between sync
278+
and async can be moved from user code into the type system, and from there on
279+
out can be handled by the compiler instead.
280+
281+
### Q: Are you building an effect system?
282+
283+
The short answer is: kind of, but not really. "Effect systems" or "algebraic
284+
effect systems" generally have a lot of surface area. A common example of what
285+
effects allow you to do is implement your own `try/catch` mechanism. What we're
286+
working on is intentionally limited to built-in keywords only, and wouldn't
287+
allow you to implement anything like that at all.
288+
289+
What we do share with effect systems is that we're integrating modifier keywords
290+
more directly into the type system. Modifier keywords like `async` are often
291+
referred to as "effects", so being able to be conditional over them in
292+
composable ways effectively gives us an "effect algebra". But that's very
293+
different from "generalized effect systems" in other languages.
294+
295+
### Q: Are you looking at other keywords beyond `async` and `const`?
296+
297+
For a while we were referring to the initiative as "modifier generics" or
298+
"modifier keyword generics", but it never really stuck. We're only really
299+
interested in keywords which modify how types work. Right now this is `const`
300+
and `async` because that's what's most relevant for the const-generics WG and
301+
async WG. But we're designing the feature with other keywords in mind as well.
302+
303+
The one most at the top of our mind is a future keyword for fallibility. There
304+
is talk about introducing `try fn() {}` or `fn () -> throws` syntax. This could
305+
make it so methods such as `Iterator::filter` would be able to use `?` to break
306+
out of the closure and short-circuit iteration.
307+
308+
Our main motiviation for this feature is that without it, it's easy for Rust to
309+
start to feel _disjointed_. We sometimes joke that Rust is actually 3-5
310+
languages in a trenchcoat. Between const rust, fallible rust, async rust, unsafe
311+
rust - it can be easy for common APIs to only be available in one variant of the
312+
language, but not in others. We hope that with this feature we can start to
313+
systematically close those gaps, leading to a more consistent Rust experience
314+
for _all_ Rust users.
315+
316+
### Q: What will the backwards compatibility story be like?
317+
318+
Rust has pretty strict backwards-compatibility guarantees, and any feature we
319+
implement needs to adhere to this. Luckily we have some wiggle room because of
320+
the edition mechanism, but our goal is to shoot for maximal backwards compat. We
321+
have some ideas of how we're going to make this work though, and we're
322+
cautiously optimistic we might actually be able to pull this off.
323+
324+
But to be frank: this is by far one of the hardest aspects of this feature, and
325+
we're lucky that we're not designing any of this just by ourselves, but have the
326+
support of the language team as well.
327+
328+
## Conclusion
329+
330+
In this post we've introduced the new keyword generics initiatve, explained why
331+
it exists, and shown a brief example of what it might look like in the future.
332+
333+
The initiative is active on the Rust Zulip under
334+
[`t-lang/keyword-generics`][zulip] - if this seems interesting to you, please
335+
pop by!
336+
337+
_Thanks to everyone who's helped review this post, but in particular:
338+
[fee1-dead][fd], [Daniel Henry-Mantilla][dhm], and [Ryan Levick][rylev]_
339+
340+
[zulip]: https://rust-lang.zulipchat.com/#narrow/stream/328082-t-lang.2Fkeyword-generics
341+
[fd]: https://github.com/fee1-dead
342+
[dhm]: https://github.com/danielhenrymantilla
343+
[rylev]: https://github.com/rylev

0 commit comments

Comments
 (0)