-
Notifications
You must be signed in to change notification settings - Fork 88
Update stream rfc #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 21 commits
7ede7b7
3ccfcb0
b5f7031
f6cd705
a6a69a0
81a2d51
7c95855
9f6aa05
1750343
0050a3e
9d48ced
8d19261
06205c4
d646a7e
c0bbd43
8cf7b37
df39f12
a13dd48
e398679
d41c12b
ddec9e0
4bcc194
6d06d3f
35efa81
642935a
b9c8bff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -63,14 +63,20 @@ the current task to be re-awoken when the data is ready. | |
```rust | ||
// Defined in std::stream module | ||
pub trait Stream { | ||
// Core items: | ||
type Item; | ||
|
||
fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>>; | ||
|
||
// Optional optimization hint, just like with iterators: | ||
#[inline] | ||
fn size_hint(&self) -> (usize, Option<usize>) { | ||
(0, None) | ||
} | ||
|
||
// Convenience methods (covered later on in the RFC): | ||
fn next(&mut self) -> Next<'_, Self> | ||
where | ||
Self: Unpin; | ||
} | ||
``` | ||
|
||
|
@@ -124,6 +130,56 @@ where | |
} | ||
``` | ||
|
||
## Next method/struct | ||
|
||
We should also implement a next method, similar to [the implementation in the futures crate](https://docs.rs/futures-util/0.3.5/src/futures_util/stream/stream/next.rs.html#10-12). | ||
nellshamrell marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
In general, we have purposefully kept the core trait definition minimal. There are a number of useful extension methods that are available, for example, in the futures-stream crate, but we have not included them because they involve closure arguments, and we have not yet finalized the design of async closures. | ||
nellshamrell marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
However, the core methods alone are extremely unergonomic. You can't even iterate over the items coming out of the stream. Therefore, we include a few minimal convenience methods that are not dependent on any unstable features. Most notably, next | ||
|
||
```rust | ||
/// A future that advances the stream and returns the next value. | ||
/// | ||
/// This `struct` is created by the [`next`] method on [`Stream`]. See its | ||
/// documentation for more. | ||
/// | ||
/// [`next`]: trait.Stream.html#method.next | ||
/// [`Stream`]: trait.Stream.html | ||
#[derive(Debug)] | ||
#[must_use = "futures do nothing unless you `.await` or poll them"] | ||
pub struct Next<'a, S: ?Sized> { | ||
stream: &'a mut S, | ||
} | ||
|
||
impl<St: ?Sized + Unpin> Unpin for Next<'_, St> {} | ||
|
||
impl<'a, St: ?Sized + Stream + Unpin> Next<'a, St> { | ||
pub(super) fn new(stream: &'a mut St) -> Self { | ||
Next { stream } | ||
} | ||
} | ||
|
||
impl<St: ?Sized + Stream + Unpin> Future for Next<'_, St> { | ||
type Output = Option<St::Item>; | ||
|
||
fn poll( | ||
mut self: Pin<&mut Self>, | ||
cx: &mut Context<'_>, | ||
) -> Poll<Self::Output> { | ||
Pin::new(&mut *self.stream).poll_next(cx) | ||
} | ||
} | ||
``` | ||
|
||
This would allow a user to await on a future: | ||
|
||
```rust | ||
while let Some(v) = stream.next().await { | ||
|
||
} | ||
``` | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
|
@@ -235,27 +291,156 @@ Designing such a migration feature is out of scope for this RFC. | |
|
||
### IntoStream | ||
|
||
**Iterators** | ||
|
||
Iterators have an `IntoIterator` that is used with `for` loops to convert items of other types to an iterator. | ||
|
||
```rust | ||
pub trait IntoIterator where | ||
<Self::IntoIter as Iterator>::Item == Self::Item, | ||
{ | ||
type Item; | ||
|
||
type IntoIter: Iterator; | ||
|
||
fn into_iter(self) -> Self::IntoIter; | ||
} | ||
``` | ||
|
||
Examples taken from the Rust docs on [for loops and into_iter](https://doc.rust-lang.org/std/iter/index.html#for-loops-and-intoiterator) | ||
|
||
* `for x in iter` uses `impl IntoIterator for T` | ||
|
||
```rust | ||
let values = vec![1, 2, 3, 4, 5]; | ||
|
||
for x in values { | ||
println!("{}", x); | ||
} | ||
``` | ||
|
||
Desugars to: | ||
|
||
```rust | ||
let values = vec![1, 2, 3, 4, 5]; | ||
{ | ||
let result = match IntoIterator::into_iter(values) { | ||
mut iter => loop { | ||
let next; | ||
match iter.next() { | ||
Some(val) => next = val, | ||
None => break, | ||
}; | ||
let x = next; | ||
let () = { println!("{}", x); }; | ||
}, | ||
}; | ||
result | ||
} | ||
``` | ||
* `for x in &iter` uses `impl IntoIterator for &T` | ||
* `for x in &mut iter` uses `impl IntoIterator for &mut T` | ||
|
||
**Streams** | ||
|
||
We may want a trait similar to this for `Stream`. The `IntoStream` trait would provide a way to convert something into a `Stream`. | ||
|
||
This trait could look like this: | ||
|
||
[TO BE ADDED] | ||
```rust | ||
pub trait IntoStream where | ||
<Self::IntoStream as Stream>::Item == Self::Item, | ||
nellshamrell marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
type Item; | ||
|
||
type IntoStream: Stream; | ||
|
||
fn into_stream(self) -> Self::IntoStream; | ||
} | ||
``` | ||
nellshamrell marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
This trait (as expressed by @taiki-e in [a comment on a draft of this RFC](https://github.com/rust-lang/wg-async-foundations/pull/15/files#r449880986)) makes it easy to write streams in combination with [async stream](https://github.com/taiki-e/futures-async-stream). For example: | ||
|
||
```rust | ||
type S(usize); | ||
|
||
impl IntoStream for S { | ||
type Item = usize; | ||
type IntoStream: impl Stream<Item = Self::Item>; | ||
|
||
fn into_stream(self) -> Self::IntoStream { | ||
#[stream] | ||
async move { | ||
for i in 0..self.0 { | ||
yield i; | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
### FromStream | ||
|
||
**Iterators** | ||
|
||
Iterators have an `FromIterator` that is used to convert iterators into another type. | ||
|
||
```rust | ||
pub trait FromIterator<A> { | ||
|
||
fn from_iter<T>(iter: T) -> Self | ||
where | ||
T: IntoIterator<Item = A>; | ||
} | ||
``` | ||
|
||
It should be noted that this trait is rarely used directly, instead used through Iterator's collect method ([source](https://doc.rust-lang.org/std/iter/trait.FromIterator.html)). | ||
|
||
```rust | ||
pub trait Interator { | ||
fn collect<B>(self) -> B | ||
where | ||
B: FromIterator<Self::Item>, | ||
{ ... } | ||
} | ||
``` | ||
|
||
Examples taken from the Rust docs on [iter and collect](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.collect) | ||
|
||
|
||
```rust | ||
let a = [1, 2, 3]; | ||
|
||
let doubled: Vec<i32> = a.iter() | ||
.map(|&x| x * 2) | ||
.collect(); | ||
|
||
``` | ||
|
||
**Streams** | ||
|
||
We may want a trait similar to this for `Stream`. The `FromStream` trait would provide way to convert a `Stream` into another type. | ||
|
||
This trait could look like this: | ||
|
||
[TO BE ADDED] | ||
```rust | ||
pub trait FromStream<A> { | ||
async fn from_stream<T>(stream: T) -> Self | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One very open-ended question I haven't thought too hard about yet: Would it ever be possible to automatically implement this for all T where There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I definitely think that's possible. To clarify, are you thinking of implementing this same trait for both Streams and Iterators, or are you thinking of having separate traits with very similar functionality? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm saying that Thinking more about it, This is a somewhat similar problem to the This might be another possible bullet point for the lang team discussion, @nikomatsakis? (cc rust-lang/lang-team#34) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The way we solved this in
To help with the conversion from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think the way currently used in As far as I know, this is why If we want the collection to extend asynchronously, you can use
This seems copy of futures::stream::iter? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think that's a compelling outcome; this means that patterns from sync Rust wouldn't be transferable to async Rust. Where possible I think it's incredibly valuable to aim for parity between the two. Though this kind of gets into "design philosophy", and at least in the context of this PR we don't need to find consensus.
That's accurate, and I think that's the right outcome. In some cases *Admittedly it's a bit of a contrived example, but hopefully it still serves to illustrate the point. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Agreed, it actually requires real async trait method to add this to std, and afaik all of the existing approaches have some problems.
Well, this problem also exists in Aside from it's preferable or not, even a crate that is obviously unrelated to async can't be used as a return type for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
One solution I was thinking about for this is, leave |
||
where | ||
T: IntoStream<Item = A>; | ||
} | ||
``` | ||
|
||
We could potentially include a collect method for Stream as well. | ||
|
||
```rust | ||
pub trait Stream { | ||
async fn collect<B>(self) -> B | ||
where | ||
B: FromStream<Self::Item>, | ||
{ ... } | ||
} | ||
``` | ||
|
||
## Other Traits | ||
|
||
|
@@ -289,25 +474,29 @@ for elem in stream { ... } | |
|
||
Designing this extension is out of scope for this RFC. However, it could be prototyped using procedural macros today. | ||
|
||
## "Attached" streams | ||
## "Lending" streams | ||
|
||
There has been much discussion around attached/detached streams. | ||
There has been much discussion around lending streams (also referred to as attached streams). | ||
|
||
### Definitions | ||
|
||
[Source](https://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/#the-need-for-streaming-streams-and-iterators) | ||
|
||
In a **detached** stream, the `Item` that gets returned by `Stream` is "detached" from self. This means it can be stored and moved about independently from `self`. | ||
|
||
In an **attached** stream, the `Item` that gets returned by `Stream` may be borrowed from `self`. It can only be used as long as the `self` reference remains live. | ||
In a **lending** stream (also known as an "attached" stream), the `Item` that gets returned by `Stream` may be borrowed from `self`. It can only be used as long as the `self` reference remains live. | ||
|
||
In a **non-lending** stream (also known as a "detached" stream), the `Item` that gets returned by `Stream` is "detached" from self. This means it can be stored and moved about independently from `self`. | ||
|
||
This RFC does not cover the addition of lending streams (streams as implemented through this RFC are all non-lending streams). | ||
|
||
This RFC does not cover the addition of attached/detached owned/borrowed streams. | ||
We can add the `Stream` trait to the standard library now and delay | ||
adding in this distinction between two types of streams. The advantage of this | ||
is it would allow us to copy the `Stream` trait from `futures` largely 'as is'. | ||
The disadvantage of this is functions that consume streams would first be written | ||
to work with `Stream`, and then potentially have to be rewritten later to work with | ||
`AttachedStream`s. | ||
adding in this distinction between the two types of streams - lending and | ||
non-lending. The advantage of this is it would allow us to copy the `Stream` | ||
trait from `futures` largely 'as is'. | ||
|
||
The disadvantage of this is functions that consume streams would | ||
first be written to work with `Stream`, and then potentially have | ||
to be rewritten later to work with `LendingStream`s. | ||
|
||
### Current Stream Trait | ||
|
||
|
@@ -327,10 +516,10 @@ pub trait Stream { | |
This trait, like `Iterator`, always gives ownership of each item back to its caller. This offers flexibility - | ||
such as the ability to spawn off futures processing each item in parallel. | ||
|
||
### Potential Attached Stream Trait | ||
### Potential Lending Stream Trait | ||
|
||
```rust | ||
impl<S> AttachedStream for S | ||
impl<S> LendingStream for S | ||
where | ||
S: Stream, | ||
{ | ||
|
@@ -346,19 +535,44 @@ where | |
``` | ||
|
||
This is a "conversion" trait such that anything which implements `Stream` can also implement | ||
`Attached Stream`. | ||
`Lending Stream`. | ||
|
||
This trait captures the case we re-use internal buffers. This would be less flexible for | ||
consumers, but potentially more efficient. Types could implement the `AttachedStream` | ||
consumers, but potentially more efficient. Types could implement the `LendingStream` | ||
where they need to re-use an internal buffer and `Stream` if they do not. There is room for both. | ||
|
||
We would also need to pursue the same design for iterators - whether through adding two traits | ||
or one new trait with a "conversion" from the old trait. | ||
|
||
This also brings up the question of whether we should allow conversion in the opposite way - if | ||
every "Detached" stream can become an attached one, should _some_ detached streams be able to | ||
become attached ones? These use cases need more thought, which is part of the reason | ||
it is out of the scope of this particular RFC. | ||
every non-lending stream can become a lending one, should _some_ lending streams be able to | ||
become non-lending ones? | ||
|
||
**Coherence** | ||
|
||
The impl above has a problem. As the Rust language stands today, we cannot cleanly convert impl Stream to impl LendingStream due to a coherence conflict. | ||
|
||
If you have other impls like: | ||
|
||
```rust | ||
impl<T> Stream for Box<T> where T: Stream | ||
``` | ||
|
||
and | ||
|
||
```rust | ||
impl<T> LendingStream for Box<T> where T: LendingStream | ||
``` | ||
|
||
There is a coherence conflict for `Box<impl Stream>`, so presumably it will fail the coherence rules. | ||
|
||
[More examples are available here](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=a667a7560f8dc97ab82a780e27dfc9eb). | ||
nellshamrell marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Resolving this would require either an explicit “wrapper” step or else some form of language extension. | ||
|
||
It should be noted that the same applies to Iterator, it is not unique to Stream. | ||
|
||
These use cases for lending/non-lending streams need more thought, which is part of the reason it is out of the scope of this particular RFC. | ||
|
||
## Generator syntax | ||
[generator syntax]: #generator-syntax | ||
|
@@ -372,10 +586,34 @@ yield could return references to local variables. Given a "detached" | |
or "owned" stream, the generator yield could return things | ||
that you own or things that were borrowed from your caller. | ||
|
||
### In Iterators | ||
|
||
```rust | ||
gen async fn foo() -> X { | ||
gen async fn foo() -> Value { | ||
nellshamrell marked this conversation as resolved.
Show resolved
Hide resolved
|
||
yield value; | ||
} | ||
``` | ||
|
||
Designing generator functions is out of the scope of this RFC. | ||
After desugaring, this would result in a function like: | ||
|
||
```rust | ||
fn foo() -> impl Iterator<Item = Value> | ||
``` | ||
|
||
### In Async Code | ||
|
||
```rust | ||
async gen fn foo() -> Value | ||
``` | ||
|
||
After desugaring would result in a function like: | ||
|
||
```rust | ||
fn foo() -> impl Stream<Item = Value> | ||
``` | ||
|
||
If we introduce `-> Stream` first, we will have to permit `LendingStream` in the future. | ||
Additionally, if we introduce `LendingStream` later, we'll have to figure out how | ||
to convert a `LendingStream` into a `Stream` seamlessly. | ||
|
||
Further designing generator functions is out of the scope of this RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can somebody (maybe @withoutboats or @cramertj) explain to me why
Self: Unpin
is here and why that's not a crushing limitation? I feel like I'm missing something super basic. I guess the answer is maybe that we expect to be operating withBox<S>
whereS: Stream
, andBox<S>
is alwaysUnpin
?And in terms of why it is needed, I guess it is because: you can invoke
next
and that creates a&mut
references to self, but once you've consumed the item, you want to be able to moveself
somewhere else .. I don't know. I can see why it requiresUnpin
(we are invokingPin::new
in thenext
method onNext
), but I'm having a hard time with the "intuition" here.I guess the alternative would be the
next
takesPin<&mut Self>
and not&mut self
? Why would that be bad?Anyway, I think we should have some text justifying it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the reason why its needed is pretty clear if you look at the definition of Next:
Since
Stream::poll_next
takes a pinned reference, the next future pretty clearly needsS
to beUnpin
to safely construct aPin<&mut S>
from&mut S
. The alternative would be for next to takePin<&mut S>
.You're basically right about operating on
Box<S>
, but remember that we can also operate on a stack-pinnedS
a lot of the time (as long as you don't need to return the stream out of this scope), so it doesn't necessarily require an allocation.I think the current approach is strictly better than requiring the method to be pinned, because that would just require pinning it even when the type is unpin, whereas the current situation requires pinning it only when the type is not unpin.
Right now, no one cares because we have no
!Unpin
streams in practice. But once we have async generators, that's going to be a big annoyance with using them. The problem is even worse with generator/Iterator, because generators won't implement Iterator if they have to be pinned to run, instead only pinned references to them will (in essence, the same problem applies to Iterator, except that iterator's definition doesn't even allow for !Unpin iterators). I'm not sure what we're going to do about this when generators exist, this was one of the big papercuts we hadn't looked into yet.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the problem will be that if
async gen
just returns animpl Stream
(and not, crucially, animpl Stream + Unpin
), you'll have to explicitly do some pinning operation to turn them into a "usable" stream (i.e., one you can callnext
one?).The same seems to apply to a
gen fn
, but as you say, the situation is almost worse.It does make me wonder if
Stream
should have apoll_next(&mut self)
method instead ofpoll_next(Pin<&mut self>)
, and we have the expectation that agen fn
should require some kind of "explicit operation" to make aStream
(which would basically be pinning it). The main benefit of this would be thatIterator
kind of requires that. Basically, the argument to me for "let's just stabilizeStream
" has relied in part on the idea that all the problems we face for Stream, we also face for Iterator, but actually in this situation the two designs diverge slightly (with Iterator potentially having "another problem", depending on your POV).I guess that the idea would be that a
gen fn
actually doesn't return anIterator
but rather animpl IntoIterator
, and anasync gen fn
returns animpl IntoStream
, where the "into" part of it does the pinning required to make&mut self
reasonable.This seems like an important detail, I'm glad we're digging into it a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure I'm following, this is because people are hand-writing their streams, and basically here the
Pin
-machinery is just kind of boilerplate.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example when removing the
Unpin
bounds and using unsafe: playground(this is unsound because you can move the underlying stream after
Next
dropped, and example is crashed as 'moved between poll calls')There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I'm trying to push a bit on this and make sure I understand why we think that is the correct approach, I guess. =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So: I don't think that this conversation has to block this PR from landing, and it probably shouldn't even block the RFC itself, but I stll feel like there is some missing text around the
next
method that at minimum explains whyUnpin
is needed and perhaps tries to cover some of the reasoning above about whypoll_next
should take a pinned argument. Maybe the right place for some of this text is in the coverage of generator syntax, since we can discuss some of the difficulties that will have to be resolved in making generator syntax (and async generator syntax) work.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See unless you write
gen move
, this is not a self-referential type at all, because it capturesmyvec
by reference. :) Users writing futures combinators immediately ran into the problem of them being best expressed by self-referential types (and ended up having toArc<Mutex>
all kinds of stuff to share ownership across yield points), but users writing iterator combinators never have that problem. I think this is worth considering more seriously.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, ok, that's a good point. There are some differences in the setting, given that iterators tend to carry a lifetime. It may indeed be true that iterator combinators are less likely to require borrows across yields. Interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key to the difference is that futures are ultimately passed to some executor API like
spawn
, which expects a static future; to achieve that the futures contain all the state they need, and references are internal to that state. Iterators are almost never required to be'static
by the APIs that consume them.However, it's still the case that it will be far more natural to end up in self-referential situations with a generator than it is with a manually implemented iterator, because you will be able to implicitly build that state using the member variables of a function. So the restriction might still limit users from creating code in a natural way. And it's a bit confusing for the restriction to exist only in non-async generators.
So we'll have to get more experience to figure it out.