Skip to content

Fetch the entire payload within the pipeline #2073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
heaths opened this issue Feb 5, 2025 · 4 comments
Open

Fetch the entire payload within the pipeline #2073

heaths opened this issue Feb 5, 2025 · 4 comments
Assignees
Labels
Azure.Core The azure_core crate Client This issue points to a problem in the data-plane of the library.
Milestone

Comments

@heaths
Copy link
Member

heaths commented Feb 5, 2025

Based on our discussion (see OP history for context), we decided on a pattern that would allow most client methods to work like so:

let secret = client.get_secret("name", "version", None).await?.into_body()?;

The entire response is buffered into memory by default. Deserialization happens on that buffer and is not, therefore, async. This also should provide opportunity to attach the raw response to the ErrorKind::HttpResponse, on which we could provide deserialization helpers but would not deserialize by default.

To support downloading large payloads - or for any case where a customer might otherwise want to stream the response - all Response<T> would support something like:

let response = client.download_blob("blob", None).await?; // get at least the headers
let mut stream = response.into_stream();
while let Some(buf) = stream.next().await? {
    // e.g., write buf to file
}

While we'll still have helpers to deserialize into custom model types attached to Response<T> (see #1831 (comment)), this would still allow customers to do something like this if, say, a blob were a structured model or for any model response:

let content: Vec<u8> = stream.try_collect().await?;
let m: Model = serde_json::from_slice(&content);

This does mean that into_body() et. al. are implemented only for something like Response<T> where T: Deserialize, so pure streaming methods need to return a type that would never implement Deserialize but can stream, like our own ResponseBody or something. Or maybe we return a ResponseBody in lieu of Response<T>.

To clarify, the into_stream method does not change anything in the pipeline itself (because, in fact, it's called too late for it to do so), so if you called it on get_secret's response, you'd end up with a stream that yields all of its bytes synchronously. Separately to adding that into_stream method we'll be changing the pipeline to eagerly read the entire body unless a special flag is provided in the Context.

Note: if the HTTP status code is not an acceptable success code (see #1733), we should always buffer the entire error response in the first await call so it's available on ErrorKind::HttpResponse (see #2495).

@heaths heaths added Azure.Core The azure_core crate Client This issue points to a problem in the data-plane of the library. design-discussion An area of design currently under discussion and open to team and community feedback. labels Feb 5, 2025
@heaths heaths self-assigned this Feb 5, 2025
@github-project-automation github-project-automation bot moved this to Untriaged in Azure SDK Rust Feb 5, 2025
@heaths
Copy link
Member Author

heaths commented Feb 5, 2025

One question that arises is whether we should advertise a way for customers to opt into this same behavior of not buffering the entire payload e.g., a field on azure_core::ClientMethodOptions or a type to pass through the Context (the former is idiomatic). However we do it would have to be pub anyway.

That said, I should clarify that we shouldn't try to deserialize in the pipeline. Deserialization is meant to be late but does not need to be async. This lets customers still grab the buffered response and do whatever - save it, deserialize their own types, whatever - without awaiting again. Besides, if deserialization fails with the entire response buffered already, it's not going to be something a retry will fix.

I'll update the OP.

@RickWinter RickWinter moved this from Untriaged to Not Started in Azure SDK Rust Apr 15, 2025
@RickWinter RickWinter added this to the 2025-07 milestone Apr 15, 2025
@heaths
Copy link
Member Author

heaths commented Apr 17, 2025

We discussed this today and decided on a pattern that would allow most client methods to work like so:

let secret = client.get_secret("name", "version", None).await?.into_body()?;

The entire response is buffered into memory by default. Deserialization happens on that buffer and is not, therefore, async. This also should provide opportunity to attach the raw response to the ErrorKind::HttpResponse, on which we could provide deserialization helpers but would not deserialize by default.

To support downloading large payloads - or for any case where a customer might otherwise want to stream the response - all Response<T> would support something like:

let response = client.download_blob("blob", None).await?; // get at least the headers
let mut stream = response.into_stream();
while let Some(buf) = stream.next().await? {
    // e.g., write buf to file
}

While we'll still have helpers to deserialize into custom model types attached to Response<T> (see #1831 (comment)), this would still allow customers to do something like this if, say, a blob were a structured model or for any model response:

let content: Vec<u8> = stream.try_collect().await?;
let m: Model = serde_json::from_slice(&content);

This does mean that into_body() et. al. are implemented only for something like Response<T> where T: Deserialize, so pure streaming methods need to return a type that would never implement Deserialize but can stream, like our own ResponseBody or something. Or maybe we return a ResponseBody in lieu of Response<T>. @analogrelay relay, since you did some refactor with those types, any initial thoughts?

Note: if the HTTP status code is not an acceptable success code (see #1733), we should always buffer the entire error response in the first await call so it's available on ErrorKind::HttpResponse (see #2495).

@heaths heaths moved this from Not Started to In Progress in Azure SDK Rust Apr 17, 2025
@analogrelay
Copy link
Member

analogrelay commented Apr 17, 2025

To support downloading large payloads - or for any case where a customer might otherwise want to stream the response - all Response would support something like:

To clarify, the into_stream method does not change anything in the pipeline itself (because, in fact, it's called too late for it to do so), so if you called it on get_secret's response, you'd end up with a stream that yields all of its bytes synchronously. Separately to adding that into_stream method we'll be changing the pipeline to eagerly read the entire body unless a special flag is provided in the Context.


Aside from that clarification (which we did cover in the meeting, just wanted to get it here in writing), this sounds good to me!

@heaths
Copy link
Member Author

heaths commented Apr 17, 2025

Thank you, @analogrelay! Good catch on that. Yes, and we already have at least partial support with our ResponseBody type.

@heaths heaths removed the design-discussion An area of design currently under discussion and open to team and community feedback. label May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Azure.Core The azure_core crate Client This issue points to a problem in the data-plane of the library.
Projects
Status: In Progress
Development

No branches or pull requests

3 participants