Skip to content

[RFC] Ergonomic improvements to payload deserialization for HTTP use cases #917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mlaota opened this issue Aug 18, 2024 · 5 comments
Closed

Comments

@mlaota
Copy link
Contributor

mlaota commented Aug 18, 2024

Summary

This RFC proposes ergonomic improvements to payload deserialization for HTTP usecases.

Motivation

The RequestPayloadExt trait is used in lambda_http to simplify extracting a Request's body into
a user-provided struct. The current version of the RequestPayloadExt::payload has the following docstring:

Return the result of a payload parsed into a type that implements serde::Deserialize

Currently only application/x-www-form-urlencoded and application/json flavors of content type are supported

A PayloadError will be returned for undeserializable payloads. If no body is provided, Ok(None) will be returned.

The interpretation of how payload should behave when the content-type header is missing/unsupported is somewhat ambiguous. For example, there's no mention of what happens when the content-type header is missing. According to RFC 9110: HTTP Semantics,

If a Content-Type header field is not present, the recipient MAY [...] examine the data to determine its type.

so, it's not unreasonable to assume that a generic payload function on an HTTP Request struct would also do a best-effort estimation of the content-type. The signature of payload asks to handle multiple PayloadError variants for different content types which may reinforce that assumption.

The actual behavior is that when the content-type header is either missing or unsupported, the implementation assumes there is no content and returns Ok(None); however, the docstring only calls out Ok(None) being returned "if no body is provided" which can lead to frustration when trying to figure out why the payload is not being recognized.

The documentation needs an update to disambiguate the scenario where the content-type is missing or unsupported. Additionally, I've written a proposal for API ergonomics improvements when handling different content types below.

Proposal: Intent-Driven API Additions

We can use this as an opportunity to introduce a fix via intent-driven additions. For example, we can add json() and x_www_form_urlencoded() methods to the Request that make it clear what the expected format should be:

// Before
let input: OperationInput = match event.payload() {
  Ok(Some(input)) => input,
  Ok(None) => 
    todo!(
      "is the data is missing or the content-type header is missing/incorrect? how 
      do I handle a default content-type for my app.. can i still use `payload` at all?
      i need to inspect the headers to figure out what to do next, and manually deserialize
      the JSON if the content-type header is missing"
    ),
  Err(PayloadError::Json(err)) => todo!("handle deserialization error"),
  Err(PayloadError(other_err)) => 
    todo!("still need to handle this")
}

// After
let input: OperationInput = match event.json() {
  Ok(Some(input)) => input,
  Ok(None) => todo!("handle missing data"),
  Err(JsonPayloadError(err)) => todo!("handle invalid json/schema"),
}

Option 1. (Preferred) Introduce a new extension trait, RequestPayloadIntentExt, with initial implementations

The extension trait would have the following interface:

pub type JsonPayloadError = serde_json::Error;
pub type XWwwFormUrlEncodedPayloadError = SerdeError;

pub trait RequestPayloadIntentExt {
  fn json<D>(&self) -> Result<Option<D>, JsonPayloadError> 
    where for<'de> D: Deserialize<'de>;

  fn x_www_form_urlencoded<D>(&self) -> Result<Option<D>, XWwwFormUrlEncodedPayloadError>
    where for<'de> D: Deserialize<'de>;
}

And initial implementations would go here:

impl RequestPayloadIntentExt for http::Request<Body> { TODO }

Advantages

  • Starting fresh. No risk for backwards compatability issues.
  • The currently available content types are documented via the API in the function names mapped to the content-type they examine for. No surprises.
  • Contributing is straightforward as users request additional content types; contributors should be able to recognize the pattern for contribution from the initial implementations of json() and x_www_form_urlencoded().

Drawbacks

  • Overlapping use case with RequestPayloadExt: users may be confused why the two implementations exist, and surprised when they behave differently
    • Mitigation: Mark RequestPayloadExt as deprecated in documentation with a pointer to RequestPayloadIntentExt

Option 2: Extend RequestPayloadExt trait with methods json() and x_www_form_urlencoded()

The trait could be modified as follows:

pub type JsonPayloadError = serde_json::Error;
pub type XWwwFormUrlEncodedPayloadError = SerdeError;

pub trait RequestPayloadExt {
  ...
  
  fn json<D>(&self) -> Result<Option<D>, JsonPayloadError> 
    where for<'de> D: Deserialize<'de> 
  {
      todo!("delegate to self.payload for backwards compatability")
  }

  fn x_www_form_urlencoded<D>(&self) -> Result<Option<D>, XWwwFormUrlEncodedPayloadError>
    where for<'de> D: Deserialize<'de> 
  {
      todo!("delegate to self.payload for backwards compatability")
  }
}

impl RequestPayloadExt for http::Request<Body> {
  fn json<D>(&self) -> Result<Option<D>, JsonPayloadError> 
    where for<'de> D: Deserialize<'de> {
      todo!("new implementation")
    }

  fn x_www_form_urlencoded<D>(&self) -> Result<Option<D>, XWwwFormUrlEncodedPayloadError>
    where for<'de> D: Deserialize<'de> {
      todo!("new implementation")
    }
}

Advantages

  • Users will find the new functionality where they expect it (from prior examples and discussions in the repo)
  • Maintaining one trait for payload deserialization utils.
  • payload can reuse the implementations for json and xx_www_form_urlencoded and share in improvements.

Drawbacks

  • Backwards compatibility: If a user created an implementation of RequestPayloadExt, they would need to then implement json and x_www_form_urlencoded.
    • Mitigation 1: we can provide default implementations on the trait delegating to self.payload to mimic the previous behavior.
  • Confusing namespace: we would now have variants such as PayloadError::Json and JsonPayloadError coexisting in the same module.
    • Mitigation: Better names? Open to suggestions.
  • Inconsistent behavior/assumptions between payload and new methods

Alternative: Try all content types deserializers before returning None

Since both current (and I'm assuming future) content-type deserialization implementations leverage Serde's Deserialize trait as inputs, we can try both deserializers on the input when the content-type header is missing. For example, we can start with JSON, catch errors, then try x-www-form-urlencoded, etc...

Advantages

  • Should be fairly simple to implement, with no changes to the trait's API.

Drawbacks

  • If users rely on the function returning None when the "content-type" header is empty, they may start handling requests that they expected to reject when the header is missing; consequently, they may be surprised at the new behavior and/or be forced to implement a different mechanism for denying requests without a content-type header.
  • Performance may not be great due to trying every content-type.
@calavera
Copy link
Contributor

I would vote for option 2. I'd rather not introduce new extensions. I believe not many people know about them and that you can implement your own.

Regarding error types, why do we need new error types for that PayloadError doesn't cover? The errors that you're creating are pretty much the same as the variants in that enum.

@mlaota
Copy link
Contributor Author

mlaota commented Aug 20, 2024

I would vote for option 2. I'd rather not introduce new extensions. I believe not many people know about them and that you can implement your own.

Sounds good.

Regarding error types, why do we need new error types for that PayloadError doesn't cover?

Good question -- the goal is that users should only be forced to handle errors that are thrown by their intended content-type deserializer. PayloadError works great with the current behavior in the case where the deserializer is selected at run-time, since it's really the client who chooses it based on the content-type header.

If we flip that decision to the server with this new API, we can strongly guarantee that only related error types will be thrown by using a more specific error type; so, when I'm calling .json(), I no longer need to consider the possibility of a PayloadError::WwwFormUrlEncoded error, because the compiler guarantees that won't happen; whereas, if we reuse PayloadError, I still have to write a handler for other content types even if I'm explicitly using the json() method, and the addition of any other content type would be a breaking change (e.g. if we add PayloadError::TextHtml, any user who fully matches the PayloadError type now has to add one more arm).

The errors that you're creating are pretty much the same as the variants in that enum.

I'm open to feedback on this. This was basically my thought process:

  1. the extension already has some coupling with serde, which is okay imo given its popularity
  2. it seems like serde_json::error::Error provides a lot of rich information so we shouldnt' need additional error scenarios layered on top
    • if this is a bad assumption, maybe we should use an enum to be safe?

@calavera
Copy link
Contributor

That makes sense. I have no naming preference for the error types, tbh. Feel free to open a PR with these changes and we can discuss it further in code review.

@mlaota
Copy link
Contributor Author

mlaota commented Sep 3, 2024

PR above is ready for review.

(Sorry for the mention spam there, that was because of git amends + pushes to work on different laptops lol)

calavera pushed a commit that referenced this issue Sep 4, 2024
BREAKING CHANGE: adds additional methods to the `RequestPayloadExt` trait.

implements RFC #917
@calavera calavera closed this as completed Sep 4, 2024
Copy link

github-actions bot commented Sep 4, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants