Skip to content

Querystrings that have percent encodings get double encoded #99

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
heathprovost opened this issue Oct 8, 2022 · 11 comments
Closed

Querystrings that have percent encodings get double encoded #99

heathprovost opened this issue Oct 8, 2022 · 11 comments

Comments

@heathprovost
Copy link

Before I begin I do not have much experience with Rust or the details of its various packages, but I did try to research this before posting it.

Problem: Whenever a url that contains a querystring with percent encoded values is passed through the adapter, it gets double encoded because (I think) the public function Url.query(&self) -> Option<&str> automatically applies percent encoding to the Url's existing querystring. The now double-percent-encoded querystring is applied to the new Url that is passed to .set_query() here: https://github.com/awslabs/aws-lambda-web-adapter/blob/main/src/lib.rs#L236

Since this adapter operates on values as they are passed to a server, generally any percent encoded that is required should already have been applied, so doing any encoding here (imo) is an error. I believe the correct behavior would be to always pass the querystring through unmolested. Alternately the value returned by .query() could be urldecoded before passing it through to set_query()

Here is an example of a url that triggers this issue:

https://test.com/?originalQueryString=%3FshowAll%3Dtrue

and this is the same url after passing through the adapter:

https://test.com/?originalQueryString=%25253FshowAll%25253Dtrue

I did a very amateurish fix for our own temporary use that looks like this, but I have no idea what I am doing with Rust so I expect you will have a better way to do this. That said, this does fix the problem for our purposes.

// ORG CODE
// app_url.set_query(parts.uri.query());

// NEW CODE
let querystring = parts.uri.query().unwrap_or_default();
let decoded_querystring = urlencoding::decode(querystring).unwrap_or_default();
app_url.set_query(Some(&decoded_querystring));

Thanks in advance.

@bnusunny
Copy link
Contributor

bnusunny commented Oct 9, 2022

Thanks for raising this issue. I couldn't reproduce it in my test. Below is a go-httpbin server running on Lambda with the adapter. The echoed query parameter seems correct to me.

https://httpbin-http-zip.dev.adapter.awsguru.dev/get?originalQueryString=%3FshowAll%3Dtrue

Do you have a test repo/environment to reproduce it?

@heathprovost
Copy link
Author

heathprovost commented Oct 9, 2022

Wow... Ok, that is very strange. Ill see if I can create some kind of way to repro. However, one thing we did do before changing anything at all is just turn on debugging and looked at the logs and we saw this:

The original request url was http://ourdomain.com/utility/serverutils?sectionOptions.url=%3FshowAll%3Dtrue. The code that currently logs this when the log level is debug is here: https://github.com/awslabs/aws-lambda-web-adapter/blob/main/src/lib.rs#L237

DEBUG lambda_web_adapter: sending request to app server app_url=http://127.0.0.1:8080/utility/serverutils?sectionOptions.url=%253FshowAll%253Dtrue req_headers=
{
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
    "accept-encoding": "gzip, deflate, br",
    "accept-language": "en-US,en;q=0.9",
    "cache-control": "max-age=0",
    "cookie": "REDACTED",
    "host": "REDACTED",
    "sec-ch-ua": "\"Chromium\";v=\"106\", \"Google Chrome\";v=\"106\", \"Not;A=Brand\";v=\"99\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"Windows\"",
    "sec-fetch-dest": "document",
    "sec-fetch-mode": "navigate",
    "sec-fetch-site": "none",
    "sec-fetch-user": "?1",
    "upgrade-insecure-requests": "1",
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36",
    "x-amzn-trace-id": "REDACTED",
    "x-forwarded-for": "REDACTED",
    "x-forwarded-port": "443",
    "x-forwarded-proto": "https"
}

The main issue being that app_url http://127.0.0.1:8080/utility/serverutils?sectionOptions.url=%253FshowAll%253Dtrue

Maybe I am mistaken or am not understanding how the logging behaves or something, but that sure looked like the adapter was double encoding things internally before it even sent the request. Shouldn't it just be the same querystring as the original request? Do you see this same behavior if you turn on debugging?

Also, in our environment we do have some unique things going on:

  1. We use an ALB to hit the lambda, not an API gateway
  2. Our lambda actually implements a proxy server, so internally it is proxying (sometimes) to another server.

I don't know for sure if any of this matters or not, but the proxy part is how this came to our attention -because urls that worked fine when we hit the proxied site directly were failing when they went through the lambda adapter (but they worked fine if we passed them through the same proxy running on the same docker image in ECS on fargate). The adapter doesn't run in fargate, so it seems to us the problem must be behavior that happens inside the adapter.

@bnusunny
Copy link
Contributor

bnusunny commented Oct 9, 2022

Which adapter version are you using?

@heathprovost
Copy link
Author

heathprovost commented Oct 9, 2022

Ok... I think I found the problem. Its actually due to the fact that we use an ALB instead of an API Gateway. I found this page and it documents something I did not realize about halfway down the page...

https://serverless-training.com/articles/api-gateway-vs-application-load-balancer-technical-details/

Note that the keys and values of evt.queryStringParameters will be:
  *  already URL decoded for you on API Gateway
      e.g. ?foo%5Ba%5D=foo%5B will be { 'foo[a]': [ 'foo[' ] } 
  *  but not URL decoded on ALB
      e.g. ?foo%5Ba%5D=foo%5B will be { 'foo%5Ba%5D': [ 'foo%5B' ] }

So I think the fix would be to only change this behavior if the request comes from an ALB (which you can detect by looking at the lambda request itself). I.e. the current behavior is correct for API Gateways, but not for ALBs.

edit: We were using 0.4.1 to answer your question

@bnusunny
Copy link
Contributor

bnusunny commented Oct 9, 2022

Oh, thanks for this info. I need to check the event data handling in Lambda Rust Runtime.

@heathprovost
Copy link
Author

Like I said, I dont know Rust very well... But the Lambda Rust Runtime seems to be doing this when it builds the request:

https://github.com/awslabs/aws-lambda-rust-runtime/blob/fd2ea235f4adb166422ca5b612c4a1a42a626daa/lambda-http/src/request.rs#L409

if let Some((mv, sv)) = queries {
  if !mv.is_empty() {
    url.push('?');
    url.push_str(&mv.to_query_string());
  } else if !sv.is_empty() {
    url.push('?');
    url.push_str(&sv.to_query_string());
  }
}

That seems wrong. I would expect to_query_string() is applying URL encoding here, and it is doing it regardless of the event source. It should probably be checking if its an ALB event first and not re-encoding it when it is.

@bnusunny
Copy link
Contributor

Do you mind opening an issue in Lambda Rust Runtime repo? We will take it from there.

@bnusunny
Copy link
Contributor

The PR is merged in Lambda Rust Runtime. I will cut a new release once Rust Runtime release is out.

@heathprovost
Copy link
Author

Thanks!

@bnusunny
Copy link
Contributor

Release v0.5.0 is out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants