Skip to content

Optimize HTTP headers parsing #1977

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 11, 2018
Merged

Conversation

bclozel
Copy link
Member

@bclozel bclozel commented Oct 3, 2018

Several benchmarks underlined a few hotspots for CPU and GC pressure in
the Spring Framework codebase:

  • org.springframework.util.MimeType.<init>(String, String, Map)
  • org.springframework.util.LinkedCaseInsensitiveMap.convertKey(String)

Both are linked with HTTP request headers parsing and manipulation
during the request processing phase.

The first one is linked to repeated calls to
HttpHeaders.getContentType within a single request handling. The media
type parsing operation is expensive and the result doesn't change
between calls, since the request headers are immutable at that point.

This commit improves this by caching the parsed MediaType for the
"Content-Type" request header in the ReadOnlyHttpHeaders class. This change
is available for both Spring MVC and Spring WebFlux.

The second one is linked to insertions/lookups in the
LinkedCaseInsensitiveMap, which is the data structure behind
HttpHeaders. Those operations are creating a lot of garbage (including
a lot of String created by toLowerCase). We could choose a more
efficient data structure for storing HTTP headers data.

As a first step, this commit is focusing on Spring WebFlux and
introduces MultiValueMap implementations mapped by native HTTP headers
for the following servers: Tomcat, Jetty, Reactor Netty and Undertow.
Such implementations avoid unnecessary copying of the request headers
and leverages as much as possible optimized operations provided by the
native implementations.

This change has a few consequences:

  • HttpHeaders can now wrap a MultiValueMap directly
  • The default constructor of HttpHeaders is still backed by a
    LinkedCaseInsensitiveMap
  • The HTTP request headers for the websocket HTTP handshake now need to
    be cloned, because native headers are likely to be pooled/recycled by
    the server implementation, hence gone when the initial HTTP exchange is
    done

Issue: SPR-17250

@@ -153,6 +153,9 @@ public String getFirst(String headerName) {
Assert.isInstanceOf(String.class, key, "Key must be a String-based header name");

Collection<String> values1 = servletResponse.getHeaders((String) key);
if (headersWritten) {
return new ArrayList<>(values1);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without that change, I'm seeing duplicate response headers because both the native servlet response ones and the ones held by Spring's servlet response are added. Once the response is committed and headers written, reading headers from both yields duplicates. I think this is a bug uncovered by other changes in this PR.

@bclozel
Copy link
Member Author

bclozel commented Oct 3, 2018

Hey @jhoeller , @rstoyanchev

As promised, here's a first review cycle for this PR.

Caching the parsed content-type is done for both Spring WebFlux and Spring MVC, but the native headers optimization is not applied to Spring MVC because 1) this might not be the main performance issue in Spring MVC right now and 2) there's only so much we can do with the Servlet API as a common bassis. We can revisit that later if we need to.

For now, this PR is trying to stick as much as possible to the previous behavior. I've added MultiValueMap implementations that wrap the native headers. Those are often then wrapped by an immutable MultiValueMap implementation when the request is declared as read only.

I'm currently running local benchmarks to measure the difference in CPU + allocation overhead created by Spring. I'll get back to you on that soon.

Possible improvements

We could go one step further and make those wrappers immutable from the start, but this clashes with the general architecture and might not work in certain places.

The content-type optimization only kicks in for HTTP PUT/POST requests where we actually look at the request content-type. The second optimization only applies to Spring WebFlux. We could improve the Spring MVC situation a bit by using a different implementation than LinkedCaseInsensitiveMap. My benchmarks show that a simple List + String.equalsIgnoreCase lookups or even a TreeMap + case insensitive hashcode yield better results, since toLowerCase is the expensive part here (creating String and the complex logic with Locale).

Let me know what you think!

@bclozel bclozel force-pushed the SPR-17250 branch 3 times, most recently from 623b151 to 627bc96 Compare October 9, 2018 21:08
@bclozel
Copy link
Member Author

bclozel commented Oct 9, 2018

I've applied @poutsma 's comments (and we fixed some more things as well).
I've just applied the same reasoning to the HTTP response headers and made sure to avoid HTTP headers copying in DefaultServerHttpRequestBuilder - those changes were applied in two new commits.

public static HttpHeaders writableHttpHeaders(HttpHeaders headers) {
Assert.notNull(headers, "HttpHeaders must not be null");
if (headers instanceof ReadOnlyHttpHeaders) {
return new HttpHeaders(headers.headers);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ReadOnlyHttpHeaders is a wrapper around HttpHeaders, couldn't we check here if headers.headers is an instance of HttpHeaders and if so return it directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, headers.headers is itself a MultiValueMap. @poutsma and I changed ReadOnlyHttpHeaders so that it extends HttpHeaders (and calls super(MultiValueMap)) to avoid issues on the equals contract.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see.

* @author Brian Clozel
* @since 5.1
*/
class ReactorServerHttpHeaders implements MultiValueMap<String, String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions on the class name.

  1. HttpHeaders is from Netty so probably shouldn't be called Reactor? It could be used with other Netty based runtimes too besides Reactor.
  2. The Netty HttpHeaders are for client and server use, so "Server" is a misnomer.
  3. Both Netty and Spring have "HttpHeaders", and this class is an instance of neither, so calling it HttpHeaders is not ideal. It is a MultiValueMap but that would be long. It is an Adapter so perhaps NettyHeadersAdapter?

For Jetty the adapted HttpFields is also both client and server. Perhaps JettyHeadersAdapter.

For Tomcat and Undertow they're server specific but no strong reason to use "Server" (what else could they be?) so TomcatHeadersAdapter and UndertowHeadersAdapter for consistency.

@rstoyanchev
Copy link
Contributor

Sorry but it looks I left some of my comments on the original commit bclozel@320433b#diff-9b3604016549e725bd1a952499ba5c6c and they do not show up here..

Overall this looks alright.

One general concern I have is with Tomcat and Jetty where basic operations like get, put, remove, create a new List per invocation. I get why Tomcat and Jetty don't use a Map internally but us having to adapt to List<String> makes things worse. Looking at our HttpHeaders, a lot of the typed get methods actually returns lists of things. Perhaps we should consider adding more caching in the read-only wrapper in order to offset this side effect.

The second comment is the possibility of using the Netty and Jetty header adapters on the client response side seems a straight-forward fit, if it wasn't for the fact there is no convenient package for sharing things between client and server (aside from the top-level http package).

@rstoyanchev
Copy link
Contributor

One more comment. In AbstractServerHttpResponse the getHeaders method returns a new read-only wrapper on every call after committed. Perhaps we can save the read-only wrapper and create it once?

@bclozel bclozel force-pushed the SPR-17250 branch 2 times, most recently from a1df2d2 to b988357 Compare October 10, 2018 20:18
Several benchmarks underlined a few hotspots for CPU and GC pressure in
the Spring Framework codebase:

1. `org.springframework.util.MimeType.<init>(String, String, Map)`
2. `org.springframework.util.LinkedCaseInsensitiveMap.convertKey(String)`

Both are linked with HTTP request headers parsing and response headers
writin during the exchange processing phase.

1) is linked to repeated calls to `HttpHeaders.getContentType`
within a single request handling. The media type parsing operation
is expensive and the result doesn't change between calls, since
the request headers are immutable at that point.

This commit improves this by caching the parsed `MediaType` for the
`"Content-Type"` request header in the `ReadOnlyHttpHeaders` class.
This change is available for both Spring MVC and Spring WebFlux.

2) is linked to insertions/lookups in the `LinkedCaseInsensitiveMap`,
which is the data structure behind `HttpHeaders`.
Those operations are creating a lot of garbage (including a lot of
`String` created by `toLowerCase`). We could choose a more efficient
data structure for storing HTTP headers data.

As a first step, this commit is focusing on Spring WebFlux and
introduces `MultiValueMap` implementations mapped by native HTTP headers
for the following servers: Tomcat, Jetty, Netty and Undertow.
Such implementations avoid unnecessary copying of the headers
and leverages as much as possible optimized operations provided by the
native implementations.

This change has a few consequences:

* `HttpHeaders` can now wrap a `MultiValueMap` directly
* The default constructor of `HttpHeaders` is still backed by a
`LinkedCaseInsensitiveMap`
* The HTTP request headers for the websocket HTTP handshake now need to
be cloned, because native headers are likely to be pooled/recycled by
the server implementation, hence gone when the initial HTTP exchange is
done

Issue: SPR-17250
This commit avoids copying HTTP headers when mutating an HTTP request.
Instead, we're now unwrapping the `ReadOnlyHttpHeaders` (which is most
likely backed by the native request headers).

Issue: SPR-17250
@bclozel bclozel merged commit f12c28e into spring-projects:master Oct 11, 2018
@bclozel bclozel deleted the SPR-17250 branch January 31, 2022 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants