Skip to content

Add a connection pool #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 25, 2020
Merged

Add a connection pool #105

merged 1 commit into from
Feb 25, 2020

Conversation

PopFlamingo
Copy link
Contributor

@PopFlamingo PopFlamingo commented Sep 13, 2019

This commit adds a basic connection pool implementation. It is still extremely WIP but includes breaking changes that must probably be taken care of before the 1.0.0 tag (see the comment below).

I'll add as many comments as possible tonight to explain parts that need to.

Let me know what you think! 🙂

There are many things to fix before the PR is truly ready for review/merge:

Making HTTPBin ready for pooling tests:

  • Add delayed responses (enables testing the creation of new connections by the connection providers when existing ones are in-use)
  • Add connections that close themselves after they served a request
  • Channels that close themselves after a certain delay (to test how the pool reacts when a stored channel closes)
  • Simulate connection delays (useful for performance testing of pooled vs non-pooled)

Misc issues/improvements

  • Fix issue where an existing ConnectionProvider will not add the proper SSL handler for https connections
  • Avoiding acquiring Locks for as long as its currently the case

Testing

  • Stress tests for the pool
  • Check the respect of the maximum number of concurrent connections for HTTP/1.1

Is this needed?

  • Make a performance comparator between pooled and non-pooled client

Not in this commit:
HTTP/2:

  • Add HTTP/2 support to HTTPBin
  • Finish and test the implementation of the HTTP/2 connection provider
  • Correct detection of HTTP/2 when its available

Copy link
Collaborator

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general shape of this seems good, but I think we need a lot more commenting explaining what the strategy is here. There's a decent amount of code duplication too that could be cleaned up. However, it's a definite great start, well done!

@PopFlamingo
Copy link
Contributor Author

@Lukasa

The general shape of this seems good, but I think we need a lot more commenting explaining what the strategy is here. There's a decent amount of code duplication too that could be cleaned up. However, it's a definite great start, well done!

Thank you! I'll add comments to explain everything.

@Lukasa @t089 I will fix the parts where you noticed issues, thank you for your feedback! There is also quite a lot of code that needs to be cleaned.

The main goal of doing this draft PR right now is to be ensure breaking changes can be added before the 1.0.0 tag, even if actual pooling was only postponed for after the release. From what I remember of previous discussions, being able to add pooling entirely transparently in a SemVer minor version was something that would be nice to have, on this subject, I think breaking changes might need to be merged before the release tag.

I think #102 already adresses what's needed for thread safety issues and or the pool to have enough informations to make the better decisions for performance, but there is another part to consider: retried requests.
Since a connection pool augments the chances of failure due to connections being kept alive for longer, having an automatic retry strategy is needed. Note that connection "failure" for non-leased connections already has a basic minimal support in the current implementation: if a connection is closed, it is directly removed from the pool.
When the implementation becomes better, the TaskHandler will probably want to detect when an in-execution task fails and retry it automatically if possible as we had discussed with @Lukasa and @weissi. AFAICT this mechanism might need breaking changes to the HTTPClientResponseDelegate: when a request is re-tried, some delegates will need to reset their internal state, so a delegate method such as willRetryRequest(...) -> EventLoopFuture<Void> might be needed.

About this I have two ideas:

  • Make a breaking change before the 1.0.0 tag by requiring all delegate implementations to explicitly handle retries, this way a connection pool with retries can be added entirely transparently in the future.
  • Make no change to the current requirements of the delegate and add an an opt-in change in the future (by making a new type of delegate for retries or adding a default implementation): in this case, a connection pool could still be added, but no SemVer minor change could automatically add the retry mechanism for everyone except if the library user explicitly starts handling the retry.

I'm not really sure of what the best thing to do would be so I would appreciate to have your opinion about this, if needed I can also open it as a separate issue.

@Lukasa
Copy link
Collaborator

Lukasa commented Sep 16, 2019

Make a breaking change before the 1.0.0 tag by requiring all delegate implementations to explicitly handle retries, this way a connection pool with retries can be added entirely transparently in the future.

I'm quite tempted by this approach, as it allows us to potentially shim a solution in by having an automatic-retry delegate that can be composed with a user-provided delegate in some way. In general, trying to do too much is likely to go badly for us. I think it may be a good idea to explicitly say that retries must be handled by delegates (and to add some API surface for doing so), as it allows us to punt on making the decision any further.

However, @weissi may be opposed to adding more API for this.

@PopFlamingo
Copy link
Contributor Author

PopFlamingo commented Sep 16, 2019

@Lukasa

I think it may be a good idea to explicitly say that retries must be handled by delegates (and to add some API surface for doing so)

Just to be sure of what you mean exactly by "handled by delegates": do you mean something similar to willRetryRequest(...)? Or do you mean an even more "manual" handling?

@PopFlamingo
Copy link
Contributor Author

PopFlamingo commented Sep 18, 2019

The 0d33412 commit adds new options (passed as Connection: close and X-internal-delay: [delayInMilliseconds] HTTP headers to control how requests are served (respectively: closing the connection after request is served and adding a delay to the HTTP response). There is also a new HTTPBin init option to control the initial connection delay of a client to the server, and a new option to make a connections (ie: Channels opened by the clients on the server) close after a certain delay.
Some new tests already take advantage of it, for instance testStressGetHttps and testResponseConnectionCloseGet.
The commit also fixes misc issues and documentation issues.

@weissi
Copy link
Contributor

weissi commented Sep 22, 2019

I'm quite tempted by this approach, as it allows us to potentially shim a solution in by having an automatic-retry delegate that can be composed with a user-provided delegate in some way. In general, trying to do too much is likely to go badly for us. I think it may be a good idea to explicitly say that retries must be handled by delegates (and to add some API surface for doing so), as it allows us to punt on making the decision any further.

However, @weissi may be opposed to adding more API for this.

I think I'm down with that. I think we should make automatic retrying very easy for somebody to implement in their own delegate. That could work either with composition as @Lukasa proposed or possibly with a default implementation?

@kylebrowning
Copy link

kylebrowning commented Nov 15, 2019

Im trying to bring this back to light because it speeds up AsyncHTTPRequest considerably.

Without

SwiftNIO run time: 0.3538249731063843
SwiftNIO run time: 0.3536069393157959
SwiftNIO run time: 0.33445894718170166
SwiftNIO run time: 0.30274498462677
SwiftNIO run time: 0.3761049509048462
SwiftNIO run time: 0.5195209980010986
SwiftNIO run time: 0.36058497428894043
SwiftNIO run time: 0.29895198345184326
SwiftNIO run time: 0.3660769462585449
SwiftNIO run time: 0.29245102405548096
URLSession run time: 0.35767805576324463
URLSession run time: 0.3387099504470825
URLSession run time: 0.20786607265472412
URLSession run time: 0.33779799938201904
URLSession run time: 0.21909499168395996
URLSession run time: 0.210394024848938
URLSession run time: 0.2118690013885498
URLSession run time: 0.20524907112121582
URLSession run time: 0.20901107788085938
URLSession run time: 0.21499598026275635

With Connection pooling

SwiftNIO run time: 0.33787500858306885
SwiftNIO run time: 0.1476989984512329
SwiftNIO run time: 0.20796406269073486
SwiftNIO run time: 0.2085040807723999
SwiftNIO run time: 0.20637094974517822
SwiftNIO run time: 0.15462100505828857
SwiftNIO run time: 0.13620400428771973
SwiftNIO run time: 0.14024698734283447
SwiftNIO run time: 0.13931310176849365
SwiftNIO run time: 0.2430809736251831
URLSession run time: 0.35739099979400635
URLSession run time: 0.26072800159454346
URLSession run time: 0.26291704177856445
URLSession run time: 0.26088404655456543
URLSession run time: 0.20262694358825684
URLSession run time: 0.19789695739746094
URLSession run time: 0.33044493198394775
URLSession run time: 0.2090669870376587
URLSession run time: 0.20859694480895996
URLSession run time: 0.21186602115631104

And then I also ran it on an iOS device, 10 requests, this is a total combined time in release mode.

---no-pooling---
SwiftNIO run time: 1.8052639961242676
URLSession run time: 1.2042559385299683
---pooling---
SwiftNIO run time: 1.191763997077942
URLSession run time: 1.1281760931015015

I worked on the PR a bit, but cant seem to get the decompression test to pass.

@PopFlamingo PopFlamingo reopened this Nov 15, 2019
@PopFlamingo PopFlamingo marked this pull request as ready for review November 16, 2019 00:39
@PopFlamingo PopFlamingo changed the title WIP: Prototype connection pool implementation Add a connection pool Nov 16, 2019
@PopFlamingo
Copy link
Contributor Author

PopFlamingo commented Nov 16, 2019

The PR is now ready for review! 🙂

cc @weissi @Lukasa @t089 @tomerd @artemredkin

Thanks to the precious help of @kylebrowning this PR now includes the latest changes of master and all tests are passing, the benchmarks are really insightful too and let us ensure that these changes are beneficial to the performance.

What do you think about the connection pool implementation? What can be improved / made clearer?

Thank you!

A few things to check in all cases before merging:

  • Check there is no problematic reference cycle between connections and the pool
  • Remove HTTP/2 related pooling, this PR only focuses on HTTP/1.1

@weissi weissi requested review from artemredkin and Lukasa November 18, 2019 15:34
/// throw the appropriate error if needed. For instance, if its internal connection pool has any non-released connections,
/// this indicate shutdown was called too early before tasks were completed or explicitly canceled.
/// In general, setting this parameter to `true` should make it easier and faster to catch related programming errors.
public func syncShutdown(requiresCleanClose: Bool) throws {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this as public API? I think this can remain internal, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weissi I think it might need to be public because syncShutdown() calls syncShutdown(requiresCleanClose: false) for backwards compatibility, but since users should preferably do a clean close, they need to be able to call syncShutdown(requiresCleanClose: true). WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if you were a user of this library, wouldn't you be super confused on which shutdown to use and what parameter to pass? I'd just not expose syncShutdown(requiresCleanClose:) (can be used in tests of course). If we feel that it's super important, we can always make it public later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about the user shutting down the client too early and having difficulties understanding why all their tasks are getting cancelled, but I recognise requiresCleanClose is a bit confusing and the docs I added are not as much clear as they could be. I'll remove it for now then, never too late to add it later indeed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you planning to remove this?

Copy link
Contributor Author

@PopFlamingo PopFlamingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from looking at the test coverage

// if the `throw` wasn't delayed until everything
// is executed
if closeError == nil {
closeError = error
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other possible error than ChannelError.alreadyClosed on close? If there is a reproducible one it might be good to add a test that makes it so that this function throws

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on the channel handlers in the pipeline to be precise. I think ignoring alreadyClosed and surfacing everything else makes sense

Copy link
Contributor

@weissi weissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking really good! I think there's a race left though (and a few nits)

@PopFlamingo
Copy link
Contributor Author

Thanks for the new test of 92aedbd @weissi

Copy link
Collaborator

@artemredkin artemredkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good! I have one question to discuss: right now we have fixed maximum number of concurrent connections, should we make it a configuration option? Or skip for now? What are your thoughts?

@weissi
Copy link
Contributor

weissi commented Feb 9, 2020

@artemredkin why not start with fixed and add config later in a separate PR?

@PopFlamingo
Copy link
Contributor Author

@artemredkin Thanks!

Latest commit fixes the bugs found by @weissi

@weissi weissi modified the milestones: 1.1.0, 1.2.0 Feb 12, 2020
@weissi weissi requested a review from Lukasa February 12, 2020 18:08
}

/// Get a new connection from this provider
func getConnection(preference: HTTPClient.EventLoopPreference) -> EventLoopFuture<Connection> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some sense it doesn't matter which event loop this future returns on, so long as it is always consistent about that choice. Otherwise it becomes impossible to write thread-safe programs. Given that we allow users to express an event loop preference here it seems sensible to follow the idea that the event loop should be returned on their preference.

There are two good choices of which event loop the future callback should run on:

  1. The event loop of the returned connection.
  2. The event loop of the caller.

Both of these are defensible, depending on what the user is exactly trying to do. It doesn't matter too much which of those we choose, but we should be attempting to guarantee that this is the case. Here we lose track of what the event loop of the future we're returning is, so it'd be good to keep track of it.

/// throw the appropriate error if needed. For instance, if its internal connection pool has any non-released connections,
/// this indicate shutdown was called too early before tasks were completed or explicitly canceled.
/// In general, setting this parameter to `true` should make it easier and faster to catch related programming errors.
public func syncShutdown(requiresCleanClose: Bool) throws {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you planning to remove this?

@PopFlamingo
Copy link
Contributor Author

@Lukasa Thanks for the NITs, I'll check and fix your other reviews soon

@Lukasa
Copy link
Collaborator

Lukasa commented Feb 18, 2020

Great, please @ me when you're ready for a re-review.

@PopFlamingo PopFlamingo requested a review from Lukasa February 20, 2020 18:09
@PopFlamingo
Copy link
Contributor Author

This PR is ready for a new round of reviews! 🙂
All FIXMEs and changes requests have now been addressed, and the latest commits contain fixes for all unhandled errors and synchronisation issues I knew of.

If you see no additional issues, I think the PR is ready to be merged. I'll squash the commits and make a correct commit message when the PR is approved.

Thanks

Copy link
Contributor

@weissi weissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! I am happy to merge this as is. This is a very major milestone for AsyncHTTPClient, thanks everybody involved, most importantly @adtrevor

Copy link
Collaborator

@artemredkin artemredkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, thank you @adtrevor !

motivation: Better performance thanks to connection reuse

changes:
- Added a connection pool for HTTP/1.1
- All requests automatically use the connection pool
- Up to 8 parallel connections per (scheme, host, port)
- Multiple additional unit tests
Copy link
Collaborator

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it. Thanks so much for your work @adtrevor!

@Lukasa Lukasa merged commit 19e2ea7 into swift-server:master Feb 25, 2020
@kylebrowning
Copy link

YAY!

@PopFlamingo
Copy link
Contributor Author

Thank you all for your reviews and advices, I learned a lot doing this! 🙂

Special thanks to you too @kylebrowning for bringing this back up a few months ago!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🆕 semver/minor Adds new public API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants