-
Notifications
You must be signed in to change notification settings - Fork 18k
net/http: http2 requests blocked after timeout #57478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CC @neild. |
Thanks for splitting this out. Can you clarify what specifically the problem is? It sounds like the number of goroutines is the bug for you ("I expected all go routines that the std library creates to handle the HTTP request to be cleaned up"), but looking at the goroutines stacks I wonder if the larger problem is that the Transport continues to schedule new requests onto the stuck http2 connection even after many requests on that connection have all timed out. I see goroutines doing the following outbound http/2 work: 16 outbound http/2 connections that exist in the app (rooted at 11 requests the app is trying to send over http/2 right now (in 1258 + 12 trying to clean up requests that were scheduled onto the stuck connection (in 9+2 in 1 more in 1 goroutine trying to write a frame to an http/2 client connection ( The following outbound http/1.1 work (I think unrelated, but helpful to classify): The following inbound http/2 work (I think unrelated, but helpful to classify): As I understand it:
|
Hi @rhysh , well, the bug isn't necessarily the go routines themselves, but rather the potential for these resources, e.g. go routines, memory, etc. to keep growing because of the problem you stated. |
I think I'm seeing a manifestation of the same problem here. HTTP/2 requests where the context deadline has been reached, but the requests remain outstanding, with each goroutine blocked trying to acquire a mutex in abortStream. In this case, the upstream server is not responding. Here are the active goroutines:
|
How to fix😂 |
@stampy88 Do you have any ideas for fix it? |
Hi @stampy88 , I use net/http less these days than I did in early 2023. I don't have a fix, and I don't have plans to make one. Maintainer time for http/2 is scarce (see https://dev.golang.org/owners for each package's list). We'll do best if we can make a little of it go a long way. Here's what I think would help: @stampy88 , you'd initially said you were working on a reproducer. I assume you would have said if you'd finished ... but otherwise, saying a bit about what you tried that didn't work (or which worked only partially) might help. @90wukai (and the now-deleted Jan 21 poster), it sounds like you're affected too. Even if you don't have a lot of time for digging, it would help to say which Go version you use that has the problem ... and if you're using x/net/http2 directly then the version of that as well. If you have more time, it could help to say a bit about the impact this has on your programs (on a scale of "kinda annoying every couple months" to "multi-hour system outage almost daily"), and some background on what your systems are "like". Especially if are there ways you suspect that your use of Go is "unusual", which might make this issue appear more frequently than in the general population of Go users. @adg , thanks for the goroutine profile. I assume you run an up-to-date Go version, and the line numbers match go1.20. But I don't see any singleton goroutines with .../h2_bundle.go code on their stacks which might be holding the lock that the other 7+7+7=21 goroutines are trying to acquire. And the panic in |
Sorry @rhysh, I was unable to consistently reproduce it and have disabled HTTP 2 for my usage in this legacy app that was having the issue. It was so long ago, I don't recall what I did. I'll try and dig up the code I had that was the basis for my reproducer, but I have a bad feeling I don't have it anymore. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Have not attempted yet as it is hard to reproduce. I am trying to attempt to write a reproducer.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
The application does HTTP POSTs to configured clients endpoints whenever an event is received. The HTTP client is configured with a 5 second timeout so that long running requests don't block an event consumer too long.
What did you expect to see?
When a timeout occurs, I expected all go routines that the std library creates to handle the HTTP request to be cleaned up.
What did you see instead?
The server the app is connecting to supports HTTP 2. During periods where timeouts are occurring we can see the number of go routines steadily increase until the server it is trying to communicate to starts responding again. See stack traces below:
The text was updated successfully, but these errors were encountered: