-
Notifications
You must be signed in to change notification settings - Fork 18k
net/http: http2 clientConnPool may cache a dead connection forever (write tcp use of closed network connection) #39750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @fraenkel |
This covers multiple issues already reported, some with awaiting fixes, others with awaiting responses. There is no guarantee that the http side and the http2 side are actually in agreement. There are many more cases where the http side believes that http2 connections are still alive and valid. However, when they are used, that is when the detection kicks in. The only safe way out of many of these situations currently is setting write timeouts. |
@fraenkel I just go through the cases you provided above. It looks quite similar to my problem. For now, the underlying tcp connections mostly rely on the
I agree with this detection strategy. But in my case, the
Setting write timeout just fixed the application hanging problem you mentioned above, it has no impact on dead connections. It is proved by the code that currently we do not invoke any detection in writing: When an error occurs in writing, errors are returned to caller and nothing else was performed. |
HTTP/2 support is golang has many problematic cornercases where dead connections would be kept. golang/go#32388 golang/go#39337 golang/go#39750 I suggest we disable HTTP/2 for now and enable it manually on the blackbox exporter. Signed-off-by: Julien Pivotto <[email protected]>
Ping @fraenkel Any updates or plan to enable it by default ? |
Is this a duplicate of #39337 ? |
@roidelapluie Thanks a lot. I will close this issue. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes. It happens after kubelet re-complied with 1.14.4
But I saw a commit
connection health check
on golang/net@0ba52f6It should help avoid this problem partly, yet not enabled by default.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
If we close the connection via
conn.Close
, thereadLoop
will immediately receive aread tcp ... use of closed connection error
. Then acleanup
will be performed: notifying every stream on this connection, removing the broken connection fromclientConnPool
... etcEverything seems fine. right ?
But, sometimes the
readLoop
of a clientConn may block on a idle connection.And the
writeHeaders
method ofclientConn
inroundTrip
never close a broken connection. (It just report errors to its caller)there is the problem begins
Imaging this: An incoming request is about to using this idle connection. After getting it from the pool, the connection died very soon, so
writeHeaders
method failed with an errorwrite tcp ... use of closed connection
and left the broken connection untouched in the pool.https://github.com/golang/net/blob/627f9648deb96c27737b83199d44bb5c1010cbcf/http2/transport.go#L1083-L1100
Since there is no traffic on the connection.
readLoop
will block, either nocleanup
will be triggered.Plus, incoming requests keep calling
roundTrip -> newStream
, add stream count on the conn, making the conn "busy". So thecloseIfIdle
will not cleanup the dead connection either.Some actual case
kubernetes/kubernetes#87615
We encountered this problem several times on highly loaded VMs. Heavy load may lead to temporarily suspend of some VMs by master machine. That's why connection may died right after getting from the pool.
But it is hard to reproduce this scenario.
kubelet keep using dead connections
The text was updated successfully, but these errors were encountered: