-
Notifications
You must be signed in to change notification settings - Fork 425
Assertion Failure in 1.16.0 for code that worked in 1.15.0 #1598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Interesting, we must not be correctly counting streams somehow. We appear to be over-returning streams; we'll need to look into this to understand it better. |
If it matters, that log prints: |
I wasn't able to reproduce this. @edelabar would you be able to run this with a connection pool delegate which logs all the relevant info (something like the snippet below)? This should hopefully provide enough context so we can understand how we reach the assertion failure. struct LoggingPoolDelegate: GRPCConnectionPoolDelegate {
func connectionAdded(id: GRPCConnectionID) {
print(#function, id)
}
func connectionRemoved(id: GRPCConnectionID) {
print(#function, id)
}
func startedConnecting(id: GRPCConnectionID) {
print(#function, id)
}
func connectFailed(id: GRPCConnectionID, error: Error) {
print(#function, id, error)
}
func connectSucceeded(id: GRPCConnectionID, streamCapacity: Int) {
print(#function, id, streamCapacity)
}
func connectionUtilizationChanged(id: GRPCConnectionID, streamsUsed: Int, streamCapacity: Int) {
print(#function, id, streamsUsed, streamCapacity)
}
func connectionQuiescing(id: GRPCConnectionID) {
print(#function, id)
}
func connectionClosed(id: GRPCConnectionID, error: Error?) {
print(#function, String(describing: error))
}
} |
Here are the logs I collected:
I added the A bit more context, we're running multiple GRPC connections to multiple servers in the app, they each have their own Anything else I can try or provide? |
Motivation: The connection pool manager manages a pool of connections per event-loop. It spreads load across these pools by tracking how many streams a pool has capacity for and how many streams are in use. To facilitate this each pool reports back to the pool manager when streams have been reserved and when they have been returned. If connections are closed unexpectedly (due to an error, for example) then the pool reports this in bulk. However when the streams are closed they are also reported back to the pool manager. This means the manager can end up thinking a pool has a negative number of reserved streams which results in an assertion failure. Modifications: - Check if the connection a stream is being returned to is available before reporting stream closures to the pool manager. Result: - Better stream accounting. - Resolved grpc#1598
Thanks for providing that. I think I've tracked down the bug and have a fix in #1603. |
…#1603) Motivation: The connection pool manager manages a pool of connections per event-loop. It spreads load across these pools by tracking how many streams a pool has capacity for and how many streams are in use. To facilitate this each pool reports back to the pool manager when streams have been reserved and when they have been returned. If connections are closed unexpectedly (due to an error, for example) then the pool reports this in bulk. However when the streams are closed they are also reported back to the pool manager. This means the manager can end up thinking a pool has a negative number of reserved streams which results in an assertion failure. Modifications: - Check if the connection a stream is being returned to is available before reporting stream closures to the pool manager. Result: - Better stream accounting. - Resolved #1598
@glbrntt Sorry for the delay, but I just confirmed against the main branch and it looks like it's working for me. Thanks for the help! ETA on a versioned release? |
Awesome, thanks for confirming. Likely to be end of week / early next week. |
Thanks! |
…grpc#1603) Motivation: The connection pool manager manages a pool of connections per event-loop. It spreads load across these pools by tracking how many streams a pool has capacity for and how many streams are in use. To facilitate this each pool reports back to the pool manager when streams have been reserved and when they have been returned. If connections are closed unexpectedly (due to an error, for example) then the pool reports this in bulk. However when the streams are closed they are also reported back to the pool manager. This means the manager can end up thinking a pool has a negative number of reserved streams which results in an assertion failure. Modifications: - Check if the connection a stream is being returned to is available before reporting stream closures to the pool manager. Result: - Better stream accounting. - Resolved grpc#1598
…grpc#1603) Motivation: The connection pool manager manages a pool of connections per event-loop. It spreads load across these pools by tracking how many streams a pool has capacity for and how many streams are in use. To facilitate this each pool reports back to the pool manager when streams have been reserved and when they have been returned. If connections are closed unexpectedly (due to an error, for example) then the pool reports this in bulk. However when the streams are closed they are also reported back to the pool manager. This means the manager can end up thinking a pool has a negative number of reserved streams which results in an assertion failure. Modifications: - Check if the connection a stream is being returned to is available before reporting stream closures to the pool manager. Result: - Better stream accounting. - Resolved grpc#1598
What are you trying to achieve?
We have a GRPC channel in our iOS app that our load balancer times-out after 2 minutes of inactivity. The following code worked until 1.16.0:
The channel is relatively inactive so when the load balancer disconnects, we just reestablish the connection. This worked fine from roughly grpc-swift version 1.8.2 through 1.15.0, but as of 1.16.0 when the exception is caught, the code that closes the connection also fails an assertion in the GRPC package target which causes the app to crash.
The assertion failure crash itself doesn't hit our code, it's all in a background thread, but I noticed this change in 1.16.0 which the assertion failure passes-through, which led me to reverting to 1.15.0. Full commit is here: 75b390e but unfortunately I don't have enough time or understanding to dig into the guts of all this. Any help is greatly appreciated.
What have you tried so far?
I've looked through the docs to confirm I'm doing things correctly, and best I can tell I am. Reverting to 1.15.0 solves the problem for now, but I was hoping to figure out if I was doing something wrong or if a regression was introduced in 1.16.0.
The text was updated successfully, but these errors were encountered: