-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Client Node Request/Fetch Circuit Breaker #11401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is at least partially solved by:
It would still be possible to overwhelm a client node by requesting too many enormous documents as hits, but the chances are greatly reduced |
@metacret rather open a pull request where people can review it properly |
@clintongormley it was edited on v1.7.5 branch which is the version we're using. This is the urgent hot fix to prevent our tribe nodes from being crashed by long GC pause or OOM error. I just wanted to ask you to verify how feasibly my idea will work before setting up the official PR because making PR will need huge amount of work including writing all unit tests. My summarized idea is TransportResponseHandler should keep recording how many bytes it has processed and if that number is over the threshold, MessageChannelHandler should throw an exception before response deserialization to prevent memory allocation. It was quite tricky how to record this data because TransportResponseHandler is allocated on the fly for every shard request. So I made SearchServiceListener to keep how many bytes are processed and that information is being propagated up to TransportResponseHandler. |
I couldn't find easy solution to Install the circuit breaker in handleResponse because it was very hard for me find out how to release everything. |
Closing as a duplicate of #9310 |
Currently, it's possible to overload a client node during the fetch process by requesting extreme amounts of data. In particular, this is possible to observe by repeating the problem in #11070 (e.g., fetching too many things in an aggregation).
The text was updated successfully, but these errors were encountered: