Skip to content

Request-level circuit breaker support on coordinating nodes #62884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 24, 2020

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Sep 24, 2020

Backport of #62223

…62223)

This commit allows coordinating node to account the memory used to perform partial and final reduce of
aggregations in the request circuit breaker. The search coordinator adds the memory that it used to save
and reduce the results of shard aggregations in the request circuit breaker. Before any partial or final
reduce, the memory needed to reduce the aggregations is estimated and a CircuitBreakingException} is thrown
if exceeds the maximum memory allowed in this breaker.
This size is estimated as roughly 1.5 times the size of the serialized aggregations that need to be reduced.
This estimation can be completely off for some aggregations but it is corrected with the real size after
the reduce completes.
If the reduce is successful, we update the circuit breaker to remove the size of the source aggregations
and replace the estimation with the serialized size of the newly reduced result.

As a follow up we could trigger partial reduces based on the memory accounted in the circuit breaker instead
of relying on a static number of shard responses. A simpler follow up that could be done in the mean time is
to [reduce the default batch reduce size](elastic#51857) of blocking
search request to a more sane number.

Closes elastic#37182
Ensures that the test always run with a memory circuit breaker.

Relates elastic#62223
@jimczi jimczi merged commit 78a93dc into elastic:7.x Sep 24, 2020
@jimczi jimczi deleted the reduce_aggs_circuit_breaker_bwc branch September 24, 2020 16:59
jimczi added a commit that referenced this pull request Jan 15, 2021
…#67431)

In #62884  we added the support for the request circuit breaker in search coordinating nodes.
Today the circuit breaker is strictly checked only when a partial or final reduce occurs.
With this commit, we also check the circuit breaker strictly when a shard response
is received and we cancel the request early if an exception is thrown at this point.
jimczi added a commit that referenced this pull request Jan 15, 2021
…#67431)

In #62884  we added the support for the request circuit breaker in search coordinating nodes.
Today the circuit breaker is strictly checked only when a partial or final reduce occurs.
With this commit, we also check the circuit breaker strictly when a shard response
is received and we cancel the request early if an exception is thrown at this point.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant