-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Create a circuit breaker to prevent searches from bringing down a node #2929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is exactly what I'm looking for as well! One of my requirements is to provide open API access to our ElasticSearch data for developers to run adhoc queries. There is a very real possibility that one of them may execute a bad query bringing down a single node or much worse multiple nodes in my cluster. What would make this feature even better, is additional performance monitoring for what queries are running at any given time and what queries have been run as well as performance metrics for them. |
+1 |
Hey folks, I want to jump in here and tell you that this is something that is pretty high on our wish-list as well. With the foundations 0.90 will bring we can approach things like this much easier and maybe more important more reliable. I might jump in here and have a first cut at this pretty soon. |
+1 |
2 similar comments
👍 |
+1 |
+1 |
@s1monw any update on this? We have some really large indices, and big searches over terrabytes of data can bring down the cluster right now because the searches just keep going forever :-( |
@avleen we are actively developing this, so hopefully soon! |
Related: #4261 |
Interesting! BTW, is there any impact on bulk operations? Like bulk update? Meaning once the circuit breaks the bulk operation will still go on but all remaining updates targeting particular shard will not make it? |
Closing this issue since #4261 landed. |
I'd love to see ES automatically detect when a query is going to use more than a certain percentage of the heap, and automatically use temporary files to do its sorting, merging and so on. That would give it the ability to run arbitrary queries (like MySQL) without bringing down the node. The query would just take a long time to run. And in many cases, that's absolutely fine -- especially when doing aggregations and similar analytic queries. |
I wouldn't say many cases. Maybe in some cases :-) On Wed, Oct 29, 2014, 17:15 roncemer [email protected] wrote:
|
just putting note here, that though not "on demand", doc values as an option (using on disk storage for certain expensive, memory wise, fields that are used for aggs and/or sorting). A lot of progress has been made both in Lucene and ES to make them faster, 1.4 would be a huge step forward, and the following ES version that would work with Lucene 5 will be even better. We are heavily investing both in Lucene and ES to make this a performant and viable option. |
Shay, I think we'd noticed a significant I/O impact (probably caused by On Wed, Oct 29, 2014, 21:03 Shay Banon [email protected] wrote:
|
+1 |
1 similar comment
+1 |
any update on this ? |
Hiya, following up too on ^^ |
One of the fears that I have when using ElasticSearch is that expensive queries can bring down nodes in my cluster.
It would be really nice if ElasticSearch could detect this type of node-killing event by adding logic that would trigger a circuit breaker and kill the offending query, leaving my node intact. For example, if a search takes X% of the heap, the query would be killed by ElasticSearch. It would be useful to expose the X% of heap_size as a configurable value since the level of concurrency of the system would vary by ES installation.
Another feature that would be helpful is when the circuit breaker is tripped, a response is generated from ElasticSearch saying that the query died from using excess memory.
The text was updated successfully, but these errors were encountered: