-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Deadlock on aggregation in 2.x #22952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@maxcom could you provide us with the aggregations that you are running please? |
I can't find exact query that causes the problem. I'll add more logging to our application and wait for next reproduction. |
The state of the thread is runnable, so I suspect this is not really a deadlock, but rather a memory pressure issue: so much time is spent doing garbage collection that the application does not seem to make any progress. Do you have some monitoring data of garbage collection activity / memory usage of the JVM? |
I do not think that it is a memory issue. CPU is not busy at all; we see no GC activity in our logs. And no progress is made until we restart Elasticsearch (for ~30 minutes). |
I dig into sources of Elasticsearch and I think that is can be some kind of class initialization deadlock. Similar problem is described here: https://ternarysearch.blogspot.ru/2013/07/static-initialization-deadlock.html |
One more stack trace (from Elasticsearch 2.4.3):
|
So this might be a class initialization deadlock actually. Can you provide us with the entire jstack output? EDIT: I had not seen the two above messages where you already mentioned the fact it could be a class init deadlock. |
I think moving the |
Sure, here is full thread dumps from 2.4.3: https://gist.github.com/maxcom/69d54d58284a7b5eea42db363bac5f6a |
Closing as this bug does not exist in 5.x and will be fixed in the upcoming 2.4 release (for which we have no ETA at the moment). Thanks @maxcom for the detailed bug report and being so reactive helping us understand what was happening. |
Elasticsearch version: 2.3.4 and 2.4.3
Plugins installed: None
JVM version: OpenJDK 1.8.0.101-3.b13
OS version: CentOS 6.8
Description of the problem including expected versus actual behavior:
Search threads became completely locked up sometimes. All search threads have the same track trace. No progress is made until we restart Elasticsearch.
We seen this problem on 2.3.4 and 2.4.3.
Steps to reproduce: Do not know how to reproduce it
Provide logs (if relevant):
Here is stack trace from 2.3.4:
The text was updated successfully, but these errors were encountered: