-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[CI] org.elasticsearch.client.TasksIT testGetInvalidTask fails sporadically on master #36001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-core-features |
This is failing again on 6.x in https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+multijob-unix-compatibility/os=centos/96/consoleFull with (see stack trace for more details):
Reproduction line:
Stack trace07:44:59 > Throwable #1: ElasticsearchStatusException[Elasticsearch exception [type=no_shard_available_action_exception, reason=No shard available for [get [.tasks][task][doesNotExistNodeName:123]: routing [null]]]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_index_shard_state_exception, reason=CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]]]; 07:44:59 > at __randomizedtesting.SeedInfo.seed([EAA6597678B5DE87:5476CCFF2E32D344]:0) 07:44:59 > at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177) 07:44:59 > at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:2042) 07:44:59 > at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:2018) 07:44:59 > at org.elasticsearch.client.RestHighLevelClient$2.onFailure(RestHighLevelClient.java:1993) 07:44:59 > at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onDefinitiveFailure(RestClient.java:850) 07:44:59 > at org.elasticsearch.client.RestClient$1.retryIfPossible(RestClient.java:588) 07:44:59 > at org.elasticsearch.client.RestClient$1.completed(RestClient.java:550) 07:44:59 > at org.elasticsearch.client.RestClient$1.completed(RestClient.java:531) 07:44:59 > at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119) 07:44:59 > at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177) 07:44:59 > at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436) 07:44:59 > at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326) 07:44:59 > at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265) 07:44:59 > at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) 07:44:59 > at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) 07:44:59 > at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114) 07:44:59 > at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) 07:44:59 > at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) 07:44:59 > at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) 07:44:59 > at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) 07:44:59 > at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) 07:44:59 > at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) 07:44:59 > at java.lang.Thread.run(Thread.java:748) 07:44:59 > Suppressed: org.elasticsearch.client.ResponseException: method [GET], host [http://[::1]:39814], URI [/_tasks/doesNotExistNodeName:123?wait_for_completion=false], status line [HTTP/1.1 503 Service Unavailable] 07:44:59 > {"error":{"root_cause":[{"type":"illegal_index_shard_state_exception","reason":"CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]","index_uuid":"gWoZj4bWR12DZMhpSR2vWw","shard":"0","index":".tasks"}],"type":"no_shard_available_action_exception","reason":"No shard available for [get [.tasks][task][doesNotExistNodeName:123]: routing [null]]","caused_by":{"type":"illegal_index_shard_state_exception","reason":"CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]","index_uuid":"gWoZj4bWR12DZMhpSR2vWw","shard":"0","index":".tasks"}},"status":503} 07:44:59 > at org.elasticsearch.client.RestClient$1.completed(RestClient.java:546) 07:44:59 > ... 16 more 07:44:59 > Caused by: [.tasks/gWoZj4bWR12DZMhpSR2vWw][[.tasks][0]] ElasticsearchException[Elasticsearch exception [type=illegal_index_shard_state_exception, reason=CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]]] 07:44:59 > at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:509) 07:44:59 > at org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:420) 07:44:59 > at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:450) 07:44:59 > at org.elasticsearch.ElasticsearchException.failureFromXContent(ElasticsearchException.java:616) 07:44:59 > at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:169) 07:44:59 > ... 22 more This does not reproduce locally. First of all, I think the reason is that we need to remove the restriction on
As there are warnings about running tasks in @markharwood as you have recently added these tests, can you please have a closer look? |
Ran this test class in a loop a couple hundred times and wasn't able to reproduce. It looks like there was a about a month-long period where a test in this class failed with that stacktrace since it was added in May 2018. There hasn't been a failure of this kind in this class since 17 December 2018 |
Ran this test in a loop on 6.7 and could not reproduce the error or the warnings about still-running tasks. |
I went thru a bunch of old build emails and cannot find any failures in recent history so I am closing this. Feel free to open if these start occurring again. |
Failed in a build today
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/86/console
and in an (unrelated) PR build yesterday
#35648 (comment)
It didn't reproduce locally in 4096 iterations.
Stacktrace
ElasticsearchStatusException[Elasticsearch exception [type=no_shard_available_action_exception, reason=No shard available for [get [.tasks][task][doesNotExistNodeName:123]: routing [null]]]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_index_shard_state_exception, reason=CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]]]; at __randomizedtesting.SeedInfo.seed([3F5A7BB7634DF43F:818AEE3E35CAF9FC]:0) at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177) at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1703) at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1679) at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseOptionalEntity(RestHighLevelClient.java:1476) at org.elasticsearch.client.TasksClient.get(TasksClient.java:84) at org.elasticsearch.client.ESRestHighLevelClientTestCase.execute(ESRestHighLevelClientTestCase.java:79) at org.elasticsearch.client.TasksIT.testGetInvalidTask(TasksIT.java:121) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at java.lang.Thread.run(Thread.java:748) Suppressed: org.elasticsearch.client.ResponseException: method [GET], host [http://[::1]:38982], URI [/_tasks/doesNotExistNodeName:123?wait_for_completion=false], status line [HTTP/1.1 503 Service Unavailable] {"error":{"root_cause":[{"type":"illegal_index_shard_state_exception","reason":"CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]","index_uuid":"YZxZgHqqSu-94s8M8mh9qA","shard":"0","index":".tasks"}],"type":"no_shard_available_action_exception","reason":"No shard available for [get [.tasks][task][doesNotExistNodeName:123]: routing [null]]","caused_by":{"type":"illegal_index_shard_state_exception","reason":"CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]","index_uuid":"YZxZgHqqSu-94s8M8mh9qA","shard":"0","index":".tasks"}},"status":503} at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:687) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:218) at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseOptionalEntity(RestHighLevelClient.java:1471) ... 40 more Caused by: org.elasticsearch.client.ResponseException: method [GET], host [http://[::1]:38982], URI [/_tasks/doesNotExistNodeName:123?wait_for_completion=false], status line [HTTP/1.1 503 Service Unavailable] {"error":{"root_cause":[{"type":"illegal_index_shard_state_exception","reason":"CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]","index_uuid":"YZxZgHqqSu-94s8M8mh9qA","shard":"0","index":".tasks"}],"type":"no_shard_available_action_exception","reason":"No shard available for [get [.tasks][task][doesNotExistNodeName:123]: routing [null]]","caused_by":{"type":"illegal_index_shard_state_exception","reason":"CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]","index_uuid":"YZxZgHqqSu-94s8M8mh9qA","shard":"0","index":".tasks"}},"status":503} at org.elasticsearch.client.RestClient$1.completed(RestClient.java:307) at org.elasticsearch.client.RestClient$1.completed(RestClient.java:292) at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119) at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177) at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436) at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326) at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265) at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114) at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) ... 1 more Caused by: [.tasks/YZxZgHqqSu-94s8M8mh9qA][[.tasks][0]] ElasticsearchException[Elasticsearch exception [type=illegal_index_shard_state_exception, reason=CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]]] at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:491) at org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:402) at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:432) at org.elasticsearch.ElasticsearchException.failureFromXContent(ElasticsearchException.java:598) at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:169) ... 43 moreThe text was updated successfully, but these errors were encountered: