Skip to content

[CI] TransformUsageIT testUsage failing #96000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
danielmitterdorfer opened this issue May 10, 2023 · 3 comments · Fixed by #96015
Closed

[CI] TransformUsageIT testUsage failing #96000

danielmitterdorfer opened this issue May 10, 2023 · 3 comments · Fixed by #96015
Labels
:ml/Transform Transform Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@danielmitterdorfer
Copy link
Member

The test failed with

org.elasticsearch.client.ResponseException: method [GET], host [http://[::1]:42877], URI [_xpack/usage], status line [HTTP/1.1 500 Internal Server Error]	
{"error":{"root_cause":[{"type":"null_pointer_exception","reason":"Cannot invoke \"org.elasticsearch.cluster.routing.ShardRouting.state()\" because the return value of \"org.elasticsearch.cluster.routing.RoutingNode.getByShardId(org.elasticsearch.index.shard.ShardId)\" is null"}],"type":"null_pointer_exception","reason":"Cannot invoke \"org.elasticsearch.cluster.routing.ShardRouting.state()\" because the return value of \"org.elasticsearch.cluster.routing.RoutingNode.getByShardId(org.elasticsearch.index.shard.ShardId)\" is null"},"status":500}	
	at __randomizedtesting.SeedInfo.seed([28FB9E766A1CE048:945C02A2D66038C1]:0)	
	at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:347)	
	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:313)	
	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:288)	
	at org.elasticsearch.xpack.transform.integration.TransformUsageIT.testUsage(TransformUsageIT.java:37)	
[...]

Build scan:
https://gradle-enterprise.elastic.co/s/tcqj3z4gyxjgc/tests/:x-pack:plugin:transform:qa:single-node-tests:javaRestTest/org.elasticsearch.xpack.transform.integration.TransformUsageIT/testUsage

Reproduction line:

./gradlew ':x-pack:plugin:transform:qa:single-node-tests:javaRestTest' --tests "org.elasticsearch.xpack.transform.integration.TransformUsageIT.testUsage" -Dtests.seed=28FB9E766A1CE048 -Dtests.locale=ja -Dtests.timezone=Europe/Rome -Druntime.java=20

Applicable branches:
main

Reproduces locally?:
No

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.xpack.transform.integration.TransformUsageIT&tests.test=testUsage

Failure excerpt:

org.elasticsearch.client.ResponseException: method [GET], host [http://[::1]:42877], URI [_xpack/usage], status line [HTTP/1.1 500 Internal Server Error]
{"error":{"root_cause":[{"type":"null_pointer_exception","reason":"Cannot invoke \"org.elasticsearch.cluster.routing.ShardRouting.state()\" because the return value of \"org.elasticsearch.cluster.routing.RoutingNode.getByShardId(org.elasticsearch.index.shard.ShardId)\" is null"}],"type":"null_pointer_exception","reason":"Cannot invoke \"org.elasticsearch.cluster.routing.ShardRouting.state()\" because the return value of \"org.elasticsearch.cluster.routing.RoutingNode.getByShardId(org.elasticsearch.index.shard.ShardId)\" is null"},"status":500}

  at __randomizedtesting.SeedInfo.seed([28FB9E766A1CE048:945C02A2D66038C1]:0)
  at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:347)
  at org.elasticsearch.client.RestClient.performRequest(RestClient.java:313)
  at org.elasticsearch.client.RestClient.performRequest(RestClient.java:288)
  at org.elasticsearch.xpack.transform.integration.TransformUsageIT.testUsage(TransformUsageIT.java:37)
  at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
  at java.lang.reflect.Method.invoke(Method.java:578)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:1623)

@danielmitterdorfer danielmitterdorfer added :ml/Transform Transform >test-failure Triaged test failures from CI labels May 10, 2023
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label May 10, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@danielmitterdorfer
Copy link
Member Author

This might be related to #87001.

danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this issue May 11, 2023
With this commit we check whether there is an available shard routing
before we test whether it has been started. This makes
`DataTiersUsageTransportAction` more resilient to potential temporary
inconsistencies between cluster state and node stats due to concurrent
shard movement.

Closes elastic#87001
Closes elastic#96000
@danielmitterdorfer
Copy link
Member Author

It is indeed an instance of #87001 (see also attached cluster logs from the test failure). I've pushed a fix that addresses this in #96015.

danielmitterdorfer added a commit that referenced this issue May 11, 2023
With this commit we check whether there is an available shard routing
before we test whether it has been started. This makes
`DataTiersUsageTransportAction` more resilient to potential temporary
inconsistencies between cluster state and node stats due to concurrent
shard movement.

Closes #87001
Closes #96000
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this issue May 11, 2023
With this commit we check whether there is an available shard routing
before we test whether it has been started. This makes
`DataTiersUsageTransportAction` more resilient to potential temporary
inconsistencies between cluster state and node stats due to concurrent
shard movement.

Closes elastic#87001
Closes elastic#96000
elasticsearchmachine pushed a commit that referenced this issue May 11, 2023
With this commit we check whether there is an available shard routing
before we test whether it has been started. This makes
`DataTiersUsageTransportAction` more resilient to potential temporary
inconsistencies between cluster state and node stats due to concurrent
shard movement.

Closes #87001
Closes #96000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml/Transform Transform Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants