Skip to content

FollowStatsIT#testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex failure #38779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
martijnvg opened this issue Feb 12, 2019 · 6 comments · Fixed by #39053
Closed
Assignees
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI

Comments

@martijnvg
Copy link
Member

[2019-02-11T23:27:48,513][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s_0] [leader1/CauqUnDjRpivCn8-gN51hQ] deleting index
FAILURE 0.49s J6 | FollowStatsIT.testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: 
   > Expected: <1>
   >      but: was <0>
   >    at __randomizedtesting.SeedInfo.seed([A9F2E2FB2B3232F2:982741CACE61265E]:0)
   >    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   >    at org.elasticsearch.xpack.ccr.FollowStatsIT.testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex(FollowStatsIT.java:182)
   >    at java.lang.Thread.run(Thread.java:748)

Build url: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+g1gc/355/console

@martijnvg martijnvg added >test-failure Triaged test failures from CI :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features labels Feb 12, 2019
@martijnvg martijnvg self-assigned this Feb 12, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@martijnvg
Copy link
Member Author

I found the following NPE in the log, that I think is causing this test failure:

[2019-02-11T23:27:48,452][WARN ][o.e.p.PersistentTasksNodeService] [node_s_0] task kD8YzUhHTK6uKNBNQI-1ZQ-0 failed with an exception
  1> java.lang.NullPointerException: null
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.lambda$fetchFollowerShardInfo$7(ShardFollowTasksExecutor.java:305) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onCompletion(TransportBroadcastByNodeAction.java:383) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onNodeResponse(TransportBroadcastByNodeAction.java:352) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:324) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:314) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1108) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1189) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1169) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:417) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:391) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:687) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1>    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]

Index stats response has no commit stats and the ShardFollowTasksExecutor#fetchFollowerShardInfo() method expects that to be there, which causes the NPE. The other consequence of this NPE is that the shard follow persistent task does not get assigned. The test expects shard follow stats, but because no shard follow task got assigned, that assertion fails.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Feb 12, 2019
…o stats

The should fix the following NPE:

```
[2019-02-11T23:27:48,452][WARN ][o.e.p.PersistentTasksNodeService] [node_s_0] task kD8YzUhHTK6uKNBNQI-1ZQ-0 failed with an exception
  1> java.lang.NullPointerException: null
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.lambda$fetchFollowerShardInfo$7(ShardFollowTasksExecutor.java:305) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onCompletion(TransportBroadcastByNodeAction.java:383) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onNodeResponse(TransportBroadcastByNodeAction.java:352) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:324) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:314) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1108) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1189) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1169) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:417) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:391) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:687) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1>    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
```

Relates to elastic#38779
@martijnvg
Copy link
Member Author

Another instance of this failure with the same NPE in the logs: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.0+release-tests/25/console

martijnvg added a commit that referenced this issue Feb 12, 2019
Relates to #38779
martijnvg added a commit that referenced this issue Feb 12, 2019
Relates to #38779
martijnvg added a commit that referenced this issue Feb 12, 2019
Relates to #38779
martijnvg added a commit that referenced this issue Feb 12, 2019
Relates to #38779
martijnvg added a commit that referenced this issue Feb 13, 2019
…o stats (#38782)

The should fix the following NPE:

```
[2019-02-11T23:27:48,452][WARN ][o.e.p.PersistentTasksNodeService] [node_s_0] task kD8YzUhHTK6uKNBNQI-1ZQ-0 failed with an exception
  1> java.lang.NullPointerException: null
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.lambda$fetchFollowerShardInfo$7(ShardFollowTasksExecutor.java:305) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onCompletion(TransportBroadcastByNodeAction.java:383) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onNodeResponse(TransportBroadcastByNodeAction.java:352) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:324) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:314) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1108) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1189) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1169) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:417) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:391) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:687) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1>    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
```

Relates to #38779
martijnvg added a commit that referenced this issue Feb 14, 2019
…o stats (#38782)

The should fix the following NPE:

```
[2019-02-11T23:27:48,452][WARN ][o.e.p.PersistentTasksNodeService] [node_s_0] task kD8YzUhHTK6uKNBNQI-1ZQ-0 failed with an exception
  1> java.lang.NullPointerException: null
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.lambda$fetchFollowerShardInfo$7(ShardFollowTasksExecutor.java:305) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onCompletion(TransportBroadcastByNodeAction.java:383) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onNodeResponse(TransportBroadcastByNodeAction.java:352) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:324) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:314) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1108) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1189) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1169) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:417) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:391) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:687) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1>    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
```

Relates to #38779
martijnvg added a commit that referenced this issue Feb 14, 2019
…o stats (#38782)

The should fix the following NPE:

```
[2019-02-11T23:27:48,452][WARN ][o.e.p.PersistentTasksNodeService] [node_s_0] task kD8YzUhHTK6uKNBNQI-1ZQ-0 failed with an exception
  1> java.lang.NullPointerException: null
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.lambda$fetchFollowerShardInfo$7(ShardFollowTasksExecutor.java:305) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onCompletion(TransportBroadcastByNodeAction.java:383) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onNodeResponse(TransportBroadcastByNodeAction.java:352) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:324) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:314) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1108) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1189) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1169) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:417) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:391) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:687) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1>    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
```

Relates to #38779
martijnvg added a commit that referenced this issue Feb 14, 2019
…o stats (#38782)

The should fix the following NPE:

```
[2019-02-11T23:27:48,452][WARN ][o.e.p.PersistentTasksNodeService] [node_s_0] task kD8YzUhHTK6uKNBNQI-1ZQ-0 failed with an exception
  1> java.lang.NullPointerException: null
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.lambda$fetchFollowerShardInfo$7(ShardFollowTasksExecutor.java:305) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onCompletion(TransportBroadcastByNodeAction.java:383) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onNodeResponse(TransportBroadcastByNodeAction.java:352) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:324) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:314) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1108) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1189) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1169) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:417) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:391) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:687) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1>    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
```

Relates to #38779
@martijnvg
Copy link
Member Author

Closing, because a fix has been pushed via #38782.

@bizybot
Copy link
Contributor

bizybot commented Feb 18, 2019

This was reproduced on 6.7, the change has been backported (8d4d009) but still is failing.

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.7+multijob-unix-compatibility/os=oraclelinux-6/23

Reopening the issue.

@bizybot bizybot reopened this Feb 18, 2019
@martijnvg
Copy link
Member Author

The persistent task is suppose to fail with IndexNotFoundException; this happens, only in a place were this kind of exception is not handled correctly. If the shard follow task fails with a fatal exception (which IndexNotFoundException is) then index replication should stop and the fatal exception field should be set, so that later via ccr stats api the fatal exception can be inspected.

At shard follow task startup, the shard follow task is removed instead of the behaviour described above and then the shard follow task can no longer be inspected via the ccr stats api, which is why this test fails.

The fix here, is to change error handling at shard follow task startup, instead of removing the shard follow task, mark the shard follow task as failed (by setting the fatalException field).

[2019-02-17T21:39:30,849][WARN ][o.e.p.PersistentTasksNodeService] [node_s_0] task bnp2HKNFSvyX-4c3KJUDdA-0 failed with an exception
  1> org.elasticsearch.index.IndexNotFoundException: no such index
  1>    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:728) ~[ela
sticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:680) ~[elasticsearch
-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:636) ~[elasticsearch-6.7.
0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.
0-SNAPSHOT]
  1>    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6
.7.0-SNAPSHOT]
  1>    at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:75) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.
7.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.<init>(TransportBroadcastByNodeAction.java:263) ~[elasticsearch-6
.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:236) ~[elasticsearch-6.7.0-SNAP
SHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:78) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:87) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:76) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:403) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.execute(AbstractClient.java:1269) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.stats(AbstractClient.java:1591) ~[elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.fetchFollowerShardInfo(ShardFollowTasksExecutor.java:297) ~[main/:?]
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.nodeOperation(ShardFollowTasksExecutor.java:289) ~[main/:?]
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.lambda$nodeOperation$3(ShardFollowTasksExecutor.java:283) ~[main/:?]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-6.7.0-SNAPSHOT.jar:6.7.0-SNAPSHOT]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1>    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]

martijnvg added a commit that referenced this issue Feb 18, 2019
Relates to #38779
martijnvg added a commit that referenced this issue Feb 18, 2019
Relates to #38779
martijnvg added a commit that referenced this issue Feb 18, 2019
Relates to #38779
martijnvg added a commit that referenced this issue Feb 18, 2019
Relates to #38779
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Feb 18, 2019
Prior to this commit, if during fetch leader / follower GCP
a fatal error occurred, then the shard follow task was removed.

This is unexpected, because if such an error occurs during the lifetime of shard follow task then replication is stopped and the fatal error flag is set. This allows the ccr stats api to report the fatal exception that has occurred (instead of the user grepping through the elasticsearch logs).

This issue was found by a rare failure of the  `FollowStatsIT#testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex` test.

Closes elastic#38779
martijnvg added a commit that referenced this issue Feb 19, 2019
Prior to this commit, if during fetch leader / follower GCP
a fatal error occurred, then the shard follow task was removed.

This is unexpected, because if such an error occurs during the lifetime of shard follow task then replication is stopped and the fatal error flag is set. This allows the ccr stats api to report the fatal exception that has occurred (instead of the user grepping through the elasticsearch logs).

This issue was found by a rare failure of the  `FollowStatsIT#testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex` test.

Closes #38779
martijnvg added a commit that referenced this issue Feb 19, 2019
Prior to this commit, if during fetch leader / follower GCP
a fatal error occurred, then the shard follow task was removed.

This is unexpected, because if such an error occurs during the lifetime of shard follow task then replication is stopped and the fatal error flag is set. This allows the ccr stats api to report the fatal exception that has occurred (instead of the user grepping through the elasticsearch logs).

This issue was found by a rare failure of the  `FollowStatsIT#testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex` test.

Closes #38779
martijnvg added a commit that referenced this issue Feb 19, 2019
Prior to this commit, if during fetch leader / follower GCP
a fatal error occurred, then the shard follow task was removed.

This is unexpected, because if such an error occurs during the lifetime of shard follow task then replication is stopped and the fatal error flag is set. This allows the ccr stats api to report the fatal exception that has occurred (instead of the user grepping through the elasticsearch logs).

This issue was found by a rare failure of the  `FollowStatsIT#testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex` test.

Closes #38779
martijnvg added a commit that referenced this issue Feb 19, 2019
Prior to this commit, if during fetch leader / follower GCP
a fatal error occurred, then the shard follow task was removed.

This is unexpected, because if such an error occurs during the lifetime of shard follow task then replication is stopped and the fatal error flag is set. This allows the ccr stats api to report the fatal exception that has occurred (instead of the user grepping through the elasticsearch logs).

This issue was found by a rare failure of the  `FollowStatsIT#testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex` test.

Closes #38779
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants