Skip to content

Commit 180eff1

Browse files
committed
Fix issue when relocation source and target routings are failed in same batch update
PR #19715 made AllocationService less lenient, requiring ShardRouting instances that are passed to its applyStartedShards and applyFailedShards methods to exist in the routing table. As primary shard failures also fail initializing replica shards, concurrent replica shard failures that are treated in the same cluster state update might not reference existing replica entries in the routing table anymore. To solve this, PR #19715 ordered the failures by first handling replica before primary failures. There are other failures that influence more than one routing entry, however. When we have a failed shard entry for both a relocation source and target, then, depending on the order, either one or the other might point to an out-dated shard entry. As finding a good order is more difficult than applying the failures, this commit re-adds parts of the ShardRouting re-resolve logic so that the applyFailedShards method can properly treat shard failure batches.
1 parent 97dfa2b commit 180eff1

File tree

1 file changed

+18
-12
lines changed

1 file changed

+18
-12
lines changed

core/src/main/java/org/elasticsearch/cluster/routing/allocation/AllocationService.java

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -235,18 +235,24 @@ public Result applyFailedShards(ClusterState clusterState, List<FailedRerouteAll
235235
FailedRerouteAllocation allocation = new FailedRerouteAllocation(allocationDeciders, routingNodes, clusterState, failedShards,
236236
clusterInfoService.getClusterInfo(), currentNanoTime);
237237

238-
// as failing primaries also fail associated replicas, we fail replicas first here to avoid re-resolving replica ShardRouting
239-
List<FailedRerouteAllocation.FailedShard> orderedFailedShards = new ArrayList<>(failedShards);
240-
orderedFailedShards.sort(Comparator.comparing(failedShard -> failedShard.routingEntry.primary()));
241-
242-
for (FailedRerouteAllocation.FailedShard failedShardEntry : orderedFailedShards) {
243-
ShardRouting failedShard = failedShardEntry.routingEntry;
244-
final int failedAllocations = failedShard.unassignedInfo() != null ? failedShard.unassignedInfo().getNumFailedAllocations() : 0;
245-
UnassignedInfo unassignedInfo = new UnassignedInfo(UnassignedInfo.Reason.ALLOCATION_FAILED, failedShardEntry.message,
246-
failedShardEntry.failure, failedAllocations + 1, currentNanoTime, System.currentTimeMillis(), false,
247-
AllocationStatus.NO_ATTEMPT);
248-
allocation.addIgnoreShardForNode(failedShard.shardId(), failedShard.currentNodeId());
249-
applyFailedShard(allocation, failedShard, unassignedInfo);
238+
for (FailedRerouteAllocation.FailedShard failedShardEntry : failedShards) {
239+
ShardRouting shardToFail = failedShardEntry.routingEntry;
240+
allocation.addIgnoreShardForNode(shardToFail.shardId(), shardToFail.currentNodeId());
241+
// failing a primary also fails initializing replica shards, re-resolve ShardRouting
242+
ShardRouting failedShard = routingNodes.getByAllocationId(shardToFail.shardId(), shardToFail.allocationId().getId());
243+
if (failedShard != null) {
244+
if (failedShard != shardToFail) {
245+
logger.trace("{} shard routing modified in an earlier iteration (previous: {}, current: {})",
246+
shardToFail.shardId(), shardToFail, failedShard);
247+
}
248+
int failedAllocations = failedShard.unassignedInfo() != null ? failedShard.unassignedInfo().getNumFailedAllocations() : 0;
249+
UnassignedInfo unassignedInfo = new UnassignedInfo(UnassignedInfo.Reason.ALLOCATION_FAILED, failedShardEntry.message,
250+
failedShardEntry.failure, failedAllocations + 1, currentNanoTime, System.currentTimeMillis(), false,
251+
AllocationStatus.NO_ATTEMPT);
252+
applyFailedShard(allocation, failedShard, unassignedInfo);
253+
} else {
254+
logger.trace("{} shard routing failed in an earlier iteration (routing: {})", shardToFail.shardId(), shardToFail);
255+
}
250256
}
251257
gatewayAllocator.applyFailedShards(allocation);
252258

0 commit comments

Comments
 (0)