Unassign persistent tasks when the executor node dies #37609
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PersistentTasksClusterService decides if a task should be reassigned by checking there is a node in the cluster with the same Id instead of comparing the ephemeral Id or the actual DiscoveryNode.
https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/persistent/PersistentTasksClusterService.java#L389
PersistentTasksClusterService runs on the master node. If a node exits and re-enters the cluster while there isn't a master node PersistentTasksClusterService will not be active to witness the change and unassign the task. When the node returns and PersistentTasksClusterService is running the assignment appears valid as the node is in the cluster but its departure was missed. This happens during the rolling upgrade tests where there are 3 nodes and minimum master nodes is set to 3.
This change is much larger than it should be because of a rename. Deassociate isn't a word in British or American english dictionaries, the options are disassociate or dissociate.
6.7 only because Zen2 although the rename could be forward ported.