Drop node if asymmetrically partitioned from master #39598

DaveCTurner · 2019-03-02T11:30:20Z

When a node is joining the cluster we ensure that it can send requests to the
master at that time. If it joins the cluster and then loses the ability to
send requests to the master then it should be removed from the cluster. Today
this is not the case: the master can still receive responses to its follower
checks, and receives acknowledgements to cluster state publications, so has no
reason to remove the node.

This commit changes the handling of follower checks so that they fail if they
come from a master that the other node was following but which it now believes
to have failed.

When a node is joining the cluster we ensure that it can send requests to the master _at that time_. If it joins the cluster and _then_ loses the ability to send requests to the master then it should be removed from the cluster. Today this is not the case: the master can still receive responses to its follower checks, and receives acknowledgements to cluster state publications, so has no reason to remove the node. This commit changes the handling of follower checks so that they fail if they come from a master that the other node was following but which it now believes to have failed.

elasticmachine · 2019-03-02T11:30:22Z

Pinging @elastic/es-distributed

DaveCTurner · 2019-03-02T11:50:13Z

@elasticmachine run elasticsearch-ci/bwc

…tion

…ub.com:DaveCTurner/elasticsearch into 2019-03-02-remove-node-on-asymmetric-partition

ywelsch · 2019-03-05T08:53:45Z

server/src/main/java/org/elasticsearch/cluster/coordination/JoinHelper.java

-        });
+    boolean isJoinPending() {
+        // cannot use pendingOutgoingJoins.isEmpty() because it's not properly synchronized.
+        return pendingOutgoingJoins.iterator().hasNext();


I don't understand how using the iterator gives you any stronger guarantees.

The ConcurrentHashMap javadocs say:

* Bear in mind that the results of aggregate status methods including * {@code size}, {@code isEmpty}, and {@code containsValue} are typically * useful only when a map is not undergoing concurrent updates in other threads. * Otherwise the results of these methods reflect transient states * that may be adequate for monitoring or estimation purposes, but not * for program control.

I believe the weakly consistent only guarantee given by ConcurrentHashMap iteration means that following could happen:

pendingOutgoingJoins has one entry, e.

A thread T1 calls isJoinPending and gets the iterator (strictly speaking, we halt inside iterator construction before advance is called).

A thread T2 adds another entry f and removes e (in that order). So pendingOutgoingJoins was never empty.

T1 continues and hasNext() can now return false.

Mutex protection is done in Coordinator, but is not applied upon receiving requests and responses. It is difficult to see if above scenario can lead to issues, but a simpler solution could be to make the pendingOutgoingJoins set synchronized instead (or maybe synchronize on the Coordinator.mutex when manipulating it)?

Also from the Javadocs:

* Iterators, Spliterators and Enumerations return elements reflecting the * state of the hash table at some point at or since the creation of the * iterator/enumeration.

The at some point in that sentence indicates a snapshot-like semantics which would forbid this situation, and also the consequences are benign as far as I can see, but I'm all for reducing unnecessary mental load. I opened #39900.

ywelsch · 2019-03-05T08:58:59Z

server/src/test/java/org/elasticsearch/cluster/coordination/CoordinatorTests.java

@@ -1560,6 +1596,14 @@ void setEmptySeedHostsList() {
            seedHostsList = emptyList();
        }

+        void dropRequestsFrom(ClusterNode sender, ClusterNode destination) {


perhaps call this blackHoleRequestsFrom?

…tion

When a node is joining the cluster we ensure that it can send requests to the master _at that time_. If it joins the cluster and _then_ loses the ability to send requests to the master then it should be removed from the cluster. Today this is not the case: the master can still receive responses to its follower checks, and receives acknowledgements to cluster state publications, so has no reason to remove the node. This commit changes the handling of follower checks so that they fail if they come from a master that the other node was following but which it now believes to have failed.

DaveCTurner added 2 commits March 2, 2019 11:28

Imports

ef1b437

DaveCTurner added >bug v7.0.0 :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.0.0 v7.2.0 labels Mar 2, 2019

DaveCTurner requested review from andrershov and ywelsch March 2, 2019 11:30

DaveCTurner added 7 commits March 2, 2019 12:28

Merge branch 'master' into 2019-03-02-remove-node-on-asymmetric-parti…

c1ee25a

…tion

Merge branch 'master' into 2019-03-02-remove-node-on-asymmetric-parti…

f17c73e

…tion

Do not reject follower checks when rejoining

22fea4a

Do not reject follower checks when rejoining

99cc6e4

Whitespace

1d71b7f

Merge branch '2019-03-02-remove-node-on-asymmetric-partition' of gith…

1ac193a

…ub.com:DaveCTurner/elasticsearch into 2019-03-02-remove-node-on-asymmetric-partition

Whitespace

e2b8ddb

ywelsch approved these changes Mar 5, 2019

View reviewed changes

DaveCTurner added 2 commits March 5, 2019 10:58

Rename

9b29e9a

Merge branch 'master' into 2019-03-02-remove-node-on-asymmetric-parti…

7342ade

…tion

DaveCTurner merged commit 2d28d7b into elastic:master Mar 6, 2019

DaveCTurner deleted the 2019-03-02-remove-node-on-asymmetric-partition branch March 6, 2019 09:23

andrershov removed their request for review March 6, 2019 11:11

jakelandis added v7.0.0-rc2 and removed v7.0.0 labels Apr 3, 2019

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drop node if asymmetrically partitioned from master #39598

Drop node if asymmetrically partitioned from master #39598

Uh oh!

DaveCTurner commented Mar 2, 2019

Uh oh!

elasticmachine commented Mar 2, 2019

Uh oh!

DaveCTurner commented Mar 2, 2019

Uh oh!

ywelsch Mar 5, 2019

Uh oh!

DaveCTurner Mar 5, 2019

Uh oh!

henningandersen Mar 10, 2019

Uh oh!

DaveCTurner Mar 11, 2019

Uh oh!

ywelsch Mar 5, 2019

Uh oh!

Uh oh!

Drop node if asymmetrically partitioned from master #39598

Drop node if asymmetrically partitioned from master #39598

Uh oh!

Conversation

DaveCTurner commented Mar 2, 2019

Uh oh!

elasticmachine commented Mar 2, 2019

Uh oh!

DaveCTurner commented Mar 2, 2019

Uh oh!

ywelsch Mar 5, 2019

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Mar 5, 2019

Choose a reason for hiding this comment

Uh oh!

henningandersen Mar 10, 2019

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Mar 11, 2019

Choose a reason for hiding this comment

Uh oh!

ywelsch Mar 5, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!