Master should wait on cluster state publication when failing a shard #15468

jasontedor · 2015-12-16T03:05:45Z

When a client sends a request to fail a shard to the master, the current
behavior is that the master will submit the cluster state update task
and then immediately send a successful response back to the client;
additionally, if there are any failures while processing the cluster
state update task to fail the shard, then the client will never be
notified of these failures.

This commit modifies the master behavior when handling requests to fail
a shard. In particular, the master will now wait until successful
publication of the cluster state update before notifying the request
client that the shard is marked as failed; additionally, the client is
now notified of any failures during the execution of the cluster state
update task.

Relates #14252

bleskes · 2015-12-16T12:44:22Z

core/src/main/java/org/elasticsearch/cluster/action/shard/ShardStateAction.java

+                    }
+
+                    @Override
+                    public void clusterStateProcessed(String source, ClusterState oldState, ClusterState newState) {


this keeps bugging me :) we should something on the executor as well....

@bleskes What keeps bugging you?

We discussed this through another channel; the issue is the possible reroutes are being issued for every task rather than once per batch. Modifying this behavior will require a little extra machinery in the cluster state task execution framework. I opened #15482 to address.

bleskes · 2015-12-16T12:45:17Z

LGTM

When a client sends a request to fail a shard to the master, the current behavior is that the master will submit the cluster state update task and then immediately send a successful response back to the client; additionally, if there are any failures while processing the cluster state update task to fail the shard, then the client will never be notified of these failures. This commit modifies the master behavior when handling requests to fail a shard. In particular, the master will now wait until successful publication of the cluster state update before notifying the request client that the shard is marked as failed; additionally, the client is now notified of any failures during the execution of the cluster state update task. Relates #14252

…d-failures Master should wait on cluster state publication when failing a shard

jasontedor added >enhancement review resiliency v5.0.0-alpha1 labels Dec 16, 2015

jasontedor assigned bleskes Dec 16, 2015

jasontedor mentioned this pull request Dec 16, 2015

Wait on shard failures #14252

Closed

9 tasks

bleskes reviewed Dec 16, 2015
View reviewed changes

jasontedor added a commit that referenced this pull request Dec 16, 2015

Merge pull request #15468 from jasontedor/master-side-of-wait-on-shar…

3e8768f

…d-failures Master should wait on cluster state publication when failing a shard

jasontedor merged commit 3e8768f into elastic:master Dec 16, 2015

jasontedor deleted the master-side-of-wait-on-shard-failures branch December 16, 2015 15:39

clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. and removed :Cluster labels Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Master should wait on cluster state publication when failing a shard #15468

Master should wait on cluster state publication when failing a shard #15468

jasontedor commented Dec 16, 2015

bleskes Dec 16, 2015

jasontedor Dec 16, 2015

jasontedor Dec 16, 2015

bleskes commented Dec 16, 2015

Master should wait on cluster state publication when failing a shard #15468

Master should wait on cluster state publication when failing a shard #15468

Conversation

jasontedor commented Dec 16, 2015

bleskes Dec 16, 2015

Choose a reason for hiding this comment

jasontedor Dec 16, 2015

Choose a reason for hiding this comment

jasontedor Dec 16, 2015

Choose a reason for hiding this comment

bleskes commented Dec 16, 2015