Suggested changes to quorums.asciidoc

lcawl · DaveCTurner · web-flow · commit 14194c60815d · 2018-12-20T09:25:31.000Z
Co-Authored-By: DaveCTurner &lt;david.turner@elastic.co&gt;
diff --git a/docs/reference/modules/discovery/quorums.asciidoc b/docs/reference/modules/discovery/quorums.asciidoc
@@ -3,37 +3,36 @@
 
 Electing a master node and changing the cluster state are the two fundamental
 tasks that master-eligible nodes must work together to perform. It is important
-that these activities work robustly even if some nodes have failed, and
-Elasticsearch achieves this robustness by only considering each action to have
-succeeded on receipt of responses from a _quorum_, a subset of the
+that these activities work robustly even if some nodes have failed.
+Elasticsearch achieves this robustness by considering each action to have
+succeeded on receipt of responses from a _quorum_, which is a subset of the
 master-eligible nodes in the cluster. The advantage of requiring only a subset
-of the nodes to respond is that it allows for some of the nodes to fail without
-preventing the cluster from making progress, and the quorums are carefully
-chosen so as not to allow the cluster to "split brain", i.e. to be partitioned
-into two pieces each of which may make decisions that are inconsistent with
+of the nodes to respond is that it means some of the nodes can fail without
+preventing the cluster from making progress. The quorums are carefully
+chosen so the cluster does not have a "split brain" scenario where it's partitioned
+into two pieces--each of which may make decisions that are inconsistent with
 those of the other piece.
 
 Elasticsearch allows you to add and remove master-eligible nodes to a running
 cluster. In many cases you can do this simply by starting or stopping the nodes
-as required, as described in more detail in the
-<<modules-discovery-adding-removing-nodes,section on adding and removing
-nodes>>.
+as required. See
+<<modules-discovery-adding-removing-nodes>>.
 
 As nodes are added or removed Elasticsearch maintains an optimal level of fault
 tolerance by updating the cluster's _voting configuration_, which is the set of
 master-eligible nodes whose responses are counted when making decisions such as
-electing a new master or committing a new cluster state. A decision is only made
-once more than half of the nodes in the voting configuration have responded.
+electing a new master or committing a new cluster state. A decision is made
+only after more than half of the nodes in the voting configuration have responded.
 Usually the voting configuration is the same as the set of all the
-master-eligible nodes that are currently in the cluster, but there are some
+master-eligible nodes that are currently in the cluster. However, there are some
 situations in which they may be different.
 
 To be sure that the cluster remains available you **must not stop half or more
 of the nodes in the voting configuration at the same time**. As long as more
 than half of the voting nodes are available the cluster can still work normally.
-This means that if there are three or four master-eligible nodes then the
-cluster can tolerate one of them being unavailable; if there are two or fewer
-master-eligible nodes then they must all remain available.
+This means that if there are three or four master-eligible nodes, the
+cluster can tolerate one of them being unavailable. If there are two or fewer
+master-eligible nodes, they must all remain available.
 
 After a node has joined or left the cluster the elected master must issue a
 cluster-state update that adjusts the voting configuration to match, and this
@@ -43,43 +42,43 @@ to complete before removing more nodes from the cluster.
 [float]
 ==== Setting the initial quorum
 
-When a brand-new cluster starts up for the first time, one of the tasks it must
-perform is to elect its first master node, for which it needs to know the set
-of master-eligible nodes whose votes should count in this first election. This
+When a brand-new cluster starts up for the first time, it must
+elect its first master node. To do this election, it needs to know the set
+of master-eligible nodes whose votes should count. This
 initial voting configuration is known as the _bootstrap configuration_ and is
 set in the <<modules-discovery-bootstrap-cluster,cluster bootstrapping
 process>>.
 
 It is important that the bootstrap configuration identifies exactly which nodes
-should vote in the first election, and it is not sufficient to configure each
+should vote in the first election. It is not sufficient to configure each
 node with an expectation of how many nodes there should be in the cluster. It
 is also important to note that the bootstrap configuration must come from
 outside the cluster: there is no safe way for the cluster to determine the
 bootstrap configuration correctly on its own.
 
-If the bootstrap configuration is not set correctly then there is a risk when
-starting up a brand-new cluster is that you accidentally form two separate
-clusters instead of one. This could lead to data loss: you might start using
-both clusters before noticing that anything had gone wrong, and it will then be
+If the bootstrap configuration is not set correctly, when
+you start a brand-new cluster there is a risk that you will accidentally form two separate
+clusters instead of one. This situation can lead to data loss: you might start using
+both clusters before you notice that anything has gone wrong and it is
 impossible to merge them together later.
 
 NOTE: To illustrate the problem with configuring each node to expect a certain
 cluster size, imagine starting up a three-node cluster in which each node knows
 that it is going to be part of a three-node cluster. A majority of three nodes
-is two, so normally the first two nodes to discover each other will form a
-cluster and the third node will join them a short time later. However, imagine
-that four nodes were erroneously started instead of three: in this case there
+is two, so normally the first two nodes to discover each other form a
+cluster and the third node joins them a short time later. However, imagine
+that four nodes were erroneously started instead of three. In this case, there
 are enough nodes to form two separate clusters. Of course if each node is
-started manually then it's unlikely that too many nodes are started, but it's
-certainly possible to get into this situation if using a more automated
-orchestrator, particularly if the orchestrator is not resilient to failures
+started manually then it's unlikely that too many nodes are started. If you're using an automated orchestrator, however, it's
+certainly possible to get into this situation--
+particularly if the orchestrator is not resilient to failures
 such as network partitions.
 
 The initial quorum is only required the very first time a whole cluster starts
-up: new nodes joining an established cluster can safely obtain all the
-information they need from the elected master, and nodes that have previously
-been part of a cluster will have stored to disk all the information required
-when restarting.
+up. New nodes joining an established cluster can safely obtain all the
+information they need from the elected master. Nodes that have previously
+been part of a cluster will have stored to disk all the information that is required
+when they restart.
 
 [float]
 ==== Cluster maintenance, rolling restarts and migrations
@@ -99,7 +98,7 @@ nodes is not changing permanently.
 Nodes may join or leave the cluster, and Elasticsearch reacts by making
 corresponding changes to the voting configuration in order to ensure that the
 cluster is as resilient as possible. The default auto-reconfiguration behaviour
-is expected to give the best results in most situation. The current voting
+is expected to give the best results in most situations. The current voting
 configuration is stored in the cluster state so you can inspect its current
 contents as follows:
 
@@ -111,24 +110,24 @@ GET /_cluster/state?filter_path=metadata.cluster_coordination.last_committed_con
 
 NOTE: The current voting configuration is not necessarily the same as the set of
 all available master-eligible nodes in the cluster. Altering the voting
-configuration itself involves taking a vote, so it takes some time to adjust the
+configuration involves taking a vote, so it takes some time to adjust the
 configuration as nodes join or leave the cluster. Also, there are situations
 where the most resilient configuration includes unavailable nodes, or does not
 include some available nodes, and in these situations the voting configuration
-will differ from the set of available master-eligible nodes in the cluster.
+differs from the set of available master-eligible nodes in the cluster.
 
-Larger voting configurations are usually more resilient, so Elasticsearch will
-normally prefer to add master-eligible nodes to the voting configuration once
-they have joined the cluster. Similarly, if a node in the voting configuration
+Larger voting configurations are usually more resilient, so Elasticsearch
+normally prefers to add master-eligible nodes to the voting configuration after
+they join the cluster. Similarly, if a node in the voting configuration
 leaves the cluster and there is another master-eligible node in the cluster that
 is not in the voting configuration then it is preferable to swap these two nodes
-over, leaving the size of the voting configuration unchanged but increasing its
-resilience.
+over. The size of the voting configuration is thus unchanged but its
+resilience increases.
 
 It is not so straightforward to automatically remove nodes from the voting
-configuration after they have left the cluster, and different strategies have
+configuration after they have left the cluster. Different strategies have
 different benefits and drawbacks, so the right choice depends on how the cluster
-will be used and is controlled by the following setting.
+will be used. You can control whether the voting configuration automatically shrinks by using the following setting:
 
 `cluster.auto_shrink_voting_configuration`::
 
@@ -151,30 +150,30 @@ configuration manually, using the
 <<modules-discovery-adding-removing-nodes,voting exclusions API>>, to achieve
 the desired level of resilience.
 
-Note that Elasticsearch will not suffer from a "split-brain" inconsistency
-however it is configured. This setting only affects its availability in the
+No matter how it is configured, Elasticsearch will not suffer from a "split-brain" inconsistency.
+The `cluster.auto_shrink_voting_configuration` setting affects only its availability in the
 event of the failure of some of its nodes, and the administrative tasks that
 must be performed as nodes join and leave the cluster.
 
 [float]
 ==== Even numbers of master-eligible nodes
 
 There should normally be an odd number of master-eligible nodes in a cluster.
-If there is an even number then Elasticsearch will leave one of them out of the
-voting configuration to ensure that it has an odd size. This does not decrease
-the failure-tolerance of the cluster, and in fact improves it slightly: if the
+If there is an even number, Elasticsearch leaves one of them out of the
+voting configuration to ensure that it has an odd size. This omission does not decrease
+the failure-tolerance of the cluster. In fact, improves it slightly: if the
 cluster is partitioned into two even halves then one of the halves will contain
-a majority of the voting configuration and will be able to keep operating,
-whereas if all of the master-eligible nodes' votes were counted then neither
+a majority of the voting configuration and will be able to keep operating.
+If all of the master-eligible nodes' votes were counted, neither
 side could make any progress in this situation.
 
 For instance if there are four master-eligible nodes in the cluster and the
-voting configuration contained all of them then any quorum-based decision would
-require votes from at least three of them, which means that the cluster can only
-tolerate the loss of a single master-eligible node. If this cluster were split
-into two equal halves then neither half would contain three master-eligible
-nodes so would not be able to make any progress. However if the voting
-configuration contains only three of the four master-eligible nodes then the
+voting configuration contained all of them, any quorum-based decision would
+require votes from at least three of them. This situation means that the cluster can
+tolerate the loss of only a single master-eligible node. If this cluster were split
+into two equal halves, neither half would contain three master-eligible
+nodes and the cluster would not be able to make any progress. If the voting
+configuration contains only three of the four master-eligible nodes, however, the
 cluster is still only fully tolerant to the loss of one node, but quorum-based
 decisions require votes from two of the three voting nodes. In the event of an
 even split, one half will contain two of the three voting nodes so will remain