[DOCS] Adds overview and API ref for voting configurations

lcawl · lcawl · commit 2368b169702e · 2018-12-21T16:09:31.000-08:00
diff --git a/docs/reference/cluster.asciidoc b/docs/reference/cluster.asciidoc
@@ -104,3 +104,5 @@ include::cluster/tasks.asciidoc[]
 include::cluster/nodes-hot-threads.asciidoc[]
 
 include::cluster/allocation-explain.asciidoc[]
+
+include::cluster/voting-exclusions.asciidoc[]
diff --git a/docs/reference/cluster/voting-exclusions.asciidoc b/docs/reference/cluster/voting-exclusions.asciidoc
@@ -0,0 +1,57 @@
+[[voting-config-exclusions]]
+== Voting configuration exclusions API
+++++
+<titleabbrev>Voting configuration exclusions</titleabbrev>
+++++
+
+Adds or removes nodes from the voting configuration exclusion list.
+
+[float]
+=== Request
+
+[source,js]
+--------------------------------------------------
+# Add a node to the voting configuration exclusions list 
+POST /_cluster/voting_config_exclusions/<node_name>
+
+# Remove all exclusions from the list
+DELETE /_cluster/voting_config_exclusions
+--------------------------------------------------
+// CONSOLE
+
+[float]
+=== Path parameters
+
+`node_name`::
+  A <<cluster-nodes,node filter>> that identifies {es} nodes.
+
+[float]
+=== Description
+  
+If the <<modules-discovery-settings,`cluster.auto_shrink_voting_configuration` setting>>
+is `true`, the <<modules-discovery-voting,voting configuration>> automatically
+shrinks when you remove master-eligible nodes from the cluster.
+
+If the `cluster.auto_shrink_voting_configuration` setting is `false`, you must
+use this API to remove departed nodes from the voting configuration manually. 
+It adds an entry for that node in the voting configuration exclusions list. The
+cluster then tries to reconfigure the voting configuration to remove that node
+and to prevent it from returning.
+  
+If the API fails, you can safely retry it.  Only a successful response
+guarantees that the node has been removed from the voting configuration and will
+not be reinstated.
+
+NOTE: Voting exclusions are required only when you remove at least half of the
+master-eligible nodes from a cluster in a short time period. They are not
+required when removing master-ineligible nodes or fewer than half of the
+master-eligible nodes.
+
+The
+<<modules-discovery-settings,`cluster.max_voting_config_exclusions` setting>>
+limits the size of the voting configuration exclusion list. The default value is
+`10`. Since voting configuration exclusions are persistent and limited in number,
+you must clean up the list.
+
+For more information, see <<modules-discovery-removing-nodes>>.
+
diff --git a/docs/reference/modules/discovery.asciidoc b/docs/reference/modules/discovery.asciidoc
@@ -13,6 +13,11 @@ module. This module is divided into the following sections:
     unknown, such as when a node has just started up or when the previous
     master has failed.
 
+<<modules-discovery-quorums>>::
+
+    This section describes the detailed design behind the master election and
+    auto-reconfiguration logic.
+
 <<modules-discovery-bootstrap-cluster>>::
 
     Bootstrapping a cluster is required when an Elasticsearch cluster starts up
@@ -39,11 +44,6 @@ module. This module is divided into the following sections:
 
     Cluster state publishing is the process by which the elected master node
     updates the cluster state on all the other nodes in the cluster.
-
-<<modules-discovery-quorums>>::
-
-    This section describes the detailed design behind the master election and
-    auto-reconfiguration logic.
     
 <<modules-discovery-settings,Settings>>::
 
@@ -52,14 +52,16 @@ module. This module is divided into the following sections:
 
 include::discovery/discovery.asciidoc[]
 
+include::discovery/quorums.asciidoc[]
+
+include::discovery/voting.asciidoc[]
+
 include::discovery/bootstrapping.asciidoc[]
 
 include::discovery/adding-removing-nodes.asciidoc[]
 
 include::discovery/publishing.asciidoc[]
 
-include::discovery/quorums.asciidoc[]
-
 include::discovery/fault-detection.asciidoc[]
 
 include::discovery/discovery-settings.asciidoc[]
diff --git a/docs/reference/modules/discovery/adding-removing-nodes.asciidoc b/docs/reference/modules/discovery/adding-removing-nodes.asciidoc
@@ -12,6 +12,7 @@ cluster, and to scale the cluster up and down by adding and removing
 master-ineligible nodes only. However there are situations in which it may be
 desirable to add or remove some master-eligible nodes to or from a cluster.
 
+[[modules-discovery-adding-nodes]]
 ==== Adding master-eligible nodes
 
 If you wish to add some nodes to your cluster, simply configure the new nodes
@@ -24,6 +25,7 @@ cluster. You can use the `cluster.join.timeout` setting to configure how long a
 node waits after sending a request to join a cluster. Its default value is `30s`.
 See <<modules-discovery-settings>>.
 
+[[modules-discovery-removing-nodes]]
 ==== Removing master-eligible nodes
 
 When removing master-eligible nodes, it is important not to remove too many all
@@ -50,7 +52,7 @@ will never automatically move a node on the voting exclusions list back into the
 voting configuration. Once an excluded node has been successfully
 auto-reconfigured out of the voting configuration, it is safe to shut it down
 without affecting the cluster's master-level availability. A node can be added
-to the voting configuration exclusion list using the following API:
+to the voting configuration exclusion list using the <<voting-config-exclusions>> API. For example:
 
 [source,js]
 --------------------------------------------------
diff --git a/docs/reference/modules/discovery/voting.asciidoc b/docs/reference/modules/discovery/voting.asciidoc
@@ -0,0 +1,100 @@
+[[modules-discovery-voting]]
+=== Voting configurations
+
+Each {es} cluster has a _voting configuration_, which is the set of 
+<<master-node,master-eligible nodes>> whose responses are counted when making
+decisions such as electing a new master or committing a new cluster
+state. Decisions are made only after a _quorum_ (more than half) of the nodes in
+the voting configuration respond.
+
+Usually the voting configuration is the same as the set of all the 
+master-eligible nodes that are currently in the cluster. However, there are some
+situations in which they may be different.
+
+IMPORTANT: To ensure the cluster remains available, you **must not stop half or
+more of the nodes in the voting configuration at the same time**. As long as more
+than half of the voting nodes are available, the cluster can work normally. For
+example, if there are three or four master-eligible nodes, the cluster
+can tolerate one unavailable node. If there are two or fewer master-eligible
+nodes, they must all remain available.
+
+After a node joins or leaves the cluster, {es} reacts by automatically making
+corresponding changes to the voting configuration in order to ensure that the
+cluster is as resilient as possible. It is important to wait for this adjustment
+to complete before you remove more nodes from the cluster. For more information,
+see <<modules-discovery-adding-removing-nodes>>.
+
+The current voting configuration is stored in the cluster state so you can
+inspect its current contents as follows:
+
+[source,js]
+--------------------------------------------------
+GET /_cluster/state?filter_path=metadata.cluster_coordination.last_committed_config
+--------------------------------------------------
+// CONSOLE
+
+NOTE: The current voting configuration is not necessarily the same as the set of
+all available master-eligible nodes in the cluster. Altering the voting
+configuration involves taking a vote, so it takes some time to adjust the
+configuration as nodes join or leave the cluster. Also, there are situations
+where the most resilient configuration includes unavailable nodes, or does not
+include some available nodes, and in these situations the voting configuration
+differs from the set of available master-eligible nodes in the cluster.
+
+Larger voting configurations are usually more resilient, so Elasticsearch
+normally prefers to add master-eligible nodes to the voting configuration after
+they join the cluster. Similarly, if a node in the voting configuration
+leaves the cluster and there is another master-eligible node in the cluster that
+is not in the voting configuration then it is preferable to swap these two nodes
+over. The size of the voting configuration is thus unchanged but its
+resilience increases.
+
+It is not so straightforward to automatically remove nodes from the voting
+configuration after they have left the cluster. Different strategies have
+different benefits and drawbacks, so the right choice depends on how the cluster
+will be used. You can control whether the voting configuration automatically
+shrinks by using the
+<<modules-discovery-settings,`cluster.auto_shrink_voting_configuration` setting>>.
+
+NOTE: If `cluster.auto_shrink_voting_configuration` is set to `true`, the
+recommended and default setting, and there are at least three master-eligible
+nodes in the cluster, Elasticsearch remains capable of processing cluster state
+updates as long as all but one of its master-eligible nodes are healthy.
+
+There are situations in which Elasticsearch might tolerate the loss of multiple
+nodes, but this is not guaranteed under all sequences of failures. If the
+`cluster.auto_shrink_voting_configuration` setting is `false`, you must remove
+departed nodes from the voting configuration manually. Use the
+<<voting-config-exclusions,voting exclusions API>> to achieve the desired level
+of resilience.
+
+No matter how it is configured, Elasticsearch will not suffer from a 
+"split-brain" inconsistency. The `cluster.auto_shrink_voting_configuration`
+setting affects only its availability in the event of the failure of some of its
+nodes, and the administrative tasks that must be performed as nodes join and
+leave the cluster.
+
+[float]
+==== Even numbers of master-eligible nodes
+
+There should normally be an odd number of master-eligible nodes in a cluster.
+If there is an even number, Elasticsearch leaves one of them out of the voting
+configuration to ensure that it has an odd size. This omission does not decrease
+the failure-tolerance of the cluster. In fact, improves it slightly: if the
+cluster suffers from a network partition that divides it into two equally-sized
+halves then one of the halves will contain a majority of the voting
+configuration and will be able to keep operating. If all of the votes from
+master-eligible nodes were counted, neither side would contain a strict majority
+of the nodes and so the cluster would not be able to make any progress.
+
+For instance if there are four master-eligible nodes in the cluster and the
+voting configuration contained all of them, any quorum-based decision would
+require votes from at least three of them. This situation means that the cluster
+can tolerate the loss of only a single master-eligible node. If this cluster
+were split into two equal halves, neither half would contain three
+master-eligible nodes and the cluster would not be able to make any progress.
+If the voting configuration contains only three of the four master-eligible
+nodes, however, the cluster is still only fully tolerant to the loss of one
+node, but quorum-based decisions require votes from two of the three voting
+nodes. In the event of an even split, one half will contain two of the three
+voting nodes so that half will remain available.