Skip to content

Commit 5b608a9

Browse files
committed
Some suggested changes
- add more side-conditions on when the voting configuration shrinks - add TOC entries for voting (and omitted fault-detection) sections - add link to voting section from quorum section to define 'voting configuration' - move 'Setting the initial quorum' into voting config section - minor rewording/whitespace fixes
1 parent 419b48e commit 5b608a9

File tree

6 files changed

+101
-82
lines changed

6 files changed

+101
-82
lines changed

docs/reference/cluster/voting-exclusions.asciidoc

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
[[voting-config-exclusions]]
22
== Voting configuration exclusions API
33
++++
4-
<titleabbrev>Voting configuration exclusions</titleabbrev>
4+
<titleabbrev>Voting Configuration Exclusions</titleabbrev>
55
++++
66

7-
Adds or removes nodes from the voting configuration exclusion list.
7+
Adds or removes master-eligible nodes from the
8+
<<modules-discovery-voting,voting configuration exclusion list>>.
89

910
[float]
1011
=== Request
@@ -28,16 +29,20 @@ DELETE /_cluster/voting_config_exclusions
2829
[float]
2930
=== Description
3031

31-
If the <<modules-discovery-settings,`cluster.auto_shrink_voting_configuration` setting>>
32-
is `true`, the <<modules-discovery-voting,voting configuration>> automatically
33-
shrinks when you remove master-eligible nodes from the cluster.
34-
35-
If the `cluster.auto_shrink_voting_configuration` setting is `false`, you must
36-
use this API to remove departed nodes from the voting configuration manually.
37-
It adds an entry for that node in the voting configuration exclusions list. The
38-
cluster then tries to reconfigure the voting configuration to remove that node
39-
and to prevent it from returning.
40-
32+
If the <<modules-discovery-settings,`cluster.auto_shrink_voting_configuration`
33+
setting>> is `true`, and there are more than three master-eligible nodes in the
34+
cluster, and you remove fewer than half of the master-eligible nodes in the
35+
cluster at once, then the <<modules-discovery-voting,voting configuration>>
36+
automatically shrinks when you remove master-eligible nodes from the cluster.
37+
38+
If the `cluster.auto_shrink_voting_configuration` setting is `false`, or you
39+
wish to shrink the voting configuration to contain fewer than three nodes, or
40+
you wish to remove half or more of the master-eligible nodes in the cluster at
41+
once, you must use this API to remove departed nodes from the voting
42+
configuration manually. It adds an entry for that node in the voting
43+
configuration exclusions list. The cluster then tries to reconfigure the voting
44+
configuration to remove that node and to prevent it from returning.
45+
4146
If the API fails, you can safely retry it. Only a successful response
4247
guarantees that the node has been removed from the voting configuration and will
4348
not be reinstated.
@@ -47,11 +52,11 @@ master-eligible nodes from a cluster in a short time period. They are not
4752
required when removing master-ineligible nodes or fewer than half of the
4853
master-eligible nodes.
4954

50-
The
51-
<<modules-discovery-settings,`cluster.max_voting_config_exclusions` setting>>
52-
limits the size of the voting configuration exclusion list. The default value is
53-
`10`. Since voting configuration exclusions are persistent and limited in number,
54-
you must clean up the list.
55+
The <<modules-discovery-settings,`cluster.max_voting_config_exclusions`
56+
setting>> limits the size of the voting configuration exclusion list. The
57+
default value is `10`. Since voting configuration exclusions are persistent and
58+
limited in number, you must clear the voting config exclusions list once the
59+
exclusions are no longer required.
5560

5661
For more information, see <<modules-discovery-removing-nodes>>.
5762

docs/reference/modules/discovery.asciidoc

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,13 @@ module. This module is divided into the following sections:
1515

1616
<<modules-discovery-quorums>>::
1717

18-
This section describes the detailed design behind the master election and
19-
auto-reconfiguration logic.
18+
This section describes how {es} uses a quorum-based voting mechanism to
19+
make decisions even if some nodes are unavailable.
20+
21+
<<modules-discovery-voting>>::
22+
23+
This section describes the concept of voting configurations, which {es}
24+
automatically updates as nodes leave and join the cluster.
2025

2126
<<modules-discovery-bootstrap-cluster>>::
2227

@@ -44,7 +49,11 @@ module. This module is divided into the following sections:
4449

4550
Cluster state publishing is the process by which the elected master node
4651
updates the cluster state on all the other nodes in the cluster.
47-
52+
53+
<<cluster-fault-detection>>::
54+
55+
{es} performs health checks to detect and remove faulty nodes.
56+
4857
<<modules-discovery-settings,Settings>>::
4958

5059
There are settings that enable users to influence the discovery, cluster
@@ -64,4 +73,4 @@ include::discovery/publishing.asciidoc[]
6473

6574
include::discovery/fault-detection.asciidoc[]
6675

67-
include::discovery/discovery-settings.asciidoc[]
76+
include::discovery/discovery-settings.asciidoc[]

docs/reference/modules/discovery/discovery-settings.asciidoc

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,12 @@ Discovery and cluster formation are affected by the following settings:
55

66
`cluster.auto_shrink_voting_configuration`::
77

8-
Controls whether the <<modules-discovery-voting,voting configuration>> sheds
9-
departed nodes automatically, as long as it still contains at least 3 nodes.
10-
The default value is `true`. If set to `false`, the voting configuration
11-
never shrinks automatically; you must remove departed nodes manually with
12-
the <<voting-config-exclusions,voting configuration exclusions API>>.
8+
Controls whether the <<modules-discovery-voting,voting configuration>>
9+
sheds departed nodes automatically, as long as it still contains at least 3
10+
nodes. The default value is `true`. If set to `false`, the voting
11+
configuration never shrinks automatically and you must remove departed
12+
nodes manually with the <<voting-config-exclusions,voting configuration
13+
exclusions API>>.
1314

1415
[[master-election-settings]]`cluster.election.back_off_time`::
1516

@@ -160,9 +161,11 @@ APIs are not be blocked and can run on any available node.
160161

161162
Provides a list of master-eligible nodes in the cluster. The list contains
162163
either an array of hosts or a comma-delimited string. Each value has the
163-
format `host:port` or `host`, where `port` defaults to the setting `transport.profiles.default.port`. Note that IPv6 hosts must be bracketed.
164+
format `host:port` or `host`, where `port` defaults to the setting
165+
`transport.profiles.default.port`. Note that IPv6 hosts must be bracketed.
164166
The default value is `127.0.0.1, [::1]`. See <<unicast.hosts>>.
165167

166168
`discovery.zen.ping.unicast.hosts.resolve_timeout`::
167169

168-
Sets the amount of time to wait for DNS lookups on each round of discovery. This is specified as a <<time-units, time unit>> and defaults to `5s`.
170+
Sets the amount of time to wait for DNS lookups on each round of discovery.
171+
This is specified as a <<time-units, time unit>> and defaults to `5s`.

docs/reference/modules/discovery/fault-detection.asciidoc

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,9 @@
22
=== Cluster fault detection
33

44
The elected master periodically checks each of the nodes in the cluster to
5-
ensure that they are still connected and healthy. Each node in the cluster also periodically checks the health of the elected master. These checks
6-
are known respectively as _follower checks_ and _leader checks_.
5+
ensure that they are still connected and healthy. Each node in the cluster also
6+
periodically checks the health of the elected master. These checks are known
7+
respectively as _follower checks_ and _leader checks_.
78

89
Elasticsearch allows these checks to occasionally fail or timeout without
910
taking any action. It considers a node to be faulty only after a number of
@@ -16,4 +17,4 @@ and retry setting values and attempts to remove the node from the cluster.
1617
Similarly, if a node detects that the elected master has disconnected, this
1718
situation is treated as an immediate failure. The node bypasses the timeout and
1819
retry settings and restarts its discovery phase to try and find or elect a new
19-
master.
20+
master.

docs/reference/modules/discovery/quorums.asciidoc

Lines changed: 8 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,13 @@ cluster. In many cases you can do this simply by starting or stopping the nodes
1818
as required. See <<modules-discovery-adding-removing-nodes>>.
1919

2020
As nodes are added or removed Elasticsearch maintains an optimal level of fault
21-
tolerance by updating the cluster's _voting configuration_, which is the set of
22-
master-eligible nodes whose responses are counted when making decisions such as
23-
electing a new master or committing a new cluster state. A decision is made only
24-
after more than half of the nodes in the voting configuration have responded.
25-
Usually the voting configuration is the same as the set of all the
26-
master-eligible nodes that are currently in the cluster. However, there are some
27-
situations in which they may be different.
21+
tolerance by updating the cluster's <<modules-discovery-voting,voting
22+
configuration>>, which is the set of master-eligible nodes whose responses are
23+
counted when making decisions such as electing a new master or committing a new
24+
cluster state. A decision is made only after more than half of the nodes in the
25+
voting configuration have responded. Usually the voting configuration is the
26+
same as the set of all the master-eligible nodes that are currently in the
27+
cluster. However, there are some situations in which they may be different.
2828

2929
To be sure that the cluster remains available you **must not stop half or more
3030
of the nodes in the voting configuration at the same time**. As long as more
@@ -38,46 +38,6 @@ cluster-state update that adjusts the voting configuration to match, and this
3838
can take a short time to complete. It is important to wait for this adjustment
3939
to complete before removing more nodes from the cluster.
4040

41-
[float]
42-
==== Setting the initial quorum
43-
44-
When a brand-new cluster starts up for the first time, it must elect its first
45-
master node. To do this election, it needs to know the set of master-eligible
46-
nodes whose votes should count. This initial voting configuration is known as
47-
the _bootstrap configuration_ and is set in the
48-
<<modules-discovery-bootstrap-cluster,cluster bootstrapping process>>.
49-
50-
It is important that the bootstrap configuration identifies exactly which nodes
51-
should vote in the first election. It is not sufficient to configure each node
52-
with an expectation of how many nodes there should be in the cluster. It is also
53-
important to note that the bootstrap configuration must come from outside the
54-
cluster: there is no safe way for the cluster to determine the bootstrap
55-
configuration correctly on its own.
56-
57-
If the bootstrap configuration is not set correctly, when you start a brand-new
58-
cluster there is a risk that you will accidentally form two separate clusters
59-
instead of one. This situation can lead to data loss: you might start using both
60-
clusters before you notice that anything has gone wrong and it is impossible to
61-
merge them together later.
62-
63-
NOTE: To illustrate the problem with configuring each node to expect a certain
64-
cluster size, imagine starting up a three-node cluster in which each node knows
65-
that it is going to be part of a three-node cluster. A majority of three nodes
66-
is two, so normally the first two nodes to discover each other form a cluster
67-
and the third node joins them a short time later. However, imagine that four
68-
nodes were erroneously started instead of three. In this case, there are enough
69-
nodes to form two separate clusters. Of course if each node is started manually
70-
then it's unlikely that too many nodes are started. If you're using an automated
71-
orchestrator, however, it's certainly possible to get into this situation--
72-
particularly if the orchestrator is not resilient to failures such as network
73-
partitions.
74-
75-
The initial quorum is only required the very first time a whole cluster starts
76-
up. New nodes joining an established cluster can safely obtain all the
77-
information they need from the elected master. Nodes that have previously been
78-
part of a cluster will have stored to disk all the information that is required
79-
when they restart.
80-
8141
[float]
8242
==== Master elections
8343

@@ -103,3 +63,4 @@ and then started again then it will automatically recover, such as during a
10363
<<restart-upgrade,full cluster restart>>. There is no need to take any further
10464
action with the APIs described here in these cases, because the set of master
10565
nodes is not changing permanently.
66+

docs/reference/modules/discovery/voting.asciidoc

Lines changed: 44 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
[[modules-discovery-voting]]
22
=== Voting configurations
33

4-
Each {es} cluster has a _voting configuration_, which is the set of
4+
Each {es} cluster has a _voting configuration_, which is the set of
55
<<master-node,master-eligible nodes>> whose responses are counted when making
6-
decisions such as electing a new master or committing a new cluster
7-
state. Decisions are made only after a _quorum_ (more than half) of the nodes in
8-
the voting configuration respond.
6+
decisions such as electing a new master or committing a new cluster state.
7+
Decisions are made only after a majority (more than half) of the nodes in the
8+
voting configuration respond.
99

1010
Usually the voting configuration is the same as the set of all the
1111
master-eligible nodes that are currently in the cluster. However, there are some
@@ -98,3 +98,43 @@ nodes, however, the cluster is still only fully tolerant to the loss of one
9898
node, but quorum-based decisions require votes from two of the three voting
9999
nodes. In the event of an even split, one half will contain two of the three
100100
voting nodes so that half will remain available.
101+
102+
[float]
103+
==== Setting the initial voting configuration
104+
105+
When a brand-new cluster starts up for the first time, it must elect its first
106+
master node. To do this election, it needs to know the set of master-eligible
107+
nodes whose votes should count. This initial voting configuration is known as
108+
the _bootstrap configuration_ and is set in the
109+
<<modules-discovery-bootstrap-cluster,cluster bootstrapping process>>.
110+
111+
It is important that the bootstrap configuration identifies exactly which nodes
112+
should vote in the first election. It is not sufficient to configure each node
113+
with an expectation of how many nodes there should be in the cluster. It is also
114+
important to note that the bootstrap configuration must come from outside the
115+
cluster: there is no safe way for the cluster to determine the bootstrap
116+
configuration correctly on its own.
117+
118+
If the bootstrap configuration is not set correctly, when you start a brand-new
119+
cluster there is a risk that you will accidentally form two separate clusters
120+
instead of one. This situation can lead to data loss: you might start using both
121+
clusters before you notice that anything has gone wrong and it is impossible to
122+
merge them together later.
123+
124+
NOTE: To illustrate the problem with configuring each node to expect a certain
125+
cluster size, imagine starting up a three-node cluster in which each node knows
126+
that it is going to be part of a three-node cluster. A majority of three nodes
127+
is two, so normally the first two nodes to discover each other form a cluster
128+
and the third node joins them a short time later. However, imagine that four
129+
nodes were erroneously started instead of three. In this case, there are enough
130+
nodes to form two separate clusters. Of course if each node is started manually
131+
then it's unlikely that too many nodes are started. If you're using an automated
132+
orchestrator, however, it's certainly possible to get into this situation--
133+
particularly if the orchestrator is not resilient to failures such as network
134+
partitions.
135+
136+
The initial quorum is only required the very first time a whole cluster starts
137+
up. New nodes joining an established cluster can safely obtain all the
138+
information they need from the elected master. Nodes that have previously been
139+
part of a cluster will have stored to disk all the information that is required
140+
when they restart.

0 commit comments

Comments
 (0)