@@ -18,13 +18,13 @@ cluster. In many cases you can do this simply by starting or stopping the nodes
18
18
as required. See <<modules-discovery-adding-removing-nodes>>.
19
19
20
20
As nodes are added or removed Elasticsearch maintains an optimal level of fault
21
- tolerance by updating the cluster's _voting configuration_, which is the set of
22
- master-eligible nodes whose responses are counted when making decisions such as
23
- electing a new master or committing a new cluster state. A decision is made only
24
- after more than half of the nodes in the voting configuration have responded.
25
- Usually the voting configuration is the same as the set of all the
26
- master-eligible nodes that are currently in the cluster. However, there are some
27
- situations in which they may be different.
21
+ tolerance by updating the cluster's <<modules-discovery-voting,voting
22
+ configuration>>, which is the set of master-eligible nodes whose responses are
23
+ counted when making decisions such as electing a new master or committing a new
24
+ cluster state. A decision is made only after more than half of the nodes in the
25
+ voting configuration have responded. Usually the voting configuration is the
26
+ same as the set of all the master-eligible nodes that are currently in the
27
+ cluster. However, there are some situations in which they may be different.
28
28
29
29
To be sure that the cluster remains available you **must not stop half or more
30
30
of the nodes in the voting configuration at the same time**. As long as more
@@ -38,46 +38,6 @@ cluster-state update that adjusts the voting configuration to match, and this
38
38
can take a short time to complete. It is important to wait for this adjustment
39
39
to complete before removing more nodes from the cluster.
40
40
41
- [float]
42
- ==== Setting the initial quorum
43
-
44
- When a brand-new cluster starts up for the first time, it must elect its first
45
- master node. To do this election, it needs to know the set of master-eligible
46
- nodes whose votes should count. This initial voting configuration is known as
47
- the _bootstrap configuration_ and is set in the
48
- <<modules-discovery-bootstrap-cluster,cluster bootstrapping process>>.
49
-
50
- It is important that the bootstrap configuration identifies exactly which nodes
51
- should vote in the first election. It is not sufficient to configure each node
52
- with an expectation of how many nodes there should be in the cluster. It is also
53
- important to note that the bootstrap configuration must come from outside the
54
- cluster: there is no safe way for the cluster to determine the bootstrap
55
- configuration correctly on its own.
56
-
57
- If the bootstrap configuration is not set correctly, when you start a brand-new
58
- cluster there is a risk that you will accidentally form two separate clusters
59
- instead of one. This situation can lead to data loss: you might start using both
60
- clusters before you notice that anything has gone wrong and it is impossible to
61
- merge them together later.
62
-
63
- NOTE: To illustrate the problem with configuring each node to expect a certain
64
- cluster size, imagine starting up a three-node cluster in which each node knows
65
- that it is going to be part of a three-node cluster. A majority of three nodes
66
- is two, so normally the first two nodes to discover each other form a cluster
67
- and the third node joins them a short time later. However, imagine that four
68
- nodes were erroneously started instead of three. In this case, there are enough
69
- nodes to form two separate clusters. Of course if each node is started manually
70
- then it's unlikely that too many nodes are started. If you're using an automated
71
- orchestrator, however, it's certainly possible to get into this situation--
72
- particularly if the orchestrator is not resilient to failures such as network
73
- partitions.
74
-
75
- The initial quorum is only required the very first time a whole cluster starts
76
- up. New nodes joining an established cluster can safely obtain all the
77
- information they need from the elected master. Nodes that have previously been
78
- part of a cluster will have stored to disk all the information that is required
79
- when they restart.
80
-
81
41
[float]
82
42
==== Master elections
83
43
@@ -103,3 +63,4 @@ and then started again then it will automatically recover, such as during a
103
63
<<restart-upgrade,full cluster restart>>. There is no need to take any further
104
64
action with the APIs described here in these cases, because the set of master
105
65
nodes is not changing permanently.
66
+
0 commit comments