|
| 1 | +[[modules-discovery-adding-removing-nodes]] |
| 2 | +=== Adding and removing nodes |
| 3 | + |
| 4 | +As nodes are added or removed Elasticsearch maintains an optimal level of fault |
| 5 | +tolerance by automatically updating the cluster's _voting configuration_, which |
| 6 | +is the set of <<master-node,master-eligible nodes>> whose responses are counted |
| 7 | +when making decisions such as electing a new master or committing a new cluster |
| 8 | +state. |
| 9 | + |
| 10 | +It is recommended to have a small and fixed number of master-eligible nodes in a |
| 11 | +cluster, and to scale the cluster up and down by adding and removing |
| 12 | +master-ineligible nodes only. However there are situations in which it may be |
| 13 | +desirable to add or remove some master-eligible nodes to or from a cluster. |
| 14 | + |
| 15 | +==== Adding master-eligible nodes |
| 16 | + |
| 17 | +If you wish to add some master-eligible nodes to your cluster, simply configure |
| 18 | +the new nodes to find the existing cluster and start them up. Elasticsearch will |
| 19 | +add the new nodes to the voting configuration if it is appropriate to do so. |
| 20 | + |
| 21 | +==== Removing master-eligible nodes |
| 22 | + |
| 23 | +When removing master-eligible nodes, it is important not to remove too many all |
| 24 | +at the same time. For instance, if there are currently seven master-eligible |
| 25 | +nodes and you wish to reduce this to three, it is not possible simply to stop |
| 26 | +four of the nodes at once: to do so would leave only three nodes remaining, |
| 27 | +which is less than half of the voting configuration, which means the cluster |
| 28 | +cannot take any further actions. |
| 29 | + |
| 30 | +As long as there are at least three master-eligible nodes in the cluster, as a |
| 31 | +general rule it is best to remove nodes one-at-a-time, allowing enough time for |
| 32 | +the cluster to <<modules-discovery-quorums,automatically adjust>> the voting |
| 33 | +configuration and adapt the fault tolerance level to the new set of nodes. |
| 34 | + |
| 35 | +If there are only two master-eligible nodes remaining then neither node can be |
| 36 | +safely removed since both are required to reliably make progress. You must first |
| 37 | +inform Elasticsearch that one of the nodes should not be part of the voting |
| 38 | +configuration, and that the voting power should instead be given to other nodes. |
| 39 | +You can then take the excluded node offline without preventing the other node |
| 40 | +from making progress. A node which is added to a voting configuration exclusion |
| 41 | +list still works normally, but Elasticsearch tries to remove it from the voting |
| 42 | +configuration so its vote is no longer required. Importantly, Elasticsearch |
| 43 | +will never automatically move a node on the voting exclusions list back into the |
| 44 | +voting configuration. Once an excluded node has been successfully |
| 45 | +auto-reconfigured out of the voting configuration, it is safe to shut it down |
| 46 | +without affecting the cluster's master-level availability. A node can be added |
| 47 | +to the voting configuration exclusion list using the following API: |
| 48 | + |
| 49 | +[source,js] |
| 50 | +-------------------------------------------------- |
| 51 | +# Add node to voting configuration exclusions list and wait for the system to |
| 52 | +# auto-reconfigure the node out of the voting configuration up to the default |
| 53 | +# timeout of 30 seconds |
| 54 | +POST /_cluster/voting_config_exclusions/node_name |
| 55 | +
|
| 56 | +# Add node to voting configuration exclusions list and wait for |
| 57 | +# auto-reconfiguration up to one minute |
| 58 | +POST /_cluster/voting_config_exclusions/node_name?timeout=1m |
| 59 | +-------------------------------------------------- |
| 60 | +// CONSOLE |
| 61 | +// TEST[skip:this would break the test cluster if executed] |
| 62 | + |
| 63 | +The node that should be added to the exclusions list is specified using |
| 64 | +<<cluster-nodes,node filters>> in place of `node_name` here. If a call to the |
| 65 | +voting configuration exclusions API fails, you can safely retry it. Only a |
| 66 | +successful response guarantees that the node has actually been removed from the |
| 67 | +voting configuration and will not be reinstated. |
| 68 | + |
| 69 | +Although the voting configuration exclusions API is most useful for down-scaling |
| 70 | +a two-node to a one-node cluster, it is also possible to use it to remove |
| 71 | +multiple master-eligible nodes all at the same time. Adding multiple nodes to |
| 72 | +the exclusions list has the system try to auto-reconfigure all of these nodes |
| 73 | +out of the voting configuration, allowing them to be safely shut down while |
| 74 | +keeping the cluster available. In the example described above, shrinking a |
| 75 | +seven-master-node cluster down to only have three master nodes, you could add |
| 76 | +four nodes to the exclusions list, wait for confirmation, and then shut them |
| 77 | +down simultaneously. |
| 78 | + |
| 79 | +NOTE: Voting exclusions are only required when removing at least half of the |
| 80 | +master-eligible nodes from a cluster in a short time period. They are not |
| 81 | +required when removing master-ineligible nodes, nor are they required when |
| 82 | +removing fewer than half of the master-eligible nodes. |
| 83 | + |
| 84 | +Adding an exclusion for a node creates an entry for that node in the voting |
| 85 | +configuration exclusions list, which has the system automatically try to |
| 86 | +reconfigure the voting configuration to remove that node and prevents it from |
| 87 | +returning to the voting configuration once it has removed. The current list of |
| 88 | +exclusions is stored in the cluster state and can be inspected as follows: |
| 89 | + |
| 90 | +[source,js] |
| 91 | +-------------------------------------------------- |
| 92 | +GET /_cluster/state?filter_path=metadata.cluster_coordination.voting_config_exclusions |
| 93 | +-------------------------------------------------- |
| 94 | +// CONSOLE |
| 95 | + |
| 96 | +This list is limited in size by the following setting: |
| 97 | + |
| 98 | +`cluster.max_voting_config_exclusions`:: |
| 99 | + |
| 100 | + Sets a limits on the number of voting configuration exclusions at any one |
| 101 | + time. Defaults to `10`. |
| 102 | + |
| 103 | +Since voting configuration exclusions are persistent and limited in number, they |
| 104 | +must be cleaned up. Normally an exclusion is added when performing some |
| 105 | +maintenance on the cluster, and the exclusions should be cleaned up when the |
| 106 | +maintenance is complete. Clusters should have no voting configuration exclusions |
| 107 | +in normal operation. |
| 108 | + |
| 109 | +If a node is excluded from the voting configuration because it is to be shut |
| 110 | +down permanently, its exclusion can be removed after it is shut down and removed |
| 111 | +from the cluster. Exclusions can also be cleared if they were created in error |
| 112 | +or were only required temporarily: |
| 113 | + |
| 114 | +[source,js] |
| 115 | +-------------------------------------------------- |
| 116 | +# Wait for all the nodes with voting configuration exclusions to be removed from |
| 117 | +# the cluster and then remove all the exclusions, allowing any node to return to |
| 118 | +# the voting configuration in the future. |
| 119 | +DELETE /_cluster/voting_config_exclusions |
| 120 | +
|
| 121 | +# Immediately remove all the voting configuration exclusions, allowing any node |
| 122 | +# to return to the voting configuration in the future. |
| 123 | +DELETE /_cluster/voting_config_exclusions?wait_for_removal=false |
| 124 | +-------------------------------------------------- |
| 125 | +// CONSOLE |
0 commit comments