1
1
[[modules-discovery-zen]]
2
2
=== Zen Discovery
3
3
4
- The zen discovery is the built in discovery module for Elasticsearch and
5
- the default. It provides unicast discovery, but can be extended to
6
- support cloud environments and other forms of discovery.
4
+ Zen discovery is the built-in, default, discovery module for Elasticsearch. It
5
+ provides unicast and file-based discovery, and can be extended to support cloud
6
+ environments and other forms of discovery via plugins .
7
7
8
- The zen discovery is integrated with other modules, for example, all
9
- communication between nodes is done using the
10
- <<modules-transport,transport>> module.
8
+ Zen discovery is integrated with other modules, for example, all communication
9
+ between nodes is done using the <<modules-transport,transport>> module.
11
10
12
11
It is separated into several sub modules, which are explained below:
13
12
14
13
[float]
15
14
[[ping]]
16
15
==== Ping
17
16
18
- This is the process where a node uses the discovery mechanisms to find
19
- other nodes.
17
+ This is the process where a node uses the discovery mechanisms to find other
18
+ nodes.
19
+
20
+ [float]
21
+ [[discovery-seed-nodes]]
22
+ ==== Seed nodes
23
+
24
+ Zen discovery uses a list of _seed_ nodes in order to start off the discovery
25
+ process. At startup, or when electing a new master, Elasticsearch tries to
26
+ connect to each seed node in its list, and holds a gossip-like conversation with
27
+ them to find other nodes and to build a complete picture of the cluster. By
28
+ default there are two methods for configuring the list of seed nodes: _unicast_
29
+ and _file-based_. It is recommended that the list of seed nodes comprises the
30
+ list of master-eligible nodes in the cluster.
20
31
21
32
[float]
22
33
[[unicast]]
23
34
===== Unicast
24
35
25
- Unicast discovery requires a list of hosts to use that will act as gossip
26
- routers. These hosts can be specified as hostnames or IP addresses; hosts
27
- specified as hostnames are resolved to IP addresses during each round of
28
- pinging. Note that if you are in an environment where DNS resolutions vary with
29
- time, you might need to adjust your <<networkaddress-cache-ttl,JVM security
30
- settings>>.
36
+ Unicast discovery configures a static list of hosts for use as seed nodes.
37
+ These hosts can be specified as hostnames or IP addresses; hosts specified as
38
+ hostnames are resolved to IP addresses during each round of pinging. Note that
39
+ if you are in an environment where DNS resolutions vary with time, you might
40
+ need to adjust your <<networkaddress-cache-ttl,JVM security settings>>.
31
41
32
- It is recommended that the unicast hosts list be maintained as the list of
33
- master-eligible nodes in the cluster.
42
+ The list of hosts is set using the `discovery.zen.ping.unicast.hosts` static
43
+ setting. This is either an array of hosts or a comma-delimited string. Each
44
+ value should be in the form of `host:port` or `host` (where `port` defaults to
45
+ the setting `transport.profiles.default.port` falling back to
46
+ `transport.tcp.port` if not set). Note that IPv6 hosts must be bracketed. The
47
+ default for this setting is `127.0.0.1, [::1]`
34
48
35
- Unicast discovery provides the following settings with the `discovery.zen.ping.unicast` prefix:
49
+ Additionally, the `discovery.zen.ping.unicast.resolve_timeout` configures the
50
+ amount of time to wait for DNS lookups on each round of pinging. This is
51
+ specified as a <<time-units, time unit>> and defaults to 5s.
36
52
37
- [cols="<,<",options="header",]
38
- |=======================================================================
39
- |Setting |Description
40
- |`hosts` |Either an array setting or a comma delimited setting. Each
41
- value should be in the form of `host:port` or `host` (where `port` defaults to the setting `transport.profiles.default.port`
42
- falling back to `transport.tcp.port` if not set). Note that IPv6 hosts must be bracketed. Defaults to `127.0.0.1, [::1]`
43
- |`hosts.resolve_timeout` |The amount of time to wait for DNS lookups on each round of pinging. Specified as
44
- <<time-units, time units>>. Defaults to 5s.
45
- |=======================================================================
53
+ Unicast discovery uses the <<modules-transport,transport>> module to perform the
54
+ discovery.
46
55
47
- The unicast discovery uses the <<modules-transport,transport>> module to perform the discovery.
56
+ [float]
57
+ [[file-based-hosts-provider]]
58
+ ===== File-based
59
+
60
+ In addition to hosts provided by the static `discovery.zen.ping.unicast.hosts`
61
+ setting, it is possible to provide a list of hosts via an external file.
62
+ Elasticsearch reloads this file when it changes, so that the list of seed nodes
63
+ can change dynamically without needing to restart each node. For example, this
64
+ gives a convenient mechanism for an Elasticsearch instance that is run in a
65
+ Docker container to be dynamically supplied with a list of IP addresses to
66
+ connect to for Zen discovery when those IP addresses may not be known at node
67
+ startup.
68
+
69
+ To enable file-based discovery, configure the `file` hosts provider as follows:
70
+
71
+ ```
72
+ discovery.zen.hosts_provider: file
73
+ ```
74
+
75
+ Then create a file at `$ES_PATH_CONF/unicast_hosts.txt` in
76
+ <<discovery-file-format,the format described below>>. Any time a change is made
77
+ to the `unicast_hosts.txt` file the new changes will be picked up by
78
+ Elasticsearch and the new hosts list will be used.
79
+
80
+ Note that the file-based discovery plugin augments the unicast hosts list in
81
+ `elasticsearch.yml`: if there are valid unicast host entries in
82
+ `discovery.zen.ping.unicast.hosts` then they will be used in addition to those
83
+ supplied in `unicast_hosts.txt`.
84
+
85
+ The `discovery.zen.ping.unicast.resolve_timeout` setting also applies to DNS
86
+ lookups for nodes specified by address via file-based discovery. This is
87
+ specified as a <<time-units, time unit>> and defaults to 5s.
88
+
89
+ [[discovery-file-format]]
90
+ [float]
91
+ ====== unicast_hosts.txt file format
92
+
93
+ The format of the file is to specify one node entry per line. Each node entry
94
+ consists of the host (host name or IP address) and an optional transport port
95
+ number. If the port number is specified, is must come immediately after the
96
+ host (on the same line) separated by a `:`. If the port number is not
97
+ specified, a default value of 9300 is used.
98
+
99
+ For example, this is an example of `unicast_hosts.txt` for a cluster with four
100
+ nodes that participate in unicast discovery, some of which are not running on
101
+ the default port:
102
+
103
+ [source,txt]
104
+ ----------------------------------------------------------------
105
+ 10.10.10.5
106
+ 10.10.10.6:9305
107
+ 10.10.10.5:10005
108
+ # an IPv6 address
109
+ [2001:0db8:85a3:0000:0000:8a2e:0370:7334]:9301
110
+ ----------------------------------------------------------------
111
+
112
+ Host names are allowed instead of IP addresses (similar to
113
+ `discovery.zen.ping.unicast.hosts`), and IPv6 addresses must be specified in
114
+ brackets with the port coming after the brackets.
115
+
116
+ It is also possible to add comments to this file. All comments must appear on
117
+ their lines starting with `#` (i.e. comments cannot start in the middle of a
118
+ line).
48
119
49
120
[float]
50
121
[[master-election]]
51
122
==== Master Election
52
123
53
- As part of the ping process a master of the cluster is either
54
- elected or joined to. This is done automatically. The
55
- `discovery.zen.ping_timeout` (which defaults to `3s`) determines how long the node
56
- will wait before deciding on starting an election or joining an existing cluster.
57
- Three pings will be sent over this timeout interval. In case where no decision can be
58
- reached after the timeout, the pinging process restarts.
59
- In slow or congested networks, three seconds might not be enough for a node to become
60
- aware of the other nodes in its environment before making an election decision.
61
- Increasing the timeout should be done with care in that case, as it will slow down the
62
- election process.
63
- Once a node decides to join an existing formed cluster, it
64
- will send a join request to the master (`discovery.zen.join_timeout`)
65
- with a timeout defaulting at 20 times the ping timeout.
66
-
67
- When the master node stops or has encountered a problem, the cluster nodes
68
- start pinging again and will elect a new master. This pinging round also
69
- serves as a protection against (partial) network failures where a node may unjustly
70
- think that the master has failed. In this case the node will simply hear from
71
- other nodes about the currently active master.
72
-
73
- If `discovery.zen.master_election.ignore_non_master_pings` is `true`, pings from nodes that are not master
74
- eligible (nodes where `node.master` is `false`) are ignored during master election; the default value is
124
+ As part of the ping process a master of the cluster is either elected or joined
125
+ to. This is done automatically. The `discovery.zen.ping_timeout` (which defaults
126
+ to `3s`) determines how long the node will wait before deciding on starting an
127
+ election or joining an existing cluster. Three pings will be sent over this
128
+ timeout interval. In case where no decision can be reached after the timeout,
129
+ the pinging process restarts. In slow or congested networks, three seconds
130
+ might not be enough for a node to become aware of the other nodes in its
131
+ environment before making an election decision. Increasing the timeout should
132
+ be done with care in that case, as it will slow down the election process. Once
133
+ a node decides to join an existing formed cluster, it will send a join request
134
+ to the master (`discovery.zen.join_timeout`) with a timeout defaulting at 20
135
+ times the ping timeout.
136
+
137
+ When the master node stops or has encountered a problem, the cluster nodes start
138
+ pinging again and will elect a new master. This pinging round also serves as a
139
+ protection against (partial) network failures where a node may unjustly think
140
+ that the master has failed. In this case the node will simply hear from other
141
+ nodes about the currently active master.
142
+
143
+ If `discovery.zen.master_election.ignore_non_master_pings` is `true`, pings from
144
+ nodes that are not master eligible (nodes where `node.master` is `false`) are
145
+ ignored during master election; the default value is `false`.
146
+
147
+ Nodes can be excluded from becoming a master by setting `node.master` to
75
148
`false`.
76
149
77
- Nodes can be excluded from becoming a master by setting `node.master` to `false`.
78
-
79
- The `discovery.zen.minimum_master_nodes` sets the minimum
80
- number of master eligible nodes that need to join a newly elected master in order for an election to
81
- complete and for the elected node to accept its mastership. The same setting controls the minimum number of
82
- active master eligible nodes that should be a part of any active cluster. If this requirement is not met the
83
- active master node will step down and a new master election will begin.
150
+ The `discovery.zen.minimum_master_nodes` sets the minimum number of master
151
+ eligible nodes that need to join a newly elected master in order for an election
152
+ to complete and for the elected node to accept its mastership. The same setting
153
+ controls the minimum number of active master eligible nodes that should be a
154
+ part of any active cluster. If this requirement is not met the active master
155
+ node will step down and a new master election will begin.
84
156
85
157
This setting must be set to a <<minimum_master_nodes,quorum>> of your master
86
158
eligible nodes. It is recommended to avoid having only two master eligible
87
- nodes, since a quorum of two is two. Therefore, a loss of either master
88
- eligible node will result in an inoperable cluster.
159
+ nodes, since a quorum of two is two. Therefore, a loss of either master eligible
160
+ node will result in an inoperable cluster.
89
161
90
162
[float]
91
163
[[fault-detection]]
92
164
==== Fault Detection
93
165
94
- There are two fault detection processes running. The first is by the
95
- master, to ping all the other nodes in the cluster and verify that they
96
- are alive. And on the other end, each node pings to master to verify if
97
- its still alive or an election process needs to be initiated.
166
+ There are two fault detection processes running. The first is by the master, to
167
+ ping all the other nodes in the cluster and verify that they are alive. And on
168
+ the other end, each node pings to master to verify if its still alive or an
169
+ election process needs to be initiated.
98
170
99
171
The following settings control the fault detection process using the
100
172
`discovery.zen.fd` prefix:
@@ -116,19 +188,21 @@ considered failed. Defaults to `3`.
116
188
117
189
The master node is the only node in a cluster that can make changes to the
118
190
cluster state. The master node processes one cluster state update at a time,
119
- applies the required changes and publishes the updated cluster state to all
120
- the other nodes in the cluster. Each node receives the publish message, acknowledges
121
- it, but does *not* yet apply it. If the master does not receive acknowledgement from
122
- at least `discovery.zen.minimum_master_nodes` nodes within a certain time (controlled by
123
- the `discovery.zen.commit_timeout` setting and defaults to 30 seconds) the cluster state
124
- change is rejected.
125
-
126
- Once enough nodes have responded, the cluster state is committed and a message will
127
- be sent to all the nodes. The nodes then proceed to apply the new cluster state to their
128
- internal state. The master node waits for all nodes to respond, up to a timeout, before
129
- going ahead processing the next updates in the queue. The `discovery.zen.publish_timeout` is
130
- set by default to 30 seconds and is measured from the moment the publishing started. Both
131
- timeout settings can be changed dynamically through the <<cluster-update-settings,cluster update settings api>>
191
+ applies the required changes and publishes the updated cluster state to all the
192
+ other nodes in the cluster. Each node receives the publish message, acknowledges
193
+ it, but does *not* yet apply it. If the master does not receive acknowledgement
194
+ from at least `discovery.zen.minimum_master_nodes` nodes within a certain time
195
+ (controlled by the `discovery.zen.commit_timeout` setting and defaults to 30
196
+ seconds) the cluster state change is rejected.
197
+
198
+ Once enough nodes have responded, the cluster state is committed and a message
199
+ will be sent to all the nodes. The nodes then proceed to apply the new cluster
200
+ state to their internal state. The master node waits for all nodes to respond,
201
+ up to a timeout, before going ahead processing the next updates in the queue.
202
+ The `discovery.zen.publish_timeout` is set by default to 30 seconds and is
203
+ measured from the moment the publishing started. Both timeout settings can be
204
+ changed dynamically through the <<cluster-update-settings,cluster update
205
+ settings api>>
132
206
133
207
[float]
134
208
[[no-master-block]]
@@ -143,10 +217,14 @@ rejected when there is no active master.
143
217
The `discovery.zen.no_master_block` setting has two valid options:
144
218
145
219
[horizontal]
146
- `all`:: All operations on the node--i.e. both read & writes--will be rejected. This also applies for api cluster state
147
- read or write operations, like the get index settings, put mapping and cluster state api.
148
- `write`:: (default) Write operations will be rejected. Read operations will succeed, based on the last known cluster configuration.
149
- This may result in partial reads of stale data as this node may be isolated from the rest of the cluster.
150
-
151
- The `discovery.zen.no_master_block` setting doesn't apply to nodes-based apis (for example cluster stats, node info and
152
- node stats apis). Requests to these apis will not be blocked and can run on any available node.
220
+ `all`:: All operations on the node--i.e. both read & writes--will be rejected.
221
+ This also applies for api cluster state read or write operations, like the get
222
+ index settings, put mapping and cluster state api.
223
+ `write`:: (default) Write operations will be rejected. Read operations will
224
+ succeed, based on the last known cluster configuration. This may result in
225
+ partial reads of stale data as this node may be isolated from the rest of the
226
+ cluster.
227
+
228
+ The `discovery.zen.no_master_block` setting doesn't apply to nodes-based apis
229
+ (for example cluster stats, node info and node stats apis). Requests to these
230
+ apis will not be blocked and can run on any available node.
0 commit comments