1
1
[[cluster-reroute]]
2
2
== Cluster Reroute
3
3
4
- The reroute command allows to explicitly execute a cluster reroute
5
- allocation command including specific commands. For example, a shard can
6
- be moved from one node to another explicitly, an allocation can be
7
- canceled, or an unassigned shard can be explicitly allocated on a
8
- specific node.
4
+ The reroute command allows for manual changes to the allocation of individual
5
+ shards in the cluster. For example, a shard can be moved from one node to
6
+ another explicitly, an allocation can be cancelled, and an unassigned shard can
7
+ be explicitly allocated to a specific node.
9
8
10
- Here is a short example of how a simple reroute API call:
9
+ Here is a short example of a simple reroute API call:
11
10
12
11
[source,js]
13
12
--------------------------------------------------
@@ -32,84 +31,53 @@ POST /_cluster/reroute
32
31
// CONSOLE
33
32
// TEST[skip:doc tests run with only a single node]
34
33
35
- An important aspect to remember is the fact that once when an allocation
36
- occurs, the cluster will aim at re-balancing its state back to an even
37
- state. For example, if the allocation includes moving a shard from
38
- `node1` to `node2`, in an `even` state, then another shard will be moved
39
- from `node2` to `node1` to even things out.
34
+ It is important to note that that after processing any reroute commands
35
+ Elasticsearch will perform rebalancing as normal (respecting the values of
36
+ settings such as `cluster.routing.rebalance.enable`) in order to remain in a
37
+ balanced state. For example, if the requested allocation includes moving a
38
+ shard from `node1` to `node2` then this may cause a shard to be moved from
39
+ `node2` back to `node1` to even things out.
40
40
41
- The cluster can be set to disable allocations, which means that only the
42
- explicitly allocations will be performed. Obviously, only once all
43
- commands has been applied, the cluster will aim to be re-balance its
44
- state .
41
+ The cluster can be set to disable allocations using the
42
+ `cluster.routing.allocation.enable` setting. If allocations are disabled then
43
+ the only allocations that will be performed are explicit ones given using the
44
+ `reroute` command, and consequent allocations due to rebalancing .
45
45
46
- Another option is to run the commands in `dry_run` (as a URI flag, or in
47
- the request body). This will cause the commands to apply to the current
48
- cluster state, and return the resulting cluster after the commands (and
49
- re-balancing) has been applied.
46
+ It is possible to run `reroute` commands in "dry run" mode by using the
47
+ `?dry_run` URI query parameter, or by passing `"dry_run": true` in the request
48
+ body. This will calculate the result of applying the commands to the current
49
+ cluster state, and return the resulting cluster state after the commands (and
50
+ re-balancing) has been applied, but will not actually perform the requested
51
+ changes.
50
52
51
- If the `explain` parameter is specified, a detailed explanation of why the
52
- commands could or could not be executed is returned .
53
+ If the `? explain` URI query parameter is included then a detailed explanation
54
+ of why the commands could or could not be executed is included in the response .
53
55
54
56
The commands supported are:
55
57
56
58
`move`::
57
59
Move a started shard from one node to another node. Accepts
58
60
`index` and `shard` for index name and shard number, `from_node` for the
59
- node to move the shard ` from` , and `to_node` for the node to move the
61
+ node to move the shard from, and `to_node` for the node to move the
60
62
shard to.
61
63
62
64
`cancel`::
63
- Cancel allocation of a shard (or recovery). Accepts `index`
64
- and `shard` for index name and shard number, and `node` for the node to
65
- cancel the shard allocation on. It also accepts `allow_primary` flag to
66
- explicitly specify that it is allowed to cancel allocation for a primary
67
- shard. This can be used to force resynchronization of existing replicas
68
- from the primary shard by cancelling them and allowing them to be
69
- reinitialized through the standard reallocation process.
65
+ Cancel allocation of a shard (or recovery). Accepts `index` and `shard` for
66
+ index name and shard number, and `node` for the node to cancel the shard
67
+ allocation on. This can be used to force resynchronization of existing
68
+ replicas from the primary shard by cancelling them and allowing them to be
69
+ reinitialized through the standard recovery process. By default only
70
+ replica shard allocations can be cancelled. If it is necessary to cancel
71
+ the allocation of a primary shard then the `allow_primary` flag must also
72
+ be included in the request.
70
73
71
74
`allocate_replica`::
72
- Allocate an unassigned replica shard to a node. Accepts the
73
- `index` and `shard` for index name and shard number, and `node` to
74
- allocate the shard to. Takes <<modules-cluster,allocation deciders>> into account.
75
-
76
- Two more commands are available that allow the allocation of a primary shard
77
- to a node. These commands should however be used with extreme care, as primary
78
- shard allocation is usually fully automatically handled by Elasticsearch.
79
- Reasons why a primary shard cannot be automatically allocated include the following:
80
-
81
- - A new index was created but there is no node which satisfies the allocation deciders.
82
- - An up-to-date shard copy of the data cannot be found on the current data nodes in
83
- the cluster. To prevent data loss, the system does not automatically promote a stale
84
- shard copy to primary.
85
-
86
- As a manual override, two commands to forcefully allocate primary shards
87
- are available:
88
-
89
- `allocate_stale_primary`::
90
- Allocate a primary shard to a node that holds a stale copy. Accepts the
91
- `index` and `shard` for index name and shard number, and `node` to
92
- allocate the shard to. Using this command may lead to data loss
93
- for the provided shard id. If a node which has the good copy of the
94
- data rejoins the cluster later on, that data will be overwritten with
95
- the data of the stale copy that was forcefully allocated with this
96
- command. To ensure that these implications are well-understood,
97
- this command requires the special field `accept_data_loss` to be
98
- explicitly set to `true` for it to work.
99
-
100
- `allocate_empty_primary`::
101
- Allocate an empty primary shard to a node. Accepts the
102
- `index` and `shard` for index name and shard number, and `node` to
103
- allocate the shard to. Using this command leads to a complete loss
104
- of all data that was indexed into this shard, if it was previously
105
- started. If a node which has a copy of the
106
- data rejoins the cluster later on, that data will be deleted!
107
- To ensure that these implications are well-understood,
108
- this command requires the special field `accept_data_loss` to be
109
- explicitly set to `true` for it to work.
75
+ Allocate an unassigned replica shard to a node. Accepts `index` and `shard`
76
+ for index name and shard number, and `node` to allocate the shard to. Takes
77
+ <<modules-cluster,allocation deciders>> into account.
110
78
111
79
[float]
112
- === Retry failed shards
80
+ === Retrying failed allocations
113
81
114
82
The cluster will attempt to allocate a shard a maximum of
115
83
`index.allocation.max_retries` times in a row (defaults to `5`), before giving
@@ -118,5 +86,48 @@ structural problems such as having an analyzer which refers to a stopwords
118
86
file which doesn't exist on all nodes.
119
87
120
88
Once the problem has been corrected, allocation can be manually retried by
121
- calling the <<cluster-reroute,`reroute`>> API with `?retry_failed`, which
122
- will attempt a single retry round for these shards.
89
+ calling the <<cluster-reroute,`reroute`>> API with the `?retry_failed` URI
90
+ query parameter, which will attempt a single retry round for these shards.
91
+
92
+ [float]
93
+ === Forced allocation on unrecoverable errors
94
+
95
+ Two more commands are available that allow the allocation of a primary shard to
96
+ a node. These commands should however be used with extreme care, as primary
97
+ shard allocation is usually fully automatically handled by Elasticsearch.
98
+ Reasons why a primary shard cannot be automatically allocated include the
99
+ following:
100
+
101
+ - A new index was created but there is no node which satisfies the allocation
102
+ deciders.
103
+ - An up-to-date shard copy of the data cannot be found on the current data
104
+ nodes in the cluster. To prevent data loss, the system does not automatically
105
+ promote a stale shard copy to primary.
106
+
107
+ The following two commands are dangerous and may result in data loss. They are
108
+ meant to be used in cases where the original data can not be recovered and the
109
+ cluster administrator accepts the loss. If you have suffered a temporary issue
110
+ that can be fixed, please see the `retry_failed` flag described above. To
111
+ emphasise: if these commands are performed and then a node joins the cluster
112
+ that holds a copy of the affected shard then the copy on the newly-joined node
113
+ will be deleted or overwritten.
114
+
115
+ `allocate_stale_primary`::
116
+ Allocate a primary shard to a node that holds a stale copy. Accepts the
117
+ `index` and `shard` for index name and shard number, and `node` to allocate
118
+ the shard to. Using this command may lead to data loss for the provided
119
+ shard id. If a node which has the good copy of the data rejoins the cluster
120
+ later on, that data will be deleted or overwritten with the data of the
121
+ stale copy that was forcefully allocated with this command. To ensure that
122
+ these implications are well-understood, this command requires the flag
123
+ `accept_data_loss` to be explicitly set to `true`.
124
+
125
+ `allocate_empty_primary`::
126
+ Allocate an empty primary shard to a node. Accepts the `index` and `shard`
127
+ for index name and shard number, and `node` to allocate the shard to. Using
128
+ this command leads to a complete loss of all data that was indexed into
129
+ this shard, if it was previously started. If a node which has a copy of the
130
+ data rejoins the cluster later on, that data will be deleted. To ensure
131
+ that these implications are well-understood, this command requires the flag
132
+ `accept_data_loss` to be explicitly set to `true`.
133
+
0 commit comments