1
1
[[cat-health]]
2
2
=== cat health
3
3
4
- `health` is a terse, one-line representation of the same information
5
- from `/_cluster/health`.
4
+ Returns the health status of a cluster, similar to the <<cluster-health,cluster
5
+ health>> API.
6
+
7
+
8
+ [[cat-health-api-request]]
9
+ ==== {api-request-title}
10
+
11
+ `GET /_cat/health`
12
+
13
+
14
+ [[cat-health-api-desc]]
15
+ ==== {api-description-title}
16
+
17
+ You can use the cat health API to get the health status of a cluster.
18
+
19
+ [[timestamp]]
20
+ This API is often used to check malfunctioning clusters. To help you
21
+ track cluster health alongside log files and alerting systems, the API returns
22
+ timestamps in two formats:
23
+
24
+ * `HH:MM:SS`, which is human-readable but includes no date information.
25
+ * https://en.wikipedia.org/wiki/Unix_time[Unix `epoch` time], which is
26
+ machine-sortable and includes date information. This is useful for cluster
27
+ recoveries that take multiple days.
28
+
29
+ You can use the cat health API to verify cluster health across multiple nodes.
30
+ See <<cat-health-api-example-across-nodes>>.
31
+
32
+ You also can use the API to track the recovery of a large cluster
33
+ over a longer period of time. See <<cat-health-api-example-large-cluster>>.
34
+
35
+
36
+ [[cat-health-api-query-params]]
37
+ ==== {api-query-parms-title}
38
+
39
+ include::{docdir}/rest-api/common-parms.asciidoc[tag=http-format]
40
+
41
+ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-h]
42
+
43
+ include::{docdir}/rest-api/common-parms.asciidoc[tag=help]
44
+
45
+ include::{docdir}/rest-api/common-parms.asciidoc[tag=local]
46
+
47
+ include::{docdir}/rest-api/common-parms.asciidoc[tag=master-timeout]
48
+
49
+ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-s]
50
+
51
+ `ts` (timestamps)::
52
+ (Optional, boolean) If `true`, returns `HH:MM:SS` and
53
+ https://en.wikipedia.org/wiki/Unix_time[Unix `epoch`] timestamps. Defaults to
54
+ `true`.
55
+
56
+ include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
57
+
58
+
59
+ [[cat-health-api-example]]
60
+ ==== {api-examples-title}
61
+
62
+ [[cat-health-api-example-timestamp]]
63
+ ===== Example with a timestamp
64
+ By default, the cat health API returns `HH:MM:SS` and
65
+ https://en.wikipedia.org/wiki/Unix_time[Unix `epoch`] timestamps. For example:
6
66
7
67
[source,js]
8
68
--------------------------------------------------
@@ -11,6 +71,8 @@ GET /_cat/health?v
11
71
// CONSOLE
12
72
// TEST[s/^/PUT twitter\n{"settings":{"number_of_replicas": 0}}\n/]
13
73
74
+ The API returns the following response:
75
+
14
76
[source,txt]
15
77
--------------------------------------------------
16
78
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
@@ -19,7 +81,9 @@ epoch timestamp cluster status node.total node.data shards pri relo i
19
81
// TESTRESPONSE[s/1475871424 16:17:04/\\d+ \\d+:\\d+:\\d+/]
20
82
// TESTRESPONSE[s/elasticsearch/[^ ]+/ s/0 -/\\d+ (-|\\d+(\\.\\d+)?[ms]+)/ non_json]
21
83
22
- It has one option `ts` to disable the timestamping:
84
+ [[cat-health-api-example-no-timestamp]]
85
+ ===== Example without a timestamp
86
+ You can use the `ts` (timestamps) parameter to disable timestamps. For example:
23
87
24
88
[source,js]
25
89
--------------------------------------------------
@@ -28,7 +92,7 @@ GET /_cat/health?v&ts=false
28
92
// CONSOLE
29
93
// TEST[s/^/PUT twitter\n{"settings":{"number_of_replicas": 0}}\n/]
30
94
31
- which looks like :
95
+ The API returns the following response :
32
96
33
97
[source,txt]
34
98
--------------------------------------------------
@@ -37,8 +101,10 @@ elasticsearch green 1 1 1 1 0 0 0
37
101
--------------------------------------------------
38
102
// TESTRESPONSE[s/elasticsearch/[^ ]+/ s/0 -/\\d+ (-|\\d+(\\.\\d+)?[ms]+)/ non_json]
39
103
40
- A common use of this command is to verify the health is consistent
41
- across nodes:
104
+ [[cat-health-api-example-across-nodes]]
105
+ ===== Example across nodes
106
+ You can use the cat health API to verify the health of a cluster across nodes.
107
+ For example:
42
108
43
109
[source,sh]
44
110
--------------------------------------------------
@@ -52,10 +118,11 @@ across nodes:
52
118
--------------------------------------------------
53
119
// NOTCONSOLE
54
120
55
- A less obvious use is to track recovery of a large cluster over
56
- time. With enough shards, starting a cluster, or even recovering after
57
- losing a node, can take time (depending on your network & disk). A way
58
- to track its progress is by using this command in a delayed loop:
121
+ [[cat-health-api-example-large-cluster]]
122
+ ===== Example with a large cluster
123
+ You can use the cat health API to track the recovery of a large cluster over a
124
+ longer period of time. You can do this by including the cat health API request
125
+ in a delayed loop. For example:
59
126
60
127
[source,sh]
61
128
--------------------------------------------------
@@ -68,19 +135,7 @@ to track its progress is by using this command in a delayed loop:
68
135
--------------------------------------------------
69
136
// NOTCONSOLE
70
137
71
- In this scenario, we can tell that recovery took roughly four minutes.
72
- If this were going on for hours, we would be able to watch the
73
- `UNASSIGNED` shards drop precipitously. If that number remained
74
- static, we would have an idea that there is a problem.
75
-
76
- [float]
77
- [[timestamp]]
78
- ==== Why the timestamp?
79
-
80
- You typically are using the `health` command when a cluster is
81
- malfunctioning. During this period, it's extremely important to
82
- correlate activities across log files, alerting systems, etc.
83
-
84
- There are two outputs. The `HH:MM:SS` output is simply for quick
85
- human consumption. The epoch time retains more information, including
86
- date, and is machine sortable if your recovery spans days.
138
+ In this example, the recovery took roughly six minutes, from `18:24:06` to
139
+ `18:30:06`. If this recovery took hours, you could continue to monitor the
140
+ number of `UNASSIGNED` shards, which should drop. If the number of `UNASSIGNED`
141
+ shards remains static, it would indicate an issue with the cluster recovery.
0 commit comments