Skip to content

Commit 1c96e8a

Browse files
committed
[DOCS] Reformat cat health API (#45218)
1 parent 2489508 commit 1c96e8a

File tree

1 file changed

+81
-26
lines changed

1 file changed

+81
-26
lines changed

docs/reference/cat/health.asciidoc

Lines changed: 81 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,68 @@
11
[[cat-health]]
22
=== cat health
33

4-
`health` is a terse, one-line representation of the same information
5-
from `/_cluster/health`.
4+
Returns the health status of a cluster, similar to the <<cluster-health,cluster
5+
health>> API.
6+
7+
8+
[[cat-health-api-request]]
9+
==== {api-request-title}
10+
11+
`GET /_cat/health`
12+
13+
14+
[[cat-health-api-desc]]
15+
==== {api-description-title}
16+
17+
You can use the cat health API to get the health status of a cluster.
18+
19+
[[timestamp]]
20+
This API is often used to check malfunctioning clusters. To help you
21+
track cluster health alongside log files and alerting systems, the API returns
22+
timestamps in two formats:
23+
24+
* `HH:MM:SS`, which is human-readable but includes no date information.
25+
* https://en.wikipedia.org/wiki/Unix_time[Unix `epoch` time], which is
26+
machine-sortable and includes date information. This is useful for cluster
27+
recoveries that take multiple days.
28+
29+
You can use the cat health API to verify cluster health across multiple nodes.
30+
See <<cat-health-api-example-across-nodes>>.
31+
32+
You also can use the API to track the recovery of a large cluster
33+
over a longer period of time. See <<cat-health-api-example-large-cluster>>.
34+
35+
36+
[[cat-health-api-query-params]]
37+
==== {api-query-parms-title}
38+
39+
include::{docdir}/rest-api/common-parms.asciidoc[tag=http-format]
40+
41+
include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-h]
42+
43+
include::{docdir}/rest-api/common-parms.asciidoc[tag=help]
44+
45+
include::{docdir}/rest-api/common-parms.asciidoc[tag=local]
46+
47+
include::{docdir}/rest-api/common-parms.asciidoc[tag=master-timeout]
48+
49+
include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-s]
50+
51+
`ts` (timestamps)::
52+
(Optional, boolean) If `true`, returns `HH:MM:SS` and
53+
https://en.wikipedia.org/wiki/Unix_time[Unix `epoch`] timestamps. Defaults to
54+
`true`.
55+
56+
include::{docdir}/rest-api/common-parms.asciidoc[tag=cat-v]
57+
58+
59+
[[cat-health-api-example]]
60+
==== {api-examples-title}
61+
62+
[[cat-health-api-example-timestamp]]
63+
===== Example with a timestamp
64+
By default, the cat health API returns `HH:MM:SS` and
65+
https://en.wikipedia.org/wiki/Unix_time[Unix `epoch`] timestamps. For example:
666

767
[source,js]
868
--------------------------------------------------
@@ -11,6 +71,8 @@ GET /_cat/health?v
1171
// CONSOLE
1272
// TEST[s/^/PUT twitter\n{"settings":{"number_of_replicas": 0}}\n/]
1373

74+
The API returns the following response:
75+
1476
[source,txt]
1577
--------------------------------------------------
1678
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
@@ -19,7 +81,9 @@ epoch timestamp cluster status node.total node.data shards pri relo i
1981
// TESTRESPONSE[s/1475871424 16:17:04/\\d+ \\d+:\\d+:\\d+/]
2082
// TESTRESPONSE[s/elasticsearch/[^ ]+/ s/0 -/\\d+ (-|\\d+(\\.\\d+)?[ms]+)/ non_json]
2183

22-
It has one option `ts` to disable the timestamping:
84+
[[cat-health-api-example-no-timestamp]]
85+
===== Example without a timestamp
86+
You can use the `ts` (timestamps) parameter to disable timestamps. For example:
2387

2488
[source,js]
2589
--------------------------------------------------
@@ -28,7 +92,7 @@ GET /_cat/health?v&ts=false
2892
// CONSOLE
2993
// TEST[s/^/PUT twitter\n{"settings":{"number_of_replicas": 0}}\n/]
3094

31-
which looks like:
95+
The API returns the following response:
3296

3397
[source,txt]
3498
--------------------------------------------------
@@ -37,8 +101,10 @@ elasticsearch green 1 1 1 1 0 0 0
37101
--------------------------------------------------
38102
// TESTRESPONSE[s/elasticsearch/[^ ]+/ s/0 -/\\d+ (-|\\d+(\\.\\d+)?[ms]+)/ non_json]
39103

40-
A common use of this command is to verify the health is consistent
41-
across nodes:
104+
[[cat-health-api-example-across-nodes]]
105+
===== Example across nodes
106+
You can use the cat health API to verify the health of a cluster across nodes.
107+
For example:
42108

43109
[source,sh]
44110
--------------------------------------------------
@@ -52,10 +118,11 @@ across nodes:
52118
--------------------------------------------------
53119
// NOTCONSOLE
54120

55-
A less obvious use is to track recovery of a large cluster over
56-
time. With enough shards, starting a cluster, or even recovering after
57-
losing a node, can take time (depending on your network & disk). A way
58-
to track its progress is by using this command in a delayed loop:
121+
[[cat-health-api-example-large-cluster]]
122+
===== Example with a large cluster
123+
You can use the cat health API to track the recovery of a large cluster over a
124+
longer period of time. You can do this by including the cat health API request
125+
in a delayed loop. For example:
59126

60127
[source,sh]
61128
--------------------------------------------------
@@ -68,19 +135,7 @@ to track its progress is by using this command in a delayed loop:
68135
--------------------------------------------------
69136
// NOTCONSOLE
70137

71-
In this scenario, we can tell that recovery took roughly four minutes.
72-
If this were going on for hours, we would be able to watch the
73-
`UNASSIGNED` shards drop precipitously. If that number remained
74-
static, we would have an idea that there is a problem.
75-
76-
[float]
77-
[[timestamp]]
78-
==== Why the timestamp?
79-
80-
You typically are using the `health` command when a cluster is
81-
malfunctioning. During this period, it's extremely important to
82-
correlate activities across log files, alerting systems, etc.
83-
84-
There are two outputs. The `HH:MM:SS` output is simply for quick
85-
human consumption. The epoch time retains more information, including
86-
date, and is machine sortable if your recovery spans days.
138+
In this example, the recovery took roughly six minutes, from `18:24:06` to
139+
`18:30:06`. If this recovery took hours, you could continue to monitor the
140+
number of `UNASSIGNED` shards, which should drop. If the number of `UNASSIGNED`
141+
shards remains static, it would indicate an issue with the cluster recovery.

0 commit comments

Comments
 (0)