@@ -24,66 +24,59 @@ avoid many unnecessary calls.
24
24
|=======================
25
25
| Collector | Data Types | Description
26
26
| Cluster Stats | `cluster_stats`
27
- | Gathers details about the cluster state, including parts of
28
- the actual cluster state (for example `GET /_cluster/state`) and statistics
29
- about it (for example, `GET /_cluster/stats`). This produces a single document
30
- type. In versions prior to X-Pack 5.5, this was actually three separate collectors
31
- that resulted in three separate types: `cluster_stats`, `cluster_state`, and
32
- `cluster_info`. In 5.5 and later, all three are combined into `cluster_stats`.
33
- +
34
- This only runs on the _elected_ master node and the data collected
35
- (`cluster_stats`) largely controls the UI. When this data is not present, it
36
- indicates either a misconfiguration on the elected master node, timeouts related
37
- to the collection of the data, or issues with storing the data. Only a single
38
- document is produced per collection.
27
+ | Gathers details about the cluster state, including parts of the actual cluster
28
+ state (for example `GET /_cluster/state`) and statistics about it (for example,
29
+ `GET /_cluster/stats`). This produces a single document type. In versions prior
30
+ to X-Pack 5.5, this was actually three separate collectors that resulted in
31
+ three separate types: `cluster_stats`, `cluster_state`, and `cluster_info`. In
32
+ 5.5 and later, all three are combined into `cluster_stats`. This only runs on
33
+ the _elected_ master node and the data collected (`cluster_stats`) largely
34
+ controls the UI. When this data is not present, it indicates either a
35
+ misconfiguration on the elected master node, timeouts related to the collection
36
+ of the data, or issues with storing the data. Only a single document is produced
37
+ per collection.
39
38
| Index Stats | `indices_stats`, `index_stats`
40
39
| Gathers details about the indices in the cluster, both in summary and
41
40
individually. This creates many documents that represent parts of the index
42
- statistics output (for example, `GET /_stats`).
43
- +
44
- This information only needs to be collected once, so it is collected on the
45
- _elected_ master node. The most common failure for this collector relates to an
46
- extreme number of indices -- and therefore time to gather them -- resulting in
47
- timeouts. One summary `indices_stats` document is produced per collection and one
48
- `index_stats` document is produced per index, per collection.
41
+ statistics output (for example, `GET /_stats`). This information only needs to
42
+ be collected once, so it is collected on the _elected_ master node. The most
43
+ common failure for this collector relates to an extreme number of indices -- and
44
+ therefore time to gather them -- resulting in timeouts. One summary
45
+ `indices_stats` document is produced per collection and one `index_stats`
46
+ document is produced per index, per collection.
49
47
| Index Recovery | `index_recovery`
50
48
| Gathers details about index recovery in the cluster. Index recovery represents
51
49
the assignment of _shards_ at the cluster level. If an index is not recovered,
52
- it is not usable. This also corresponds to shard restoration via snapshots.
53
- +
54
- This information only needs to be collected once, so it is collected on the
55
- _elected_ master node. The most common failure for this collector relates to an
56
- extreme number of shards -- and therefore time to gather them -- resulting in
57
- timeouts. This creates a single document that contains all recoveries by default,
58
- which can be quite large, but it gives the most accurate picture of recovery in
59
- the production cluster.
50
+ it is not usable. This also corresponds to shard restoration via snapshots. This
51
+ information only needs to be collected once, so it is collected on the _elected_
52
+ master node. The most common failure for this collector relates to an extreme
53
+ number of shards -- and therefore time to gather them -- resulting in timeouts.
54
+ This creates a single document that contains all recoveries by default, which
55
+ can be quite large, but it gives the most accurate picture of recovery in the
56
+ production cluster.
60
57
| Shards | `shards`
61
58
| Gathers details about all _allocated_ shards for all indices, particularly
62
- including what node the shard is allocated to.
63
- +
64
- This information only needs to be collected once, so it is collected on the
65
- _elected_ master node. The collector uses the local cluster state to get the
66
- routing table without any network timeout issues unlike most other collectors.
67
- Each shard is represented by a separate monitoring document.
59
+ including what node the shard is allocated to. This information only needs to be
60
+ collected once, so it is collected on the _elected_ master node. The collector
61
+ uses the local cluster state to get the routing table without any network
62
+ timeout issues unlike most other collectors. Each shard is represented by a
63
+ separate monitoring document.
68
64
| Jobs | `job_stats`
69
- | Gathers details about all machine learning job statistics (for example,
70
- `GET /_xpack/ml/anomaly_detectors/_stats`).
71
- +
72
- This information only needs to be collected once, so it is collected on the
73
- _elected_ master node. However, for the master node to be able to perform the
74
- collection, the master node must have `xpack.ml.enabled` set to true (default)
75
- and a license level that supports {ml}.
65
+ | Gathers details about all machine learning job statistics (for example, `GET
66
+ /_xpack/ml/anomaly_detectors/_stats`). This information only needs to be
67
+ collected once, so it is collected on the _elected_ master node. However, for
68
+ the master node to be able to perform the collection, the master node must have
69
+ `xpack.ml.enabled` set to true (default) and a license level that supports {ml}.
76
70
| Node Stats | `node_stats`
77
71
| Gathers details about the running node, such as memory utilization and CPU
78
- usage (for example, `GET /_nodes/_local/stats`).
79
- +
80
- This runs on _every_ node with {monitoring} enabled. One common failure
81
- results in the timeout of the node stats request due to too many segment files.
82
- As a result, the collector spends too much time waiting for the file system
83
- stats to be calculated until it finally times out. A single `node_stats`
84
- document is created per collection. This is collected per node to help to
85
- discover issues with nodes communicating with each other, but not with the
86
- monitoring cluster (for example, intermittent network issues or memory pressure).
72
+ usage (for example, `GET /_nodes/_local/stats`). This runs on _every_ node with
73
+ {monitoring} enabled. One common failure results in the timeout of the node
74
+ stats request due to too many segment files. As a result, the collector spends
75
+ too much time waiting for the file system stats to be calculated until it
76
+ finally times out. A single `node_stats` document is created per collection.
77
+ This is collected per node to help to discover issues with nodes communicating
78
+ with each other, but not with the monitoring cluster (for example, intermittent
79
+ network issues or memory pressure).
87
80
|=======================
88
81
89
82
{monitoring} uses a single threaded scheduler to run the collection of {es}
0 commit comments