forked from cloudfoundry/docs-loggregator
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcontainer-metrics.html.md.erb
194 lines (158 loc) · 8.46 KB
/
container-metrics.html.md.erb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
title: Container metrics
owner:
---
Here you can learn about the metrics that are emitted by all containers managed by <%= vars.app_runtime_full %> (<%= vars.app_runtime_abbr %>) and its scheduling
system, Diego.
App metrics include the container metrics, and any custom app metrics that developers create.
## <a id='container-metrics'></a> Diego container metrics
Diego containers emit resource usage metrics for the app instance. Diego averages and emits each metric every 15 seconds.
The following table describes all Diego container metrics:
<table id='container-metrics' class="nice" >
<tr>
<th width=27%>Metric</th>
<th width=58%>Description</th>
<th width=15%>Unit</th>
</tr><tr>
<td><code>cpu</code></td>
<td>The CPU time that an app instance has used, as a percentage of a single CPU core.
<br>
<br>
This value is usually no greater than <code>100% * the number of vCPUs on the host Diego cell</code>. However, discrepancies in measurement timing may
cause the value to be greater.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>absolute_entitlement</code></td>
<td>The amount of CPU time that a Diego Cell has allocated to an app instance, in nanoseconds.
<br>
<br>
At minimum, the CPU time that a Diego Cell allocates to an app instance is
<code>min(app memory, 8 GB) * (Diego cell vCPUs/Diego cell memory) * 100%</code>. The operator of your <%= vars.app_runtime_abbr %> deployment can
provide the <code>vCPUs/memory</code> ratio of the Diego Cell to developers.
<br>
If a Diego Cell is not already working at capacity, or if other workloads on the Diego Cell are idle, the Diego Cell can allocate more than the minimum
amount of CPU time to an app instance.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>absolute_usage</code></td>
<td>The CPU time that an app instance has used, in nanoseconds.
<br>
<br>
<code>absolute_usage / absolute_entitlement</code> calculates a 0-100% range of app instance usage per entitlement.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>memory</code></td>
<td>The amount of RAM memory in bytes that an app instance has used.</td>
<td><code>uint64</code></td>
</tr><tr>
<td><code>memory_quota</code></td>
<td>The amount of RAM memory in bytes that is available for an app instance to use.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>disk</code></td>
<td>The amount of disk space in bytes that an app instance has used.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>disk_quota</code></td>
<td>The amount of disk space in bytes that is available for an app instance to use.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>container_age</code></td>
<td>The age in nanoseconds of the Diego container.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>log_rate</code></td>
<td>The current log rate in bytes per second for an app instance.</td>
<td><code>float64</code></td>
</tr><tr>
<td><code>log_rate_limit</code></td>
<td>The log rate limit in bytes per second for an app instance.</td>
<td><code>float64</code></td>
</tr>
</table>
The way that Diego emits container metrics differs depending on the version of Loggregator used in your <%= vars.app_runtime_abbr %> deployment:
* **Loggregator v1:** Diego emits most container metrics in a `ContainerMetric` envelope. Diego emits the `absolute_entitlement`, `absolute_usage`,
`container_age`, `log_rate`, and `log_rate_limit` container metrics in `ValueMetric` envelopes.
* **Loggregator v2:** Diego emits all container metrics in gauge envelopes. Diego emits the `absolute_entitlement`, `absolute_usage`, `container_age`, `log_rate`, and `log_rate_limit` container metrics in separate gauge envelopes from other container metrics. The container metrics come in three envelopes:
* One envelope containing `cpu`, `disk`, `disk_quota`, `memory`, and `memory_quota`
* One envelope containing `absolute_entitlement`, `absolute_usage`, and `container_age`
* One envelope containing `log_rate` and `log_rate_limit`
## <a id='cf-cli'></a> Retrieving container metrics from the cf CLI
You can retrieve container metrics using the Cloud Foundry Command Line Interface (cf CLI).
* To retrieve CPU, memory, and disk metrics for all instances of an app, see [Retrieve CPU, memory, and disk metrics](#cpu-mem-disk).
* To retrieve CPU entitlement metrics for all instances of an app, see [Retrieve CPU Entitlement metrics](#cpu-entitlement).
* To determine when an app has exceeded its CPU entitlement, see [Monitor apps that exceed their CPU Entitlement](#cpu-entitlement).
### <a id='cf-mem-disk'></a> Retrieving CPU, memory, and disk metrics
To retrieve CPU, memory, and disk metrics for all instances of an app:
1. In a terminal window, run:
```
cf app APP-NAME
```
Where `APP-NAME` is the name of the app.
<br>
<br>
This command returns CPU, memory, and disk metrics for all instances of the app, similar to the following example:
<pre class="terminal">
Showing health and status for app dora-example in org o / space s as admin...
name: dora-example
requested state: started
routes: dora-example.example.com
last uploaded: Fri 16 Sep 01:38:32 UTC 2022
stack: cflinuxfs3
buildpacks:
name version detect output buildpack name
ruby_buildpack 1.8.58 ruby ruby
type: web
sidecars:
instances: 1/1
memory usage: 1024M
state since cpu memory disk logging details
#0 running 2022-09-16T01:38:46Z 0.2% 36.3M of 1G 90.3M of 1G 0/s of unlimited
</pre>
<p> The preceding command shows what percentage of CPU the app is currently using relative to the total CPU on the host
machine.</p>
### <a id='cf-entitlement'></a> Retrieving CPU Entitlement metrics
The `absolute_entitlement` metric shows the amount of CPU an app is using relative to its CPU entitlement.
To retrieve `absolute_entitlement` metrics for all instances of an app:
1. Install the Cloud Foundry CPU Entitlement Plug-in from the [Cloud Foundry CPU Entitlement Plugin](https://github.com/cloudfoundry/cpu-entitlement-plugin)
repository on GitHub.
1. In a terminal window, run:
```
cf cpu-entitlement APP-NAME
```
Where `APP-NAME` is the name of the app.
<br>
<br>
This command returns `absolute_entitlement` metrics for all instances of the app, similar to the following example:
<pre class="terminal">
Showing CPU usage against entitlement for app dora-example in org example-org / space example-org-staging as [email protected] ...
avg usage curr usage
#0 1.62% 1.66%
#1 2.93% 3.09%
#2 2.51% 2.62%
</pre>
### <a id='cpu-entitlement'></a> Determining when Apps exceed their CPU Entitlement
You can use the Cloud Foundry CPU Overentitlement Plug-in to determine when an app has exceeded its CPU entitlement and might need to be scaled up.
When you allow CPU throttling in your <%= vars.app_runtime_abbr %> deployment, apps are split into two groups:
* Apps with an average CPU usage under 100% of their CPU entitlement
* Apps with an average CPU usage that exceeds 100% of their entitlements
To determine which apps in your org have exceeded their CPU entitlements:
1. Install the Cloud Foundry CPU Overentitlement Plug-in, `cpu-overentitlement-instances-plugin`, from the [Cloud Foundry CPU Entitlement
Plug-in](https://github.com/cloudfoundry/cpu-entitlement-plugin/releases) repository on GitHub.
1. In a terminal window, run:
```
cf over-entitlement-instances
```
This command returns output similar to the following example:
<pre class="terminal">
Note, this feature is experimental.
Showing over-entitlement apps in org example-org / space example-org-staging as [email protected] ...
space app
#0 example-org-staging dora-example-2
#1 example-org-staging dora-example-3
#2 example-org-staging dora-example-4
</pre>
The previous example output shows that the three listed apps have an average CPU usage that exceeds 100% of their entitlements. When an app's average CPU usage
exceeds its CPU entitlement, consider increasing their CPU entitlement to ensure that they do not become throttled.