Skip to content

Commit 372eaa3

Browse files
Merge pull request #1803 from JoaoBraveCoding/mon-2727
MON-2727: Adds telemeter alert TelemeterClientFailures
2 parents aa61e2a + f7a2ab1 commit 372eaa3

File tree

3 files changed

+21
-2
lines changed

3 files changed

+21
-2
lines changed

Diff for: CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
## 4.12
44
- [#1624](https://github.com/openshift/cluster-monitoring-operator/pull/1624) Add option to specify TopologySpreadConstraints for Prometheus, Alertmanager, and ThanosRuler.
55
- [#1752](https://github.com/openshift/cluster-monitoring-operator/pull/1752) Add option to improve consistency of prometheus-adapter CPU and RAM time series.
6+
- [#1803](https://github.com/openshift/cluster-monitoring-operator/pull/1803) Add alert TelemeterClientFailures
67

78
## 4.11
89
- [#1652](https://github.com/openshift/cluster-monitoring-operator/pull/1652) Double scrape interval for all CMO controlled ServiceMonitors on single node deployments

Diff for: assets/telemeter-client/prometheus-rule.yaml

+18
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,21 @@ spec:
99
rules:
1010
- expr: max(federate_samples - federate_filtered_samples)
1111
record: cluster:telemetry_selected_series:count
12+
- alert: TelemeterClientFailures
13+
annotations:
14+
description: |-
15+
The telemeter client in namespace {{ $labels.namespace }} fails {{ $value | humanize }} of the requests to the telemeter service.
16+
Check the logs of the telemeter-client pod with the following command:
17+
oc logs -n openshift-monitoring deployment.apps/telemeter-client -c telemeter-client
18+
If the telemeter client fails to authenticate with the telemeter service, make sure that the global pull secret is up to date, see https://docs.openshift.com/container-platform/latest/openshift_images/managing_images/using-image-pull-secrets.html#images-update-global-pull-secret_using-image-pull-secrets for more details.
19+
summary: Telemeter client fails to send metrics
20+
expr: |
21+
sum by (namespace) (
22+
rate(federate_requests_failed_total{job="telemeter-client"}[15m])
23+
) /
24+
sum by (namespace) (
25+
rate(federate_requests_total{job="telemeter-client"}[15m])
26+
) > 0.2
27+
for: 1h
28+
labels:
29+
severity: warning

Diff for: jsonnet/jsonnetfile.lock.json

+2-2
Original file line numberDiff line numberDiff line change
@@ -120,8 +120,8 @@
120120
"subdir": "jsonnet/telemeter"
121121
}
122122
},
123-
"version": "320b9a967574c0a57690dea1987e1f294dbc22e5",
124-
"sum": "jPX3JQZndZSVPDmkW2HZEib7/oeuVpxGOB/rXSgyOcI=",
123+
"version": "4d304019274307c21afefa108493c8af89a2429d",
124+
"sum": "079UoqPnQJWKoVi2qMsVUANGD0cBkx25D+S7guvrcGc=",
125125
"name": "telemeter-client"
126126
},
127127
{

0 commit comments

Comments
 (0)