Skip to content

Commit a27d56b

Browse files
committedAug 28, 2018
update suggested build metric query rules
1 parent 55f97bf commit a27d56b

File tree

1 file changed

+10
-2
lines changed

1 file changed

+10
-2
lines changed
 

‎examples/prometheus/README.md

+10-2
Original file line numberDiff line numberDiff line change
@@ -165,11 +165,19 @@ builds where the fact they have not started could be cited as resulting from use
165165

166166
NOTE: OpenShift Online monitors builds in a fashion similar to this today.
167167

168-
> sum(rate(openshift_build_total{phase="Error"}[10m])) / sum((rate(openshift_build_total{phase="Complete"}[10m]) + rate(openshift_build_total{phase="Error"}[10m]))) * 100
168+
> sum(openshift_build_total{job="kubernetes-apiservers",phase="Error"})/(sum(openshift_build_total{job="kubernetes-apiservers",phase=~"Complete|Error"})) * 100
169169
170-
Calculates the error rate for builds over the last 10 minutes, where the error might indicate issues with the cluster or namespace. Note, it ignores build in the "Failed" and "Cancelled" phases, as builds typically end up in
170+
Calculates the error rate for builds, where the error might indicate issues with the cluster or namespace. Note, it ignores builds in the "Failed" and "Cancelled" phases, as builds typically end up in
171171
one of those phases as the result of a user choice or error. Administrators after some experience with their cluster could decide what is an acceptable error rate and monitor when it is exceeded.
172172

173+
> ((sum(openshift_build_total{job="kubernetes-apiservers",phase="Complete"})-
174+
> sum(openshift_build_total{job="kubernetes-apiservers",phase="Complete"} offset 1h)) /
175+
> (sum(openshift_build_total{job="kubernetes-apiservers",phase=\~"Failed|Complete|Error"}) -
176+
> (sum(openshift_build_total{job="kubernetes-apiservers",phase=\~"Failed|Complete|Error"} offset 1h)))) * 100
177+
178+
Calculates the percentage of builds that were successful in the last hour. Note that this value is only accurate if no pruning of builds
179+
is performed, otherwise it is impossible to determine how many builds ran (successfully or otherwise) in the last hour.
180+
173181
> predict_linear(openshift_build_total{phase="Error"}[1h],3600)
174182
175183
Predicts what the error count will be in 1 hour, using last hours data.

0 commit comments

Comments
 (0)
Please sign in to comment.