Skip to content

Commit dbcbab3

Browse files
authored
Merge pull request #5699 from pecameron/metrics
Router - document published metrics
2 parents 595797d + 2c23c51 commit dbcbab3

File tree

3 files changed

+171
-81
lines changed

3 files changed

+171
-81
lines changed

architecture/topics/haproxy_template_router.adoc

Lines changed: 99 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,102 @@ The following diagram illustrates how data flows from the master through the
2727
plug-in and finally into an HAProxy configuration:
2828

2929
.HAProxy Router Data Flow
30-
image::router_model.png[HAProxy Router Data Flow]
30+
image::router_model.png[HAProxy Router Data Flow]
31+
32+
[[haproxy-metrics]]
33+
=== HAProxy Template Router Metrics
34+
35+
The HAProxy router exposes or publishes metrics in
36+
link:https://Prometheus.io/docs/concepts/data_model/[Prometheus format]
37+
for consumption by external metrics collection and aggregation systems (e.g. Prometheus, statsd).
38+
The router can be
39+
xref:../../install_config/router/default_haproxy_router.adoc#exposing-the-router-metrics[confiugred]
40+
to provide
41+
link:https://cbonte.github.io/haproxy-dconv/1.5/configuration.html#9[HAProxy CSV format] metrics, or
42+
provide no router metrics at all.
43+
44+
The metrics are collected from both the router controller and from HAProxy every 5 seconds.
45+
The router metrics counters start at zero when the router is deployed and increase over time.
46+
The HAProxy metrics counters are reset to zero every time haproxy is reloaded. The router
47+
collects HAProxy statistics for each frontend, backend and server. To reduce resource usage
48+
when there are more than 500 servers, the backends are reported instead of the servers since
49+
a backend can have multiple servers.
50+
51+
The statistics are a subset of the available HAProxy
52+
link:https://cbonte.github.io/haproxy-dconv/1.5/configuration.html#9.1[Statistics].
53+
54+
The following HAProxy metrics are collected on a preiodic basis and converted to Prometheus
55+
format. For every frontend the "F" counters are collected. When the total number of servers
56+
is less than the Server Threshold (default 500), the "b" counters are collected for each
57+
backend and the "S" server counters are collected for each server. Otherwise, the "B"
58+
counters are collected for each backend and no server counters are collected.
59+
60+
In the following table:
61+
62+
Column 1 - Index from HAProxy CSV statistics
63+
64+
Column 2
65+
|===
66+
|F|Frontend metrics
67+
|b|Backend metrics when not showing Server metrics due to the Server Threshold,
68+
|B|Backend metrics when showing Server metrics
69+
|S|Server metrics.
70+
|===
71+
72+
Column 3 - The counter
73+
74+
Column 4 - Counter description
75+
// defaultSelectedMetrics = []int{2, 4, 5, 7, 8, 9, 13, 14, 17, 21, 24, 33, 35, 40, 43, 60}
76+
// reducedBackendExports: map[int]struct{}{2: {}, 3: {}, 7: {}, 17: {}},
77+
|===
78+
|Index|Usage|Counter|Description
79+
|2|bBS|current_queue|Current number of queued requests not assigned to any server.
80+
//|3|bBS|max_queue|Maximum observed number of queued requests not assigned to any server.
81+
|4|FbS|current_sessions|Current number of active sessions.
82+
|5|FbS|max_sessions|Maximum observed number of active sessions.
83+
//|6|FbS|limit_sessions|Configured session limit.
84+
|7|FbBS|connections_total|Total number of connections.
85+
|8|FbS|bytes_in_total|Current total of incoming bytes.
86+
|9|FbS|bytes_out_total|Current total of outgoing bytes.
87+
//|10|F|requests_denied_total|Total of requests denied for security.
88+
//|12|F|request_errors_total|Total of request errors.
89+
|13|bS|connection_errors_total|Total of connection errors.
90+
|14|bS|response_errors_total|Total of response errors.
91+
//|15|bS|retry_warnings_total|Total of retry warnings.
92+
//|16|bS|redispatch_warnings_total|Total of redispatch warnings.
93+
|17|bBS|up|Current health status of the backend (1 = UP, 0 = DOWN).
94+
//|18|b.S|weight|Total weight of the servers in the backend.
95+
|21|S|check_failures_total|Total number of failed health checks.
96+
|24|S|downtime_seconds_total|Total downtime in seconds.", nil),
97+
|33|FbS|current_session_rate|Current number of sessions per second over last elapsed second.
98+
//|34|F|limit_session_rate|Configured limit on new sessions per second.
99+
|35|FbS|max_session_rate|Maximum observed number of sessions per second.
100+
//|38|S|check_duration_milliseconds|Previously run health check duration, in milliseconds.
101+
//|39|FbS|http_responses_total|Total of HTTP responses, code 1xx
102+
|40|FbS|http_responses_total|Total of HTTP responses, code 2xx
103+
//|41|FbS|http_responses_total|Total of HTTP responses, code 3xx
104+
//|42|FbS|http_responses_total|Total of HTTP responses, code 4xx
105+
|43|FbS|http_responses_total|Total of HTTP responses, code 5xx
106+
//|44|FbS|http_responses_total|Total of HTTP responses, code other
107+
//|48|F|http_requests_total|Total HTTP requests.
108+
|60|FbS|http_average_response_latency_milliseconds|of the last 1024 requests in milliseconds.
109+
|===
110+
111+
112+
The router controller scrapes the following items. These are only available with Prometheus format metrics.
113+
|===
114+
|Name|Description
115+
|template_router_reload_seconds|Measures the time spent reloading the router in seconds.
116+
|template_router_write_config_seconds|Measures the time spent writing out the router configuration to disk in seconds.
117+
|haproxy_exporter_up|Was the last scrape of haproxy successful.
118+
|haproxy_exporter_csv_parse_failures|Number of errors while parsing CSV.
119+
|haproxy_exporter_scrape_interval|The time in seconds before another scrape is allowed, proportional to size of data.
120+
|haproxy_exporter_server_threshold|Number of servers tracked and the current threshold value.
121+
|haproxy_exporter_total_scrapes|Current total HAProxy scrapes.
122+
|http_request_duration_microseconds|The HTTP request latencies in microseconds.
123+
|http_request_size_bytes|The HTTP request sizes in bytes.
124+
|http_response_size_bytes|The HTTP response sizes in bytes.
125+
|openshift_build_info|A metric with a constant '1' value labeled by major, minor, git commit & git version from which OpenShift was built.
126+
|ssh_tunnel_open_count|Counter of SSH tunnel total open attempts
127+
|ssh_tunnel_open_fail_count|Counter of SSH tunnel failed open attempts
128+
|===

architecture/topics/router_environment_variables.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,10 @@ connections (and any time HAProxy is reloaded), the old HAProxy processes
4545
will "linger" around for that period. xref:time-units[(TimeUnits)]
4646
|`ROUTER_DENIED_DOMAINS` | | A comma-separated list of domains that the host name in a route can not be part of. No subdomain in the domain can be used either. Overrides option `ROUTER_ALLOWED_DOMAINS`.
4747
|`ROUTER_ENABLE_COMPRESSION`| | If `true` or `TRUE`, compress responses when possible.
48+
|`ROUTER_LISTEN_ADDR`| 0.0.0.0:1936 | Sets the listening address for xref:../../install_config/router/default_haproxy_router.adoc#exposing-the-router-metrics[router metrics].
4849
|`ROUTER_LOG_LEVEL` | warning | The log level to send to the syslog server.
4950
|`ROUTER_MAX_CONNECTIONS`| 20000 | Maximum number of concurrent connections.
51+
|`ROUTER_METRICS_TYPE`| haproxy | Generate metrics for xref:../../install_config/router/default_haproxy_router.adoc#exposing-the-router-metrics[the HAProxy router]. (haproxy is the only supported value)
5052
|`ROUTER_OVERRIDE_HOSTNAME`| | If set `true`, override the spec.host value for a route with the template in `ROUTER_SUBDOMAIN`.
5153
|`ROUTER_SERVICE_HTTPS_PORT` | 443 | Port to listen for HTTPS requests.
5254
|`ROUTER_SERVICE_HTTP_PORT` | 80 | Port to listen for HTTP requests.

install_config/router/default_haproxy_router.adoc

Lines changed: 70 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -1500,109 +1500,99 @@ add routes across the namespaces.
15001500
[[exposing-the-router-metrics]]
15011501
== Exposing Router Metrics
15021502

1503-
Using the `--metrics-image` and `--expose-metrics` options, you can configure
1504-
the {product-title} router to run a sidecar container that exposes or publishes
1505-
router metrics for consumption by external metrics collection and aggregation
1506-
systems (e.g. Prometheus, statsd).
1507-
1508-
Depending on your router implementation, the image is appropriately set up and
1509-
the metrics sidecar container is started when the router is deployed. For
1510-
example, the HAProxy-based router implementation defaults to using the
1511-
`prom/haproxy-exporter` image to run as a sidecar container, which can then be
1512-
used as a metrics datasource by the Prometheus server.
1513-
1514-
[NOTE]
1515-
====
1516-
The `--metrics-image` option overrides the defaults for HAProxy-based router
1517-
implementations and, in the case of custom implementations, enables the image to
1518-
use for a custom metrics exporter or publisher.
1519-
====
1520-
1521-
ifdef::openshift-enterprise[]
1522-
. Grab the HAProxy Prometheus exporter image from the Docker registry:
1523-
+
1524-
====
1525-
----
1526-
$ sudo docker pull prom/haproxy-exporter
1527-
----
1528-
====
1503+
The
1504+
xref:../../architecture/networking/haproxy-router.adoc#haproxy-metrics[HAProxy router metrics]
1505+
are, by default, exposed or published in
1506+
link:https://prometheus.io/docs/concepts/data_model/[Prometheus format]
1507+
for consumption by external metrics collection and aggregation systems (e.g. Prometheus, statsd).
1508+
Metrics are also available dirctly from the
1509+
link:https://cbonte.github.io/haproxy-dconv/1.5/configuration.html#9[HAProxy router] in its own CSV format.
15291510

1530-
. Create the {product-title} router:
1531-
+
1511+
When you create a router, as below,
15321512
====
15331513
----
1534-
$ oc adm router --service-account=router --expose-metrics
1514+
$ oc adm router --service-account=router
15351515
----
15361516
====
1537-
+
1538-
Or, optionally, use the `--metrics-image` option to override the HAProxy
1539-
defaults:
1540-
+
1517+
metrics are automatically available in Prometheus format on the stats-port, default 1936. To suppress metrics collection,
15411518
====
15421519
----
1543-
$ oc adm router --service-account=router --expose-metrics \
1544-
--metrics-image=prom/haproxy-exporter
1520+
$ oc adm router --service-account=router --stats-port=0
15451521
----
15461522
====
1547-
endif::[]
1548-
ifdef::openshift-origin[]
1549-
. Grab the HAProxy Prometheus exporter image from the Docker registry:
1550-
+
1523+
1524+
To switch to the HAProxy CSV format metrics, edit the
1525+
architecture/networking/routes.adoc
1526+
xref:../../architecture/networking/routes.adoc#env-variables[environment variables]
1527+
for the router dc and delete the following lines:
1528+
15511529
====
15521530
----
1553-
$ sudo docker pull prom/haproxy-exporter
1531+
- name: ROUTER_LISTEN_ADDR
1532+
value: 0.0.0.0:1936
1533+
- name: ROUTER_METRICS_TYPE
1534+
value: haproxy
15541535
----
15551536
====
1537+
Where 1936 is the STATS_PORT value.
15561538

1557-
. Create the {product-title} router:
1558-
+
1539+
[NOTE]
15591540
====
1560-
----
1561-
$ oc adm router --service-account=router --expose-metrics
1562-
----
1541+
The `--expose-metrics` and `--metrics-image` options are deprecated. The haproxy-exporter
1542+
side car is now integrated into the router controller so you can delete the sidecar container from existing
1543+
router deployment configs. You can continue to use the side car in existing routers. New routers use the integrated metrics.
15631544
====
1564-
+
1565-
Or, optionally, use the `--metrics-image` option to override the HAProxy
1566-
defaults:
1567-
+
1545+
1546+
1547+
You can extract the raw statistics in Prometheus format by using the following.
1548+
1549+
Information needed to access the metrics is found the router service annotations:
1550+
15681551
====
15691552
----
1570-
$ oc adm router --service-account=router --expose-metrics \
1571-
--metrics-image=prom/haproxy-exporter
1553+
metadata:
1554+
annotations:
1555+
prometheus.io/port: "1936"
1556+
prometheus.io/scrape: "true"
1557+
prometheus.openshift.io/password: IImoDqON02
1558+
prometheus.openshift.io/username: admin
15721559
----
15731560
====
1574-
endif::[]
15751561

1576-
. Once the haproxy-exporter containers (and your HAProxy router) have started,
1577-
point Prometheus to the sidecar container on port 9101 on the node where the
1578-
haproxy-exporter container is running:
1579-
+
1580-
====
1581-
----
1582-
$ haproxy_exporter_ip="<enter-ip-address-or-hostname>"
1583-
$ cat > haproxy-scraper.yml <<CFGEOF
1584-
---
1585-
global:
1586-
scrape_interval: "60s"
1587-
scrape_timeout: "10s"
1588-
# external_labels:
1589-
# source: openshift-router
1590-
1591-
scrape_configs:
1592-
- job_name: "haproxy"
1593-
target_groups:
1594-
- targets:
1595-
- "${haproxy_exporter_ip}:9101"
1596-
CFGEOF
1597-
1598-
$ # And start prometheus as you would normally using the above config file.
1599-
$ echo " - Example: prometheus -config.file=haproxy-scraper.yml "
1600-
$ echo " or you can start it as a container on {product-title}!!
1562+
The metrics port is set from the STATS_PORT, default 1936. You may need to confiugre your firewall to permit access.
1563+
Use the above username and password to access the metrics. The path is "/metrics".
16011564

1602-
$ echo " - Once the prometheus server is up, view the {product-title} HAProxy "
1603-
$ echo " router metrics at: http://<ip>:9090/consoles/haproxy.html "
16041565
----
1605-
====
1566+
$ curl <user>:<password>@<router_IP>:<STATS_PORT>/metrics
1567+
for example:
1568+
$ curl admin:[email protected]:1936/metrics
1569+
...
1570+
# HELP haproxy_backend_connections_total Total number of connections.
1571+
# TYPE haproxy_backend_connections_total gauge
1572+
haproxy_backend_connections_total{backend="http",namespace="default",route="hello-route"} 0
1573+
haproxy_backend_connections_total{backend="http",namespace="default",route="hello-route-alt"} 0
1574+
haproxy_backend_connections_total{backend="http",namespace="default",route="hello-route01"} 0
1575+
...
1576+
# HELP haproxy_exporter_server_threshold Number of servers tracked and the current threshold value.
1577+
# TYPE haproxy_exporter_server_threshold gauge
1578+
haproxy_exporter_server_threshold{type="current"} 11
1579+
haproxy_exporter_server_threshold{type="limit"} 500
1580+
...
1581+
# HELP haproxy_frontend_bytes_in_total Current total of incoming bytes.
1582+
# TYPE haproxy_frontend_bytes_in_total gauge
1583+
haproxy_frontend_bytes_in_total{frontend="fe_no_sni"} 0
1584+
haproxy_frontend_bytes_in_total{frontend="fe_sni"} 0
1585+
haproxy_frontend_bytes_in_total{frontend="public"} 119070
1586+
...
1587+
# HELP haproxy_server_bytes_in_total Current total of incoming bytes.
1588+
# TYPE haproxy_server_bytes_in_total gauge
1589+
haproxy_server_bytes_in_total{namespace="",pod="",route="",server="fe_no_sni",service=""} 0
1590+
haproxy_server_bytes_in_total{namespace="",pod="",route="",server="fe_sni",service=""} 0
1591+
haproxy_server_bytes_in_total{namespace="default",pod="docker-registry-5-nk5fz",route="docker-registry",server="10.130.0.89:5000",service="docker-registry"} 0
1592+
haproxy_server_bytes_in_total{namespace="default",pod="hello-rc-vkjqx",route="hello-route",server="10.130.0.90:8080",service="hello-svc-1"} 0
1593+
...
1594+
----
1595+
16061596

16071597
[[preventing-connection-failures-during-restarts]]
16081598
== Preventing Connection Failures During Restarts

0 commit comments

Comments
 (0)