Skip to content

[enterprise-4.11] RHDEVDOCS-3919 - document size-based configuration for Prometheus metrics retention #45700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
// Module included in the following assemblies:
//
// * monitoring/configuring-the-monitoring-stack.adoc

:_content-type: PROCEDURE
[id="modifying-retention-time-and-size-for-prometheus-metrics-data_{context}"]
= Modifying the retention time and size for Prometheus metrics data

By default, Prometheus automatically retains metrics data for 15 days.
You can modify the retention time to change how soon data is deleted by specifying a time value in the `retention` field.
You can also configure the maximum amount of disk space the retained metrics data uses by specifying a size value in the `retentionSize` field.
If the data reaches this size limit, Prometheus deletes the oldest data first until the disk space used is again below the limit.

Note the following behaviors of these data retention settings:

* The size-based retention policy applies to all data block directories in the `/prometheus` directory, including persistent blocks, write-ahead log (WAL) data, and m-mapped chunks.
* Data in the `/wal` and `/head_chunks` directories counts toward the retention size limit, but Prometheus never purges data from these directories based on size- or time-based retention policies.
Thus, if you set a retention size limit lower than the maximum size set for the `/wal` and `/head_chunks` directories, you have configured the system not to retain any data blocks in the `/prometheus` data directories.
* The size-based retention policy is applied only when Prometheus cuts a new data block, which occurs every two hours after the WAL contains at least three hours of data.
* If you do not explicitly define values for either `retention` or `retentionSize`, retention time defaults to 15 days, and retention size is not set.
* If you define values for both `retention` and `retentionSize`, both values apply.
If any data blocks exceed the defined retention time or the defined size limit, Prometheus purges these data blocks.
* If you define a value for `retentionSize` and do not define `retention`, only the `retentionSize` value applies.
* If you do not define a value for `retentionSize` and only define a value for `retention`, only the `retention` value applies.

.Prerequisites

* *If you are configuring core {product-title} monitoring components*:
** You have access to the cluster as a user with the `cluster-admin` role.
** You have created the `cluster-monitoring-config` `ConfigMap` object.
* *If you are configuring components that monitor user-defined projects*:
** A cluster administrator has enabled monitoring for user-defined projects.
** You have access to the cluster as a user with the `cluster-admin` role, or as a user with the `user-workload-monitoring-config-edit` role in the `openshift-user-workload-monitoring` project.
** You have created the `user-workload-monitoring-config` `ConfigMap` object.
* You have installed the OpenShift CLI (`oc`).

[WARNING]
====
Saving changes to a monitoring config map might restart monitoring processes and redeploy the pods and other resources in the related project.
The running monitoring processes in that project might also restart.
====

.Procedure

. Edit the `ConfigMap` object:
** *To modify the retention time and size for the Prometheus instance that monitors core {product-title} projects*:
.. Edit the `cluster-monitoring-config` `ConfigMap` object in the `openshift-monitoring` project:
+
[source,terminal]
----
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
----

.. Add the retention time and size configuration under `data/config.yaml`:
+
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
prometheusK8s:
retention: <time_specification> <1>
retentionSize: <size_specification> <2>
----
+
<1> The retention time: a number directly followed by `ms` (milliseconds), `s` (seconds), `m` (minutes), `h` (hours), `d` (days), `w` (weeks), or `y` (years). You can also combine time values for specific times, such as `1h30m15s`.
<2> The retention size: a number directly followed by `B` (bytes), `KB` (kilobytes), `MB` (megabytes), `GB` (gigabytes), `TB` (terabytes), `PB` (petabytes), and `EB` (exabytes).
+
The following example sets the retention time to 24 hours and the retention size to 10 gigabytes for the Prometheus instance that monitors core {product-title} components:
+
[source,yaml,subs=quotes]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
prometheusK8s:
retention: *24h*
retentionSize: *10GB*
----

** *To modify the retention time and size for the Prometheus instance that monitors user-defined projects*:
.. Edit the `user-workload-monitoring-config` `ConfigMap` object in the `openshift-user-workload-monitoring` project:
+
[source,terminal]
----
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
----

.. Add the retention time and size configuration under `data/config.yaml`:
+
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
prometheus:
retention: <time_specification> <1>
retentionSize: <size_specification> <2>
----
+
<1> The retention time: a number directly followed by `ms` (milliseconds), `s` (seconds), `m` (minutes), `h` (hours), `d` (days), `w` (weeks), or `y` (years).
You can also combine time values for specific times, such as `1h30m15s`.
<2> The retention size: a number directly followed by `B` (bytes), `KB` (kilobytes), `MB` (megabytes), `GB` (gigabytes), `TB` (terabytes), `PB` (petabytes), or `EB` (exabytes).
+
The following example sets the retention time to 24 hours and the retention size to 10 gigabytes for the Prometheus instance that monitors user-defined projects:
+
[source,yaml,subs=quotes]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
prometheus:
retention: *24h*
retentionSize: *10GB*
----

. Save the file to apply the changes. The pods affected by the new configuration restart automatically.

This file was deleted.

2 changes: 1 addition & 1 deletion monitoring/configuring-the-monitoring-stack.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ If you use a local volume for persistent storage, do not use a raw block volume,
====

include::modules/monitoring-configuring-a-local-persistent-volume-claim.adoc[leveloffset=+2]
include::modules/monitoring-modifying-retention-time-for-prometheus-metrics-data.adoc[leveloffset=+2]
include::modules/monitoring-modifying-retention-time-and-size-for-prometheus-metrics-data.adoc[leveloffset=+2]

[role="_additional-resources"]
.Additional resources
Expand Down