Skip to content

OBSDOCS-1751 incident detection ui plugin #90113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions _topic_maps/_topic_map.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2911,6 +2911,8 @@ Topics:
Topics:
- Name: Observability UI plugins overview
File: observability-ui-plugins-overview
- Name: Monitoring UI plugin
File: monitoring-ui-plugin
- Name: Logging UI plugin
File: logging-ui-plugin
- Name: Distributed tracing UI plugin
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/coo-incident-alerts-components.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/coo-incident-alerts-timeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/coo-incidents-timeline-daily.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/coo-incidents-timeline-weekly.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/coo-incidents-timeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion modules/coo-distributed-tracing-ui-plugin-install.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
// * observability/cluster_observability_operator/ui_plugins/distributed-tracing-ui-plugin.adoc

:_mod-docs-content-type: PROCEDURE
[id="coo-distributed-tracing-ui-plugin-install-_{context}"]
[id="coo-distributed-tracing-ui-plugin-install_{context}"]
= Installing the {coo-full} distributed tracing UI plugin


Expand Down
16 changes: 16 additions & 0 deletions modules/coo-incident-detection-overview.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
// Module included in the following assemblies:

// * observability/cluster_observability_operator/ui_plugins/incident-detection-ui-plugin.adoc

:_mod-docs-content-type: CONCEPT
[id="coo-incident-detection-overview_{context}"]
= {coo-full} incident detection overview

Clusters can generate significant volumes of monitoring data, making it hard for you to distinguish critical signals from noise.
Single incidents can trigger a cascade of alerts, and this results in extended time to detect and resolve issues.

The {coo-full} incident detection feature groups related alerts into *incidents*. These incidents are then visualized as timelines that are color-coded by severity.
Alerts are mapped to specific components, grouped by severity, helping you to identify root causes by focusing on high impact components first.
You can then drill down from the incident timelines to individual alerts to determine how to fix the underlying issue.

{coo-full} incident detection transforms the alert storm into clear steps for faster understanding and resolution of the incidents that occur on your clusters.
59 changes: 59 additions & 0 deletions modules/coo-incident-detection-using.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
// Module included in the following assemblies:

// * observability/cluster_observability_operator/ui_plugins/incident-detection-ui-plugin.adoc

:_mod-docs-content-type: PROCEDURE
[id="coo-incident-detection-using_{context}"]
= Using {coo-full} incident detection

.Prerequisites

* You have access to the cluster as a user with the `cluster-admin` cluster role.
* You have logged in to the {product-title} web console.
* You have installed the {coo-full}.
* You have installed the {coo-full} monitoring UI plugin with incident detection enabled.


.Procedure

. In the Administrator perspective of the web console, click on *Observe* -> *Incidents*.

. The Incidents Timeline UI shows the grouping of alerts into *incidents*. The color coding of the lines in the graph corresponds to the severity of the incident. By default, a seven day timeline is presented.
+
image::coo-incidents-timeline-weekly.png[Weekly incidents timeline]
+
[NOTE]
====
It will take at least 10 minutes to process the correlations and to see the timeline, after you enable incident detection.

The analysis and grouping into incidents is performed only for alerts that are firing after you have enabled this feature. Alerts that have been resolved before feature enablement are not included.
====

. Zoom in to a 1-day view by clicking on the drop-down to specify the duration.
+
image::coo-incidents-timeline-daily.png[Daily incidents timeline]

. By clicking on an incident, you can see the timeline of alerts that are part of that incident, in the Alerts Timeline UI.
+
image::coo-incident-alerts-timeline.png[Incidents alerts timeline]

. In the list of alerts that follows, alerts are mapped to specific components, which are grouped by severity.
+
image::coo-incident-alerts-components.png[Incidents alerts components]

. Click to expand a compute component in the list. The underlying alerts related to that component are displayed.
+
image::coo-incident-alerts-components-expanded.png[Incidents expanded components]

. Click the link for a firing alert, to see detailed information about that alert.



[NOTE]
====
**Known issues**

* Depending on the order of the timeline bars, the tooltip might overlap and hide the underlying bar. You can still click the bar and select the incident or alert.

* The Silence Alert button in the **Incidents** -> **Component** section does not pre-populate the fields and is not usable. As a workaround, you can use the same menu and the Silence Alert button in the **Alerting** section instead.
====
42 changes: 42 additions & 0 deletions modules/coo-monitoring-ui-plugin-install.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
// Module included in the following assemblies:

// * observability/cluster_observability_operator/ui_plugins/monitoring-ui-plugin.adoc

:_mod-docs-content-type: PROCEDURE
[id="coo-monitoring-ui-plugin-install_{context}"]
= Installing the {coo-full} monitoring UI plugin

The monitoring UI plugin adds monitoring related UI features to the OpenShift web console, for the Advance Cluster Management (ACM) perspective and for incident detection.

.Prerequisites

* You have access to the cluster as a user with the `cluster-admin` cluster role.
* You have logged in to the {product-title} web console.
* You have installed the {coo-full}

.Procedure

. In the {product-title} web console, click *Operators* -> *Installed Operators* and select {coo-full}
. Choose the *UI Plugin* tab (at the far right of the tab list) and press *Create UIPlugin*
. Select *YAML view*, enter the following content, and then press *Create*:
+
[source,yaml]
----
apiVersion: observability.openshift.io/v1alpha1
kind: UIPlugin
metadata:
name: monitoring
spec:
type: Monitoring
monitoring:
acm: # <1>
enabled: true
alertmanager:
url: 'https://alertmanager.open-cluster-management-observability.svc:9095'
thanosQuerier:
url: 'https://rbac-query-proxy.open-cluster-management-observability.svc:8443'
incidents: # <2>
enabled: true
----
<1> Enable {rh-rhacm} features. You must configure the Alertmanager and ThanosQuerier Service endpoints.
<2> Enable incident detection features.
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
:_mod-docs-content-type: ASSEMBLY
[id="monitoring-ui-plugin"]
= Monitoring UI plugin
include::_attributes/common-attributes.adoc[]
:context: monitoring-ui-plugin

toc::[]

:FeatureName: The {coo-full} monitoring UI plugin
include::snippets/technology-preview.adoc[leveloffset=+2]

The monitoring UI plugin adds monitoring features to the Administrator perspective of the OpenShift web console.

* **{rh-rhacm}:** The monitoring plugin in {coo-first} allows it to function in {rh-rhacm-first} environments, providing {rh-rhacm} with the same alerting capabilities as {product-title}. You can configure the plugin to fetch alerts from the {rh-rhacm} Alertmanager backend. This enables seamless integration and user experience by aligning {rh-rhacm} and {product-title} monitoring workflows.

* **Incident detection:** The incident detection feature groups related alerts into incidents, to help you identify the root causes of alert bursts, instead of being overwhelmed by individual alerts. It presents a timeline of incidents, color-coded by severity, and you can drill down into the individual alerts within an incident. The system also categorizes alerts by affected component, grouped by severity. This helps you focus on the most critical areas first.
+
The incident detection feature is available in the Administrator perspective of the OpenShift web console at **Observe** → **Incidents**.

include::modules/coo-monitoring-ui-plugin-install.adoc[leveloffset=+1]

include::modules/coo-incident-detection-overview.adoc[leveloffset=+1]

include::modules/coo-incident-detection-using.adoc[leveloffset=+1]
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,16 @@ toc::[]
You can use the {coo-first} to install and manage UI plugins to enhance the observability capabilities of the {product-title} web console.
The plugins extend the default functionality, providing new UI features for troubleshooting, distributed tracing, and cluster logging.

[id="monitoring_{context}"]
== Monitoring

The monitoring UI plugin adds monitoring related UI features to the OpenShift web console, for the Advance Cluster Management (ACM) perspective and for incident detection.

* **ACM:** The monitoring plugin in {coo-first} allows it to function in {rh-rhacm-first} environments, providing ACM with the same monitoring capabilities as {product-title}.

* **Incident Detection:** The incident detection feature groups alerts into incidents to help you identify the root causes of alert bursts instead of being overwhelmed by individual alerts. It presents a timeline of incidents, color-coded by severity, and you can drill down into the individual alerts within an incident. The system also categorizes alerts by affected component to help you focus on the most critical areas first.

For more information, see the xref:../../../observability/cluster_observability_operator/ui_plugins/monitoring-ui-plugin.adoc#monitoring-ui-plugin[monitoring UI plugin] page.

[id="cluster-logging_{context}"]
== Cluster logging
Expand All @@ -19,7 +28,6 @@ You can specify filters, queries, time ranges and refresh rates. The results dis

For more information, see the xref:../../../observability/cluster_observability_operator/ui_plugins/logging-ui-plugin.adoc#logging-ui-plugin[logging UI plugin] page.


[id="troubleshooting_{context}"]
== Troubleshooting

Expand All @@ -46,7 +54,6 @@ You can select a supported `TempoStack` or `TempoMonolithic` multi-tenant instan

For more information, see the xref:../../../observability/cluster_observability_operator/ui_plugins/distributed-tracing-ui-plugin.adoc#distributed-tracing-ui-plugin[distributed tracing UI plugin] page.


////
[id="dashboards_{context}"]
== Dashboards
Expand All @@ -57,4 +64,4 @@ This results in a unified observability experience across different data sources

For more information, see the xref :../../../observability/cluster_observability_operator/ui_plugins/dashboard-ui-plugin.adoc#dashboard-ui-plugin[dashboard UI plugin] page.

////
////