Skip to content

Commit 9a02e92

Browse files
authored
Merge pull request #90113 from gabriel-rh/OBSDOCS-1751
OBSDOCS-1751 incident detection ui plugin
2 parents 630446a + b9244bd commit 9a02e92

13 files changed

+154
-4
lines changed

_topic_maps/_topic_map.yml

+2
Original file line numberDiff line numberDiff line change
@@ -2919,6 +2919,8 @@ Topics:
29192919
Topics:
29202920
- Name: Observability UI plugins overview
29212921
File: observability-ui-plugins-overview
2922+
- Name: Monitoring UI plugin
2923+
File: monitoring-ui-plugin
29222924
- Name: Logging UI plugin
29232925
File: logging-ui-plugin
29242926
- Name: Distributed tracing UI plugin
Loading
16.8 KB
Loading
33.1 KB
Loading
7.14 KB
Loading
9.97 KB
Loading

images/coo-incidents-timeline.png

17.5 KB
Loading

modules/coo-distributed-tracing-ui-plugin-install.adoc

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
// * observability/cluster_observability_operator/ui_plugins/distributed-tracing-ui-plugin.adoc
44

55
:_mod-docs-content-type: PROCEDURE
6-
[id="coo-distributed-tracing-ui-plugin-install-_{context}"]
6+
[id="coo-distributed-tracing-ui-plugin-install_{context}"]
77
= Installing the {coo-full} distributed tracing UI plugin
88

99

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
// Module included in the following assemblies:
2+
3+
// * observability/cluster_observability_operator/ui_plugins/incident-detection-ui-plugin.adoc
4+
5+
:_mod-docs-content-type: CONCEPT
6+
[id="coo-incident-detection-overview_{context}"]
7+
= {coo-full} incident detection overview
8+
9+
Clusters can generate significant volumes of monitoring data, making it hard for you to distinguish critical signals from noise.
10+
Single incidents can trigger a cascade of alerts, and this results in extended time to detect and resolve issues.
11+
12+
The {coo-full} incident detection feature groups related alerts into *incidents*. These incidents are then visualized as timelines that are color-coded by severity.
13+
Alerts are mapped to specific components, grouped by severity, helping you to identify root causes by focusing on high impact components first.
14+
You can then drill down from the incident timelines to individual alerts to determine how to fix the underlying issue.
15+
16+
{coo-full} incident detection transforms the alert storm into clear steps for faster understanding and resolution of the incidents that occur on your clusters.
+59
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
// Module included in the following assemblies:
2+
3+
// * observability/cluster_observability_operator/ui_plugins/incident-detection-ui-plugin.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="coo-incident-detection-using_{context}"]
7+
= Using {coo-full} incident detection
8+
9+
.Prerequisites
10+
11+
* You have access to the cluster as a user with the `cluster-admin` cluster role.
12+
* You have logged in to the {product-title} web console.
13+
* You have installed the {coo-full}.
14+
* You have installed the {coo-full} monitoring UI plugin with incident detection enabled.
15+
16+
17+
.Procedure
18+
19+
. In the Administrator perspective of the web console, click on *Observe* -> *Incidents*.
20+
21+
. The Incidents Timeline UI shows the grouping of alerts into *incidents*. The color coding of the lines in the graph corresponds to the severity of the incident. By default, a seven day timeline is presented.
22+
+
23+
image::coo-incidents-timeline-weekly.png[Weekly incidents timeline]
24+
+
25+
[NOTE]
26+
====
27+
It will take at least 10 minutes to process the correlations and to see the timeline, after you enable incident detection.
28+
29+
The analysis and grouping into incidents is performed only for alerts that are firing after you have enabled this feature. Alerts that have been resolved before feature enablement are not included.
30+
====
31+
32+
. Zoom in to a 1-day view by clicking on the drop-down to specify the duration.
33+
+
34+
image::coo-incidents-timeline-daily.png[Daily incidents timeline]
35+
36+
. By clicking on an incident, you can see the timeline of alerts that are part of that incident, in the Alerts Timeline UI.
37+
+
38+
image::coo-incident-alerts-timeline.png[Incidents alerts timeline]
39+
40+
. In the list of alerts that follows, alerts are mapped to specific components, which are grouped by severity.
41+
+
42+
image::coo-incident-alerts-components.png[Incidents alerts components]
43+
44+
. Click to expand a compute component in the list. The underlying alerts related to that component are displayed.
45+
+
46+
image::coo-incident-alerts-components-expanded.png[Incidents expanded components]
47+
48+
. Click the link for a firing alert, to see detailed information about that alert.
49+
50+
51+
52+
[NOTE]
53+
====
54+
**Known issues**
55+
56+
* Depending on the order of the timeline bars, the tooltip might overlap and hide the underlying bar. You can still click the bar and select the incident or alert.
57+
58+
* The Silence Alert button in the **Incidents** -> **Component** section does not pre-populate the fields and is not usable. As a workaround, you can use the same menu and the Silence Alert button in the **Alerting** section instead.
59+
====
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
// Module included in the following assemblies:
2+
3+
// * observability/cluster_observability_operator/ui_plugins/monitoring-ui-plugin.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="coo-monitoring-ui-plugin-install_{context}"]
7+
= Installing the {coo-full} monitoring UI plugin
8+
9+
The monitoring UI plugin adds monitoring related UI features to the OpenShift web console, for the Advance Cluster Management (ACM) perspective and for incident detection.
10+
11+
.Prerequisites
12+
13+
* You have access to the cluster as a user with the `cluster-admin` cluster role.
14+
* You have logged in to the {product-title} web console.
15+
* You have installed the {coo-full}
16+
17+
.Procedure
18+
19+
. In the {product-title} web console, click *Operators* -> *Installed Operators* and select {coo-full}
20+
. Choose the *UI Plugin* tab (at the far right of the tab list) and press *Create UIPlugin*
21+
. Select *YAML view*, enter the following content, and then press *Create*:
22+
+
23+
[source,yaml]
24+
----
25+
apiVersion: observability.openshift.io/v1alpha1
26+
kind: UIPlugin
27+
metadata:
28+
name: monitoring
29+
spec:
30+
type: Monitoring
31+
monitoring:
32+
acm: # <1>
33+
enabled: true
34+
alertmanager:
35+
url: 'https://alertmanager.open-cluster-management-observability.svc:9095'
36+
thanosQuerier:
37+
url: 'https://rbac-query-proxy.open-cluster-management-observability.svc:8443'
38+
incidents: # <2>
39+
enabled: true
40+
----
41+
<1> Enable {rh-rhacm} features. You must configure the Alertmanager and ThanosQuerier Service endpoints.
42+
<2> Enable incident detection features.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
[id="monitoring-ui-plugin"]
3+
= Monitoring UI plugin
4+
include::_attributes/common-attributes.adoc[]
5+
:context: monitoring-ui-plugin
6+
7+
toc::[]
8+
9+
:FeatureName: The {coo-full} monitoring UI plugin
10+
include::snippets/technology-preview.adoc[leveloffset=+2]
11+
12+
The monitoring UI plugin adds monitoring features to the Administrator perspective of the OpenShift web console.
13+
14+
* **{rh-rhacm}:** The monitoring plugin in {coo-first} allows it to function in {rh-rhacm-first} environments, providing {rh-rhacm} with the same alerting capabilities as {product-title}. You can configure the plugin to fetch alerts from the {rh-rhacm} Alertmanager backend. This enables seamless integration and user experience by aligning {rh-rhacm} and {product-title} monitoring workflows.
15+
16+
* **Incident detection:** The incident detection feature groups related alerts into incidents, to help you identify the root causes of alert bursts, instead of being overwhelmed by individual alerts. It presents a timeline of incidents, color-coded by severity, and you can drill down into the individual alerts within an incident. The system also categorizes alerts by affected component, grouped by severity. This helps you focus on the most critical areas first.
17+
+
18+
The incident detection feature is available in the Administrator perspective of the OpenShift web console at **Observe****Incidents**.
19+
20+
include::modules/coo-monitoring-ui-plugin-install.adoc[leveloffset=+1]
21+
22+
include::modules/coo-incident-detection-overview.adoc[leveloffset=+1]
23+
24+
include::modules/coo-incident-detection-using.adoc[leveloffset=+1]

observability/cluster_observability_operator/ui_plugins/observability-ui-plugins-overview.adoc

+10-3
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,16 @@ toc::[]
99
You can use the {coo-first} to install and manage UI plugins to enhance the observability capabilities of the {product-title} web console.
1010
The plugins extend the default functionality, providing new UI features for troubleshooting, distributed tracing, and cluster logging.
1111

12+
[id="monitoring_{context}"]
13+
== Monitoring
1214

15+
The monitoring UI plugin adds monitoring related UI features to the OpenShift web console, for the Advance Cluster Management (ACM) perspective and for incident detection.
16+
17+
* **ACM:** The monitoring plugin in {coo-first} allows it to function in {rh-rhacm-first} environments, providing ACM with the same monitoring capabilities as {product-title}.
18+
19+
* **Incident Detection:** The incident detection feature groups alerts into incidents to help you identify the root causes of alert bursts instead of being overwhelmed by individual alerts. It presents a timeline of incidents, color-coded by severity, and you can drill down into the individual alerts within an incident. The system also categorizes alerts by affected component to help you focus on the most critical areas first.
20+
21+
For more information, see the xref:../../../observability/cluster_observability_operator/ui_plugins/monitoring-ui-plugin.adoc#monitoring-ui-plugin[monitoring UI plugin] page.
1322

1423
[id="cluster-logging_{context}"]
1524
== Cluster logging
@@ -19,7 +28,6 @@ You can specify filters, queries, time ranges and refresh rates. The results dis
1928

2029
For more information, see the xref:../../../observability/cluster_observability_operator/ui_plugins/logging-ui-plugin.adoc#logging-ui-plugin[logging UI plugin] page.
2130

22-
2331
[id="troubleshooting_{context}"]
2432
== Troubleshooting
2533

@@ -46,7 +54,6 @@ You can select a supported `TempoStack` or `TempoMonolithic` multi-tenant instan
4654

4755
For more information, see the xref:../../../observability/cluster_observability_operator/ui_plugins/distributed-tracing-ui-plugin.adoc#distributed-tracing-ui-plugin[distributed tracing UI plugin] page.
4856

49-
5057
////
5158
[id="dashboards_{context}"]
5259
== Dashboards
@@ -57,4 +64,4 @@ This results in a unified observability experience across different data sources
5764
5865
For more information, see the xref :../../../observability/cluster_observability_operator/ui_plugins/dashboard-ui-plugin.adoc#dashboard-ui-plugin[dashboard UI plugin] page.
5966
60-
////
67+
////

0 commit comments

Comments
 (0)