Skip to content

OCPBUGS-12903: Add new web console usage metrics #1910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## 4.14

- [#1937](https://github.com/openshift/cluster-monitoring-operator/pull/1937) Disables btrfs collector
- [#1910](https://github.com/openshift/cluster-monitoring-operator/pull/1910) Add new web console usage metrics

## 4.13

Expand All @@ -15,7 +16,7 @@
- [#1882](https://github.com/openshift/cluster-monitoring-operator/issues/1882) Allow configuring secrets in alertmanager component (platform)
- [#1876](https://github.com/openshift/cluster-monitoring-operator/pull/1876) Add nodeExporter.collectors.tcpstat settings.
- [#1888](https://github.com/openshift/cluster-monitoring-operator/pull/1888) Add nodeExporter.collectors.netdev settings.
- [#1884](https://github.com/openshift/cluster-monitoring-operator/issues/1882) Allow configuring secrets in alertmanager component (UWM)
- [#1884](https://github.com/openshift/cluster-monitoring-operator/issues/1884) Allow configuring secrets in alertmanager component (UWM)
- [#1893](https://github.com/openshift/cluster-monitoring-operator/pull/1893) Add nodeExporter.collectors.netclass settings.
- [#1894](https://github.com/openshift/cluster-monitoring-operator/pull/1894) Add toggle netlink implementation of netclass collector in Node Exporter.
- [#1891](https://github.com/openshift/cluster-monitoring-operator/pull/1891) Add nodeExporter.collectors.buddyinfo settings.
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ This targets needs `docker` to be installed on host and was not tested with othe

## Testing

Supposing $KUBECONFIOG is set to the config file of a Kubernetes cluster running the codes to test. We can run all tests by using this command `make test`.
Supposing $KUBECONFIG is set to the config file of a Kubernetes cluster running the codes to test. We can run all tests by using this command `make test`.

The testing consist of 3 aspects:
- unit tests, can be run seperately by `make test-unit`.
Expand Down
50 changes: 50 additions & 0 deletions Documentation/data-collection.md
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,56 @@ data:
# consumers: (@openshift/openshift-team-cluster-manager)
- '{__name__="console_url"}'
#
# owners: (@openshift/hybrid-application-console-maintainers)
# cluster:console_auth_login_requests_total:sum gives the total number of login requests initiated from the web console.
#
- '{__name__="cluster:console_auth_login_requests_total:sum"}'
#
# owners: (@openshift/hybrid-application-console-maintainers)
# cluster:console_auth_login_successes_total:sum gives the total number of successful logins initiated from the web console.
# Labels:
# * `role`, one of `kubeadmin`, `cluster-admin` or `developer`. The value is based whether or not the logged-in user can list all namespaces.
#
- '{__name__="cluster:console_auth_login_successes_total:sum"}'
#
# owners: (@openshift/hybrid-application-console-maintainers)
# cluster:console_auth_login_failures_total:sum gives the total number of login failures initiated from the web console.
# This might include canceled OAuth logins depenending on the user OAuth provider/configuration.
# Labels:
# * `reason`, currently always `unknown`
#
- '{__name__="cluster:console_auth_login_failures_total:sum"}'
#
# owners: (@openshift/hybrid-application-console-maintainers)
# cluster:console_auth_logout_requests_total:sum gives the total number of logout requests send from the web console.
# Labels:
# * `reason`, currently always `unknown`
#
- '{__name__="cluster:console_auth_logout_requests_total:sum"}'
#
# owners: (@openshift/hybrid-application-console-maintainers)
# cluster:console_usage_users:max contains the number of web console users splitten into the roles.
# Labels:
# * `role`: `kubeadmin`, `cluster-admin` or `developer`. The value is based whether or not the user can list all namespaces.
#
- '{__name__="cluster:console_usage_users:max"}'
#
# owners: (@openshift/hybrid-application-console-maintainers)
# cluster:console_plugins_info:max reports information about the web console plugins and their state.
# Labels:
# * `name`: `redhat`, `demo` or `other`.
# * `state`: `enabled`, `disabled` or `notfound`
#
- '{__name__="cluster:console_plugins_info:max"}'
#
# owners: (@openshift/hybrid-application-console-maintainers)
# cluster:console_customization_perspectives_info:max reports information about customized web console perspectives.
# Labels:
# * `name`, one of `admin`, `dev`, `acm` or `other`
# * `state`, one of `enabled`, `disabled`, `only-for-cluster-admins`, `only-for-developers` or `custom-permissions`
#
- '{__name__="cluster:console_customization_perspectives_info:max"}'
#
# owners: (@openshift/networking)
#
# ovnkube_master_egress_routing_via_host" informs if the OVN-K cluster's gateway mode is
Expand Down
2 changes: 1 addition & 1 deletion Documentation/sample-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ return the full set of metrics that the Telemeter client captures:

[embedmd]:# (telemetry/telemeter_query txt)
```txt
{__name__=~"cluster:usage:.*|count:up0|count:up1|cluster_version|cluster_version_available_updates|cluster_version_capability|cluster_operator_up|cluster_operator_conditions|cluster_version_payload|cluster_installer|cluster_infrastructure_provider|cluster_feature_set|instance:etcd_object_counts:sum|ALERTS|code:apiserver_request_total:rate:sum|cluster:capacity_cpu_cores:sum|cluster:capacity_memory_bytes:sum|cluster:cpu_usage_cores:sum|cluster:memory_usage_bytes:sum|openshift:cpu_usage_cores:sum|openshift:memory_usage_bytes:sum|workload:cpu_usage_cores:sum|workload:memory_usage_bytes:sum|cluster:virt_platform_nodes:sum|cluster:node_instance_type_count:sum|cnv:vmi_status_running:count|cluster:vmi_request_cpu_cores:sum|node_role_os_version_machine:cpu_capacity_cores:sum|node_role_os_version_machine:cpu_capacity_sockets:sum|subscription_sync_total|olm_resolution_duration_seconds|csv_succeeded|csv_abnormal|cluster:kube_persistentvolumeclaim_resource_requests_storage_bytes:provisioner:sum|cluster:kubelet_volume_stats_used_bytes:provisioner:sum|ceph_cluster_total_bytes|ceph_cluster_total_used_raw_bytes|ceph_health_status|odf_system_raw_capacity_total_bytes|odf_system_raw_capacity_used_bytes|odf_system_health_status|job:ceph_osd_metadata:count|job:kube_pv:count|job:odf_system_pvs:count|job:ceph_pools_iops:total|job:ceph_pools_iops_bytes:total|job:ceph_versions_running:count|job:noobaa_total_unhealthy_buckets:sum|job:noobaa_bucket_count:sum|job:noobaa_total_object_count:sum|odf_system_bucket_count|odf_system_objects_total|noobaa_accounts_num|noobaa_total_usage|console_url|cluster:ovnkube_master_egress_routing_via_host:max|cluster:network_attachment_definition_instances:max|cluster:network_attachment_definition_enabled_instance_up:max|cluster:ingress_controller_aws_nlb_active:sum|cluster:route_metrics_controller_routes_per_shard:min|cluster:route_metrics_controller_routes_per_shard:max|cluster:route_metrics_controller_routes_per_shard:avg|cluster:route_metrics_controller_routes_per_shard:median|cluster:openshift_route_info:tls_termination:sum|insightsclient_request_send_total|cam_app_workload_migrations|cluster:apiserver_current_inflight_requests:sum:max_over_time:2m|cluster:alertmanager_integrations:max|cluster:telemetry_selected_series:count|openshift:prometheus_tsdb_head_series:sum|openshift:prometheus_tsdb_head_samples_appended_total:sum|monitoring:container_memory_working_set_bytes:sum|namespace_job:scrape_series_added:topk3_sum1h|namespace_job:scrape_samples_post_metric_relabeling:topk3|monitoring:haproxy_server_http_responses_total:sum|rhmi_status|status:upgrading:version:rhoam_state:max|state:rhoam_critical_alerts:max|state:rhoam_warning_alerts:max|rhoam_7d_slo_percentile:max|rhoam_7d_slo_remaining_error_budget:max|cluster_legacy_scheduler_policy|cluster_master_schedulable|che_workspace_status|che_workspace_started_total|che_workspace_failure_total|che_workspace_start_time_seconds_sum|che_workspace_start_time_seconds_count|cco_credentials_mode|cluster:kube_persistentvolume_plugin_type_counts:sum|visual_web_terminal_sessions_total|acm_managed_cluster_info|cluster:vsphere_vcenter_info:sum|cluster:vsphere_esxi_version_total:sum|cluster:vsphere_node_hw_version_total:sum|openshift:build_by_strategy:sum|rhods_aggregate_availability|rhods_total_users|instance:etcd_disk_wal_fsync_duration_seconds:histogram_quantile|instance:etcd_mvcc_db_total_size_in_bytes:sum|instance:etcd_network_peer_round_trip_time_seconds:histogram_quantile|instance:etcd_mvcc_db_total_size_in_use_in_bytes:sum|instance:etcd_disk_backend_commit_duration_seconds:histogram_quantile|jaeger_operator_instances_storage_types|jaeger_operator_instances_strategies|jaeger_operator_instances_agent_strategies|appsvcs:cores_by_product:sum|nto_custom_profiles:count|openshift_csi_share_configmap|openshift_csi_share_secret|openshift_csi_share_mount_failures_total|openshift_csi_share_mount_requests_total|cluster:velero_backup_total:max|cluster:velero_restore_total:max|eo_es_storage_info|eo_es_redundancy_policy_info|eo_es_defined_delete_namespaces_total|eo_es_misconfigured_memory_resources_info|cluster:eo_es_data_nodes_total:max|cluster:eo_es_documents_created_total:sum|cluster:eo_es_documents_deleted_total:sum|pod:eo_es_shards_total:max|eo_es_cluster_management_state_info|imageregistry:imagestreamtags_count:sum|imageregistry:operations_count:sum|log_logging_info|log_collector_error_count_total|log_forwarder_pipeline_info|log_forwarder_input_info|log_forwarder_output_info|cluster:log_collected_bytes_total:sum|cluster:log_logged_bytes_total:sum|cluster:kata_monitor_running_shim_count:sum|platform:hypershift_hostedclusters:max|platform:hypershift_nodepools:max|namespace:noobaa_unhealthy_bucket_claims:max|namespace:noobaa_buckets_claims:max|namespace:noobaa_unhealthy_namespace_resources:max|namespace:noobaa_namespace_resources:max|namespace:noobaa_unhealthy_namespace_buckets:max|namespace:noobaa_namespace_buckets:max|namespace:noobaa_accounts:max|namespace:noobaa_usage:max|namespace:noobaa_system_health_status:max|ocs_advanced_feature_usage|os_image_url_override:sum|cluster:vsphere_topology_tags:max|cluster:vsphere_infrastructure_failure_domains:max|cluster:vsphere_csi_migration:max",alertstate=~"firing|",quantile=~"0.99|0.99|0.99|",system_type=~"OCS|OCS|",system_vendor=~"Red Hat|Red Hat|"}
{__name__=~"cluster:usage:.*|count:up0|count:up1|cluster_version|cluster_version_available_updates|cluster_version_capability|cluster_operator_up|cluster_operator_conditions|cluster_version_payload|cluster_installer|cluster_infrastructure_provider|cluster_feature_set|instance:etcd_object_counts:sum|ALERTS|code:apiserver_request_total:rate:sum|cluster:capacity_cpu_cores:sum|cluster:capacity_memory_bytes:sum|cluster:cpu_usage_cores:sum|cluster:memory_usage_bytes:sum|openshift:cpu_usage_cores:sum|openshift:memory_usage_bytes:sum|workload:cpu_usage_cores:sum|workload:memory_usage_bytes:sum|cluster:virt_platform_nodes:sum|cluster:node_instance_type_count:sum|cnv:vmi_status_running:count|cluster:vmi_request_cpu_cores:sum|node_role_os_version_machine:cpu_capacity_cores:sum|node_role_os_version_machine:cpu_capacity_sockets:sum|subscription_sync_total|olm_resolution_duration_seconds|csv_succeeded|csv_abnormal|cluster:kube_persistentvolumeclaim_resource_requests_storage_bytes:provisioner:sum|cluster:kubelet_volume_stats_used_bytes:provisioner:sum|ceph_cluster_total_bytes|ceph_cluster_total_used_raw_bytes|ceph_health_status|odf_system_raw_capacity_total_bytes|odf_system_raw_capacity_used_bytes|odf_system_health_status|job:ceph_osd_metadata:count|job:kube_pv:count|job:odf_system_pvs:count|job:ceph_pools_iops:total|job:ceph_pools_iops_bytes:total|job:ceph_versions_running:count|job:noobaa_total_unhealthy_buckets:sum|job:noobaa_bucket_count:sum|job:noobaa_total_object_count:sum|odf_system_bucket_count|odf_system_objects_total|noobaa_accounts_num|noobaa_total_usage|console_url|cluster:console_auth_login_requests_total:sum|cluster:console_auth_login_successes_total:sum|cluster:console_auth_login_failures_total:sum|cluster:console_auth_logout_requests_total:sum|cluster:console_usage_users:max|cluster:console_plugins_info:max|cluster:console_customization_perspectives_info:max|cluster:ovnkube_master_egress_routing_via_host:max|cluster:network_attachment_definition_instances:max|cluster:network_attachment_definition_enabled_instance_up:max|cluster:ingress_controller_aws_nlb_active:sum|cluster:route_metrics_controller_routes_per_shard:min|cluster:route_metrics_controller_routes_per_shard:max|cluster:route_metrics_controller_routes_per_shard:avg|cluster:route_metrics_controller_routes_per_shard:median|cluster:openshift_route_info:tls_termination:sum|insightsclient_request_send_total|cam_app_workload_migrations|cluster:apiserver_current_inflight_requests:sum:max_over_time:2m|cluster:alertmanager_integrations:max|cluster:telemetry_selected_series:count|openshift:prometheus_tsdb_head_series:sum|openshift:prometheus_tsdb_head_samples_appended_total:sum|monitoring:container_memory_working_set_bytes:sum|namespace_job:scrape_series_added:topk3_sum1h|namespace_job:scrape_samples_post_metric_relabeling:topk3|monitoring:haproxy_server_http_responses_total:sum|rhmi_status|status:upgrading:version:rhoam_state:max|state:rhoam_critical_alerts:max|state:rhoam_warning_alerts:max|rhoam_7d_slo_percentile:max|rhoam_7d_slo_remaining_error_budget:max|cluster_legacy_scheduler_policy|cluster_master_schedulable|che_workspace_status|che_workspace_started_total|che_workspace_failure_total|che_workspace_start_time_seconds_sum|che_workspace_start_time_seconds_count|cco_credentials_mode|cluster:kube_persistentvolume_plugin_type_counts:sum|visual_web_terminal_sessions_total|acm_managed_cluster_info|cluster:vsphere_vcenter_info:sum|cluster:vsphere_esxi_version_total:sum|cluster:vsphere_node_hw_version_total:sum|openshift:build_by_strategy:sum|rhods_aggregate_availability|rhods_total_users|instance:etcd_disk_wal_fsync_duration_seconds:histogram_quantile|instance:etcd_mvcc_db_total_size_in_bytes:sum|instance:etcd_network_peer_round_trip_time_seconds:histogram_quantile|instance:etcd_mvcc_db_total_size_in_use_in_bytes:sum|instance:etcd_disk_backend_commit_duration_seconds:histogram_quantile|jaeger_operator_instances_storage_types|jaeger_operator_instances_strategies|jaeger_operator_instances_agent_strategies|appsvcs:cores_by_product:sum|nto_custom_profiles:count|openshift_csi_share_configmap|openshift_csi_share_secret|openshift_csi_share_mount_failures_total|openshift_csi_share_mount_requests_total|cluster:velero_backup_total:max|cluster:velero_restore_total:max|eo_es_storage_info|eo_es_redundancy_policy_info|eo_es_defined_delete_namespaces_total|eo_es_misconfigured_memory_resources_info|cluster:eo_es_data_nodes_total:max|cluster:eo_es_documents_created_total:sum|cluster:eo_es_documents_deleted_total:sum|pod:eo_es_shards_total:max|eo_es_cluster_management_state_info|imageregistry:imagestreamtags_count:sum|imageregistry:operations_count:sum|log_logging_info|log_collector_error_count_total|log_forwarder_pipeline_info|log_forwarder_input_info|log_forwarder_output_info|cluster:log_collected_bytes_total:sum|cluster:log_logged_bytes_total:sum|cluster:kata_monitor_running_shim_count:sum|platform:hypershift_hostedclusters:max|platform:hypershift_nodepools:max|namespace:noobaa_unhealthy_bucket_claims:max|namespace:noobaa_buckets_claims:max|namespace:noobaa_unhealthy_namespace_resources:max|namespace:noobaa_namespace_resources:max|namespace:noobaa_unhealthy_namespace_buckets:max|namespace:noobaa_namespace_buckets:max|namespace:noobaa_accounts:max|namespace:noobaa_usage:max|namespace:noobaa_system_health_status:max|ocs_advanced_feature_usage|os_image_url_override:sum|cluster:vsphere_topology_tags:max|cluster:vsphere_infrastructure_failure_domains:max|cluster:vsphere_csi_migration:max",alertstate=~"firing|",quantile=~"0.99|0.99|0.99|",system_type=~"OCS|OCS|",system_vendor=~"Red Hat|Red Hat|"}
```

For reference, here is an example response produced by a running OpenShift cluster:
Expand Down
Loading