Skip to content

Bug 2037513: Revert dropping container label and update diskDeviceSelector #1685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

slashpai
Copy link
Member

@slashpai slashpai commented Jun 3, 2022

Why this change?

container_fs_.* metrics doesn't have "container" label after
#1402 which caused regression
for dashboards using container label since it gives empty datapoints
for all queries using "container" label in queries containing
container_fs_.* metrices.

Had added deviceDeviceSelector in container_fs* queries as part of
kubernetes-monitoring/kubernetes-mixin#737 since diskDeviceSelector
includes only node_exporter label values updated disDeviceSelector
to include cadvisor devices too (/dev)

Tried alternative solution but didnt work

We tried to fix this by introducing new selector in kubernetes-mixin and overriding that in cmo in #1554 but that won't solve the problem since many queries use container label in sum() queries

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

cc: @simonpasquier @jan--f

/hold

slashpai added 2 commits June 3, 2022 13:12
container_fs_.* metrics doesn't have "container" label after
openshift/pull/1402 which caused regression
for dashboards using "container" label since it gives empty datapoints
for all queries using "container" label in queries containing
container_fs_.* metrices.

Signed-off-by: Jayapriya Pai <[email protected]>
Had added deviceDeviceSelector in container_fs* queries as part of
kubernetes-monitoring/kubernetes-mixin#737 since diskDeviceSelector
includes only node_exporter label values updated disDeviceSelector
to include cadvisor devices too (/dev)

Signed-off-by: Jayapriya Pai <[email protected]>
@openshift-ci openshift-ci bot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jun 3, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 3, 2022

@slashpai: This pull request references Bugzilla bug 2037513, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @juzhao

In response to this:

Bug 2037513: Revert dropping container label and update diskDeviceSelector

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 3, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: slashpai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 3, 2022
@simonpasquier
Copy link
Contributor

The goal of #1402 was to contain the label cardinality of metrics collected from kubelet. Reverting it would increase the load on Prometheus again...

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 3, 2022

@slashpai: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/versions 4e83a62 link false /test versions
ci/prow/e2e-aws-single-node 4e83a62 link false /test e2e-aws-single-node

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 3, 2022

@slashpai: This pull request references Bugzilla bug 2037513, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @juzhao

In response to this:

Bug 2037513: Revert dropping container label and update diskDeviceSelector

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@slashpai
Copy link
Member Author

slashpai commented Jun 3, 2022

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 3, 2022
@simonpasquier
Copy link
Contributor

one solution might be to pass containerfsSelector: 'container=""' to the mixins configuration?

@slashpai
Copy link
Member Author

slashpai commented Jun 3, 2022

one solution might be to pass containerfsSelector: 'container=""' to the mixins configuration?

I tried with overriding containerfsSelector in #1554 by replacing it with id!="". It fixed dashboards except "Kubernetes / Compute Resources / Pod" since it uses container in sum() query

@simonpasquier
Copy link
Contributor

Since Prometheus doesn't collect the per-container fs metrics, I would be in favor of dropping the per-container storage dashboards. @jan--f WDYT?

@jan--f
Copy link
Contributor

jan--f commented Jun 8, 2022

Since Prometheus doesn't collect the per-container fs metrics, I would be in favor of dropping the per-container storage dashboards. @jan--f WDYT?

Indeed yes, the dashboard panels just show No datapoints found, we might as well get rid of them. cc @kyoto @jgbernalp for visibility.

@slashpai
Copy link
Member Author

slashpai commented Jun 8, 2022

Since Prometheus doesn't collect the per-container fs metrics, I would be in favor of dropping the per-container storage dashboards. @jan--f WDYT?

Indeed yes, the dashboard panels just show No datapoints found, we might as well get rid of them. cc @kyoto @jgbernalp for visibility.

Thank you. I can address that in #1554

@slashpai
Copy link
Member Author

slashpai commented Jun 8, 2022

/close in favour of latest update in #1554

@jan--f
Copy link
Contributor

jan--f commented Jun 10, 2022

/close
The bot doesn't like additional text after commands ;)

@openshift-ci openshift-ci bot closed this Jun 10, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 10, 2022

@jan--f: Closed this PR.

In response to this:

/close
The bot doesn't like additional test after commands ;)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 10, 2022

@slashpai: This pull request references Bugzilla bug 2037513. The bug has been updated to no longer refer to the pull request using the external bug tracker.

In response to this:

Bug 2037513: Revert dropping container label and update diskDeviceSelector

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@slashpai slashpai deleted the container_drop_revert branch October 13, 2022 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants