Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surface Pressure Stall Information (PSI) metrics #130701

Merged
merged 7 commits into from
Mar 24, 2025

Conversation

roycaihw
Copy link
Member

What type of PR is this?

/kind feature

What this PR does / why we need it:

Surface PSI metrics from cadvisor to node summary API. Manual test result.

This implements the first phase of this KEP. Ref kubernetes/enhancements#4205

Does this PR introduce a user-facing change?

Add Pressure Stall Information (PSI) metrics to node metrics.

/sig node
/cc @haircommander @ndixita

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 10, 2025
@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubelet area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Mar 10, 2025
@roycaihw
Copy link
Member Author

/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 10, 2025
@kannon92
Copy link
Contributor

Just a quick glance. For alpha features I’d expect to see any feature related code guarded by a feature gate. So when the gate is off we should not emit PSI metrics.

@haircommander haircommander moved this from Triage to Archive-it in SIG Node CI/Test Board Mar 12, 2025
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 17, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2025
@roycaihw roycaihw force-pushed the psi-metrics branch 2 times, most recently from 7309f40 to ed7165c Compare March 17, 2025 06:20
@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Mar 21, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 16c0f04fd8dcc2244db51c87f80dd0ed75e97a1a

@haircommander
Copy link
Contributor

/retest

@roycaihw
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2025
@roycaihw
Copy link
Member Author

/test pull-kubernetes-node-e2e-containerd-kubelet-psi

@roycaihw
Copy link
Member Author

pull-kubernetes-e2e-capz-windows-master

Ginkgo detected a panic while constructing the spec tree

The Windows test failure is unrelated to this PR. It's reported in #130960. There is a fix #130962 in progress

@roycaihw
Copy link
Member Author

All tests passed excepted the known Windows failure. This is ready for another round of review @liggitt @haircommander

@roycaihw
Copy link
Member Author

roycaihw commented Mar 24, 2025

I also confirmed that the tests passed with the above-mentioned Windows test fix.

/retest

@roycaihw
Copy link
Member Author

/test pull-kubernetes-node-e2e-containerd-alpha-features

@roycaihw
Copy link
Member Author

/retest

@liggitt
Copy link
Member

liggitt commented Mar 24, 2025

@liggitt
Copy link
Member

liggitt commented Mar 24, 2025

mirroring of the cadvisor API type lgtm

@haircommander
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 24, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 5d81abcaf98ea203e5d78746e1d106eb3d411f07

@mbianchidev
Copy link
Member

/milestone v1.33

@k8s-ci-robot k8s-ci-robot added this to the v1.33 milestone Mar 24, 2025
@roycaihw
Copy link
Member Author

/retest

1 similar comment
@roycaihw
Copy link
Member Author

/retest

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Mar 24, 2025

@roycaihw: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-capz-windows-master 914a4ba link false /test pull-kubernetes-e2e-capz-windows-master

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot merged commit 62555ca into kubernetes:master Mar 24, 2025
19 of 20 checks passed
@github-project-automation github-project-automation bot moved this from Needs Reviewer to Done in SIG Node: code and documentation PRs Mar 24, 2025
@roycaihw roycaihw deleted the psi-metrics branch March 24, 2025 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/dependency Issues or PRs related to dependency changes area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.
Development

Successfully merging this pull request may close these issues.

9 participants