Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates Kubelet Plugin Registration process #114136

Merged
merged 2 commits into from
Sep 11, 2024

Conversation

claudiubelu
Copy link
Contributor

@claudiubelu claudiubelu commented Nov 25, 2022

What type of PR is this?

/kind failing-test
/kind bug

/sig windows
/sig testing

/priority important-soon
/milestone v1.27

What this PR does / why we need it:

Currently, when a Kubelet Plugin is being added in the DesiredStateOfWorld, a timestamp is saved in the PluginInfo. This timestamp is then updated on subsequent plugin reregistrations.

The Reconciler, when it detects different timestamps for a Plugin in its DesiredStateOfWorld and ActualStateOfWorld, it will then trigger a Plugin unregister and then a new Plugin registration.

Basically, the timestamp is being used to detect whether or not a Plugin needs to be reregistered or not. However, this can be an issue on Windows, where the time measurements are not as fine-grained. time.Now() calls within the same ~1-15ms window will have the same timestamp. This can mean that Plugin Reregistration events can be missed on Windows [1]. Because of this, some of the Plugin registration unit tests fail on Windows.

This commit updates the behaviour, instead of relying on different timestamps, the Reconciler will check the set PluginInfo UUID to detect a Plugin Reregistration. With this change, the unit tests mentioned above will also pass on Windows.

[1] golang/go#8687

Which issue(s) this PR fixes:

Related: #51540

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Kubelet Plugins are now re-registered properly  on Windows if the re-registration period is < 15ms.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. labels Nov 25, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.27 milestone Nov 25, 2022
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.26 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.26.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Fri Nov 25 09:34:34 UTC 2022.

@k8s-ci-robot k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. sig/testing Categorizes an issue or PR as relevant to SIG Testing. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 25, 2022
@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Nov 25, 2022
@jsturtevant
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 6, 2022
@SergeyKanzhelev
Copy link
Member

/assign

@claudiubelu
Copy link
Contributor Author

/test pull-ci-kubernetes-unit-windows

@claudiubelu
Copy link
Contributor Author

@SergeyKanzhelev, have you had a change to review this?

@bart0sh
Copy link
Contributor

bart0sh commented Jan 14, 2023

/sig-storage

@bart0sh
Copy link
Contributor

bart0sh commented Jan 14, 2023

/sig storage

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jan 14, 2023
@claudiubelu
Copy link
Contributor Author

/assign @saad-ali

@claudiubelu
Copy link
Contributor Author

@SergeyKanzhelev, did you have a chance to look over this?

@pacoxu
Copy link
Member

pacoxu commented Mar 25, 2024

clear the milestone as it has no update for week.
/milestone v1.31
Hope we can make it in next release cycle.
As this is a bugfix, we may backport it later.

@k8s-ci-robot k8s-ci-robot added this to the v1.31 milestone Mar 25, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 25, 2024
@claudiubelu claudiubelu force-pushed the kubelet-plugins branch 3 times, most recently from 7e2c1b2 to e215d29 Compare March 25, 2024 21:26
@claudiubelu
Copy link
Contributor Author

/test pull-ci-kubernetes-unit-windows

claudiubelu and others added 2 commits May 13, 2024 15:47
Currently, when a Kubelet Plugin is being added in the DesiredStateOfWorld,
a timestamp is saved in the PluginInfo. This timestamp is then updated on
subsequent plugin reregistrations.

The Reconciler, when it detects different timestamps for a Plugin in its
DesiredStateOfWorld and ActualStateOfWorld, it will then trigger a Plugin unregister
and then a new Plugin registration.

Basically, the timestamp is being used to detect whether or not a Plugin needs to
be reregistered or not. However, this can be an issue on Windows, where the time
measurements are not as fine-grained. time.Now() calls within the same ~1-15ms
window will have the same timestamp. This can mean that Plugin Reregistration events
can be missed on Windows [1]. Because of this, some of the Plugin registration unit
tests fail on Windows.

This commit updates the behaviour, instead of relying on different timestamps,
the Reconciler will check the set PluginInfo UUID to detect a Plugin Reregistration.
With this change, the unit tests mentioned above will also pass on Windows.

[1] golang/go#8687
@claudiubelu
Copy link
Contributor Author

/test pull-ci-kubernetes-unit-windows

1 similar comment
@claudiubelu
Copy link
Contributor Author

/test pull-ci-kubernetes-unit-windows

@Vyom-Yadav
Copy link
Member

Hello @claudiubelu ! This PR has not been updated for a long time, so I'd like to check what's the status. The code freeze is starting 02:00 UTC Wednesday 10th July 2024 (about 4 weeks from now), and while there is still plenty of time, we want to ensure that each PR has a chance to be merged on time.

As the PR is tagged for 1.31, is it still planned for this release?

@claudiubelu
Copy link
Contributor Author

Hello @claudiubelu ! This PR has not been updated for a long time, so I'd like to check what's the status. The code freeze is starting 02:00 UTC Wednesday 10th July 2024 (about 4 weeks from now), and while there is still plenty of time, we want to ensure that each PR has a chance to be merged on time.

As the PR is tagged for 1.31, is it still planned for this release?

Hasn't been updated because I got no reviews. Waiting for those.

FWIW, it seems the unit tests are passing on Windows as well.

@Vyom-Yadav
Copy link
Member

Hello @claudiubelu,

This PR hasn't been updated in a while. What's the current status? Do you want us to reach out to the folks who are supposed to review this?

Reminder: Code freeze starts 02:00 UTC Wednesday 24th July 2024 / 19:00 PDT Tuesday 23rd July 2024 (about 1 week from now). Please make sure the PR has both lgtm and approved labels before the code freeze.

@Vyom-Yadav
Copy link
Member

Vyom-Yadav commented Aug 5, 2024

/milestone clear
We are way past the code freeze. Please add the milestone for the next release if you'd like to target that.

@k8s-ci-robot k8s-ci-robot removed this from the v1.31 milestone Aug 5, 2024
Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 11, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2d53f15313565741c0b29f20e6b17d6202bfe688

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: claudiubelu, SergeyKanzhelev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 11, 2024
@k8s-ci-robot
Copy link
Contributor

@claudiubelu: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-verify-lint 2604d7d link unknown /test pull-kubernetes-verify-lint

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot merged commit d14b0b0 into kubernetes:master Sep 11, 2024
13 of 15 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.32 milestone Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Archived in project
Status: Done
Development

Successfully merging this pull request may close these issues.