Skip to content

DRA 1.32 API: promotion to beta #127511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Nov 6, 2024
Merged

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Sep 20, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

Beta APIs for DRA in 1.32.

Fixes: #123687, #127386

Does this PR introduce a user-facing change?

The core functionality of Dynamic Resource Allocation (DRA) got promoted to beta. No action is required when *upgrading*, the previous v1alpha3 API is still supported, so existing deployments and DRA drivers based on v1alpha3 continue to work. *Downgrading* from 1.32 to 1.31 with DRA resources in the cluster (resourceclaims, resourceclaimtemplates, deviceclasses, resourceslices) is *not* supported because the new v1beta1 is used as storage version and not readable by 1.31.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Sep 20, 2024
@k8s-ci-robot k8s-ci-robot added area/code-generation area/kubelet area/test kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 20, 2024
@k8s-ci-robot k8s-ci-robot added area/apiserver sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. labels Sep 26, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 26, 2024
@pohly pohly force-pushed the dra-1.32-api branch 2 times, most recently from 858689f to e45a53c Compare September 27, 2024 07:51
@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. labels Sep 27, 2024
pohly added 7 commits November 6, 2024 13:03

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Using 1.0 was a workaround to grant Kubernetes 1.31 access to things introduce
in that same release. In Kubernetes 1.32 we don't need that workaround anymore
because everything is still available after a downgrade and thus usable.
The version bump is an opportunity to pick a name that is a bit more
descriptive. It matches the "DevicePlugin" service name.
Listing supported gRPC services (e.g. drav1alpha3.Node, drav1beta1.DRAPlugin)
during registration enables the kubelet to determine in advance which methods
it can call.

Versioning by Kubernetes release makes less sense because it doesn't say
anything about which gRPC service is supported. New ones might get added and
obsolete ones removed. Some services might be optional.

In the past, this versioning support wasn't really used. At least one version
had to be provided and kubelet tried to use the plugin with the highest
version. This version comparison gets dropped. In the unlikely situation
that different plugins register under the same name, the most recent one is
used.

Because advertising gRPC services is a new convention, plugins only reporting
some version are treated as providing the old alpha gRPC service.
Supporting the alpha gRPC interface isn't enough anymore to be compatible
with kubelet 1.31: the "supported versions" must contain version numbers,
otherwise the older kubelet refuses to register the driver.

With this change, a DRA driver can decide to support both kubelet 1.31 and
kubelet 1.32 by registering *only* the alpha gRPC interface (NodeV1alpha4(true)
and NodeV1beta1(false) as options for Start).

The default is to provide both interfaces and using the registration mechanism
for 1.32, which makes DRA drivers compatible only with Kubernetes >= 1.32.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Deleting slices was not covered to begin with and the recent registration
changes also could have been covered better. Now coverage is at 91%.
…location

This makes a configuration with --feature-gates=AllAlpha=true valid
again. Without this change, that flag enabled DRAAdminAccess without
DynamicResourceAllocation being enabled (default off!) and the kube-apiserver
refused to start.

While DRAAdminAccess isn't usable without DynamicResourceAllocation, it's also
not really wrong to allow it - it simply won't matter.
@pohly
Copy link
Contributor Author

pohly commented Nov 6, 2024

Some additional updates where needed due to behavioral changes in master:

  • add cbor entries in generated openapi spec
  • replace TooLongMaxLength with TooMany in new test cases

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 6, 2024
@k8s-ci-robot k8s-ci-robot requested a review from towca November 6, 2024 12:05
@aojea
Copy link
Member

aojea commented Nov 6, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 6, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: d7c5e70d4fc11d1de9598998e05d7bdd32dc406a

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Nov 6, 2024

@pohly: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-kind-classic-dra 318f32e link false /test pull-kubernetes-kind-classic-dra
pull-kubernetes-e2e-gce-cos-alpha-features d6bad27 link false /test pull-kubernetes-e2e-gce-cos-alpha-features

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@aojea
Copy link
Member

aojea commented Nov 6, 2024

/test pull-kubernetes-e2e-gce-cos-alpha-features

This job is broken before this PR, we need to investigate what has broken it , but is not related to this

https://prow.k8s.io/job-history/gs/kubernetes-ci-logs/pr-logs/directory/pull-kubernetes-e2e-gce-cos-alpha-features?buildId=1854133104784969728

@k8s-ci-robot k8s-ci-robot merged commit e273349 into kubernetes:master Nov 6, 2024
22 of 23 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.32 milestone Nov 6, 2024
@johnbelamaric
Copy link
Member

Amazing work @pohly !

k8s-publishing-bot pushed a commit to kubernetes/api that referenced this pull request Nov 6, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Based on review
feedback (kubernetes/kubernetes#127511 (comment)).

Kubernetes-commit: 30f52826560129839922e1756730b02f184f0ef9
k8s-publishing-bot pushed a commit to kubernetes/client-go that referenced this pull request Nov 6, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Based on review
feedback (kubernetes/kubernetes#127511 (comment)).

Kubernetes-commit: 30f52826560129839922e1756730b02f184f0ef9
k8s-publishing-bot pushed a commit to kubernetes/dynamic-resource-allocation that referenced this pull request Nov 6, 2024
Based on review
feedback (kubernetes/kubernetes#127511 (comment)).

Kubernetes-commit: 30f52826560129839922e1756730b02f184f0ef9
@thockin
Copy link
Member

thockin commented Nov 6, 2024

This was an exemplary PR for "easy to review". Thank you for the extra work.

@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/code-generation area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.
Projects
Archived in project
Archived in project
Archived in project
Development

Successfully merging this pull request may close these issues.

DRA: beta: CEL validation