-
Notifications
You must be signed in to change notification settings - Fork 534
Adding GEP 922: Gateway API Versioning Plan #923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: robscott The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good overall to me, but there's a couple of things that could be clearer.
``` | ||
gateway.networking.k8s.io/bundle-version: v0.4.0 | ||
gateway.networking.k8s.io/channel: stable|experimental | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, at the end of the day, a cluster can either have a stable or an experimental CRD. It is exclusive because you can only have one CRD at a time for a GK.
Why have the distinction in that case?
What if we just ship the experimental fields as part of the stable channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do that, then the only way for YAML consumers (and let's be real, most people will use YAML, not read the full spec) to know what fields are experimental is for us to prefix them somehow - that is, we'd have to call all the experimental fields experimentalFieldName
or something.
Other things I can think of that only supplying on CRD spec would imply:
- Implementations would need a switch of some sort to turn on experimental field processing - effectively, we're moving the feature gate to the implementing controller, and trusting them to figure it out.
- We'd also need to supply a method for implementations to determine what the experimental fields are, so that they can tell people they're using an experimental field but haven't enabled it.
Having two separate CRDs solves all of those problems - the implementation can switch experimental field processing on or off based on the annotation on the CRD object itself (this also means you can't enable experimental field processing without having modify CRDs access, which I think is fine).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @youngnick's points above. I think it's important to have users explicitly opt in to experimental functionality, and I think separate CRDs are likely the best mechanism we have for this. Although I'd really like something better like kubectl support/gating for alpha fields, I don't think that's going to be practical anytime soon.
cd82f80
to
eab8df7
Compare
* Making required fields optional | ||
* Removal of experimental fields | ||
* Removal of experimental resources | ||
* Graduation of fields or resources from experimental to stable track |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally see the need for doing this within the same API version but I also think it is our responsibility to help out implementers here. Is there any way a controller can check the version bundle version in the cluster?
Like an annotation on the CRD (I'm not sure if those annotations are readable programmatically from a controller or not).
Since Gateways are very sensitive to security, most operators will prefer a hard failure compared to an implementation having buggy behavior (or a vulnerable behavior). We don't need to instruct what controller authors should do when the bungle version is higher than the controller is programmed against.
Can we make this part of this GEP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - we do have annotations proposed further down that would enable this. Annotations should be able to be read programmatically, as long as the controller requests read access on CRDs which seems reasonable and likely necessary.
Note that each new bundle version, no matter how small, may include updated | ||
CRDs, webhook, or both. | ||
|
||
### Implementers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if an implementation doesn't want to work in a cluster unless the installed version matches a support matrix that the it is programmed against? Is that invalid behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's valid, but likely unnecessarily constrained. If all the guidelines here are followed, newer additions to the API would simply not be supported by an older implementation of the API. A warning or set of warnings is likely preferable to stopping altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Can we explicitly mention that controllers can rely on the annotation to decide their behavior?
6f26f5e
to
2e138a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lgtm with a few non-blocking nits.
the following: | ||
|
||
* API Types/CRDs | ||
* Validating Webhook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention other validation code here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should definitely add that to our bundle versions if/when we have it. I have a note in #926 to come back to this KEP and update it if/when we add that logic.
site-src/geps/gep-922.md
Outdated
responsibilities to ensure we're providing a consistent experience. | ||
|
||
### API Authors: | ||
* MUST provide conversion between stable API versions, starting with v1alpha2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* MUST provide conversion between stable API versions, starting with v1alpha2. | |
* MUST provide conversion between API versions, starting with v1alpha2. |
Feels weird to call an alpha API "stable".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the terminology here is confusing. I've updated this to "excluding experimental fields", which I think is a clearer representation of what I was aiming for here.
crashing | ||
* MUST NOT rely on webhook or CRD validation as a security mechanism. If field | ||
values need to be escaped to secure an implementation, both webhook and CRD | ||
validation can be bypassed and cannot be relied on. Instead, implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wording is fine for now, but if we do implement the validation module, feels like it should be a MUST to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I added a note to #926 so we remember to come back to this GEP and update it.
/lgtm |
Here is to hoping we have thought this through. |
I think there's broad enough consensus on this to merge it in, we can revisit through smaller tweaks if needed. /hold cancel |
What type of PR is this?
/kind gep
What this PR does / why we need it:
This proposes a set of versioning guidelines for future Gateway API releases. This is a follow up to the related doc that has been discussed in gateway-api and sig-apimachinery meetings.
Which issue(s) this PR fixes:
Related to #922, will not be fixed until this GEP moves to "Implemented"
Does this PR introduce a user-facing change?: