Skip to content

Summary of out-of-tree problems #629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hbagdi opened this issue Apr 23, 2021 · 15 comments
Closed

Summary of out-of-tree problems #629

hbagdi opened this issue Apr 23, 2021 · 15 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience.

Comments

@hbagdi
Copy link
Contributor

hbagdi commented Apr 23, 2021

This is an umbrella issue to track several problems that this project is running into because of it being out-of-tree.

@hbagdi hbagdi added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 23, 2021
@hbagdi
Copy link
Contributor Author

hbagdi commented Apr 23, 2021

This was discussed a while ago in one of the community meetings but I only got to this action item today.
There are several other issues that we have run into with CRDs but they are not logged and I can't recall others from the top of my head.

@robscott
Copy link
Member

Thanks for creating this! I think another key one is the lack of built-in CRD validation for many of our use cases which has required us to create and maintain a validating webhook.

@robscott robscott added kind/design Categorizes issue or PR as related to design. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Apr 23, 2021
@bowei
Copy link
Contributor

bowei commented Apr 26, 2021

We need to feed a lot of these requirements to api machinery and CRD.

Maybe we should create x-ref issues on their repo(s)?

@youngnick
Copy link
Contributor

I think that all of the things we've mentioned here are definitely problems for other CRD authors, so being able to get some solutions to them would be really helpful for everyone. Thanks for making sure this stuff gets tracked @hbagdi!

@hbagdi
Copy link
Contributor Author

hbagdi commented Apr 27, 2021

Maybe we should create x-ref issues on their repo(s)?

That's the plan but I was thinking of getting a fair representation of the problems we have in this project before doing that.
Are there others that come to mind?

@hbagdi hbagdi added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label May 3, 2021
@robscott
Copy link
Member

I've created a doc to track these issues and ran it by @thockin, @lavalamp, @apelisse and some others. Here are some highlights of those conversations (feel free to correct me if I got anything wrong):

  1. Some things are being worked on to make CRD validation more powerful such as support for enforcing immutable fields.
  2. There are no plans to support cross field validation, and if that's something we need (I think it is) we'll need a validating webhook.
  3. Until our API and validation stabilizes, implementations will need to be flexible with the API versions they support and fall back to older versions if a newer one isn't installed in the cluster. Although this will be messy in alpha, this should be less of an issue with a more stable beta API.
  4. The kubectl UX is not great for the custom use cases we've described, even the upstream fix I've been working on would only help for a relatively small set of use cases. We need to loop in sig-cli to this conversation, but one initial suggestion was to implement a mutating webhook that populated output for kubectl specific columns. Of course it's also looking increasingly like a plugin would be helpful here to go beyond that.
  5. There's likely not a great path for new APIs like this to be in-tree. We can explore with sig-arch, but wherever possible, we should be trying to work with CRDs and make the experience as good as possible.

Still need to loop in more people, but wanted to provide an update on what I've found so far. If you've run into any other frustrating points with CRDs, feel free to add them here or in the linked doc. I'll try to circulate this to other sigs soon to see what's possible.

@howardjohn
Copy link
Contributor

Until our API and validation stabilizes, implementations will need to be flexible with the API versions they support and fall back to older versions if a newer one isn't installed in the cluster. Although this will be messy in alpha, this should be less of an issue with a more stable beta API.

Is it correct to read this as "vendors will need to duplicate their entire gateway-api codebase or only use the oldest api version they care about"? I don't know of any other way, but would be very happy to. This was a big pain point for Ingress, and other deprecation, for us.

@lavalamp
Copy link

Is it correct to read this as "vendors will need to duplicate their entire gateway-api codebase or only use the oldest api version they care about"?

CRDs can keep a limited number of different versions of the schema if they are mutually convertible.

Typically, the point of doing alpha(s) is to relieve one's self of that burden, because it's not easy to work under that constraint.

If you want to install multiple non-convertible alpha versions at the same time, you could do so by putting them under different groups, although that'd be a little weird.

Instead, I'd recommend

  • Change the X in v1alphaX every time a non-backwards-compatible change is made.
  • Implementations read the discovery doc (or the CRD object) and verify that the version there is the version they understand.

@howardjohn
Copy link
Contributor

Even if they are convertible, doesn't the same issue persist? Say we have v1alpha1 and v1alpha2. I am a controller that is up to date, so I want to support v1alpha2 (to get some new fields, etc), but I also need to support clusters with only v1alpha1 for backwards compatibility.

  • If I read v1alpha1, then I don't get the new fields
  • If I read v1alpha2, then it will fail on the clusters without v1alpha1
    I could in theory read v1alpha1 and run the conversion logic locally to v1alpha2, but that may be challenging. For core types this logic is hidden in k/k which is frowned upon to use

@lavalamp
Copy link

If they are convertible, then it is safe to install the variation of the CRD with both versions in all clusters.

@apelisse
Copy link

cc @jpbetz

  1. Some things are being worked on to make CRD validation more powerful such as support for enforcing immutable fields.
  2. There are no plans to support cross field validation, and if that's something we need (I think it is) we'll need a validating webhook.
  3. Until our API and validation stabilizes, implementations will need to be flexible with the API versions they support and fall back to older versions if a newer one isn't installed in the cluster. Although this will be messy in alpha, this should be less of an issue with a more stable beta API.
  4. The kubectl UX is not great for the custom use cases we've described, even the upstream fix I've been working on would only help for a relatively small set of use cases. We need to loop in sig-cli to this conversation, but one initial suggestion was to implement a mutating webhook that populated output for kubectl specific columns. Of course it's also looking increasingly like a plugin would be helpful here to go beyond that.
  5. There's likely not a great path for new APIs like this to be in-tree. We can explore with sig-arch, but wherever possible, we should be trying to work with CRDs and make the experience as good as possible.

I'm a big +1 on 5. that we should use this as a motivation to improve CRDs rather than fall-back on in-tree. Sorry if you're paying part of the price :-(.

In the medium/long term, we're thinking about possible ways to address 1, 2, 3, and 4. So hopefully it will be much easier for the next projects or so that go through the same journey you did.

@howardjohn
Copy link
Contributor

If they are convertible, then it is safe to install the variation of the CRD with both versions in all clusters.

This is great if the controller is also in charge of the CRDs. As a concrete example of where this won't work, GKE will automatically provision these CRDs. There will always be some delay before GKE shipping the new CRD; in that time, it is impossible to install the new version into the cluster. That is just one example, I am sure there are other cases where we may not be able to, such as when the user running the controller doesn't have admin privs, etc.

@lavalamp
Copy link

when the user running the controller doesn't have admin privs

You should need admin privileges to run a gateway implementation, so someone who can do that can also update / install the CRD.

I agree that if the platform providing the cluster has an opinion about the version of the CRD that is installed, you basically have to wait for them to upgrade, or risk fights with whatever controller maintains the CRD version. Platforms should therefore think very carefully before shipping a CRD like this.

@MadhavJivrajani
Copy link

/remove-kind design
/kind feature
kind/design is migrated to kind/feature, see kubernetes/community#6144 for more details

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/design Categorizes issue or PR as related to design. labels Oct 11, 2021
@shaneutt
Copy link
Member

shaneutt commented Aug 5, 2022

Given that all sub-issues and PRs listed are now resolved, it would seem we can consider this one resolved as well. If anyone has reason that this needs to be re-opened, no big deal please just say so and we'll be happy to reconsider and re-open as necessary.

@shaneutt shaneutt closed this as completed Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience.
Projects
None yet
Development

No branches or pull requests

10 participants