-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Managed Kubernetes in CAPI #7494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@richardcase: This issue is currently awaiting triage. If CAPI contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
cc @joekr @shyamradhakrishnan, CAPOCI team |
/triage accepted Report a discussion from KubeCon WRT to
There could be cases where an infra cluster could be used to provide something on top of "vanilla managed Kubernetes", people were quoting some examples about private clusters or additional security groups if I remember well People also agreed that what could make sense is to make infra cluster optional, while instead reusing the same CR in two points seems less intuitive (please, other folks present at the meeting feel free to chime in) |
@richardcase thank you for this, very succinctly stated and with all the important items. 🙏 I would like to call out how this proposal effects our "consistency across cloud providers" objective. You mention in the description several real-world issues that make consistency a challenge in the current landscape (in practice it would mean all cloud providers committing to either option 2 or 3 as defined in current proposed set of recommendations). I'd like to get some consensus from folks around the viability of pinning the "consistency across cloud providers" objective to the landing and eventual implementing of this new proposal instead of the already landed proposal. This would suggest a few obvious outcomes:
From the capz community: @mtougeron @zmalik @luthermonson @LochanRn @NovemberZulu (there are others as well), do you have any opinions on a willingness to adopt an eventual, Cluster API-specific Managed Kubernetes standard that will be consistent across cloud providers at a future date according to a new API? Does such a strategy have downsides to current platform adoption? |
Great write up, @richardcase. I also want to add what I heard from managed Kubernetes customers in the community. As pointed out in @jackfrancis's CAPZ Managed Kubernetes evolution proposal, the typical customer persona is a Cluster API, multi-cloud, Managed Kubernetes customer first of all and what they care about is consistency across Managed Kubernetes. They said they don't really expect consistency between managed and unmanaged Kubernetes in CAPI. Considering managed Kubernetes can utilize many value added services offered by cloud providers, i.e. managed addons and easy installation mechanism, built-in health checks and scaling capabilities, I think managed Kubernetes proposal should focus on making it easy to bring in managed services' ever-increasing capabilities instead of trying to fit into current CAPI contracts that were designed for unmanaged Kubernetes. |
Frankly speaking, I think we should be really careful in taking this path So, in my humble opinion, this task should be about how to embrace managed Kubernetes while preserving a common abstraction (that of course can be improved from the current state), not about inventing a new abstraction. |
I agree @fabriziopandini 👍 The desire to keep managed / unmanaged as close as possible in the long term will be carefully considered. I think there are ways to do that whilst still being more implementation friendly to the managed provider implementors. |
The challenge here is that we have a trade-off in terms of what to prioritize for "close"-ness.
I'm not sure what the right answer is, looking forward to input from the community! |
@jackfrancis we need to understand the flaws of the current managed cluster CAPI proposal. It did tie nicely to CAPI API such as control plane, machine pool etc and also to Kubernetes constructs. We in OCI did not find any huge problems implementing using the proposal. Ofcourse there is a minor naming problem in that the control plane translates to the Cluster object in cloud provider, but technically the cluster(for example OKE/AKS) etc are really running managed control plane. Answering some questions above, do we need infra cluster and control plane, thats how current CAPI works and we can question that as well right? For example in current implementation, the infra cluster creates all the network infrastructure, the kubeadmcontrol plane creates the control plane nodes. We are doing the same thing with our impl, the infra cluster creates the network and any other infra, the control plane creates the managed control plane(which is called cluster by most cloud providers) We can go through our impl and API in a meeting if required. I apologise if I did not understand the problem statement here or did nit explain properly. |
Hi @shyamradhakrishnan thanks so much for adding your thoughts. I think that @alexeldeib would agree (though he can speak for himself :) as the original implementer of capz + AKS) that there were not huge problems (after all capz and it seems capoci have built a robust user community around the current implementations), but there is some friction, and I'd say @richardcase states the key points very succinctly, so I'll simply copy them:
If we were starting a brand new managed cluster implementation right now for the near term we would 💯 follow the option #3 specified in kubernetes-sigs/cluster-api-provider-azure#2739. I was one of the reviewers of that proposal and I stand by my lgtm. The concerns stated in this particular issue are scoped to solving this at an even higher fidelity for the longer term. My observation is that the very existence of this issue suggests the community desires a better solution to achieve managed k8s standardization longer term; thus the current proposal does not address these longer term concerns (indeed the proposal was not scoped to include changes to Cluster API itself).
I agree. We should continue to discuss the tradeoff of "forcing the managed provider to adhere to Cluster API" vs "forcing Cluster API to adhere to managed providers" (I'm exagerrating a bit, but that's the simplest way of putting it). My summary of the key theme from the folks starting this thread (@richardcase, @pydctw, and myself) are:
Based on those 2 points (of course, let's emphasize that these are the views of the minority at present — consensus has not yet been achieved) it seems likely that the community will indeed eventually want to solve for this problem from first principles, and thus we would anticipate breaking our customers (capa and capz, at least) twice if we took on the work of migrating to option #3 from kubernetes-sigs/cluster-api-provider-azure#2739. (The very explicit language of "you don't have to do this if you have an existing managed k8s implementation" is purposeful as the authors of that proposal anticipated this possibility. Hope that helps, definitely agree that we will want to discuss this fluidly in at least one meeting to clarify everyone's desires/needs/expectations/concerns. |
Sounds good @jackfrancis , a small change is that capa needs to be modified to support cluster class, as current implementation does not support it. but MachinePool also does not support ClusterClass, so... |
I mentioned something along those lines at KubeCon. My main point being that configurability options present with the managed Kubernetes solutions out there should also be possible with our CAPI solution for them for two reasons:
|
@puja108 What part of a hypothetical capi |
Sorry Jack, this issue somehow got lost in my backlog. I did not mean the comment with regard to any current or planned spec. My worries that I mentioned at KCSNA came more from talking to end users at KubeCon and in our virtual end user meetups, where I was seeing a curious rise in end users looking at Cluster API, maybe even trying it out, but then going for a self-built Crossplane-based solution in combination with Managed Control Planes like EKS and AKS. And the three end user companies I asked for the main reason behind this choice mentioned that they had special requirements for configurability or features that they would have had to implement around CAPI, e.g. private network support. If this is irrelevant wrt the current proposal, then I might have gone off-topic for this thread. |
@puja108 - are these requirements captured anywhere? Would be great to see if we can cover these in CAPI. |
Sadly I did not do any structure requirement gathering. I was mainly poking people at events and getting very short feedback. I think it would be worth it to gather them as a group in a more structured way. |
Thanks for weighing in @puja108. At first glance it seems like these configuration requirements are provider-specific, but we'd love for you to join our regular feature group discussions around managed k8s in cluster api, your time permitting. Meeting details are in the markdown file in this PR that is en route to capi main branch: |
Just a quick status update that this CAEP is near to reaching definitional consensus: Following that an implementation effort will commence to carry out the work proposed above. |
Follow-up status update: last minute scope enhancement under consideration! Specifically, @vincepri would like us to consider a new CRD definition for the control plane endpoint to aid any new behavioral flexibility we add to the existing assumptions of an infra provider's cluster and control plane resources. This may also help unblock forward progress on a long-standing request to support multiple control plane endpoints: #5295 See: https://kubernetes.slack.com/archives/C8TSNPY4T/p1686781200032249 |
@jackfrancis this will benefit OKE and other managed provider as well I think. For example OKE support private and public endpoints https://docs.oracle.com/en-us/iaas/api/#/en/containerengine/20180222/datatypes/ClusterEndpoints , I am pretty sure other providers will also have this. Even in non managed cases, a Load balancer can have public and private endpoints. So theoretically, this will have a lot of benefits. |
@jackfrancis Q: Is this issue done now that 8500 is merged? I assume we would want to track the implementation, maybe let's create a new umbrella issue for the implementation? (which would include the tasks from the PR description: #8500 (comment)) |
/kind design
/kind api-change
/area api
User Story
As a cluster service consumer, I want to use Cluster API to provision and manage Kubernetes Clusters that utilize my service provider's Managed Kubernetes Service (i.e. EKS, AKS, GKE), So that I don’t have to worry about the management/provisioning of control plane nodes, and so I can take advantage of any value add services offered by the service provider.
Detailed Description
Let's take another look at how we managed Kubernetes services can be presented in CAPI and its providers.
The original Managed Kubernetes in CAPI proposal, based on guidance from the community at the time, explicitly did not consider (i.e. out of scope) changes to CAPI itself and focused purely on fitting managed Kubernetes into the existing API types. The existing proposal is useful in the short term to provider guidance to provider implementers.
However, it has become apparent that the recommendations made in the proposal, whilst helpful, are less than ideal when it comes to managed Kubernetes. Some areas where it's less than ideal (not an exhaustive list):
So, we should re-look at how we represent managed Kubernetes in CAPI and include the option of changes to CAPI itself to represent managed k8s as a first class citizen.
It's expected that this will result in a "Managed Kubernetes in CAPI v2" proposal.
Anything else you would like to add:
We should still consider the recommendations made in the original proposal (i.e. option 3), along with any new options that are a result of looking at changes to capi itself. We can then consider the trade-offs and decide a longer-term strategy for managed Kubernetes in capi.
Short-term, the 3 main cloud providers could achieve consistency quickly by going with option 2 from the proposal. Although not the recommended approach, it was still called out as an option. However, the Oracle provider implemented their managed service according to the document and went with option 3, so we need to consider this as well.
Some related issues that we can also consider as part of this work:
/assign richardcase
/assign jackfrancis
/assign pydctw
The text was updated successfully, but these errors were encountered: