-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Cluster-API Provisioning Mechanism Consolidation Proposal #585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: xmudrii If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
## Proposal | ||
|
||
The upstream Cluster-API doesn’t have any bootstrap scripts neither any wrapper around kubeadm for handling cluster operations (initialization and join). This means implementers have to manually write bootstrap scripts used by specific Cluster-API provider and manually handle cluster initialization and node joining. For most cloud providers both bootstrap scripts and steps for cluster initialization/joining are the same, except for provider specifics that could be provided via template variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roberthbailey: What about a provider that decided not to build on top of kubeadm? The kops adoption of the machine types isn't going to use kubeadm for node joining, since kops already has node joining.
@alvaroaleman: The purpose of this scripts is to provide a working default that can be used, if ppl decide they want it different they are free to build and use something else
@roberthbailey: Have we considered putting the common bootstrapping scripts into a bundle definition that can be reused instead of adding code into the upstream repo?
@xmudrii:
I don't think we have considered that at all.
I'm not really sure what bundle definition is and how it looks like, so would you mind giving me an example or short introduction?
* Install dependencies required by Kubernetes | ||
* Includes dependencies such as socat, ebtables, which are same for all providers. | ||
* Install container runtime | ||
* Steps for installing container runtime such as Docker or containerd are same for all providers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roberthbailey: For a specific runtime. But different providers may choose not to implement support for all runtimes. If this code is standardized, does that imply a burden of supporting all runtimes that are upstream on a provider?
@alvaroaleman: I think we will provide just one script for one runtime that works, if ppl want something else they can choose not to use that script
|
||
Implementing the provisioning scripts in the upstream Cluster-API ensures: | ||
|
||
* Cluster-API implementers can easily get started, as they don’t need to implement complex provisioning logic and scripts for many various setups and operating systems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roberthbailey: How many variations do you expect we will maintain in the cluster-api repo?
@xmudrii: I'm not expecting too many variations. For now we should go with basic single-master setup with kubeadm. Regarding Kubernetes versions, depends on how the big difference is and hassle to maintain it. Regarding operating systems, Ubuntu is the most popular one and should be supported, beside it we can choose one more optionally.
|
||
### Example use case | ||
|
||
Han wants to write a new Cluster-API provider for his employers cloud solution. Han knows the only difference is the cloud providers’ API, hence Han only wants to implement those API calls. Because the Cluster-API provides templates for the actual provisioning, Han can easily import and re-use them and safely assume they will work, as they are well tested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roberthbailey: How will we test that a change upstream to these scripts will work on all providers next time they vendor the latest upstream repo? And I'm not sure Han can assume they will "just work" because with required templating Han will need to pass values that are unique to the cloud environment.
@alvaroaleman: So IMHO what we can test on the cluster-api main repo is that these scripts work when --cloud-provider
and --cloud-config
is unset, e.G. by leveraging KIND and then just creating a node inside a docker container via DIND and checking if it will come up and join the cluster. Aside from those two settings, I think the node config is identical on all cloud providers or am I missing something?
|
||
## Goals | ||
|
||
* Cluster-API provides re-usable provisioning scripts for bootstrapping a cluster for the most popular operating systems (Ubuntu, CentOS and CoreOS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roberthbailey: It feels like the scripts have two jobs, that we could maybe separate. There is the "install stuff" step, which I think could be argued is mostly the same across providers, and then there is the "configure and run stuff" part, which either needs heavy templating or shouldn't be shared. If we focused on the first part, how much would that go towards solving the issues you are hoping to address?
@roberthbailey: This is still my biggest outstanding question about this proposal. Re-reading the doc, it seems like we are going to end up with a lot of templating for the second part, which may not make it particularly easy for "Han" to pick up the scripts and just use them in a new environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same impression that it would be desirable to split in two actions to make easier to reuse the install step and leave the configure part to each provider.
* [Cluster-API provider for AWS](https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/4553a80b6337b4adcc378c07db943772d30fbc78/pkg/cloud/aws/services/ec2/bastion.go) | ||
* [Cluster-API provider for VSphere](https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/master/cloud/vsphere/provisioner/common/templates.go) | ||
|
||
The implementation will be realized using interfaces, meaning that for each supported operating system interface will be implemented. The interface has two functions, for getting user data for Master instances and for workers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sidharthsurana: But would this mean that we are putting the go templates inside the go code?
If that is the case, then that would make a difficult to revise these scripts when needed for a specific case.
I suggest if we do need these pre-created scripts we store then in the form of kubernetes objects. for e.g. like a config map or even better if we can create a dedicated CRD for these. For this dedicated CRD we can publish some "standard" CRs one for each distro/version combination (like ubuntu/16.04, ubuntu/18.04, etc.). These CRs could represent the standard way of installation but at the same time if a user want to do some modifications, they can either create one more CR with their changes in. Inside the machine spec one can then refer to a particular CR of this new CRD type to specify what script to use to bootstrap the machine. This way if I am bringing my own custom image, I can also upload my custom install script in the form of a CR that I know will work on my image.
But would this mean that we are putting the go templates inside the go code?
Yes, that is something we will always have to do because there are some fields that are unique per machine, e.G. a join token or certificates.
The idea to use a CRD to make adjusting them easier is probably a good one, but I think it would be a mechanism that would then work on top of what is proposed here
Yes, that is something we will always have to do because there are some fields that are unique per machine, e.G. a join token or certificates.
Well may not be necessary. The way I am thinking is this CRs for the custom CRD above would be a go template with say placeholder like "join token", "certificate" , etc. And the machine controller would read the specified CR and would mash that template with the runtime information (via the machineSpec, clusterSpec, etc.) to render the final script to be used. The point I am trying to get to is if we can avoid changing the go code everytime we want to customize that script, that would be great and make this much more useful
@roberthbailey: The GCP implementation moved towards storing these scripts in a config map (to make them easier to change). There is also the option to embed them into a bundle, which might make it easier to share them across providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the idea of moving scripts as config maps (or better, as CRD) more appealing, because it will offer an standard operational interface (how to setup/update configuration) without making any strong assumption about the specific OS or cloud platform to be supported. IMHO this should be the main goal: define the operational interface.
Another example of this approach, besides GCP previously mentioned is Gardener's Operating System controller used to setup CoreOS nodes.
Offering a reference implementation is great, but I don't see it as the biggest benefit. This may be because In my case, I plan to use a different distribution than suggested (Ubuntu) and it is unlikely I could reuse the scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offering a reference implementation is great, but I don't see it as the biggest benefit.
I second this. Let's figure out the "standard operational interface (how to setup/update configuration)."
### Decisions to be made | ||
|
||
* What Kubernetes versions should be supported? | ||
* Kubernetes supports 3 version at the same time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roberthbailey: I just reviewed kubernetes-sigs/cluster-api-provider-gcp#54 where we updated bootstrap scripts to a different k8s version and the scripts had to be changed.
And if we look at the history of kubeadm, I expect constant changes in the flags / configuration that we need to pass into kubeadm itself as well as the system components. We are constantly adding new flags / behavior to k8s that needs to be configured as we move to new releases. That needs to be reflected somewhere in the configuration (at least for the control plane components).
* What container engine should be set up by the scripts? | ||
* Docker - very popular and very widely used in production. | ||
* containerd - newer solution, seems to work well. Used by AWS provider. | ||
* Opinion: we should choose one and stay with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sflxn: Where would we keep scripts used to set up other container runtimes? I assume you're not suggesting we just choose one and only support that, right? Strongly isolated container runtimes are coming.
@roberthbailey: Also, even if we pick one today we will need a way to upgrade to a newer version or a different one in the future. For the GCP provider we will want to track what the node folks qualify on GCE/GKE even if that isn't what is being tested on other environments.
@alvaroaleman: The idea here would be to provide something that "just works" and can be used as a default, if ppl decide they want another runtime they can decide to not use that script or add another script template for that other runtime
Comments from Google Docs are moved to GitHub, so it's easier to follow progress and discussion. /assign @roberthbailey |
/retest What's the status of this PR? |
@vincepri I'm planning to rework it a little bit, but was busy for a past few weeks and will be for next several weeks. If you want, we can close the PR and create a new one once I have some updates. |
No worries, I'll leave it up to you. I was mostly going through the open PRs and check their status. |
@xmudrii I just made some comments on the proposal, based on my recent experience adapting the existing node bootstrap controller from the Gardener project so a different operating systems (from CoreOS to OpenSuse and MicroOS). When you are ready to come back to this PR I'll be happy to collaborate. |
I would be interested in discussing the following use cases, all in Cluster API common and not per provider. Note, this is broader than just common bootstrapping scripts, and starts to tease out a bit about where we have defaults and optional replacements/extensions for them.
|
If we are going to allow users to define their own mechanism for gRPC/webhook, should the default implementations also use the same mechanism? This would avoid differences in implementation or features between the two approaches as well as provide a reference implementation. |
If we use gRPC/webhook, yes, I would do that. Part of it will depend on what the machine definition evolves to. It could, for example, have a spec field that is the exact |
@ncdc @detiber If this still sounds like an interesting topic, I'd definitely like to continue working on this proposal. I probably can make it for the next meeting or meeting for two weeks and we can talk more about this. So far from some recent experience, Point 5 is also interesting, but may need some discussion and work to get it working. :) |
I definitely want to pursue this, thanks! |
+1 to optional, common, shared components for cluster bootstrapping on top of what's really provider specific - getting a Machine provisioned, up and running with a particular image. I hadn't seen this before these recent comments. This seems related to #721 and also https://docs.google.com/document/d/1pzXtwYWRsOzq5Ftu03O5FcFAlQE26nD3bjYBPenbhjg/edit I'm very interested in this. We've just started building a managed bare metal cluster-api provider (github.com/metalkube). So far we've just been focused on automating the machine provisioning under Kubernetes custom resources. I was really hoping to figure out how to help make progress on shared, optional components for bootstrapping instead of writing something else specific to our provider. |
@russellb yes, all of these things are intertwined 😄 |
In the Gardener Project what we are doing is to use a CRD for the Operating System Configuration and a Controller that "renders" this configuration into a operating system specific format (for example, script or cloud-init config file). The CRD is modeled in terms of with This rendered representation is stored as a secret and retrieved by a small downloader script deployed in the machines at creation time by the machine controller using This model has been tested with CoreOS ( |
I'm also interested in defining some "software provisioning artifact" that can be used uniformly across all Cluster API providers. We haven't talked about node upgrades here. Something common across all Cluster API providers has to take upgrades into account, otherwise There are two kinds of upgrades:
For "replace" node upgrades, bootstrap scripts injected at machine create time are sufficient. But they do not work for "in-place" node upgrades. Of course, if the Cluster API provider cannot invoke commands on the machine, it's hard to conceive of an "in-place" upgrade. I recall that some providers want to avoid using ssh, although they may be open to using some custom agent. A Cluster API provider that wants to support "in-place" node upgrades should be able to implement them using the same common software provisioning artifact. P.S. I think "in-place" node upgrades are required only in corner cases. For example, if the Cluster API runs on the cluster, and there is only one node in the cluster. However, if an "in-place" upgrade takes less time, it can be a desirable optimization. |
@dlipovetsky I think at some point we discussed this "in place" vs replacement use cases and kind of agreed that this should be provider specific. That is, cluster api shouldn't assume a mechanism for this configuration upgrade to be applied, but provide the mechanism to make configuration and configuration changes available for the provider in a consistent way. If the provider wants to implement an "in-place" upgrade, it can use deploy a small agent that monitor changes in the configuration and applies them when a change is detected. This agent cant be injected in the "rendered" bootstrap script for the provider. |
Hey folks, Let's move to close this PR and start to rally on the discussion from this Wed. There are several things in this doc that were discussed as themes, but we need to break-down execution into logical chunks that we can accomplish. Thoughts? |
Sounds good to me! |
Sounds good to me! Maybe we should create an issue for this feature so we can easier track the progress. |
@xmudrii we are planning to use the KEP process for Cluster API enhancements going forward, and I am working with the community to determine the best way to manage the roadmap items. |
/close |
@ncdc: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What this PR does / why we need it:
This PR represents a proposal for adding provisioning mechanism, which has been discussed on Cluster-API Breakout meetings and via Google Docs.
The original proposal along with all comments can be found on Google Docs.
This proposal has been discussed on the following meetings:
At this point there are still several outstanding comments for which we are going to discuss about on the Cluster-API Breakout meeting on 11/14.
To make it easier to follow the comments on GitHub, I'll move all outstanding comments from Google Docs to GitHub.
Release note: