Umbrella: Breaking apart clusterctl #1065

timothysc · 2019-06-24T21:07:31Z

Describe the solution you'd like
As an old-timer to kubernetes, I find clusterctl to be weird... It's performing operations which "could be preconditions", and also performing operations which IMO should be part of the .spec of the objects. This issue tries to break down some of the details, and feedback is solicited.

Building a bootstrap cluster...
- Why? We can / should have it as a precondition for folks to run kubectl apply
Create / Delete (CRDs, Cluster, Machines, ... )
- Why? That's just a kubectl apply OR delete... possibly with a kubectl plugin to make it feel more 1st classed.
Pivot
- IMO, this should be a field in the cluster.spec that is part of a state machine of the cluster object.
Kubeconfig
- There are many ways to drop an encrypted secret config to access, and I think creating a workflow for this might better serve the community.

What I'm really struggling with is... do we really need this tool? I think there are portions of clusterctl which I think could move into a client library that I think would be generally useful as aggregate utility functions which are common operations that providers could leverage. I also think a kubectl plugin might be generally useful to treat clusterapi objects as 1st classed resources, but other then that.?.?.? In a v1alpha2 world what workflows are missing that clusterctl provides?

/kind feature
/cc @ncdc @vincepri @detiber

The text was updated successfully, but these errors were encountered:

moshloop · 2019-06-24T22:21:57Z

Agree on the clusters and machines, CRD's are a little different though

There are a few high-level features that I think clusterctl could serve:

A generic install

e.g. capdctl crds | kubectl apply -f - from https://github.com/kubernetes-sigs/cluster-api-provider-docker.

Install could probably be solved by hosting a concatenated YAML if it were the only use case

clusterctl logs for streaming logs from machines (not sure if kubectl logs would be extensible for this, but a kubectl machine-logs plugin would work as well)
clusterctl status - provide more real-time status e.g. node readiness, conditions, top, node events

timothysc · 2019-06-25T13:58:28Z

1 & 3 could be solved with a kubectl plugin. Or possibly an AddOn operator pattern.

2 is a non-starter imo. If you want that you should setup your own logging service.

detiber · 2019-06-25T14:40:49Z

Pivot

IMO, this should be a field in the cluster.spec that is part of a state machine of the cluster object.

Pivot is complicated for a few reasons, I'm not sure it could be simplified outside of an external tool.

cluster-api components need to be moved to a new cluster
scaling of the cluster-api controllers from the source cluster need to be scaled down before the target cluster cluster-api controllers are running to avoid multiple controllers running at the same time
cluster and machine* objects need to be deleted out of the source cluster without removing the underlying resources (currently called "force delete" and done by removing the finalizers before deleting)
cluster and machine* objects need to be created in the right order on the target cluster

detiber · 2019-06-25T14:42:48Z

Building a bootstrap cluster...

Why? We can / should have it as a precondition for folks to run kubectl apply

The big reason for this was to simplify the user experience and avoid needing to run manual pre-steps.

detiber · 2019-06-25T14:44:04Z

Kubeconfig

There are many ways to drop an encrypted secret config to access, and I think creating a workflow for this might better serve the community.

+1 to this approach going forward, it didn't exist for the first iteration

detiber · 2019-06-25T14:45:41Z

Overall, I think kubectl plugins are a great way for us to move going forward, when clusterctl was started kubectl plugins either did not exist yet or they were just implemented with little to no documentation.

moshloop · 2019-06-25T15:13:47Z

Perhaps we need to reframe the question:

a) Should CAPI and CAPI providers depend/require clusterctl ? I think not
b) Would the community benefit from tool that can provide higher-order functionality for a better UX? I think yes, but it could be a sub-project or a 3rd project entirely.

moshloop · 2019-06-25T15:18:06Z

Building a bootstrap cluster...

Why? We can / should have it as a precondition for folks to run kubectl apply

The big reason for this was to simplify the user experience and avoid needing to run manual pre-steps.

I think the UX with kind is good enough, if not great

detiber · 2019-06-25T15:21:16Z

a) Should CAPI and CAPI providers depend/require clusterctl ? I think not

I agree here 100%, there should definitely be workflows that exist without clusterctl, that said certain functionality such as Pivot, will require some explicit documentation on how to do without causing issues.

b) Would the community benefit from tool that can provide higher-order functionality for a better UX? I think yes, but it could be a sub-project or a 3rd project entirely.

I think there may be multiple levels here. I think the project benefits from having a bootstrapping tool to go from 0 -> cluster-api, but how much of a friendly UX we can accomplish while keeping the relatively un-opinionated nature of cluster-api proper is debatable.

For a proper user-friendly installer UX, I definitely think that is a separate project.

timothysc · 2019-06-25T22:08:17Z

Pivot is complicated for a few reasons, I'm not sure it could be simplified outside of an external tool.

I'm thinking Cluster API Operator to manage the lifecycle across * providers. This component can install the CRDs and controllers and turn down components if needed. It could be the one todo the final killing, but regardless we could. The operator also solves a part of our distribution problem.

detiber · 2019-06-26T13:45:06Z

I definitely like the idea of an operator, but we also need to remember that deploying the operator also presents a chicken/egg situation where an existing cluster needs to be present.

timothysc · 2019-06-26T14:23:46Z

I'm totally cool with having a bootstrap cluster be a precondition.

ncdc · 2019-06-26T14:25:55Z

Same - kind is so easy to set up these days, that should be sufficient as a minimum requirement.

detiber · 2019-06-26T14:27:31Z

The problem with kind as the bootstrapping cluster now means that you still need something to handle pivoting, or would we expect that the operator be able to solve the pivoting problem for us?

timothysc · 2019-06-26T14:28:27Z

My thought is that the operator would do the pivot.

pires · 2019-06-27T07:35:37Z

Here's my POV:

Things will be easier on the user if there's a bootstrap cluster, eg kind. I understand the argument chicken/egg problem but there's no such thing as free lunch.
Self-hosted (pivot) stuff is complicated, particularly if you're moving from the bootstrap cluster to the target cluster and need to account for concurrency. I wouldn't make it a must-have but a good-to-have and give the group time to mature the idea and implementation.
- What happens if the pivot pieces fail, ie pod(s) is evicted or node goes away and there's no capacity to run a new instance? If the bootstrapper isn't there, bye bye cluster.
If we don't have pivot, as @detiber put it, kind cannot be the solution. However, I'd argue that kind could be the reference implementation and the user should be able opt to use any Kubernetes cluster, eg k3s or GKE.
clusterctl or any other CLI is not something I appreciate as, as @timothysc put it, kubectl can do the same things and a plug-in for it would make things feel 1st class still.

pires · 2019-06-27T08:20:17Z

@detiber I don't want to hijack the discussion here but since you mention the concurrency problems of moving from bootstrapper to pivot, I think for as long as one can reach the other, say bootstrapper can reach the pivot, controller(s) may be smart and rely on the same locks to guarantee there are no concurrency issues. Something along the follow lines:

bootstrapper controller (bc1) locks object in its API;
bc1 creates cluster X;
bc1 eventually knows the cluster is up by contacting target API (I believe this is true since I've learned about remote node refs);
bc1 can lock a resource in the target API until pcX is ready;
pcX is healthy and bc1 can stop working. pcX gets the lock and takes over;
eventually, we can have bc1 still run for the lock to replace pcX in case pcX is unhealthy or can't come back.

Does it make any sense? If yes, then let's take it out of this issue, maybe?

detiber · 2019-06-27T13:39:24Z

One thing to keep in mind in this discussion is that currently kubectl does not allow for setting foreground propagation during delete which is currently needed to ensure that resources are cleaned up properly, see #985. Providing something to safely delete resources might be a good thing. Whether that is a standalone tool ala clusterctl or a kubectl plugin.

alexeldeib · 2019-07-09T19:28:15Z

I like the idea of a kubectl cluster bootstrap plugin. kubectl create cluster would be nice but I don't think you can override create like that (?).

dlipovetsky · 2019-07-25T22:41:00Z

I don't see a need for clusterctl. I've been using CAPA entirely without clusterctl. I create a management cluster with kind (one command) and then use kubectl to deploy the CAPI/CAPA bits, and the workload cluster(s).

I do use a CAPA-maintained helper shell script to generate the manifests, though. With the infra/bootstrap provider split, I can see each infra and bootstrap provider shipping its own tool to help generate manifests. There might be room for a CAPI-maintained tool that helps generate the now provider-agnostic Cluster and Machine manifests.

Haven't thought this through, but I suspect kubectl plugins using kustomize would do the job nicely.

fabriziopandini · 2019-10-14T09:57:35Z

@timothysc @ncdc
There is a CAEP in flight defining the way forward for clusterctl, so IMO we can close this issue
In the meantime, I'm moving pivoting bits to a separated issue

liztio · 2019-10-28T15:06:09Z

/close

k8s-ci-robot · 2019-10-28T15:06:11Z

@liztio: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

timothysc added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 24, 2019

timothysc added this to the v1alpha2 milestone Jun 24, 2019

timothysc self-assigned this Jun 24, 2019

timothysc mentioned this issue Jun 27, 2019

Cluster API Operator - Manage lifecycle across providers #1085

Closed

ncdc mentioned this issue Jul 15, 2019

Provide default SSH access to machines kubernetes-sigs/cluster-api-provider-vsphere#416

Closed

ncdc modified the milestones: v0.2.0 (v1alpha2), v0.2.x (v1alpha2) Aug 23, 2019

ncdc added the area/clusterctl Issues or PRs related to clusterctl label Sep 4, 2019

timothysc modified the milestones: v0.2.x (v1alpha2), Next Sep 6, 2019

timothysc assigned fabriziopandini Sep 27, 2019

timothysc added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Sep 27, 2019

timothysc modified the milestones: Next, v0.3.0 Sep 27, 2019

fabriziopandini mentioned this issue Oct 14, 2019

Define clusterctl move process #1525

Closed

k8s-ci-robot closed this as completed Oct 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Umbrella: Breaking apart clusterctl #1065

Umbrella: Breaking apart clusterctl #1065

timothysc commented Jun 24, 2019 •

edited

Loading

moshloop commented Jun 24, 2019

timothysc commented Jun 25, 2019 •

edited

Loading

detiber commented Jun 25, 2019

detiber commented Jun 25, 2019

detiber commented Jun 25, 2019

detiber commented Jun 25, 2019

moshloop commented Jun 25, 2019

moshloop commented Jun 25, 2019

detiber commented Jun 25, 2019

timothysc commented Jun 25, 2019

detiber commented Jun 26, 2019

timothysc commented Jun 26, 2019

ncdc commented Jun 26, 2019

detiber commented Jun 26, 2019

timothysc commented Jun 26, 2019

pires commented Jun 27, 2019

pires commented Jun 27, 2019

detiber commented Jun 27, 2019

alexeldeib commented Jul 9, 2019

dlipovetsky commented Jul 25, 2019

fabriziopandini commented Oct 14, 2019

liztio commented Oct 28, 2019

k8s-ci-robot commented Oct 28, 2019

Umbrella: Breaking apart clusterctl #1065

Umbrella: Breaking apart clusterctl #1065

Comments

timothysc commented Jun 24, 2019 • edited Loading

moshloop commented Jun 24, 2019

timothysc commented Jun 25, 2019 • edited Loading

detiber commented Jun 25, 2019

detiber commented Jun 25, 2019

detiber commented Jun 25, 2019

detiber commented Jun 25, 2019

moshloop commented Jun 25, 2019

moshloop commented Jun 25, 2019

detiber commented Jun 25, 2019

timothysc commented Jun 25, 2019

detiber commented Jun 26, 2019

timothysc commented Jun 26, 2019

ncdc commented Jun 26, 2019

detiber commented Jun 26, 2019

timothysc commented Jun 26, 2019

pires commented Jun 27, 2019

pires commented Jun 27, 2019

detiber commented Jun 27, 2019

alexeldeib commented Jul 9, 2019

dlipovetsky commented Jul 25, 2019

fabriziopandini commented Oct 14, 2019

liztio commented Oct 28, 2019

k8s-ci-robot commented Oct 28, 2019

timothysc commented Jun 24, 2019 •

edited

Loading

timothysc commented Jun 25, 2019 •

edited

Loading