Skip to content

Umbrella: Breaking apart clusterctl #1065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
timothysc opened this issue Jun 24, 2019 · 23 comments
Closed

Umbrella: Breaking apart clusterctl #1065

timothysc opened this issue Jun 24, 2019 · 23 comments
Assignees
Labels
area/clusterctl Issues or PRs related to clusterctl kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@timothysc
Copy link
Member

timothysc commented Jun 24, 2019

Describe the solution you'd like
As an old-timer to kubernetes, I find clusterctl to be weird... It's performing operations which "could be preconditions", and also performing operations which IMO should be part of the .spec of the objects. This issue tries to break down some of the details, and feedback is solicited.

  1. Building a bootstrap cluster...

    • Why? We can / should have it as a precondition for folks to run kubectl apply
  2. Create / Delete (CRDs, Cluster, Machines, ... )

    • Why? That's just a kubectl apply OR delete... possibly with a kubectl plugin to make it feel more 1st classed.
  3. Pivot

    • IMO, this should be a field in the cluster.spec that is part of a state machine of the cluster object.
  4. Kubeconfig

    • There are many ways to drop an encrypted secret config to access, and I think creating a workflow for this might better serve the community.

What I'm really struggling with is... do we really need this tool? I think there are portions of clusterctl which I think could move into a client library that I think would be generally useful as aggregate utility functions which are common operations that providers could leverage. I also think a kubectl plugin might be generally useful to treat clusterapi objects as 1st classed resources, but other then that.?.?.? In a v1alpha2 world what workflows are missing that clusterctl provides?

/kind feature
/cc @ncdc @vincepri @detiber

@timothysc timothysc added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 24, 2019
@timothysc timothysc added this to the v1alpha2 milestone Jun 24, 2019
@timothysc timothysc self-assigned this Jun 24, 2019
@moshloop
Copy link
Contributor

Agree on the clusters and machines, CRD's are a little different though

There are a few high-level features that I think clusterctl could serve:

  1. A generic install

e.g. capdctl crds | kubectl apply -f - from https://github.com/kubernetes-sigs/cluster-api-provider-docker.

Install could probably be solved by hosting a concatenated YAML if it were the only use case

  1. clusterctl logs for streaming logs from machines (not sure if kubectl logs would be extensible for this, but a kubectl machine-logs plugin would work as well)

  2. clusterctl status - provide more real-time status e.g. node readiness, conditions, top, node events

@timothysc
Copy link
Member Author

timothysc commented Jun 25, 2019

1 & 3 could be solved with a kubectl plugin. Or possibly an AddOn operator pattern.

2 is a non-starter imo. If you want that you should setup your own logging service.

@detiber
Copy link
Member

detiber commented Jun 25, 2019

  1. Pivot
  • IMO, this should be a field in the cluster.spec that is part of a state machine of the cluster object.

Pivot is complicated for a few reasons, I'm not sure it could be simplified outside of an external tool.

  • cluster-api components need to be moved to a new cluster
  • scaling of the cluster-api controllers from the source cluster need to be scaled down before the target cluster cluster-api controllers are running to avoid multiple controllers running at the same time
  • cluster and machine* objects need to be deleted out of the source cluster without removing the underlying resources (currently called "force delete" and done by removing the finalizers before deleting)
  • cluster and machine* objects need to be created in the right order on the target cluster

@detiber
Copy link
Member

detiber commented Jun 25, 2019

  1. Building a bootstrap cluster...
  • Why? We can / should have it as a precondition for folks to run kubectl apply

The big reason for this was to simplify the user experience and avoid needing to run manual pre-steps.

@detiber
Copy link
Member

detiber commented Jun 25, 2019

  1. Kubeconfig
  • There are many ways to drop an encrypted secret config to access, and I think creating a workflow for this might better serve the community.

+1 to this approach going forward, it didn't exist for the first iteration

@detiber
Copy link
Member

detiber commented Jun 25, 2019

Overall, I think kubectl plugins are a great way for us to move going forward, when clusterctl was started kubectl plugins either did not exist yet or they were just implemented with little to no documentation.

@moshloop
Copy link
Contributor

Perhaps we need to reframe the question:

a) Should CAPI and CAPI providers depend/require clusterctl ? I think not
b) Would the community benefit from tool that can provide higher-order functionality for a better UX? I think yes, but it could be a sub-project or a 3rd project entirely.

@moshloop
Copy link
Contributor

  1. Building a bootstrap cluster...
  • Why? We can / should have it as a precondition for folks to run kubectl apply

The big reason for this was to simplify the user experience and avoid needing to run manual pre-steps.

I think the UX with kind is good enough, if not great

@detiber
Copy link
Member

detiber commented Jun 25, 2019

a) Should CAPI and CAPI providers depend/require clusterctl ? I think not

I agree here 100%, there should definitely be workflows that exist without clusterctl, that said certain functionality such as Pivot, will require some explicit documentation on how to do without causing issues.

b) Would the community benefit from tool that can provide higher-order functionality for a better UX? I think yes, but it could be a sub-project or a 3rd project entirely.

I think there may be multiple levels here. I think the project benefits from having a bootstrapping tool to go from 0 -> cluster-api, but how much of a friendly UX we can accomplish while keeping the relatively un-opinionated nature of cluster-api proper is debatable.

For a proper user-friendly installer UX, I definitely think that is a separate project.

@timothysc
Copy link
Member Author

Pivot is complicated for a few reasons, I'm not sure it could be simplified outside of an external tool.

I'm thinking Cluster API Operator to manage the lifecycle across * providers. This component can install the CRDs and controllers and turn down components if needed. It could be the one todo the final killing, but regardless we could. The operator also solves a part of our distribution problem.

@detiber
Copy link
Member

detiber commented Jun 26, 2019

I definitely like the idea of an operator, but we also need to remember that deploying the operator also presents a chicken/egg situation where an existing cluster needs to be present.

@timothysc
Copy link
Member Author

I'm totally cool with having a bootstrap cluster be a precondition.

@ncdc
Copy link
Contributor

ncdc commented Jun 26, 2019

Same - kind is so easy to set up these days, that should be sufficient as a minimum requirement.

@detiber
Copy link
Member

detiber commented Jun 26, 2019

The problem with kind as the bootstrapping cluster now means that you still need something to handle pivoting, or would we expect that the operator be able to solve the pivoting problem for us?

@timothysc
Copy link
Member Author

My thought is that the operator would do the pivot.

@pires
Copy link
Contributor

pires commented Jun 27, 2019

Here's my POV:

  • Things will be easier on the user if there's a bootstrap cluster, eg kind. I understand the argument chicken/egg problem but there's no such thing as free lunch.
  • Self-hosted (pivot) stuff is complicated, particularly if you're moving from the bootstrap cluster to the target cluster and need to account for concurrency. I wouldn't make it a must-have but a good-to-have and give the group time to mature the idea and implementation.
    • What happens if the pivot pieces fail, ie pod(s) is evicted or node goes away and there's no capacity to run a new instance? If the bootstrapper isn't there, bye bye cluster.
  • If we don't have pivot, as @detiber put it, kind cannot be the solution. However, I'd argue that kind could be the reference implementation and the user should be able opt to use any Kubernetes cluster, eg k3s or GKE.
  • clusterctl or any other CLI is not something I appreciate as, as @timothysc put it, kubectl can do the same things and a plug-in for it would make things feel 1st class still.

@pires
Copy link
Contributor

pires commented Jun 27, 2019

@detiber I don't want to hijack the discussion here but since you mention the concurrency problems of moving from bootstrapper to pivot, I think for as long as one can reach the other, say bootstrapper can reach the pivot, controller(s) may be smart and rely on the same locks to guarantee there are no concurrency issues. Something along the follow lines:

  • bootstrapper controller (bc1) locks object in its API;
  • bc1 creates cluster X;
  • bc1 eventually knows the cluster is up by contacting target API (I believe this is true since I've learned about remote node refs);
  • bc1 can lock a resource in the target API until pcX is ready;
  • pcX is healthy and bc1 can stop working. pcX gets the lock and takes over;
  • eventually, we can have bc1 still run for the lock to replace pcX in case pcX is unhealthy or can't come back.

Does it make any sense? If yes, then let's take it out of this issue, maybe?

@detiber
Copy link
Member

detiber commented Jun 27, 2019

One thing to keep in mind in this discussion is that currently kubectl does not allow for setting foreground propagation during delete which is currently needed to ensure that resources are cleaned up properly, see #985. Providing something to safely delete resources might be a good thing. Whether that is a standalone tool ala clusterctl or a kubectl plugin.

@alexeldeib
Copy link
Contributor

I like the idea of a kubectl cluster bootstrap plugin. kubectl create cluster would be nice but I don't think you can override create like that (?).

@dlipovetsky
Copy link
Contributor

I don't see a need for clusterctl. I've been using CAPA entirely without clusterctl. I create a management cluster with kind (one command) and then use kubectl to deploy the CAPI/CAPA bits, and the workload cluster(s).

I do use a CAPA-maintained helper shell script to generate the manifests, though. With the infra/bootstrap provider split, I can see each infra and bootstrap provider shipping its own tool to help generate manifests. There might be room for a CAPI-maintained tool that helps generate the now provider-agnostic Cluster and Machine manifests.

Haven't thought this through, but I suspect kubectl plugins using kustomize would do the job nicely.

@ncdc ncdc added the area/clusterctl Issues or PRs related to clusterctl label Sep 4, 2019
@timothysc timothysc modified the milestones: v0.2.x (v1alpha2), Next Sep 6, 2019
@timothysc timothysc added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Sep 27, 2019
@timothysc timothysc modified the milestones: Next, v0.3.0 Sep 27, 2019
@fabriziopandini
Copy link
Member

@timothysc @ncdc
There is a CAEP in flight defining the way forward for clusterctl, so IMO we can close this issue
In the meantime, I'm moving pivoting bits to a separated issue

@liztio
Copy link
Contributor

liztio commented Oct 28, 2019

/close

@k8s-ci-robot
Copy link
Contributor

@liztio: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clusterctl Issues or PRs related to clusterctl kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

10 participants