Skip to content

⚠️ Add watch label to allow multiple manager instances #4119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 9, 2021

Conversation

MarcelMue
Copy link
Contributor

What this PR does / why we need it:

This PR adds a flag which defines which CRs will be processed depending on a given label.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #4004

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 25, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @MarcelMue. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 25, 2021
@MarcelMue MarcelMue changed the title ✨ Add watch label to allow running multiple instances of the operator ✨ Add watch label to allow running multiple instances of the operator Jan 25, 2021
@MarcelMue MarcelMue marked this pull request as ready for review January 26, 2021 10:38
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 26, 2021
@@ -157,6 +177,12 @@ func ResourceNotPaused(logger logr.Logger) predicate.Funcs {
}
}

// ResourceNotPausedAndHasLabel returns a predicate that returns true only if the
// ResourceNotPaused and ResourceHasLabel predicates return true.
func ResourceNotPausedAndHasLabel(logger logr.Logger, labelValue string) predicate.Funcs {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something similar was implemented before but then removed. I tried to follow that structure and naming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are not performance concerns, I would prefer to keep paused and filter by label as a separated predicates

@JoelSpeed
Copy link
Contributor

I think this might need some further discussion before we review. Personally I'd like to understand what the benefit of this approach is over using namespaces to separate the resources that should be reconciled by particular controllers

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test

If this behaviour applies to all the objects managed by Cluster API, we should get this documented in the contract, and implemented also in CAPBK, KCP & CAPD too

@@ -157,6 +177,12 @@ func ResourceNotPaused(logger logr.Logger) predicate.Funcs {
}
}

// ResourceNotPausedAndHasLabel returns a predicate that returns true only if the
// ResourceNotPaused and ResourceHasLabel predicates return true.
func ResourceNotPausedAndHasLabel(logger logr.Logger, labelValue string) predicate.Funcs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are not performance concerns, I would prefer to keep paused and filter by label as a separated predicates

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 26, 2021
@MarcelMue
Copy link
Contributor Author

I think this might need some further discussion before we review. Personally I'd like to understand what the benefit of this approach is over using namespaces to separate the resources that should be reconciled by particular controllers

There has been discussion on this topic in several different places so far - it was also acknowledged here:
#4074

Google doc which was shared with sig-cluster-lifecycle:
https://docs.google.com/document/d/1Icttu56PmbDm70g8GyXupRq3MX4NADuGDoVhjwQwHMo/

Discussions during sync (just two examples):
https://www.youtube.com/watch?t=649&v=FLUV7QB2QL8
https://www.youtube.com/watch?t=269&v=tV1yoeypfWk

To answer your question briefly:
The namespace separation is not well suited for multi tenancy (in particular with multiple core controllers in different versions) as it implies moving CRs between namespaces.

Please let me know if there is a clear concern here - I am open to discuss but there has been IMO a lot of discussion leading up to this PR already.

@MarcelMue
Copy link
Contributor Author

If this behaviour applies to all the objects managed by Cluster API, we should get this documented in the contract, and implemented also in CAPBK, KCP & CAPD too

Should I align the docs in this same PR? Or would this be a follow up?
I am totally fine with bringing this change to the other providers once this is merged.

Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only concern that came up during review is that now the predicate has to be added to each controller separately (which is also hard to test and could break any time).

Is Namespace based filtering not enough to run multiple controllers?

With Namespace based filters we also get the benefit of having the informers and all internal cache to only ever watch that namespace. With label based predicates instead, the cache is still being populated entirely, and watch notifications are still coming through but discarded

@MarcelMue
Copy link
Contributor Author

Is Namespace based filtering not enough to run multiple controllers?

To me this mostly boils down to usability. I agree that there are some performance downside but this is a completely optional flag and from my experience with using this approach, the performance impact is negligible.

The usability is important to me because this enables switching the reconciling controller in a single command: Changing the label. I feel like that is easy to understand, execute and handle.

With a namespaced approach we would firstly have our namespace structure to fit the controller versions we use - this is not very nice IMO.
Additionally switching the version now becomes a rather complex action:

  1. Pause all CRs of the Cluster.
  2. Move all CRs of the Cluster to a different namespace.
  3. Unpause all CRs of the Cluster.

A basically atomic action in the workflow I propose here becomes a rather error prone, multi step process.

I feel like we discussed these points before in a meeting - I am also more than happy to provide E2E tests for this functionality (I think the e2e workthrough is not out yet) if this increases your confidence.

@MarcelMue
Copy link
Contributor Author

@CecileRobertMichon You also had opinions on this IIRC.

@CecileRobertMichon
Copy link
Contributor

Only concern that came up during review is that now the predicate has to be added to each controller separately (which is also hard to test and could break any time).

I'm +1 on this PR. There is precedent for this with the Pause annotation. In addition, the default behavior does not change and this is completely opt-in, so the risk is low. We should definitely consider adding an E2E test for preventing future regressions though.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 1, 2021
@vincepri
Copy link
Member

vincepri commented Feb 1, 2021

I'm +1 on this PR. There is precedent for this with the Pause annotation

@CecileRobertMichon Did you meant the feature gates, or did I miss the use case you were referring to?

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 1, 2021
@CecileRobertMichon
Copy link
Contributor

@vincepri no I meant the pause annotation predicate (which had to be added to every single controller, including in infra providers)

@vincepri
Copy link
Member

vincepri commented Feb 1, 2021

@CecileRobertMichon Oh I see, the implementation bits, yes. The flag though also has to be kept in sync, which is what I was thinking w.r.t. the feature gates.

Thanks @MarcelMue for the explanation above, as long as we document these requirements and limitations +1

Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 5, 2021
@CecileRobertMichon
Copy link
Contributor

I would suggest to title the PR with warning given we are changing the contract and to ask for a quick +1 to provider's maintainer as well.

Azure provider maintainer here - /lgtm

/assign @randomvariable @yastij @cpanato

@CecileRobertMichon
Copy link
Contributor

@MarcelMue
Copy link
Contributor Author

Actually, I just realized we should also update https://github.com/kubernetes-sigs/cluster-api/blob/master/docs/book/src/developer/architecture/controllers/support-multiple-instances.md @MarcelMue

This PR already contains a simple addition of the label in the same section where the namespace flag is mentioned. Would you like to something additional there? I felt like the general statement in that file remained unchanged:
https://github.com/kubernetes-sigs/cluster-api/pull/4119/files#diff-e022c3c4584b8bb87c39431d43bdac3c6d62743735ede072b8a1a1bf9e1d64ce

@CecileRobertMichon
Copy link
Contributor

This PR already contains a simple addition of the label in the same section where the namespace flag is mentioned. Would you like to something additional there

Ahh I must have missed that on my re-read. Thanks, all good from my side then.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 8, 2021
Copy link
Member

@yastij yastij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

pending rebase, fyi @gab-satchi

@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 9, 2021
@MarcelMue
Copy link
Contributor Author

/lgtm

pending rebase, fyi @gab-satchi

Should be all rebased correctly now :)

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 9, 2021
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 9, 2021
@vincepri
Copy link
Member

vincepri commented Feb 9, 2021

/retitle ⚠️ Add watch label to allow multiple instances of controllers

@k8s-ci-robot k8s-ci-robot changed the title ⚠️ Add watch label to allow running multiple instances of the operator ⚠️ Add watch label to allow multiple instances of controllers Feb 9, 2021
@vincepri
Copy link
Member

vincepri commented Feb 9, 2021

/retitle ⚠️ Add watch label to allow multiple manager instances

@k8s-ci-robot k8s-ci-robot changed the title ⚠️ Add watch label to allow multiple instances of controllers ⚠️ Add watch label to allow multiple manager instances Feb 9, 2021
@k8s-ci-robot k8s-ci-robot merged commit 4a6cccd into kubernetes-sigs:master Feb 9, 2021
@MarcelMue MarcelMue deleted the add-watch-label branch March 16, 2021 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable running multiple versions of CAPI controllers in paralell
9 participants