Skip to content

🐛 WIP: Stop cluster/topology reconciliation on Status changes #5956

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

killianmuldoon
Copy link
Contributor

Signed-off-by: killianmuldoon [email protected]

Add a predicate to the Cluster Topology controller to only trigger reconciliation when the GenerationNumber of the Cluster resource changes. This change will result in fewer reconciliation loops and fix an issue where a bad spec was causing multiple instant reconciliations.

Fixes #5945

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 19, 2022
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jan 19, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign fabriziopandini after the PR has been reviewed.
You can assign the PR to them by writing /assign @fabriziopandini in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -77,6 +79,8 @@ func (r *Reconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manager, opt
For(&clusterv1.Cluster{}, builder.WithPredicates(
// Only reconcile Cluster with topology.
predicates.ClusterHasTopology(ctrl.LoggerFrom(ctx)),
// Only reconcile on Cluster Generation change to avoid reconciling on Status updates.
ctrlpredicates.GenerationChangedPredicate{},
Copy link
Member

@sbueringer sbueringer Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks reasonable to me. As far as I'm aware there is no case where we want to reconcile the topology when the status changes.

I think it would be the far more generic solution to the problem, as we then don't have to make sure that every error we propagate into the conditions (which is afaik all of them) is constant with each failed reconcile.

@killianmuldoon killianmuldoon force-pushed the fix/no-exponential-backoff branch from 7d9965b to c35aa0e Compare January 20, 2022 11:14
@@ -30,6 +30,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller"
"sigs.k8s.io/controller-runtime/pkg/handler"
ctrlpredicates "sigs.k8s.io/controller-runtime/pkg/predicate"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I wonder if we should use crpredicates instead. I saw cr as prefix in other providers as prefix for controller-runtime and I think it's more intuitive and reads better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever suits - I picked ctrl because it's used to import controller runtime itself in other parts of the code, and ctrl client is used in places. We should try to keep the usage consistent if we can.

Copy link
Member

@sbueringer sbueringer Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup agree. If nobody else has an opinion on that, let's just leave it as is.
I think currently this would affect:

  • ctrl => sigs.k8s.io/controller-runtime
  • ctrlclient => "sigs.k8s.io/controller-runtime/pkg/client" (which is only very rarely aliased at all, but I don't see a problem there)
  • ctrlpredicates "sigs.k8s.io/controller-runtime/pkg/predicate"

@fabriziopandini
Copy link
Member

fabriziopandini commented Jan 21, 2022

/hold
I don't think we should do this, this might lead to inconsistent behavior with all the other controllers.
Going back to the issue that triggered this word, I still think we should fix error messages that bubble up into conditions by removing random names; the full error message can be preserved in the log.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 21, 2022
@killianmuldoon
Copy link
Contributor Author

Closing this for now in favor of #5971

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reconciliation error doesn't result in exponential backoff
4 participants