Skip to content

go.*,pkg,vendor: Bump controller-runtime to v0.10.1 #2368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

timflannagan
Copy link
Member

@timflannagan timflannagan commented Sep 21, 2021

Description of the change:
Bumps the controller runtime dependency to v0.10.1 which contains the race condition fix around metadata-only watchers.

Motivation for the change:
Removes the replace pin in the root go.mod.

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive

Fixes #2353

@timflannagan timflannagan force-pushed the bump-controller-runtime branch from 0273ed2 to 1c3da5a Compare September 21, 2021 19:15
@timflannagan timflannagan reopened this Sep 23, 2021
@timflannagan timflannagan force-pushed the bump-controller-runtime branch 3 times, most recently from 181ed76 to bfe84d4 Compare December 3, 2021 16:41
@@ -30,15 +30,15 @@ var _ = Describe("Not found APIs", func() {
// each entry is an installplan with a deprecated resource
type payload struct {
name string
ip *operatorsv1alpha1.InstallPlan
IP *operatorsv1alpha1.InstallPlan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: Why was this exported?

@awgreene
Copy link
Member

awgreene commented Dec 6, 2021

/approve

@openshift-ci
Copy link

openshift-ci bot commented Dec 6, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: awgreene, timflannagan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 6, 2021
@awgreene
Copy link
Member

awgreene commented Dec 6, 2021

[BeforeEach] Operator API
  /home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/test/e2e/operator_test.go:41
[It] should surface components in its status
  /home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/test/e2e/operator_test.go:73
STEP: eventually having a status that contains its component label selector
STEP: eventually listing a single component reference
first patch error: Operation cannot be fulfilled on serviceaccounts "sa-a-gcg2h": the object has been modified; please apply your changes to the latest version and try again
STEP: eventually listing multiple component references

Is there potentially a race condition here?

@awgreene
Copy link
Member

awgreene commented Dec 6, 2021

/retest

crds.OperatorGroup(),
crds.Operator(),
crds.OperatorCondition(),
CRDs: []apiextensionsv1.CustomResourceDefinition{
Copy link
Contributor

@tylerslaton tylerslaton Dec 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nb: Is this something we should move to across the board when using controller-runtime moving forward?

@tylerslaton
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 6, 2021
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@timflannagan
Copy link
Member Author

Holding so the bot doesn't go crazy.

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 6, 2021
Copy link
Member Author

@timflannagan timflannagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we're continuously running into the following errors:

Summarizing 2 Failures:

[Fail] Operator API [It] should surface components in its status 
/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/test/e2e/util_test.go:838

[Fail] Operator API when a subscription to a package exists [It] should automatically adopt components 
/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/test/e2e/operator_test.go:346

When poking around these errors locally, it seems like both of the test failures are around existing ServiceAccount's that aren't being propagated to the list of status.components.refs array. Maybe we're missing that cache event entirely in some cases?

Note: in the context of the second test case, it looks like the ServiceAccount in the kiali bundle was labeled with the operator component. Is the adoption controller, which also uses metadata-only watches for a subset of the resource watches, responsible for ensuring that resource gets labeled?

@timflannagan
Copy link
Member Author

Hmm weird it looks like we do get the mapping function event properly but we're somehow not able to update the Operator resource:

2021-12-08T19:40:37.265148583Z stderr F 2021-12-08T19:40:37.265Z	INFO	controllers.adoption	mapToClusterServiceVersions: requeuing a csv	{"gvk": "&TypeMeta{Kind:,APIVersion:,}", "name": "kiali-operator", "namespace": "ns-4f8q4"}
2021-12-08T19:40:37.265357386Z stderr F 2021-12-08T19:40:37.265Z	INFO	controllers.operator	mapComponentRequests: requeuing operator resource	{"gvk": "&TypeMeta{Kind:,APIVersion:,}", "name": "kiali-operator", "namespace": "ns-4f8q4"}

I'm going to try and update to also collect the state of that Operator resource.

@timflannagan timflannagan force-pushed the bump-controller-runtime branch from e136f98 to 6cb7f2d Compare December 8, 2021 21:20
@timflannagan timflannagan force-pushed the bump-controller-runtime branch 10 times, most recently from ad0cf83 to 5e3ee03 Compare December 8, 2021 23:59
@estroz
Copy link
Member

estroz commented Dec 10, 2021

@timflannagan hoping #2515 doesn't screw this PR up. @dinhxuanvu and I needed to get unblocked for constraint stuff added to api.

@timflannagan
Copy link
Member Author

@estroz Let's definitely touch base tomorrow. We'll need to cut an o-f/api minor release in any case, but there's some complexity involved with certain controller runtime versions, and it's compatibility with OLM.

cc @awgreene @njhale

Update the CRD field to use the apiextensionsv1.CustomResourceDefinition
type instead of the client.Object interface. It looks like this change
was introduced during the sweep to k8s 1.22 bump:

kubernetes-sigs/controller-runtime@b5eeb71

Signed-off-by: timflannagan <[email protected]>
Update the o/api dependency and pull in a new version that contains v1 CRDs. Newer controller-runtime versions don't support v1beta1 CRDs

Signed-off-by: timflannagan <[email protected]>
Update the adoption controller test subscription fixture and remove the
status sub-resource block that's shared between tests. When an object is
created, it's status sub-resource is now removed, so changes are centered
around setting this resource before firing off any Update calls used
throughout this test package.

Update any test/e2e APIs that also follow a similiar pattern.

Signed-off-by: timflannagan <[email protected]>
@timflannagan timflannagan force-pushed the bump-controller-runtime branch from 05bd756 to 83c4cad Compare December 10, 2021 16:21
@estroz
Copy link
Member

estroz commented Dec 10, 2021

/lgtm
/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 10, 2021
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 10, 2021
@openshift-merge-robot openshift-merge-robot merged commit 1f2472f into operator-framework:master Dec 10, 2021
@timflannagan timflannagan deleted the bump-controller-runtime branch December 10, 2021 19:41
// Patch for a race condition involving metadata-only
// informers until it can be resolved upstream:
sigs.k8s.io/controller-runtime v0.9.2 => github.com/benluddy/controller-runtime v0.9.3-0.20210720171926-9bcb99bd9bd3
sigs.k8s.io/controller-runtime v0.10.0 => github.com/timflannagan/controller-runtime v0.10.1-0.20211210161403-6756a4203e70
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking the removal of this pin in #2353 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove controller-runtime replace pin
6 participants