-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNTRLPLANE-78: Move Group informer configuration to RestrictSubjectBindings plugin initialization #2157
base: master
Are you sure you want to change the base?
Conversation
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
Skipping CI for Draft Pull Request. |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
80e26cb
to
e933af2
Compare
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
e933af2
to
edf1675
Compare
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
5e5552d
to
e25f667
Compare
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
e25f667
to
b7800b2
Compare
@everettraven: This pull request references CNTRLPLANE-78 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest |
/retest-required |
/retest |
if err := userInformers.User().V1().Groups().Informer().AddIndexers(cache.Indexers{ | ||
usercache.ByUserIndexName: usercache.ByUserIndexKeys, | ||
}); err != nil { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would something like this for error handling be appropriate here?
if err := userInformers.User().V1().Groups().Informer().AddIndexers(cache.Indexers{ | |
usercache.ByUserIndexName: usercache.ByUserIndexKeys, | |
}); err != nil { | |
return | |
} | |
if err := userInformers.User().V1().Groups().Informer().AddIndexers(cache.Indexers{ | |
usercache.ByUserIndexName: usercache.ByUserIndexKeys, | |
}); err != nil { | |
utilruntime.HandleError(err) | |
return | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good idea. I wasn't aware there was a utility for handling runtime errors. I'll update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in the latest push
03e0545
to
fd11785
Compare
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
fd11785
to
5592a48
Compare
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
/retest-required |
/lgtm |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: everettraven, liouk The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
usercache.ByUserIndexName: usercache.ByUserIndexKeys, | ||
}); err != nil { | ||
utilruntime.HandleError(err) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how we're handling errors in other places in this file, so this seems correct because this is an admission plugin, i.e., it's a non-user-facing error.
usercache.ByUserIndexName: usercache.ByUserIndexKeys, | ||
}); err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the enhancement, you have this:
- `Group` informer creation and configuration is moved into the `authorization.openshift.io/RestrictSubjectBindings` admission plugin initialization process
From what I understand from your PR, you are not moving the creation of the informer, only the indexer is being added in the admission plugin's informer instead. I assume that's what you meant by "configuration" (i.e., the indexer).
However, in the PR description, you said:
(...)
This is necessary to prevent the startup of an informer for the Group API when the plugin is disabled
(...)
So I'm wondering how this is preventing the startup of the informer. It seems to me that it's still being started as before. Am I missing anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The object passed to the admission plugin initializer is an informer factory, and it's the chained call to Informer()
that both creates and registers a new informer if needed (and the informer is later started by SharedInformerFactory's Start). So moving that call to plugin initialization instead of run-always should actually work. This all assumes that the one call being moved is the only place a User Group informer is being requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 162 to 184 in 2f2cf38
// InformerFor returns the SharedIndexInformer for obj using an internal | |
// client. | |
func (f *sharedInformerFactory) InformerFor(obj runtime.Object, newFunc internalinterfaces.NewInformerFunc) cache.SharedIndexInformer { | |
f.lock.Lock() | |
defer f.lock.Unlock() | |
informerType := reflect.TypeOf(obj) | |
informer, exists := f.informers[informerType] | |
if exists { | |
return informer | |
} | |
resyncPeriod, exists := f.customResync[informerType] | |
if !exists { | |
resyncPeriod = f.defaultResync | |
} | |
informer = newFunc(f.client, resyncPeriod) | |
informer.SetTransform(f.transform) | |
f.informers[informerType] = informer | |
return informer | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was the only place for that particular SharedInformerFactory
that it was being called
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I see. Thanks for the explanation. So the factory is created in the run-always path, and the informers that are requested are started there as well. However, the creation of the informer and its configuration need to be done in the plugin.
I'm not familiar with this, and maybe that's how it's typically done, but this "shared responsibility" over the informer sounds error-prone to me. Is there a way to keep where it is, but run it conditionally?
Regardless, if we're going to do this, I'd suggest adding a comment in the plugin initialization stating that. Also, do we have a job proving this is working as intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to keep where it is, but run it conditionally?
This is typical. Take a look at all of the implementations of this:
SetExternalKubeInformerFactory(informers.SharedInformerFactory) |
The part that is atypical is having only one thing that requires a Group informer, and that it happens to be an admission plugin that is not always enabled. I agree there should be some job demonstrating that this is doing what we expect (are we looking to see that there are group watch 404s from kube-apiserver in an E2E job that enables and disables external OIDC without this, and that those requests disappear with this patch?).
Is there something we could add to CI that would tell us when we have perma-unstarted informers? I know that it's one of the /readyz checks to wait for all the Kube shared informers to start (https://github.com/openshift/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/config.go#L960-L976). I guess we can't do the same for resources served by the aggregation layer without having a chicken-and-egg problem. Still, we should emit some indication that can be wired to a monitor test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not an existing CI job to test this behavior, but is one that is intended to be created. If this PR is not mergeable until such a job exists I'm happy to place this PR on hold until then.
I have manually tested this based on https://issues.redhat.com/browse/OCPBUGS-45460 and verified that disabling the admission plugin does not result in seeing the same reflector errors being logged. I know manual testing isn't a good substitute for a repeatable CI job, but thought it was worth mentioning that I least stood up a cluster and manually tested this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't critical, and the BYO OIDC functionality is now slated for TP in 4.19, so this PR can wait until we've got more progress done on the testing front for that feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is typical. Take a look at all of the implementations of this:
Thank you, now I see.
Another question: don't we need to wait for the cache to warm up?
diff --git i/openshift-kube-apiserver/admission/authorization/restrictusers/restrictusers.go w/openshift-kube-apiserver/admission/authorization/restrictusers/restrictusers.go
index 4dea00e61a4..e37ef8c0ff0 100644
--- i/openshift-kube-apiserver/admission/authorization/restrictusers/restrictusers.go
+++ w/openshift-kube-apiserver/admission/authorization/restrictusers/restrictusers.go
@@ -88,13 +88,16 @@ func (q *restrictUsersAdmission) SetRESTClientConfig(restClientConfig rest.Confi
}
func (q *restrictUsersAdmission) SetUserInformer(userInformers userinformer.SharedInformerFactory) {
- if err := userInformers.User().V1().Groups().Informer().AddIndexers(cache.Indexers{
+ groupInformer := userInformers.User().V1().Groups()
+ if err := groupInformer.Informer().AddIndexers(cache.Indexers{
usercache.ByUserIndexName: usercache.ByUserIndexKeys,
}); err != nil {
utilruntime.HandleError(err)
return
}
- q.groupCache = usercache.NewGroupCache(userInformers.User().V1().Groups())
+ q.groupCache = usercache.NewGroupCache(groupInformer)
+
+ q.SetReadyFunc(groupInformer.Informer().HasSynced)
}
// subjectsDelta returns the relative complement of elementsToIgnore in
@@ -129,6 +132,10 @@ func (q *restrictUsersAdmission) Validate(ctx context.Context, a admission.Attri
return nil
}
+ if !q.WaitForReady() {
+ return admission.NewForbidden(a, fmt.Errorf("not yet ready to handle request"))
+ }
+
// Ignore all operations that correspond to subresource actions.
if len(a.GetSubresource()) != 0 {
return nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's super indirect, but that appears to be already happening via the GroupCache... via newRoleBindingRestrictionContext. I wouldn't expect this PR to affect whether or not that's working as intended.
@@ -22,7 +22,7 @@ func newOpenshiftAPIServiceReachabilityCheck(ipForKubernetesDefaultService net.I | |||
return newAggregatedAPIServiceReachabilityCheck(ipForKubernetesDefaultService, "openshift-apiserver", "api") | |||
} | |||
|
|||
func newOAuthPIServiceReachabilityCheck(ipForKubernetesDefaultService net.IP) *aggregatedAPIServiceAvailabilityCheck { | |||
func newOAuthAPIServiceReachabilityCheck(ipForKubernetesDefaultService net.IP) *aggregatedAPIServiceAvailabilityCheck { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see the cosmetic changes excluded to reduce the potential for conflicts as we maintain this patch over the years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one was my doing, the rest were go fmt
related. I'm happy to revert cosmetic changes, would you like both this and the go fmt
changes reverted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any output if I run gofmt -d on either of the other files. Is it me? Maybe your editor is set up to run a gofmt-alike? I'd remove all of the cosmetic changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What version of Go are you running? I've got 1.23.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran GOTOOLCHAIN=go1.23.4 go fmt openshift-kube-apiserver/openshiftkubeapiserver/patch.go
on master and it didn't make any changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosmetic changes reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will rebase to roll up the commit that reverts it in a bit.
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
New changes are detected. LGTM label has been removed. |
3e7fc7b
to
e998ca8
Compare
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
…jectBindings admission plugin initialization to prevent Group informers being configured when the plugin is disabled. This is necessary for when the OpenShift OAuth stack is not present and the plugin is disabled as part of that. Signed-off-by: Bryce Palmer <[email protected]>
e998ca8
to
e94a834
Compare
@everettraven: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@everettraven: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
This PR moves the configuration of the
Group
informer to theauthorization.openshift.io/RestrictSubjectBindings
admission plugin initialization process. This is necessary to prevent the startup of an informer for theGroup
API when the plugin is disabled, which will happen when the OpenShift OAuth stack is intentionally removed from the cluster based on the Authentication configuration.See openshift/enhancements#1726 for additional information.