-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNTRLPLANE-78: Move Group informer configuration to RestrictSubjectBindings plugin initialization #2157
base: master
Are you sure you want to change the base?
CNTRLPLANE-78: Move Group informer configuration to RestrictSubjectBindings plugin initialization #2157
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -27,9 +27,7 @@ import ( | |||||||||||||||||||||||||||||||||||||||||||||||||
clientgoinformers "k8s.io/client-go/informers" | ||||||||||||||||||||||||||||||||||||||||||||||||||
corev1informers "k8s.io/client-go/informers/core/v1" | ||||||||||||||||||||||||||||||||||||||||||||||||||
"k8s.io/client-go/rest" | ||||||||||||||||||||||||||||||||||||||||||||||||||
"k8s.io/client-go/tools/cache" | ||||||||||||||||||||||||||||||||||||||||||||||||||
"k8s.io/kubernetes/openshift-kube-apiserver/admission/authorization/restrictusers" | ||||||||||||||||||||||||||||||||||||||||||||||||||
"k8s.io/kubernetes/openshift-kube-apiserver/admission/authorization/restrictusers/usercache" | ||||||||||||||||||||||||||||||||||||||||||||||||||
"k8s.io/kubernetes/openshift-kube-apiserver/admission/autoscaling/managednode" | ||||||||||||||||||||||||||||||||||||||||||||||||||
"k8s.io/kubernetes/openshift-kube-apiserver/admission/autoscaling/managementcpusoverride" | ||||||||||||||||||||||||||||||||||||||||||||||||||
"k8s.io/kubernetes/openshift-kube-apiserver/admission/scheduler/nodeenv" | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
@@ -176,11 +174,6 @@ func newInformers(loopbackClientConfig *rest.Config) (*kubeAPIServerInformers, e | |||||||||||||||||||||||||||||||||||||||||||||||||
OpenshiftUserInformers: userinformer.NewSharedInformerFactory(userClient, defaultInformerResyncPeriod), | ||||||||||||||||||||||||||||||||||||||||||||||||||
OpenshiftConfigInformers: configv1informer.NewSharedInformerFactory(configClient, defaultInformerResyncPeriod), | ||||||||||||||||||||||||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||||||||||||||||||||||||
if err := ret.OpenshiftUserInformers.User().V1().Groups().Informer().AddIndexers(cache.Indexers{ | ||||||||||||||||||||||||||||||||||||||||||||||||||
usercache.ByUserIndexName: usercache.ByUserIndexKeys, | ||||||||||||||||||||||||||||||||||||||||||||||||||
}); err != nil { | ||||||||||||||||||||||||||||||||||||||||||||||||||
return nil, err | ||||||||||||||||||||||||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the enhancement, you have this:
From what I understand from your PR, you are not moving the creation of the informer, only the indexer is being added in the admission plugin's informer instead. I assume that's what you meant by "configuration" (i.e., the indexer). However, in the PR description, you said:
So I'm wondering how this is preventing the startup of the informer. It seems to me that it's still being started as before. Am I missing anything? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The object passed to the admission plugin initializer is an informer factory, and it's the chained call to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lines 162 to 184 in 2f2cf38
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It was the only place for that particular There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, I see. Thanks for the explanation. So the factory is created in the run-always path, and the informers that are requested are started there as well. However, the creation of the informer and its configuration need to be done in the plugin. I'm not familiar with this, and maybe that's how it's typically done, but this "shared responsibility" over the informer sounds error-prone to me. Is there a way to keep where it is, but run it conditionally? Regardless, if we're going to do this, I'd suggest adding a comment in the plugin initialization stating that. Also, do we have a job proving this is working as intended? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is typical. Take a look at all of the implementations of this:
The part that is atypical is having only one thing that requires a Group informer, and that it happens to be an admission plugin that is not always enabled. I agree there should be some job demonstrating that this is doing what we expect (are we looking to see that there are group watch 404s from kube-apiserver in an E2E job that enables and disables external OIDC without this, and that those requests disappear with this patch?). Is there something we could add to CI that would tell us when we have perma-unstarted informers? I know that it's one of the /readyz checks to wait for all the Kube shared informers to start (https://github.com/openshift/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/config.go#L960-L976). I guess we can't do the same for resources served by the aggregation layer without having a chicken-and-egg problem. Still, we should emit some indication that can be wired to a monitor test. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is not an existing CI job to test this behavior, but is one that is intended to be created. If this PR is not mergeable until such a job exists I'm happy to place this PR on hold until then. I have manually tested this based on https://issues.redhat.com/browse/OCPBUGS-45460 and verified that disabling the admission plugin does not result in seeing the same reflector errors being logged. I know manual testing isn't a good substitute for a repeatable CI job, but thought it was worth mentioning that I least stood up a cluster and manually tested this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't critical, and the BYO OIDC functionality is now slated for TP in 4.19, so this PR can wait until we've got more progress done on the testing front for that feature. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thank you, now I see. Another question: don't we need to wait for the cache to warm up? diff --git i/openshift-kube-apiserver/admission/authorization/restrictusers/restrictusers.go w/openshift-kube-apiserver/admission/authorization/restrictusers/restrictusers.go
index 4dea00e61a4..e37ef8c0ff0 100644
--- i/openshift-kube-apiserver/admission/authorization/restrictusers/restrictusers.go
+++ w/openshift-kube-apiserver/admission/authorization/restrictusers/restrictusers.go
@@ -88,13 +88,16 @@ func (q *restrictUsersAdmission) SetRESTClientConfig(restClientConfig rest.Confi
}
func (q *restrictUsersAdmission) SetUserInformer(userInformers userinformer.SharedInformerFactory) {
- if err := userInformers.User().V1().Groups().Informer().AddIndexers(cache.Indexers{
+ groupInformer := userInformers.User().V1().Groups()
+ if err := groupInformer.Informer().AddIndexers(cache.Indexers{
usercache.ByUserIndexName: usercache.ByUserIndexKeys,
}); err != nil {
utilruntime.HandleError(err)
return
}
- q.groupCache = usercache.NewGroupCache(userInformers.User().V1().Groups())
+ q.groupCache = usercache.NewGroupCache(groupInformer)
+
+ q.SetReadyFunc(groupInformer.Informer().HasSynced)
}
// subjectsDelta returns the relative complement of elementsToIgnore in
@@ -129,6 +132,10 @@ func (q *restrictUsersAdmission) Validate(ctx context.Context, a admission.Attri
return nil
}
+ if !q.WaitForReady() {
+ return admission.NewForbidden(a, fmt.Errorf("not yet ready to handle request"))
+ }
+
// Ignore all operations that correspond to subresource actions.
if len(a.GetSubresource()) != 0 {
return nil There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's super indirect, but that appears to be already happening via the GroupCache... via newRoleBindingRestrictionContext. I wouldn't expect this PR to affect whether or not that's working as intended. |
||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||
return ret, nil | ||||||||||||||||||||||||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how we're handling errors in other places in this file, so this seems correct because this is an admission plugin, i.e., it's a non-user-facing error.