Skip to content

Support for dynamic registration of EventSources #2120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ivanchuchulski opened this issue Nov 7, 2023 · 5 comments · Fixed by #2105
Closed

Support for dynamic registration of EventSources #2120

ivanchuchulski opened this issue Nov 7, 2023 · 5 comments · Fixed by #2105
Assignees

Comments

@ivanchuchulski
Copy link

Hello,

We are migrating a Helm Chart to Operator based provisioning of an application. Our application should have the ability to work either with native K8s Ingress or with Istio service mesh. We're using standalone dependent resources in our reconciler and we register the event sources, by implementing the EventSourceInitializer<T> and the prepareEventSources method, where we call the static EventSourceInitializer.nameEventSourcesFromDependentResource() method and passing an array, which we create beforehand, containing instances of our needed dependent resources. In that array, currently we put a couple of Istio specific resources, like VirtualService and Gateway, for which we had to add the Istio client artifact to get the object definitions.
However, we if we try to start the Operator in a cluster, that hasn't got Istio installed, we get an error:

09:32:48.615 INFO  Operator.java                 (243) Registered reconciler: 'appreconciler' for resource: 'class com.company.MyApp' for namespace(s): [all namespaces]
09:32:48.616 INFO  Operator.java                 (147) Operator SDK 4.4.4 (commit: 609a55a) built on 2023-09-19T13:58:23.000+0300 starting...
09:32:48.616 INFO  Operator.java                 (153) Client version: 6.7.2
09:32:48.618 INFO  Controller.java               (334) Starting 'appreconciler' controller for reconciler: com.company.AppReconciler, resource: com.company.MyApp
09:32:48.905 WARN  VersionUsageUtils.java        (60) The client is using resource type 'virtualservices' with unstable version 'v1beta1'
09:32:48.905 WARN  VersionUsageUtils.java        (60) The client is using resource type 'gateways' with unstable version 'v1beta1'
09:32:49.034 ERROR Reflector.java                (153) listSyncAndWatch failed for networking.istio.io/v1beta1/virtualservices, will stop
java.util.concurrent.CompletionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/virtualservices?resourceVersion=0. Message: Not Found.
...
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/virtualservices?resourceVersion=0. Message: Not Found.
09:32:49.034 ERROR Reflector.java                (153) listSyncAndWatch failed for networking.istio.io/v1beta1/gateways, will stop
java.util.concurrent.CompletionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/gateways?resourceVersion=0. Message: Not Found.
...
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/gateways?resourceVersion=0. Message: Not Found.

09:32:49.052 ERROR InformerWrapper.java          (93) Informer startup error. Operator will be stopped. Informer: networking.istio.io/v1beta1/virtualservices
java.util.concurrent.ExecutionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/virtualservices?resourceVersion=0. Message: Not Found.
...
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/virtualservices?resourceVersion=0. Message: Not Found.
...
09:32:49.034 ERROR Reflector.java                (153) listSyncAndWatch failed for networking.istio.io/v1beta1/destinationrules, will stop
java.util.concurrent.CompletionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/destinationrules?resourceVersion=0. Message: Not Found.
...
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/destinationrules?resourceVersion=0. Message: Not Found.
...
09:32:49.053 ERROR InformerWrapper.java          (93) Informer startup error. Operator will be stopped. Informer: networking.istio.io/v1beta1/gateways
java.util.concurrent.ExecutionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/gateways?resourceVersion=0. Message: Not Found.
...
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/gateways?resourceVersion=0. Message: Not Found.

Exception in thread "main" io.javaoperatorsdk.operator.OperatorException: Error starting operator
	at io.javaoperatorsdk.operator.Operator.start(Operator.java:166)
	at com.company.MyAppOperator.main(MyAppOperator.java:27)
Caused by: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.OperatorException: Couldn't start source io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource#1298483237
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.lambda$executeAndWaitForAllToComplete$2(ExecutorServiceManager.java:81)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.executeAndWaitForAllToComplete(ExecutorServiceManager.java:76)
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.boundedExecuteAndWaitForAllToComplete(ExecutorServiceManager.java:56)
	at io.javaoperatorsdk.operator.ControllerManager.start(ControllerManager.java:42)
	at io.javaoperatorsdk.operator.Operator.start(Operator.java:161)
	... 1 more
Caused by: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.OperatorException: Couldn't start source io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource#1298483237
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.lambda$executeAndWaitForAllToComplete$2(ExecutorServiceManager.java:81)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.executeAndWaitForAllToComplete(ExecutorServiceManager.java:76)
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.boundedExecuteAndWaitForAllToComplete(ExecutorServiceManager.java:56)
	at io.javaoperatorsdk.operator.processing.event.EventSourceManager.start(EventSourceManager.java:79)
	at io.javaoperatorsdk.operator.processing.Controller.start(Controller.java:342)
	at io.javaoperatorsdk.operator.ControllerManager.lambda$start$0(ControllerManager.java:43)
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.lambda$executeAndWaitForAllToComplete$0(ExecutorServiceManager.java:70)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.javaoperatorsdk.operator.OperatorException: Couldn't start source io.javaoperatorsdk.operator.processing.event.source.informer.InformerEventSource#1298483237
	at io.javaoperatorsdk.operator.processing.event.EventSourceManager.startEventSource(EventSourceManager.java:130)
	... 5 more
Caused by: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.OperatorException: Couldn't start informer for destinationrules.networking.istio.io/v1beta1 resources
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.lambda$executeAndWaitForAllToComplete$2(ExecutorServiceManager.java:81)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.executeAndWaitForAllToComplete(ExecutorServiceManager.java:76)
	at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.boundedExecuteAndWaitForAllToComplete(ExecutorServiceManager.java:56)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerManager.start(InformerManager.java:59)
	at io.javaoperatorsdk.operator.processing.event.source.informer.ManagedInformerEventSource.start(ManagedInformerEventSource.java:80)
	at io.javaoperatorsdk.operator.processing.event.NamedEventSource.start(NamedEventSource.java:27)
	at io.javaoperatorsdk.operator.processing.event.EventSourceManager.startEventSource(EventSourceManager.java:125)
	... 5 more
Caused by: io.javaoperatorsdk.operator.OperatorException: Couldn't start informer for destinationrules.networking.istio.io/v1beta1 resources
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerWrapper.start(InformerWrapper.java:110)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerManager.lambda$start$0(InformerManager.java:62)
	... 5 more
Caused by: io.javaoperatorsdk.operator.OperatorException: java.util.concurrent.ExecutionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/destinationrules?resourceVersion=0. Message: Not Found.
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerWrapper.start(InformerWrapper.java:94)
	... 6 more
Caused by: java.util.concurrent.ExecutionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/destinationrules?resourceVersion=0. Message: Not Found.
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
	at io.javaoperatorsdk.operator.processing.event.source.informer.InformerWrapper.start(InformerWrapper.java:87)
	... 6 more
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://<cluster-ip>/apis/networking.istio.io/v1beta1/destinationrules?resourceVersion=0. Message: Not Found.
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:671)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:651)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:600)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:560)
	at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:646)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
	at io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:140)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
	at io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:52)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
	at io.fabric8.kubernetes.client.okhttp.OkHttpClientImpl$OkHttpAsyncBody.doConsume(OkHttpClientImpl.java:137)
	... 3 more

which leaves the operator not able to process any further data.

Correct me if I'm wrong, but I believe this error is happening, because the Informers of the DependentResource's EventSources, created in the prepareEventSources aren't able to query Istio resource APIs, which aren't present in the API server, since their CRD definitions are missing.

Do you have any suggestion how to resolve this issue? Is there a mechanism, which would allow us to somehow change dynamically the set of EventSources, which are associated with a given reconciler?

Best Regards,
Ivan

@csviri
Copy link
Collaborator

csviri commented Nov 8, 2023

Hi,
there will be actually a mechanism, it is planned for next release, see:
#2063

Already started to work on that.

For now I would suggest to use standalone dependent resources, and check with the client if the CRDs are present on the server, and only register those informers if they are. You can implement this logic directly in the EventSourceInitializer.
Might be also a feature flag for the reconciler, through the deployment.

Will close this issue as a duplicate if this is clear and no objections

@csviri csviri self-assigned this Nov 8, 2023
@ivanchuchulski
Copy link
Author

ivanchuchulski commented Nov 8, 2023

Hello @csviri ,

Thank you for the fast reply, it's really great news that this feature would be implemented soon!

Just to clarify, we're using standalone dependent resources, because we have some logic in the reconcile method. This logic checks a property (acting like an on-off switch) in the current custom resource and if it's true for example, then we'd need to do the creation/registration of the EventSources for another standalone dependent resources, which model the non-standard K8s resources, like Istio or CertManager and then trigger the reconcile methods of these resources. Then in another reconciliation loop, if the property is set to false, we'd need to remove/unregister the EventSources and delete the resources from the cluster.

If that is what you're intending to implement, then we're looking forward to that.

Best Regards,
Ivan

@csviri
Copy link
Collaborator

csviri commented Nov 13, 2023

@ivanchuchulski well what you suggest would be possible, although it is advised rather to have one informer (or as minimal number as possible) per type, so if you register informer per resource that might result in lots of informers therefor lots of websocket connections - what might not be desirable.

In terms of dependent resource scope we out of the box will support case when we have a resource where this mentioed activation condition holds will register the informers, but not per resources. So in other works, think of the use case, if the platform support certManager, will the whole workflow work with cert manager.

But again, your use case will be also possible.

@ivanchuchulski
Copy link
Author

Hi @csviri ,
I think currently we're using separate event sources (thus separate informers) for each of our dependent resources. As stated in the official docs

When dealing with multiple dependents of the same type, one needs to decide whether these dependent resources should track the same resources and therefore share a common event source, or, to the contrary, track completely separate resources, in which case using separate event sources is advised.

Our use case is that we need a couple of Service and ConfigMap, VirtualService objects, of course a regular Deployment and Ingress and some more. We've modeled each of them with a standalone dependent resources, which we all instantiate, put them in an array and use the static <K extends HasMetadata> Map<String, EventSource> nameEventSourcesFromDependentResource(EventSourceContext<K> context, DependentResource... dependentResources) method of the EventSourceInitializer interface in the prepareEventSources of our Reconciler.

As I'm not familiar with the low-level details of the framework abstractions, I don't know if there would be any drawback if we should convert to the usage of one event source per resource type and how to implement that correctly.

@csviri csviri linked a pull request Nov 20, 2023 that will close this issue
@csviri csviri closed this as completed Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants