Skip to content

Conditionally register a dependent resource via @Dependent #2063

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Javatar81 opened this issue Sep 19, 2023 · 12 comments · Fixed by #2105
Closed

Conditionally register a dependent resource via @Dependent #2063

Javatar81 opened this issue Sep 19, 2023 · 12 comments · Fixed by #2105
Assignees
Milestone

Comments

@Javatar81
Copy link

Javatar81 commented Sep 19, 2023

We need some way to register a dependent resource based on a certain condition via @Dependent annotation. One use case is to disable dependent resources that are not available in a particular environment, e.g. when using a Route resource that is only available in OpenShift. For this use case it is not enough to specify a reconcilePrecondition because even when the condition evaluates to false, the controller will still try access the api causing an error.

@Javatar81
Copy link
Author

Javatar81 commented Sep 19, 2023

One workaround is to use the standalone configuration of dependent resources instead of @Dependent annotations. However, the registration via annotations is much easier and the preferred way to go.

@Javatar81
Copy link
Author

Javatar81 commented Sep 19, 2023

When you register a CRUDKubernetesDependentResource<Route, P> via @Dependent using a reconcileCondition as follows:

public class RouteReconcileCondition implements Condition<Route, P> {
    
   @Inject
   OpenShiftClient ocpClient;

    @Override
    public boolean isMet(DependentResource<Route, Gitea> dependentResource, Gitea primary, Context<P> context) {
        return ocpClient.supportsOpenShiftAPIGroup(OpenShiftAPIGroups.ROUTE);
    }
}

you will get this stack trace:

io.javaoperatorsdk.operator.OperatorException: Error starting operator
        at io.javaoperatorsdk.operator.Operator.start(Operator.java:166)
        at io.javaoperatorsdk.operator.OperatorProducer_ProducerMethod_operator_e1ad7713d0b252934b962fd9fbdd45e731130299_ClientProxy.start(Unknown Source)
        at io.quarkiverse.operatorsdk.runtime.AppEventListener.onStartup(AppEventListener.java:28)
        at io.quarkiverse.operatorsdk.runtime.AppEventListener_Observer_onStartup_b28deb793825eb1808af096a843376083fea4592.notify(Unknown Source)
        at io.quarkus.arc.impl.EventImpl$Notifier.notifyObservers(EventImpl.java:346)
        at io.quarkus.arc.impl.EventImpl$Notifier.notify(EventImpl.java:328)
        at io.quarkus.arc.impl.EventImpl.fire(EventImpl.java:82)
        at io.quarkus.arc.runtime.ArcRecorder.fireLifecycleEvent(ArcRecorder.java:155)
        at io.quarkus.arc.runtime.ArcRecorder.handleLifecycleEvents(ArcRecorder.java:106)
        at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy_0(Unknown Source)
        at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy(Unknown Source)
        ... 51 more
Caused by: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.OperatorException: Couldn't start source giteaRoute
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.lambda$executeAndWaitForAllToComplete$2(ExecutorServiceManager.java:81)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.executeAndWaitForAllToComplete(ExecutorServiceManager.java:76)
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.boundedExecuteAndWaitForAllToComplete(ExecutorServiceManager.java:56)
        at io.javaoperatorsdk.operator.ControllerManager.start(ControllerManager.java:42)
        at io.javaoperatorsdk.operator.Operator.start(Operator.java:161)
        ... 61 more
Caused by: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.OperatorException: Couldn't start source giteaRoute
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.lambda$executeAndWaitForAllToComplete$2(ExecutorServiceManager.java:81)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.executeAndWaitForAllToComplete(ExecutorServiceManager.java:76)
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.boundedExecuteAndWaitForAllToComplete(ExecutorServiceManager.java:56)
        at io.javaoperatorsdk.operator.processing.event.EventSourceManager.start(EventSourceManager.java:79)
        at io.javaoperatorsdk.operator.processing.Controller.start(Controller.java:342)
        at io.javaoperatorsdk.operator.ControllerManager.lambda$start$0(ControllerManager.java:43)
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.lambda$executeAndWaitForAllToComplete$0(ExecutorServiceManager.java:70)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1623)
Caused by: io.javaoperatorsdk.operator.OperatorException: Couldn't start source giteaRoute
        at io.javaoperatorsdk.operator.processing.event.EventSourceManager.startEventSource(EventSourceManager.java:130)
        ... 5 more
Caused by: io.javaoperatorsdk.operator.OperatorException: io.javaoperatorsdk.operator.OperatorException: Couldn't start informer for routes.route.openshift.io/v1 resources
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.lambda$executeAndWaitForAllToComplete$2(ExecutorServiceManager.java:81)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.executeAndWaitForAllToComplete(ExecutorServiceManager.java:76)
        at io.javaoperatorsdk.operator.api.config.ExecutorServiceManager.boundedExecuteAndWaitForAllToComplete(ExecutorServiceManager.java:56)
        at io.javaoperatorsdk.operator.processing.event.source.informer.InformerManager.start(InformerManager.java:63)
        at io.javaoperatorsdk.operator.processing.event.source.informer.ManagedInformerEventSource.start(ManagedInformerEventSource.java:81)
        at io.javaoperatorsdk.operator.processing.event.NamedEventSource.start(NamedEventSource.java:27)
        at io.javaoperatorsdk.operator.processing.event.EventSourceManager.startEventSource(EventSourceManager.java:125)
        ... 5 more
Caused by: io.javaoperatorsdk.operator.OperatorException: Couldn't start informer for routes.route.openshift.io/v1 resources
        at io.javaoperatorsdk.operator.processing.event.source.informer.InformerWrapper.start(InformerWrapper.java:110)
        at io.javaoperatorsdk.operator.processing.event.source.informer.InformerManager.lambda$start$0(InformerManager.java:66)
        ... 5 more
Caused by: io.javaoperatorsdk.operator.OperatorException: java.util.concurrent.ExecutionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://localhost:36293/apis/route.openshift.io/v1/routes?resourceVersion=0. Message: Not Found.
        at io.javaoperatorsdk.operator.processing.event.source.informer.InformerWrapper.start(InformerWrapper.java:94)
        ... 6 more
Caused by: java.util.concurrent.ExecutionException: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://localhost:36293/apis/route.openshift.io/v1/routes?resourceVersion=0. Message: Not Found.
        at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
        at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
        at io.javaoperatorsdk.operator.processing.event.source.informer.InformerWrapper.start(InformerWrapper.java:87)
        ... 6 more
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://localhost:36293/apis/route.openshift.io/v1/routes?resourceVersion=0. Message: Not Found.
        at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:671)
        at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:651)
        at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:600)
        at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:560)
        at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:646)
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
        at io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:140)
        at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
        at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
        at io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:52)
        at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
        at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
        at io.fabric8.kubernetes.client.vertx.VertxHttpRequest.lambda$null$1(VertxHttpRequest.java:122)
        at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:264)
        at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:246)
        at io.vertx.core.http.impl.HttpEventHandler.handleEnd(HttpEventHandler.java:76)
        at io.vertx.core.http.impl.HttpClientResponseImpl.handleEnd(HttpClientResponseImpl.java:250)
        at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.lambda$new$0(Http1xClientConnection.java:444)
        at io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:255)
        at io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:134)
        at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleEnd(Http1xClientConnection.java:708)
        at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:76)
        at io.vertx.core.impl.ContextBase.execute(ContextBase.java:232)
        at io.vertx.core.http.impl.Http1xClientConnection.handleResponseEnd(Http1xClientConnection.java:945)
        at io.vertx.core.http.impl.Http1xClientConnection.handleHttpMessage(Http1xClientConnection.java:814)
        at io.vertx.core.http.impl.Http1xClientConnection.handleMessage(Http1xClientConnection.java:778)
        at io.vertx.core.net.impl.ConnectionBase.read(ConnectionBase.java:158)
        at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:153)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
        at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1383)
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1246)
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1295)
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        ... 1 more

@csviri csviri added this to the 4.5 milestone Sep 19, 2023
@csviri csviri self-assigned this Sep 19, 2023
@csviri
Copy link
Collaborator

csviri commented Sep 19, 2023

I see this as a valid use case. This can happen, usually with openshift vs kubernetes, but also in cases for example when the controller implentation would change based on some other capabilities of the cluster, thus is there are other custom resource definitions present.

@csviri
Copy link
Collaborator

csviri commented Sep 19, 2023

I see multiple possibilities how to approach the solution, one is that we will introduce a new condition, maybe useCondition, where user can implement a logic when to use the resource or not. If evaluates as false, multiple things will happen:

  1. The event source won't be registered (even if provided) from the DR
  2. The resource won't be reconciled at all, that includes also delete.
  3. All other conditions would be ignored.

The question is what should happen if there are resources that are depend on such resource?

Those should not be reconciled either? I guess so, since depends_on is semantically saying it is needed for other resources.
Or those should be just evaluated same way as the reconcilePrecondition would not hold for this resource. Taking into the account that useCondition could change in runtime. (Since CRD can be added runtime on the cluster.)

Alternatively it could just ignore the resource as it was not there, but that rather fits for the (not implemented) after relation

@Javatar81
Copy link
Author

The question is what should happen if there are resources that are depend on such resource? Those should not be reconciled either?

I would say that according to the semantics of dependsOn defined as The list of named dependents that need to be reconciled before this one can be. this would mean

useCondition evaluates to false => the dependent resource is never reconciled => a resource that depends on this resource will also never be reconciled

+1 to support changes at runtime.

@csviri
Copy link
Collaborator

csviri commented Sep 19, 2023

useCondition evaluates to false => the dependent resource is never reconciled => a resource that depends on this resource will also never be reconciled

yes, I meant this way, just compared it to reconcilePrecondition, since it works in a way that if the condition not holds (anymore) the resource is deleted, and also the resources which (transitively) depends on it are also deleted. This delete logic should be the case also here.

@csviri
Copy link
Collaborator

csviri commented Sep 19, 2023

Maybe platformCondition would be a better name.

@csviri csviri linked a pull request Sep 19, 2023 that will close this issue
@Javatar81
Copy link
Author

I would rather refer to an offical term (though I can't find one in the docs) used in the workflow definition to describe the transition into this state of becoming active. How would you define the process of enabling the KubernetesDependentResource? Some proposals:

  • activationCondition
  • registrationCondition
  • subscriptionCondition

@csviri
Copy link
Collaborator

csviri commented Sep 19, 2023

IMO this very much depends on how we look on this,

subscriptionCondition, registrationCondition - we don't have such notion in the vocabulary, we don't really subscribe a dependent resource or register since are anyways statically added to workflow,
activationCondition or useCondition is something makes more sense to me, since we kinda want to use / activate those that resource only in certain condition. It's true that although platformCondition might desrcibe something more specific that this is a condition that is cases when we want to take a look on specifics of a platform but does not really describe the efffect.

So yeah, IMO then activationCondition or useCondition ? (for now renamed to activationCondition in the PR)

@csviri
Copy link
Collaborator

csviri commented Oct 3, 2023

The problem with dynamic (runtime) handling is that, registering a dependent resource need also an event source / informer. But registering that in runtime is a bit problematic. Basically it needs to "stop the world", thus processing until the Informer is synced.

@csviri csviri modified the milestones: 4.5, 4.6 Oct 3, 2023
@csviri
Copy link
Collaborator

csviri commented Oct 24, 2023

Came back to this, thinking about runtime registration, although Kubernetes API is dynamic in terms of how CRDs are registered. The dynamic registration of EventSources is possible, would be even a possible nice feature for other (special) use cases, but also brings complexity. Thus would be harder to understand for users what is happening in the background.

@csviri
Copy link
Collaborator

csviri commented Oct 24, 2023

On the other hand if the this decision is made on startup, there might be some ordering issues, like of the operator checks if there is cert manager installed on the cluster or not, this would be nicer to do dynamically. Since when starting a (test) cluster and installing components would need some ordering. Thus first install cert manager (at least CRD) the run the controller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants