Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow reconciliations with JOSDK 5.0.3 release and failed reconciliation count in JOSDK metrics #2709

Closed
afalhambra-hivemq opened this issue Feb 27, 2025 · 15 comments

Comments

@afalhambra-hivemq
Copy link

Bug Report

What did you do?

We already migrated to the 5.0.2 JOSDK release with no issues but since the 5.0.3 release we are hitting some performance issues and degradation with slow reconciliations (greater than 1 second) on a rolling restart, along with some unexpected error metrics from the JOSDK:

operator_sdk_reconciliations_failed_total{exception="KubernetesClientException",group="hivemq.com",kind="HiveMQPlatform",name="test-platform",namespace="customextensionwithsubpathinstallationit",scope="namespace",version="v1"} 1.0

What did you expect to see?

No slow reconciliation (greater than 1 second) are happening and no error metrics are displayed if the reconciliation is done successfully.

What did you see instead? Under which circumstances?

Slow reconciliations (greater than 1 second) and unexpected error count metrics with same KubernetesClientException when reconciling on a rolling restart.

Environment

Kubernetes cluster type:

K3S

$ Mention java-operator-sdk version from pom.xml file

5.0.3

$ java -version

openjdk 21.0.3 2024-04-16 LTS

$ kubectl version

Client Version: v1.32.2
Kustomize Version: v5.5.0
Server Version: v1.32.0

Possible Solution

Additional context

@metacosm
Copy link
Collaborator

Would you happen to have the associated stacktraces?

@afalhambra-hivemq
Copy link
Author

Would you happen to have the associated stacktraces?

I mentioned in the ticket, there is no error or stacktrace available in the logs as all the reconciliation loops go fine with no further issue. It may be a inner error generated by the JOSDK or the Fabric8 client?

Will it be useful if I increase the log level of the JOSDK to DEBUG and attached the log?

@csviri
Copy link
Collaborator

csviri commented Feb 27, 2025

@afalhambra-hivemq yes, pls turn on the debug level logs, see if there is something useful

@csviri
Copy link
Collaborator

csviri commented Feb 27, 2025

Also is this operator opensource?

@csviri
Copy link
Collaborator

csviri commented Feb 27, 2025

It is quite strange, effectively only this changed:
https://github.com/operator-framework/java-operator-sdk/pull/2696/files
the the only difference is that we clone the resource before we doing the patch update.

@metacosm
Copy link
Collaborator

It is quite strange, effectively only this changed: https://github.com/operator-framework/java-operator-sdk/pull/2696/files the the only difference is that we clone the resource before we doing the patch update.

well, cloning is a potentially costly operation so it's not that big of a surprise…

@afalhambra-hivemq
Copy link
Author

This is the only stacktrace exception below I noticed when setting log level to DEBUG.

11:20:19.705 [ReconcilerExecutor-hivemq-controller-524] INFO  c.h.p.o.h.HiveMQPlatformReconcilerRollingRestartHandler - [operator] Stopping surge Pod
11:20:19.728 [-1149346687-pool-2-thread-14] INFO  c.h.p.o.t.OperatorK3sContainer - [EVENT] Normal [RollingRestart] HiveMQ Platform is stopping the surge Pod (rolling restart) [operator-multi-label-selector-migration:test-platform]
11:20:19.728 [ReconcilerExecutor-hivemq-controller-524] DEBUG c.h.p.operator.event.EventSender - [operator] Updating K8s event rolling-restart-scale-down-in-progress: [RollingRestart] HiveMQ Platform is stopping the surge Pod (rolling restart)
11:20:19.728 [ReconcilerExecutor-hivemq-controller-524] INFO  c.h.p.o.h.AbstractHiveMQReconcilerStateHandler - [operator] Update HiveMQ Platform Status (ROLLING_RESTART [RESTART_PODS_IN_PROGRESS] -> ROLLING_RESTART [SCALE_DOWN_IN_PROGRESS]): HiveMQ Platform is stopping the surge Pod (rolling restart)
11:20:19.748 [ReconcilerExecutor-hivemq-controller-524] DEBUG i.j.o.p.event.EventProcessor - Event processing finished. Scope: ExecutionScope{ resource id: ResourceID{name='test-platform', namespace='operator-multi-label-selector-migration'}, version: 1141}, PostExecutionControl: PostExecutionControl{onlyFinalizerHandled=false, updatedCustomResource=null, runtimeException=io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://localhost:32775/apis/hivemq.com/v1/namespaces/operator-multi-label-selector-migration/hivemq-platforms/test-platform/status?fieldManager=hivemq-controller&force=true. Message: Operation cannot be fulfilled on hivemq-platforms.hivemq.com "test-platform": the object has been modified; please apply your changes to the latest version and try again. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=hivemq.com, kind=hivemq-platforms, name=test-platform, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Operation cannot be fulfilled on hivemq-platforms.hivemq.com "test-platform": the object has been modified; please apply your changes to the latest version and try again, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Conflict, status=Failure, additionalProperties={}).}
11:20:19.748 [ReconcilerExecutor-hivemq-controller-524] DEBUG i.j.o.p.event.EventProcessor - Full client conflict error during event processing ExecutionScope{ resource id: ResourceID{name='test-platform', namespace='operator-multi-label-selector-migration'}, version: 1141}
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://localhost:32775/apis/hivemq.com/v1/namespaces/operator-multi-label-selector-migration/hivemq-platforms/test-platform/status?fieldManager=hivemq-controller&force=true. Message: Operation cannot be fulfilled on hivemq-platforms.hivemq.com "test-platform": the object has been modified; please apply your changes to the latest version and try again. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=hivemq.com, kind=hivemq-platforms, name=test-platform, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Operation cannot be fulfilled on hivemq-platforms.hivemq.com "test-platform": the object has been modified; please apply your changes to the latest version and try again, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Conflict, status=Failure, additionalProperties={}).
	at io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:205)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:507)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:524)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handlePatch(OperationSupport.java:419)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handlePatch(OperationSupport.java:397)
	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handlePatch(BaseOperation.java:764)
	at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:231)
	at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:236)
	at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:251)
	at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:44)
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher$CustomResourceFacade.patchStatus(ReconciliationDispatcher.java:413)
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:168)
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:125)
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:94)
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:67)
	at io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:444)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://localhost:32775/apis/hivemq.com/v1/namespaces/operator-multi-label-selector-migration/hivemq-platforms/test-platform/status?fieldManager=hivemq-controller&force=true. Message: Operation cannot be fulfilled on hivemq-platforms.hivemq.com "test-platform": the object has been modified; please apply your changes to the latest version and try again. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=hivemq.com, kind=hivemq-platforms, name=test-platform, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Operation cannot be fulfilled on hivemq-platforms.hivemq.com "test-platform": the object has been modified; please apply your changes to the latest version and try again, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Conflict, status=Failure, additionalProperties={}).
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:642)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:622)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:582)
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:549)
	at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:646)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:141)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:51)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at io.fabric8.kubernetes.client.vertx.VertxHttpRequest.lambda$consumeBytes$1(VertxHttpRequest.java:120)
	at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:270)
	at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:252)
	at io.vertx.core.http.impl.HttpEventHandler.handleEnd(HttpEventHandler.java:76)
	at io.vertx.core.http.impl.HttpClientResponseImpl.handleEnd(HttpClientResponseImpl.java:250)
	at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.lambda$new$0(Http1xClientConnection.java:421)
	at io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:279)
	at io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:157)
	at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleEnd(Http1xClientConnection.java:731)
	at io.vertx.core.impl.ContextImpl.execute(ContextImpl.java:327)
	at io.vertx.core.impl.ContextImpl.execute(ContextImpl.java:307)
	at io.vertx.core.http.impl.Http1xClientConnection.handleResponseEnd(Http1xClientConnection.java:962)
	at io.vertx.core.http.impl.Http1xClientConnection.handleHttpMessage(Http1xClientConnection.java:832)
	at io.vertx.core.http.impl.Http1xClientConnection.handleMessage(Http1xClientConnection.java:796)
	at io.vertx.core.net.impl.ConnectionBase.read(ConnectionBase.java:159)
	at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:153)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
	at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1515)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1378)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1427)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:796)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:732)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:658)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	... 1 common frames omitted
11:20:19.758 [ReconcilerExecutor-hivemq-controller-524] WARN  i.j.o.p.event.EventProcessor - Resource Kubernetes Resource Creator/Update Conflict during reconciliation. Message: Failure executing: PATCH at: https://localhost:32775/apis/hivemq.com/v1/namespaces/operator-multi-label-selector-migration/hivemq-platforms/test-platform/status?fieldManager=hivemq-controller&force=true. Message: Operation cannot be fulfilled on hivemq-platforms.hivemq.com "test-platform": the object has been modified; please apply your changes to the latest version and try again. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=hivemq.com, kind=hivemq-platforms, name=test-platform, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Operation cannot be fulfilled on hivemq-platforms.hivemq.com "test-platform": the object has been modified; please apply your changes to the latest version and try again, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Conflict, status=Failure, additionalProperties={}). Resource name: null
11:20:19.759 [ReconcilerExecutor-hivemq-controller-524] DEBUG i.j.o.p.event.EventProcessor - Scheduling timer event for retry with delay:2000 for resource: ResourceID{name='test-platform', namespace='operator-multi-label-selector-migration'}

This is thrown where there is a rolling restart of the pods. And our reconciler is set to use SSA explicitly as:

                .withUseSSAToPatchPrimaryResource(true)
                .withSSABasedCreateUpdateMatchForDependentResources(true));

It's also weird that the reconciliation loops are taking longer than with 5.0.2 version.

@metacosm
Copy link
Collaborator

If there are exceptions and retry in place then it makes sense that the reconciliations take longer as JOSDK will probably need multiple attempts to perform the same operation that was previously working the first time.

@csviri
Copy link
Collaborator

csviri commented Feb 27, 2025

@afalhambra-hivemq what is changed, that with SSA, since you I guess not passing a fresh resource the resourceVersion is now set on the resource, therefore now it performs optimistic locking (before was not), so what you can do is set metadata.resourceVersion to null before UpdateControl.patchStatus() is called. That will resolve this issue.

Will add this into the blog post, but see also: https://javaoperatorsdk.io/blog/2025/02/25/from-legacy-approach-to-server-side-apply/

@csviri
Copy link
Collaborator

csviri commented Feb 27, 2025

see also: #2710

@afalhambra-hivemq
Copy link
Author

To give some context here, this is happening in a rolling restart in the main reconciliation loop. We have some managed dependent resources, like in this case a StatefulSet:

@KubernetesDependent(informer = @Informer(labelSelector = LABEL_SELECTOR))
public class StatefulSetResource extends CRUDKubernetesDependentResource<StatefulSet, HiveMQPlatform> {

And that DR, the reconciliation loop is skipped, no action needed as it matches with the existing one:

11:20:19.652 [pool-38-thread-33] DEBUG c.h.p.o.d.StatefulSetResource - [operator] Desired StatefulSet (3 replicas) for status ROLLING_RESTART (RESTART_PODS_IN_PROGRESS)
11:20:19.652 [pool-38-thread-33] DEBUG i.j.o.p.e.s.i.InformerEventSource - Using PrimaryToSecondaryMapper to find secondary resources for primary: CustomResource{kind='HiveMQPlatform', apiVersion='hivemq.com/v1', metadata=ObjectMeta(annotations={meta.helm.sh/release-name=test-platform, meta.helm.sh/release-namespace=operator-multi-label-selector-migration}, creationTimestamp=2025-02-27T11:16:50Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[hivemq-platforms.hivemq.com/finalizer], generateName=null, generation=2, labels={app.kubernetes.io/instance=test-platform, app.kubernetes.io/managed-by=Helm, app.kubernetes.io/name=hivemq-platform, app.kubernetes.io/version=4.x.y, helm.sh/chart=hivemq-platform-0.x.y}, managedFields=[ManagedFieldsEntry(apiVersion=hivemq.com/v1, fieldsType=FieldsV1, fieldsV1=FieldsV1(additionalProperties={f:metadata={f:annotations={f:meta.helm.sh/release-name={}, f:meta.helm.sh/release-namespace={}}, f:finalizers={v:"hivemq-platforms.hivemq.com/finalizer"={}}, f:labels={f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}}}, f:status={f:crdVersion={}, f:message={}, f:reconciliationRequests={}, f:recoveryInformation={f:message={}, f:state={}, f:statePhase={}}, f:restartExtensions={}, f:state={}, f:statePhase={}}}), manager=hivemq-controller, operation=Apply, subresource=status, time=2025-02-27T11:18:45Z, additionalProperties={}), ManagedFieldsEntry(apiVersion=hivemq.com/v1, fieldsType=FieldsV1, fieldsV1=FieldsV1(additionalProperties={f:status={.={}, f:crdVersion={}, f:recoveryInformation={}, f:restartExtensions={}}}), manager=fabric8-kubernetes-client, operation=Update, subresource=status, time=2025-02-27T11:16:50Z, additionalProperties={}), ManagedFieldsEntry(apiVersion=hivemq.com/v1, fieldsType=FieldsV1, fieldsV1=FieldsV1(additionalProperties={f:metadata={f:annotations={.={}, f:meta.helm.sh/release-name={}, f:meta.helm.sh/release-namespace={}}, f:finalizers={.={}, v:"hivemq-platforms.hivemq.com/finalizer"={}}, f:labels={.={}, f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}}}, f:spec={.={}, f:configMapName={}, f:enabled={}, f:extensions={}, f:healthApiPort={}, f:logLevel={}, f:metricsPath={}, f:metricsPort={}, f:operatorRestApiPort={}, f:secretName={}, f:services={}, f:statefulSet={.={}, f:metadata={}, f:spec={.={}, f:replicas={}, f:template={.={}, f:spec={.={}, f:containers={}}}}}, f:statefulSetMigration={}}}), manager=fabric8-kubernetes-client, operation=Update, subresource=null, time=2025-02-27T11:18:08Z, additionalProperties={})], name=test-platform, namespace=operator-multi-label-selector-migration, ownerReferences=[], resourceVersion=1141, selfLink=null, uid=e6853a0d-65bf-402c-87fd-fd7de6c20153, additionalProperties={}), spec=com.hivemq.platform.operator.v1.HiveMQPlatformSpec@5a0d1321, status=com.hivemq.platform.operator.v1.HiveMQPlatformStatus@707e94f1, deprecated=false, deprecationWarning=null}. Found secondary ids: [ResourceID{name='hivemq-configuration-test-platform', namespace='operator-multi-label-selector-migration'}] 
11:20:19.652 [pool-38-thread-33] DEBUG i.j.o.p.e.s.i.ManagedInformerEventSource - Resource not found in temporary cache reading it from informer cache, for Resource ID: ResourceID{name='hivemq-configuration-test-platform', namespace='operator-multi-label-selector-migration'}
11:20:19.652 [pool-38-thread-33] DEBUG i.j.o.p.e.s.i.ManagedInformerEventSource - Resource found in cache: true for id: ResourceID{name='hivemq-configuration-test-platform', namespace='operator-multi-label-selector-migration'}
11:20:19.656 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: metadata actual map value: managedFieldValue: {f:annotations={f:javaoperatorsdk.io/previous={}, f:meta.helm.sh/release-name={}, f:meta.helm.sh/release-namespace={}}, f:labels={f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}, f:hivemq-platform={}}, f:ownerReferences={k:{"uid":"e6853a0d-65bf-402c-87fd-fd7de6c20153"}={}}}
11:20:19.656 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: annotations actual map value: managedFieldValue: {f:javaoperatorsdk.io/previous={}, f:meta.helm.sh/release-name={}, f:meta.helm.sh/release-namespace={}}
11:20:19.656 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: labels actual map value: managedFieldValue: {f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}, f:hivemq-platform={}}
11:20:19.656 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: spec actual map value: managedFieldValue: {f:replicas={}, f:selector={}, f:serviceName={}, f:template={f:metadata={f:annotations={f:kubernetes-resource-versions={}, f:meta.helm.sh/release-name={}, f:meta.helm.sh/release-namespace={}}, f:labels={f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}, f:hivemq-platform={}}}, f:spec={f:containers={k:{"name":"hivemq"}={.={}, f:command={}, f:env={k:{"name":"HIVEMQ_BIND_ADDRESS"}={.={}, f:name={}, f:valueFrom={f:fieldRef={}}}, k:{"name":"JAVA_OPTS"}={.={}, f:name={}, f:value={}}}, f:image={}, f:imagePullPolicy={}, f:livenessProbe={f:failureThreshold={}, f:httpGet={f:path={}, f:port={}, f:scheme={}}, f:initialDelaySeconds={}, f:periodSeconds={}}, f:name={}, f:ports={k:{"containerPort":1883,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}, k:{"containerPort":8080,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}, k:{"containerPort":9399,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}}, f:readinessProbe={f:failureThreshold={}, f:httpGet={f:path={}, f:port={}, f:scheme={}}, f:initialDelaySeconds={}, f:periodSeconds={}}, f:resources={}, f:volumeMounts={k:{"mountPath":"/etc/podinfo/"}={.={}, f:mountPath={}, f:name={}}, k:{"mountPath":"/opt/hivemq/conf-k8s/"}={.={}, f:mountPath={}, f:name={}}, k:{"mountPath":"/opt/hivemq/operator/"}={.={}, f:mountPath={}, f:name={}}}}}, f:initContainers={k:{"name":"hivemq-platform-operator-init"}={.={}, f:image={}, f:imagePullPolicy={}, f:name={}, f:resources={f:limits={f:cpu={}, f:ephemeral-storage={}, f:memory={}}, f:requests={f:cpu={}, f:ephemeral-storage={}, f:memory={}}}, f:volumeMounts={k:{"mountPath":"/hivemq"}={.={}, f:mountPath={}, f:name={}}}}}, f:securityContext={}, f:serviceAccountName={}, f:terminationGracePeriodSeconds={}, f:volumes={k:{"name":"broker-configuration"}={.={}, f:configMap={f:name={}}, f:name={}}, k:{"name":"operator-init"}={.={}, f:emptyDir={}, f:name={}}, k:{"name":"pod-info"}={.={}, f:configMap={f:name={}}, f:name={}}}}}, f:updateStrategy={f:type={}}}
11:20:19.657 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: template actual map value: managedFieldValue: {f:metadata={f:annotations={f:kubernetes-resource-versions={}, f:meta.helm.sh/release-name={}, f:meta.helm.sh/release-namespace={}}, f:labels={f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}, f:hivemq-platform={}}}, f:spec={f:containers={k:{"name":"hivemq"}={.={}, f:command={}, f:env={k:{"name":"HIVEMQ_BIND_ADDRESS"}={.={}, f:name={}, f:valueFrom={f:fieldRef={}}}, k:{"name":"JAVA_OPTS"}={.={}, f:name={}, f:value={}}}, f:image={}, f:imagePullPolicy={}, f:livenessProbe={f:failureThreshold={}, f:httpGet={f:path={}, f:port={}, f:scheme={}}, f:initialDelaySeconds={}, f:periodSeconds={}}, f:name={}, f:ports={k:{"containerPort":1883,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}, k:{"containerPort":8080,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}, k:{"containerPort":9399,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}}, f:readinessProbe={f:failureThreshold={}, f:httpGet={f:path={}, f:port={}, f:scheme={}}, f:initialDelaySeconds={}, f:periodSeconds={}}, f:resources={}, f:volumeMounts={k:{"mountPath":"/etc/podinfo/"}={.={}, f:mountPath={}, f:name={}}, k:{"mountPath":"/opt/hivemq/conf-k8s/"}={.={}, f:mountPath={}, f:name={}}, k:{"mountPath":"/opt/hivemq/operator/"}={.={}, f:mountPath={}, f:name={}}}}}, f:initContainers={k:{"name":"hivemq-platform-operator-init"}={.={}, f:image={}, f:imagePullPolicy={}, f:name={}, f:resources={f:limits={f:cpu={}, f:ephemeral-storage={}, f:memory={}}, f:requests={f:cpu={}, f:ephemeral-storage={}, f:memory={}}}, f:volumeMounts={k:{"mountPath":"/hivemq"}={.={}, f:mountPath={}, f:name={}}}}}, f:securityContext={}, f:serviceAccountName={}, f:terminationGracePeriodSeconds={}, f:volumes={k:{"name":"broker-configuration"}={.={}, f:configMap={f:name={}}, f:name={}}, k:{"name":"operator-init"}={.={}, f:emptyDir={}, f:name={}}, k:{"name":"pod-info"}={.={}, f:configMap={f:name={}}, f:name={}}}}}
11:20:19.657 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: metadata actual map value: managedFieldValue: {f:annotations={f:kubernetes-resource-versions={}, f:meta.helm.sh/release-name={}, f:meta.helm.sh/release-namespace={}}, f:labels={f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}, f:hivemq-platform={}}}
11:20:19.657 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: annotations actual map value: managedFieldValue: {f:kubernetes-resource-versions={}, f:meta.helm.sh/release-name={}, f:meta.helm.sh/release-namespace={}}
11:20:19.657 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: labels actual map value: managedFieldValue: {f:app.kubernetes.io/instance={}, f:app.kubernetes.io/managed-by={}, f:app.kubernetes.io/name={}, f:app.kubernetes.io/version={}, f:helm.sh/chart={}, f:hivemq-platform={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: spec actual map value: managedFieldValue: {f:containers={k:{"name":"hivemq"}={.={}, f:command={}, f:env={k:{"name":"HIVEMQ_BIND_ADDRESS"}={.={}, f:name={}, f:valueFrom={f:fieldRef={}}}, k:{"name":"JAVA_OPTS"}={.={}, f:name={}, f:value={}}}, f:image={}, f:imagePullPolicy={}, f:livenessProbe={f:failureThreshold={}, f:httpGet={f:path={}, f:port={}, f:scheme={}}, f:initialDelaySeconds={}, f:periodSeconds={}}, f:name={}, f:ports={k:{"containerPort":1883,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}, k:{"containerPort":8080,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}, k:{"containerPort":9399,"protocol":"TCP"}={.={}, f:containerPort={}, f:name={}}}, f:readinessProbe={f:failureThreshold={}, f:httpGet={f:path={}, f:port={}, f:scheme={}}, f:initialDelaySeconds={}, f:periodSeconds={}}, f:resources={}, f:volumeMounts={k:{"mountPath":"/etc/podinfo/"}={.={}, f:mountPath={}, f:name={}}, k:{"mountPath":"/opt/hivemq/conf-k8s/"}={.={}, f:mountPath={}, f:name={}}, k:{"mountPath":"/opt/hivemq/operator/"}={.={}, f:mountPath={}, f:name={}}}}}, f:initContainers={k:{"name":"hivemq-platform-operator-init"}={.={}, f:image={}, f:imagePullPolicy={}, f:name={}, f:resources={f:limits={f:cpu={}, f:ephemeral-storage={}, f:memory={}}, f:requests={f:cpu={}, f:ephemeral-storage={}, f:memory={}}}, f:volumeMounts={k:{"mountPath":"/hivemq"}={.={}, f:mountPath={}, f:name={}}}}}, f:securityContext={}, f:serviceAccountName={}, f:terminationGracePeriodSeconds={}, f:volumes={k:{"name":"broker-configuration"}={.={}, f:configMap={f:name={}}, f:name={}}, k:{"name":"operator-init"}={.={}, f:emptyDir={}, f:name={}}, k:{"name":"pod-info"}={.={}, f:configMap={f:name={}}, f:name={}}}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: valueFrom actual map value: managedFieldValue: {f:fieldRef={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: livenessProbe actual map value: managedFieldValue: {f:failureThreshold={}, f:httpGet={f:path={}, f:port={}, f:scheme={}}, f:initialDelaySeconds={}, f:periodSeconds={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: httpGet actual map value: managedFieldValue: {f:path={}, f:port={}, f:scheme={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: readinessProbe actual map value: managedFieldValue: {f:failureThreshold={}, f:httpGet={f:path={}, f:port={}, f:scheme={}}, f:initialDelaySeconds={}, f:periodSeconds={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: httpGet actual map value: managedFieldValue: {f:path={}, f:port={}, f:scheme={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: resources actual map value: managedFieldValue: {f:limits={f:cpu={}, f:ephemeral-storage={}, f:memory={}}, f:requests={f:cpu={}, f:ephemeral-storage={}, f:memory={}}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: limits actual map value: managedFieldValue: {f:cpu={}, f:ephemeral-storage={}, f:memory={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: requests actual map value: managedFieldValue: {f:cpu={}, f:ephemeral-storage={}, f:memory={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: configMap actual map value: managedFieldValue: {f:name={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: configMap actual map value: managedFieldValue: {f:name={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.k.SSABasedGenericKubernetesResourceMatcher - key: updateStrategy actual map value: managedFieldValue: {f:type={}}
11:20:19.658 [pool-38-thread-33] DEBUG i.j.o.p.d.AbstractDependentResource - Update skipped for dependent ResourceID{name='test-platform', namespace='operator-multi-label-selector-migration'} as it matched the existing one

Then the main reconciliation is invoked, and when patching the status of the custom resource there, this exception is thrown - but no further changes are done to the custom resource, it should be a fresh and up-to-date instance, or am I wrong?

@csviri
Copy link
Collaborator

csviri commented Feb 27, 2025

this exception is thrown - but no further changes are done to the custom resource, it should be a fresh and up-to-date instance, or am I wrong?

Well unless something modifies it in the background it should, but this is clearly about the conflict, and for status patch usually there is no reason to do optimistic locking, so would just adjust it to not have there resource version.

@afalhambra-hivemq
Copy link
Author

Ok, I confirm that setting the resourceVersion to null fixes this issue. Will try to see what is causing the resource not being up-to-date.

But still curious as this issue should've been come up as well in the 5.0.2 release. Not sure how this PR for the 5.0.3 release can affect that now.

@csviri
Copy link
Collaborator

csviri commented Feb 27, 2025

But still curious as this issue should've been come up as well in the 5.0.2 release. Not sure how this PR for the 5.0.3 release can affect that now.

That before that PR the resourceVersion was set to null explicitly for SSA to, not anymore after that PR.

@csviri
Copy link
Collaborator

csviri commented Feb 27, 2025

but mainly glad it helped, will close this issue if no objections

@csviri csviri closed this as completed Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants