Skip to content

ErrorStatusHandler - support for better error reporting in status #685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions docs/documentation/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,30 @@ won't be a result of a retry, but the new event.

A successful execution resets the retry.

### Setting Error Status After Last Retry Attempt

In order to facilitate error reporting in case a last retry attempt fails, Reconciler can implement the following
interface:

```java
public interface ErrorStatusHandler<T extends CustomResource<?, ?>> {

T updateErrorStatus(T resource, RuntimeException e);

}
```

The `updateErrorStatus` resource is called when it's the last retry attempt according the retry configuration and the
reconciler execution still resulted in a runtime exception.

The result of the method call is used to make a status sub-resource update on the custom resource. This is always a
sub-resource update request, so no update on custom resource itself (like spec of metadata) happens. Note that this
update request will also produce an event, and will result in a reconciliation if the controller is not generation
aware.

Note that the scope of this feature is only the `reconcile` method of the reconciler, since there should not be updates
on custom resource after it is marked for deletion.

### Correctness and Automatic Retries

There is a possibility to turn of the automatic retries. This is not desirable, unless there is a very specific reason.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@

public interface Cloner {

CustomResource<?, ?> clone(CustomResource<?, ?> object);
<T extends CustomResource<?, ?>> T clone(T object);

}
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,10 @@ public interface ConfigurationService {
private final ObjectMapper OBJECT_MAPPER = new ObjectMapper();

@Override
public CustomResource<?, ?> clone(CustomResource<?, ?> object) {
public <T extends CustomResource<?, ?>> T clone(T object) {
try {
return OBJECT_MAPPER.readValue(OBJECT_MAPPER.writeValueAsString(object), object.getClass());
return OBJECT_MAPPER.readValue(OBJECT_MAPPER.writeValueAsString(object),
(Class<T>) object.getClass());
} catch (JsonProcessingException e) {
throw new IllegalStateException(e);
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package io.javaoperatorsdk.operator.api.reconciler;

import io.fabric8.kubernetes.client.CustomResource;

public interface ErrorStatusHandler<T extends CustomResource<?, ?>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have the HasStatus (or similar) interface, this probably should use that instead of CustomResource

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be moved to HasMetada if the other PR is merged.


/**
* <p>
* Reconcile can implement this interface in order to update the status sub-resource in the case
* when the last reconciliation retry attempt is failed on the Reconciler. In that case the
* updateErrorStatus is called automatically.
* <p>
* The result of the method call is used to make a status sub-resource update on the custom
* resource. This is always a sub-resource update request, so no update on custom resource itself
* (like spec of metadata) happens. Note that this update request will also produce an event, and
* will result in a reconciliation if the controller is not generation aware.
* <p>
* Note that the scope of this feature is only the reconcile method of the reconciler, since there
* should not be updates on custom resource after it is marked for deletion.
*
* @param resource to update the status on
* @param e exception thrown from the reconciler
* @return the updated resource
*/
T updateErrorStatus(T resource, RuntimeException e);

}
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import io.javaoperatorsdk.operator.api.reconciler.Context;
import io.javaoperatorsdk.operator.api.reconciler.DefaultContext;
import io.javaoperatorsdk.operator.api.reconciler.DeleteControl;
import io.javaoperatorsdk.operator.api.reconciler.ErrorStatusHandler;
import io.javaoperatorsdk.operator.api.reconciler.Reconciler;
import io.javaoperatorsdk.operator.api.reconciler.UpdateControl;

Expand Down Expand Up @@ -109,27 +110,76 @@ private PostExecutionControl<R> handleCreateOrUpdate(
updateCustomResourceWithFinalizer(resource);
return PostExecutionControl.onlyFinalizerAdded();
} else {
log.debug(
"Executing createOrUpdate for resource {} with version: {} with execution scope: {}",
getName(resource),
getVersion(resource),
executionScope);
try {
var resourceForExecution =
cloneResourceForErrorStatusHandlerIfNeeded(resource, context);
return createOrUpdateExecution(executionScope, resourceForExecution, context);
} catch (RuntimeException e) {
handleLastAttemptErrorStatusHandler(resource, context, e);
throw e;
}
}
}

/**
* Resource make sense only to clone for the ErrorStatusHandler. Otherwise, this operation can be
* skipped since it can be memory and time-consuming. However, it needs to be cloned since it's
* common that the custom resource is changed during an execution, and it's much cleaner to have
* to original resource in place for status update.
*/
private R cloneResourceForErrorStatusHandlerIfNeeded(R resource, Context context) {
if (isLastAttemptOfRetryAndErrorStatusHandlerPresent(context)) {
return controller.getConfiguration().getConfigurationService().getResourceCloner()
.clone(resource);
} else {
return resource;
}
}

UpdateControl<R> updateControl = controller.reconcile(resource, context);
R updatedCustomResource = null;
if (updateControl.isUpdateCustomResourceAndStatusSubResource()) {
updatedCustomResource = updateCustomResource(updateControl.getCustomResource());
updateControl
.getCustomResource()
.getMetadata()
.setResourceVersion(updatedCustomResource.getMetadata().getResourceVersion());
updatedCustomResource = updateStatusGenerationAware(updateControl.getCustomResource());
} else if (updateControl.isUpdateStatusSubResource()) {
updatedCustomResource = updateStatusGenerationAware(updateControl.getCustomResource());
} else if (updateControl.isUpdateCustomResource()) {
updatedCustomResource = updateCustomResource(updateControl.getCustomResource());
private PostExecutionControl<R> createOrUpdateExecution(ExecutionScope<R> executionScope,
R resource, Context context) {
log.debug(
"Executing createOrUpdate for resource {} with version: {} with execution scope: {}",
getName(resource),
getVersion(resource),
executionScope);

UpdateControl<R> updateControl = controller.reconcile(resource, context);
R updatedCustomResource = null;
if (updateControl.isUpdateCustomResourceAndStatusSubResource()) {
updatedCustomResource = updateCustomResource(updateControl.getCustomResource());
updateControl
.getCustomResource()
.getMetadata()
.setResourceVersion(updatedCustomResource.getMetadata().getResourceVersion());
updatedCustomResource = updateStatusGenerationAware(updateControl.getCustomResource());
} else if (updateControl.isUpdateStatusSubResource()) {
updatedCustomResource = updateStatusGenerationAware(updateControl.getCustomResource());
} else if (updateControl.isUpdateCustomResource()) {
updatedCustomResource = updateCustomResource(updateControl.getCustomResource());
}
return createPostExecutionControl(updatedCustomResource, updateControl);
}

private void handleLastAttemptErrorStatusHandler(R resource, Context context,
RuntimeException e) {
if (isLastAttemptOfRetryAndErrorStatusHandlerPresent(context)) {
try {
var updatedResource = ((ErrorStatusHandler<R>) controller.getReconciler())
.updateErrorStatus(resource, e);
customResourceFacade.updateStatus(updatedResource);
} catch (RuntimeException ex) {
log.error("Error during error status handling.", ex);
}
return createPostExecutionControl(updatedCustomResource, updateControl);
}
}

private boolean isLastAttemptOfRetryAndErrorStatusHandlerPresent(Context context) {
if (context.getRetryInfo().isPresent()) {
return context.getRetryInfo().get().isLastAttempt()
&& controller.getReconciler() instanceof ErrorStatusHandler;
} else {
return false;
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ public Optional<T> getCustomResource(CustomResourceID resourceID) {
if (resource == null) {
return Optional.empty();
} else {
return Optional.of((T) (cloner.clone(resource)));
return Optional.of(cloner.clone(resource));
}
}

Expand Down
Loading