Skip to content

Commit 7725ff4

Browse files
authored
docs: retry doc fixes (#2744)
1 parent 6f04b5a commit 7725ff4

File tree

1 file changed

+40
-40
lines changed

1 file changed

+40
-40
lines changed

docs/content/en/docs/documentation/error-handling-retries.md

+40-40
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ weight: 46
66
## Automatic Retries on Error
77

88
JOSDK will schedule an automatic retry of the reconciliation whenever an exception is thrown by
9-
your `Reconciler`. The retry is behavior is configurable but a default implementation is provided
9+
your `Reconciler`. The retry behavior is configurable, but a default implementation is provided
1010
covering most of the typical use-cases, see
1111
[GenericRetry](https://github.com/java-operator-sdk/java-operator-sdk/blob/master/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/processing/retry/GenericRetry.java)
1212
.
@@ -22,7 +22,7 @@ You can also configure the default retry behavior using the `@GradualRetry` anno
2222

2323
It is possible to provide a custom implementation using the `retry` field of the
2424
`@ControllerConfiguration` annotation and specifying the class of your custom implementation.
25-
Note that this class will need to provide an accessible no-arg constructor for automated
25+
Note that this class must provide an accessible no-arg constructor for automated
2626
instantiation. Additionally, your implementation can be automatically configured from an
2727
annotation that you can provide by having your `Retry` implementation implement the
2828
`AnnotationConfigurable` interface, parameterized with your annotation type. See the
@@ -32,41 +32,44 @@ Information about the current retry state is accessible from
3232
the [Context](https://github.com/java-operator-sdk/java-operator-sdk/blob/master/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/Context.java)
3333
object. Of note, particularly interesting is the `isLastAttempt` method, which could allow your
3434
`Reconciler` to implement a different behavior based on this status, by setting an error message
35-
in your resource' status, for example, when attempting a last retry.
35+
in your resource status, for example, when attempting a last retry.
3636

3737
Note, though, that reaching the retry limit won't prevent new events to be processed. New
3838
reconciliations will happen for new events as usual. However, if an error also occurs that
39-
would normally trigger a retry, the SDK won't schedule one at this point since the retry limit
40-
is already reached.
39+
would trigger a retry, the SDK won't schedule one at this point since the retry limit
40+
has already been reached.
4141

4242
A successful execution resets the retry state.
4343

44-
### Setting Error Status After Last Retry Attempt
44+
### Reconciler Error Handler
4545

46-
In order to facilitate error reporting, `Reconciler` can implement the
47-
[ErrorStatusHandler](https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/reconciler/ErrorStatusHandler.java)
48-
interface:
46+
In order to facilitate error reporting you can override [`updateErrorStatus`](https://github.com/operator-framework/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/reconciler/Reconciler.java#L52)
47+
method in `Reconciler`:
4948

5049
```java
51-
public interface ErrorStatusHandler<P extends HasMetadata> {
50+
public class MyReconciler implements Reconciler<WebPage> {
5251

53-
ErrorStatusUpdateControl<P> updateErrorStatus(P resource, Context<P> context, Exception e);
52+
@Override
53+
public ErrorStatusUpdateControl<WebPage> updateErrorStatus(
54+
WebPage resource, Context<WebPage> context, Exception e) {
55+
return handleError(resource, e);
56+
}
5457

5558
}
5659
```
5760

5861
The `updateErrorStatus` method is called in case an exception is thrown from the `Reconciler`. It is
5962
also called even if no retry policy is configured, just after the reconciler execution.
6063
`RetryInfo.getAttemptCount()` is zero after the first reconciliation attempt, since it is not a
61-
result of a retry (regardless of whether a retry policy is configured or not).
64+
result of a retry (regardless of whether a retry policy is configured).
6265

63-
`ErrorStatusUpdateControl` is used to tell the SDK what to do and how to perform the status
64-
update on the primary resource, always performed as a status sub-resource request. Note that
65-
this update request will also produce an event, and will result in a reconciliation if the
66-
controller is not generation aware.
66+
`ErrorStatusUpdateControl` tells the SDK what to do and how to perform the status
67+
update on the primary resource, which is always performed as a status sub-resource request. Note that
68+
this update request will also produce an event and result in a reconciliation if the
69+
controller is not generation-aware.
6770

6871
This feature is only available for the `reconcile` method of the `Reconciler` interface, since
69-
there should not be updates to resource that have been marked for deletion.
72+
there should not be updates to resources that have been marked for deletion.
7073

7174
Retry can be skipped in cases of unrecoverable errors:
7275

@@ -76,40 +79,37 @@ Retry can be skipped in cases of unrecoverable errors:
7679

7780
### Correctness and Automatic Retries
7881

79-
While it is possible to deactivate automatic retries, this is not desirable, unless for very
80-
specific reasons. Errors naturally occur, whether it be transient network errors or conflicts
81-
when a given resource is handled by a `Reconciler` but is modified at the same time by a user in
82-
a different process. Automatic retries handle these cases nicely and will usually result in a
82+
While it is possible to deactivate automatic retries, this is not desirable unless there is a particular reason.
83+
Errors naturally occur, whether it be transient network errors or conflicts
84+
when a given resource is handled by a `Reconciler` but modified simultaneously by a user in
85+
a different process. Automatic retries handle these cases nicely and will eventually result in a
8386
successful reconciliation.
8487

85-
## Retry and Rescheduling and Event Handling Common Behavior
88+
## Retry, Rescheduling and Event Handling Common Behavior
8689

87-
Retry, reschedule and standard event processing form a relatively complex system, each of these
90+
Retry, reschedule, and standard event processing form a relatively complex system, each of these
8891
functionalities interacting with the others. In the following, we describe the interplay of
8992
these features:
9093

91-
1. A successful execution resets a retry and the rescheduled executions which were present before
92-
the reconciliation. However, a new rescheduling can be instructed from the reconciliation
93-
outcome (`UpdateControl` or `DeleteControl`).
94+
1. A successful execution resets a retry and the rescheduled executions that were present before
95+
the reconciliation. However, the reconciliation outcome can instruct a new rescheduling (`UpdateControl` or `DeleteControl`).
9496

95-
For example, if a reconciliation had previously been re-scheduled after some amount of time, but an event triggered
96-
the reconciliation (or cleanup) in the mean time, the scheduled execution would be automatically cancelled, i.e.
97-
re-scheduling a reconciliation does not guarantee that one will occur exactly at that time, it simply guarantees that
98-
one reconciliation will occur at that time at the latest, triggering one if no event from the cluster triggered one.
99-
Of course, it's always possible to re-schedule a new reconciliation at the end of that "automatic" reconciliation.
97+
For example, if a reconciliation had previously been rescheduled for after some amount of time, but an event triggered
98+
the reconciliation (or cleanup) in the meantime, the scheduled execution would be automatically cancelled, i.e.
99+
rescheduling a reconciliation does not guarantee that one will occur precisely at that time; it simply guarantees that a reconciliation will occur at the latest.
100+
Of course, it's always possible to reschedule a new reconciliation at the end of that "automatic" reconciliation.
100101

101-
Similarly, if a retry was scheduled, any event from the cluster triggering a successful execution in the mean time
102+
Similarly, if a retry was scheduled, any event from the cluster triggering a successful execution in the meantime
102103
would cancel the scheduled retry (because there's now no point in retrying something that already succeeded)
103104

104-
2. In case an exception happened, a retry is initiated. However, if an event is received
105+
2. In case an exception is thrown, a retry is initiated. However, if an event is received
105106
meanwhile, it will be reconciled instantly, and this execution won't count as a retry attempt.
106107
3. If the retry limit is reached (so no more automatic retry would happen), but a new event
107108
received, the reconciliation will still happen, but won't reset the retry, and will still be
108-
marked as the last attempt in the retry info. The point (1) still holds, but in case of an
109-
error, no retry will happen.
110-
111-
The thing to keep in mind when it comes to retrying or rescheduling is that JOSDK tries to avoid unnecessary work. When
112-
you reschedule an operation, you instruct JOSDK to perform that operation at the latest by the end of the rescheduling
113-
delay. If something occurred on the cluster that triggers that particular operation (reconciliation or cleanup), then
109+
marked as the last attempt in the retry info. The point (1) still holds - thus successful reconciliation will reset the retry - but no retry will happen in case of an error.
110+
111+
The thing to remember when it comes to retrying or rescheduling is that JOSDK tries to avoid unnecessary work. When
112+
you reschedule an operation, you instruct JOSDK to perform that operation by the end of the rescheduling
113+
delay at the latest. If something occurred on the cluster that triggers that particular operation (reconciliation or cleanup), then
114114
JOSDK considers that there's no point in attempting that operation again at the end of the specified delay since there
115-
is now no point to do so anymore. The same idea also applies to retries.
115+
is no point in doing so anymore. The same idea also applies to retries.

0 commit comments

Comments
 (0)