-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Combine blocking and non-blocking error handling #2116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the suggestion @mariecuriiee, that's a good article. I'm currently working on a couple of issues that should pave the way for this feature to work, by leveraging the existing DefaultErrorHandler we already set for the RetryableTopics. So hopefully we'll have this implemented soon. |
Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature. Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT.
Hi @mariecuriiee, I've added a PR with this feature, maybe you want to take a look: LMWYT, thanks |
Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature. Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT.
Hey @tomazfernandes, thanks for implementing this so quickly! If I understand the implementation correctly, I only have to set the backoff using |
Hi @mariecuriiee, thanks for looking into the solution. We’re currently reviewing it to improve the classification for blocking retries and make it allowlistable instead of the current denylist only behavior. Either way, you’d have to configure the classification for each type of retry, i.e. define which exceptions are fatal for the blocking retries, and which are fatal for the non-blocking retries. So if you want a non-default fatal exception to go straight to the DLT, you’d have to add it to both classifications. With this you can have any behavior you want - both retries, only one or another, or no retries. We’re also in the process of reviewing the configuration API for the retryable topics feature as a whole, so when that comes out we’ll probably have a nicer API for configuring this. As a side note, we’ve just introduced a way of setting global fatal exceptions for the non-blocking retries, maybe that’s something you’ll want to check out too. |
@tomazfernandes that makes sense, thanks for clarification! But as soon as it is released, I will use it in my project 😄 I also had a quick look at the global fatal exceptions - will test that too! |
Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature. Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT.
* GH-2116: Add blocking retries to RT Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature. Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT. * Change DHE in LCFC to defaultFalse With this we no longer need a no ops back off. Some minor adjustments were needed to maintain behavior when the logic gets to DLPR. * Change DHE in LCFC to defaultFalse With this we no longer need a no ops back off. Some minor adjustments were needed to maintain behavior when the logic gets to DLPR. * Improve API and docs Now retryable exceptions can be set directly in the lcfc class. Improved the docs on how to combine blocking and non-blocking behaviors. Added what's new entry for this feature. * Improve ExceptionClassifier JavaDoc Also add assertions to the LCFC new methods to warn the user if they already set the blocking configurations.
* GH-2116: Add blocking retries to RT Before we hardcoded a no-ops back off in the DefaultErrorHandler used in the Retryable Topics feature. Adds a setter to let the user provide their own back off policy and configure blocking retries in conjunction with RT. * Change DHE in LCFC to defaultFalse With this we no longer need a no ops back off. Some minor adjustments were needed to maintain behavior when the logic gets to DLPR. * Change DHE in LCFC to defaultFalse With this we no longer need a no ops back off. Some minor adjustments were needed to maintain behavior when the logic gets to DLPR. * Improve API and docs Now retryable exceptions can be set directly in the lcfc class. Improved the docs on how to combine blocking and non-blocking behaviors. Added what's new entry for this feature. * Improve ExceptionClassifier JavaDoc Also add assertions to the LCFC new methods to warn the user if they already set the blocking configurations.
@mariecuriiee, we’ve implemented the solution, don’t know if you had the chance to take a look. It’s on 2.8.4-SNAPSHOT, and referenced in the docs: https://docs.spring.io/spring-kafka/docs/2.8.4-SNAPSHOT/reference/html/#retry-topic-combine-blocking Thanks again for the suggestion, and feel free to provide any feedback. @garyrussell, DYT there’s any more work to be done regarding this issue, or maybe we could close it? Thanks |
Thanks @tomazfernandes ; closing; resolved. |
As before discussed here on Github.
Expected Behavior
A combination of blocking and non-blocking retry should be possible to handle three different types of exceptions:
non-blocking (but not fatal) exceptions should be retried.
non-blocking (but fatal, e.g. DeserializationException) exceptions should be routed to the DLT directly.
blocking exceptions (e.g. a database is currently not available) should not be routed to a retry topic since all following messages would be routed there as well. The consumer should remain at the current offset.
Current Behavior
It is possible to use @RetryableTopic annotation to satisfy the first two bullets. It is also possible to use DefaultErrorHandler to implement the last bullet. But a combination of both approaches cannot be configured at the moment.
Context
@garyrussell mentioned a workaround by stopping the container using the
stopImmediate
container property. Nevertheless, there are a few blog articles that address the different handling of error types, e.g., this article. Therefore, having a proper way to configure this behavior would be a useful possibility.The text was updated successfully, but these errors were encountered: