-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Avoid unnecessary partition seeks on successful record recovery #2195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @v-chernyshev, thanks for bringing this up. I'm not sure why this logic is as is - @garyrussell sure can elucidate that. But it seems to me that if we don't do seeks in case the record is successfully recovered, in About |
It may indeed be possible with some API tweaks. The main obstacle, in my opinion, is this block of code. As |
Yeah, you are absolutely right. What I meant when I said that this exception is special is that targeted single-record recovery won't ever be possible for it. |
It's historical (error handlers pre-dated the retryable topic code by several years). However, I can see that this might be helpful, even for normal (blocking) retries, after a successful recovery; but it is really exacerbated by retryable topics because recovery is generally successful (at least for the first/main topic). That said, one of the reasons for doing the seeks is to avoid a rebalance; if we do the retries without polling the consumer and the aggregate of the retry delays exceeds I can see (at least) a couple of solutions:
spring-kafka/spring-kafka/src/main/java/org/springframework/kafka/listener/ErrorHandlingUtils.java Lines 66 to 104 in 31b78c4
We could probably combine 1 & 2 in the listener container. I'll give it some thought. |
Just a couple of thoughts:
Do we still have any retry mechanisms that rely on backing off and retrying several times in memory? If error handling doesn't include a wait (other than when we wait to seek to current again), maybe the rebalancing problem might not be that much of an issue anymore.
Why do we need to drain the iterator early? Maybe we could just have the failed record and pass the iterator itself to the error handler, which we'd only iterate if we needed to. Of course, that would have API implications, so maybe it's not worth it. Thanks |
Yes, I didn't mean to imply that this would be a problem for non-blocking retries, but it would be a problem for blocking retries; If we change the logic to never do seeks and the aggregate backoff is, say, 10 seconds per record and we have 500 records, we'd not call I suppose we could change the logic to do the seeks whenever we are retrying and only continue from where we left off when recovery is successful, that would be ok (but that really wouldn't satisfy @v-chernyshev complaint that the seeks aren't really necessary. I think my final comment will solve it because we can pause/poll/resume the consumer during the retries as well. As I said
Yes, that's one possibility. Either way, these are non-trivial changes. |
Sure, that's what I had in mind, but I see your point.
I see now that for blocking retries that fail this would probably be much better. Thanks. |
I have had some fresh thoughts on this. Stay tuned for a PoC early next week. |
Looking forward to it @garyrussell! |
Resolves spring-projects#2195 Add an option to avoid seeks after handling exceptions. Instead, pause the consumer for one `poll()` and use the remaining records as the result of that poll. New methods on `CommonErrorHandler` - `handleOne` for record listeners, returning a boolean to indicate whether the record was recovered and should not be redelivered. `handlaBatchAndReturnRemaining` for batch listeners, returning either the complete set or a subset, e.g. when the `DEH` receives a `BatchListenerExecutionFailedException` and commits a partial batch. Also includes the classifier refactoring discussed here spring-projects#2185 (comment)
Resolves spring-projects#2195 Add an option to avoid seeks after handling exceptions. Instead, pause the consumer for one `poll()` and use the remaining records as the result of that poll. New methods on `CommonErrorHandler` - `handleOne` for record listeners, returning a boolean to indicate whether the record was recovered and should not be redelivered. `handlaBatchAndReturnRemaining` for batch listeners, returning either the complete set or a subset, e.g. when the `DEH` receives a `BatchListenerExecutionFailedException` and commits a partial batch. Also includes the classifier refactoring discussed here spring-projects#2185 (comment) The new logic is disabled by default, we can consider enabling it in 3.0 and remove the deprecations.
Resolves spring-projects#2195 Add an option to avoid seeks after handling exceptions. Instead, pause the consumer for one `poll()` and use the remaining records as the result of that poll. New methods on `CommonErrorHandler` - `handleOne` for record listeners, returning a boolean to indicate whether the record was recovered and should not be redelivered. `handlaBatchAndReturnRemaining` for batch listeners, returning either the complete set or a subset, e.g. when the `DEH` receives a `BatchListenerExecutionFailedException` and commits a partial batch. Also includes the classifier refactoring discussed here spring-projects#2185 (comment) The new logic is disabled by default, we can consider enabling it in 3.0 and remove the deprecations.
I issued a PR for review; please take a look if you get a chance. |
Resolves spring-projects#2195 Add an option to avoid seeks after handling exceptions. Instead, pause the consumer for one `poll()` and use the remaining records as the result of that poll. New methods on `CommonErrorHandler` - `handleOne` for record listeners, returning a boolean to indicate whether the record was recovered and should not be redelivered. `handlaBatchAndReturnRemaining` for batch listeners, returning either the complete set or a subset, e.g. when the `DEH` receives a `BatchListenerExecutionFailedException` and commits a partial batch. Also includes the classifier refactoring discussed here spring-projects#2185 (comment) The new logic is disabled by default, we can consider enabling it in 3.0 and remove the deprecations.
Resolves spring-projects#2195 Add an option to avoid seeks after handling exceptions. Instead, pause the consumer for one `poll()` and use the remaining records as the result of that poll. New methods on `CommonErrorHandler` - `handleOne` for record listeners, returning a boolean to indicate whether the record was recovered and should not be redelivered. `handlaBatchAndReturnRemaining` for batch listeners, returning either the complete set or a subset, e.g. when the `DEH` receives a `BatchListenerExecutionFailedException` and commits a partial batch. Also includes the classifier refactoring discussed here spring-projects#2185 (comment) The new logic is disabled by default, we can consider enabling it in 3.0 and remove the deprecations.
Resolves spring-projects#2195 Add an option to avoid seeks after handling exceptions. Instead, pause the consumer for one `poll()` and use the remaining records as the result of that poll. New methods on `CommonErrorHandler` - `handleOne` for record listeners, returning a boolean to indicate whether the record was recovered and should not be redelivered. `handlaBatchAndReturnRemaining` for batch listeners, returning either the complete set or a subset, e.g. when the `DEH` receives a `BatchListenerExecutionFailedException` and commits a partial batch. Also includes the classifier refactoring discussed here spring-projects#2185 (comment) The new logic is disabled by default, we can consider enabling it in 3.0 and remove the deprecations.
Resolves #2195 Add an option to avoid seeks after handling exceptions. Instead, pause the consumer for one `poll()` and use the remaining records as the result of that poll. New methods on `CommonErrorHandler` - `handleOne` for record listeners, returning a boolean to indicate whether the record was recovered and should not be redelivered. `handlaBatchAndReturnRemaining` for batch listeners, returning either the complete set or a subset, e.g. when the `DEH` receives a `BatchListenerExecutionFailedException` and commits a partial batch. Also includes the classifier refactoring discussed here #2185 (comment) The new logic is disabled by default, we can consider enabling it in 3.0 and remove the deprecations. * Fix race - do not call `resume()` on the container; the user might have paused after the error. * Change Since to 2.9. * Fix typos. Co-authored-by: Artem Bilan <[email protected]> * Remove unnecessary local variable; add docs. * Polishing - see commit comment for more details - move the resume logic to after the invokes and don't resume if pending records - don't check `isPaused()` after empty poll due to errors; always restore the pending records * Remove unnecessary boolean; fix deprecation warnings and delegating error handlers. * Emergency stop container if the consumer returns records while paused after an error. * Fix race in test - prevent consumer thread from changing pausedConsumers while the test thread is calling revoke/assign. * Remove System.out(). * Add diagnostics to test. * Fix race in test; wait until next poll after consumer thread pauses the partitions. * Fix stubbing in emergency stop test. * Remove unnecessary boolean. **Cherry-pick to `2.9.x`**
Resolves #2195 Add an option to avoid seeks after handling exceptions. Instead, pause the consumer for one `poll()` and use the remaining records as the result of that poll. New methods on `CommonErrorHandler` - `handleOne` for record listeners, returning a boolean to indicate whether the record was recovered and should not be redelivered. `handlaBatchAndReturnRemaining` for batch listeners, returning either the complete set or a subset, e.g. when the `DEH` receives a `BatchListenerExecutionFailedException` and commits a partial batch. Also includes the classifier refactoring discussed here #2185 (comment) The new logic is disabled by default, we can consider enabling it in 3.0 and remove the deprecations. * Fix race - do not call `resume()` on the container; the user might have paused after the error. * Change Since to 2.9. * Fix typos. Co-authored-by: Artem Bilan <[email protected]> * Remove unnecessary local variable; add docs. * Polishing - see commit comment for more details - move the resume logic to after the invokes and don't resume if pending records - don't check `isPaused()` after empty poll due to errors; always restore the pending records * Remove unnecessary boolean; fix deprecation warnings and delegating error handlers. * Emergency stop container if the consumer returns records while paused after an error. * Fix race in test - prevent consumer thread from changing pausedConsumers while the test thread is calling revoke/assign. * Remove System.out(). * Add diagnostics to test. * Fix race in test; wait until next poll after consumer thread pauses the partitions. * Fix stubbing in emergency stop test. * Remove unnecessary boolean. **Cherry-pick to `2.9.x`**
- Revert to seeking when `CommitFailedException` - due to a rebalance - For batch listeners, move committing to the try block - For record listeners, `AckMode.RECORD` already commits in the try block - With `AckMode.BATCH`, detect a no-seek retry and call the error handler `handleRemaining`
- Revert to seeking when `CommitFailedException` - due to a rebalance - For batch listeners, move committing to the try block - For record listeners, `AckMode.RECORD` already commits in the try block - With `AckMode.BATCH`, detect a no-seek retry and call the error handler `handleRemaining`
- Revert to seeking when `CommitFailedException` - due to a rebalance - For batch listeners, move committing to the try block - For record listeners, `AckMode.RECORD` already commits in the try block - With `AckMode.BATCH`, detect a no-seek retry and call the error handler `handleRemaining` **cherry-pick to 2.9.x**
- Revert to seeking when `CommitFailedException` - due to a rebalance - For batch listeners, move committing to the try block - For record listeners, `AckMode.RECORD` already commits in the try block - With `AckMode.BATCH`, detect a no-seek retry and call the error handler `handleRemaining` **cherry-pick to 2.9.x**
Related to: #2195 - Revert to seeking when `CommitFailedException` - due to a rebalance - For batch listeners, move committing to the try block - For record listeners, `AckMode.RECORD` already commits in the try block - With `AckMode.BATCH`, detect a no-seek retry and call the error handler `handleRemaining` **cherry-pick to 2.9.x** * Fix test class names.
Related to: #2195 - Revert to seeking when `CommitFailedException` - due to a rebalance - For batch listeners, move committing to the try block - For record listeners, `AckMode.RECORD` already commits in the try block - With `AckMode.BATCH`, detect a no-seek retry and call the error handler `handleRemaining` **cherry-pick to 2.9.x** * Fix test class names.
Related to spring-projects#2195 Do not process the remaining records if the first record is from a paused partition.
Related to spring-projects#2195 Do not process the remaining records if the first record is from a paused partition.
Related to #2195 Do not process the remaining records if the first record is from a paused partition.
Related to #2195 Do not process the remaining records if the first record is from a paused partition.
- check `isPartitionPauseRequested()` instead of `isPartitionPaused()` - add diagnostic debug logging - prevent re-pausing the entire consumer each time a poll exits but the first partition is still paused. **cherry-pick to 2.9.x**
Related to spring-projects#2195 - check `isPartitionPauseRequested()` instead of `isPartitionPaused()` - add diagnostic debug logging - prevent re-pausing the entire consumer each time a poll exits but the first partition is still paused. **cherry-pick to 2.9.x**
Related to #2195 - check `isPartitionPauseRequested()` instead of `isPartitionPaused()` - add diagnostic debug logging - prevent re-pausing the entire consumer each time a poll exits but the first partition is still paused. **cherry-pick to 2.9.x** * Checkstyle fix.
Related to #2195 - check `isPartitionPauseRequested()` instead of `isPartitionPaused()` - add diagnostic debug logging - prevent re-pausing the entire consumer each time a poll exits but the first partition is still paused. **cherry-pick to 2.9.x** * Checkstyle fix.
Related to spring-projects#2195 Similar to the recent fix when the first partition is paused, we should not continue to process the remaining records if the container itself was paused during the poll after an error occurred with `seekAfterError=false`. **cherry-pick to 2.9.x**
Related to #2195 Similar to the recent fix when the first partition is paused, we should not continue to process the remaining records if the container itself was paused during the poll after an error occurred with `seekAfterError=false`. **cherry-pick to 2.9.x** * Remove unneeded copy/paste code from test. * Fix creation of remaining records. Previously, they were all inserted into the failed record's `TopicPartition` record list.
Related to #2195 Similar to the recent fix when the first partition is paused, we should not continue to process the remaining records if the container itself was paused during the poll after an error occurred with `seekAfterError=false`. **cherry-pick to 2.9.x** * Remove unneeded copy/paste code from test. * Fix creation of remaining records. Previously, they were all inserted into the failed record's `TopicPartition` record list.
Use a `LinkedHashMap` for the remaining records so that the order is retained. Fix test after removing some logic in the last commit.
Use a `LinkedHashMap` for the remaining records so that the order is retained. Fix test after removing some logic in the last commit.
Don't reference deprecated methods on `CommonErrorHandler`.
Don't reference deprecated methods on `CommonErrorHandler`.
I've reordered the sections for this request because, in my opinion, it is easier to understand the desired behaviour when the limitations of the current approach are explained first.
Current Behavior
Let's consider a situation where
max.poll.records
(500 by default) are available for consumption in a non-transactional record listener container, but the very first record cannot be processed immediately due to e.g. an issue with an external data provider. Here is what happens in the code in case of, at the very least, non-blocking retries:KafkaMessageListenerContainer.ListenerConsumer.doInvokeRecordListener
is eventually invoked withrecord
pointing to the first record anditerator
holding the remaining 499.this.invokeOnMessage
fails with an exception.this.invokeErrorHandler
is called.this.commonErrorHandler.remainingRecords()
returnstrue
. Both the failed record and all the remaining records from the iterator are drained into a temporary array,this.commonErrorHandler.handleRemaining
leads toDefaultErrorHandler.handleRemaining
that performsSeekUtils.seekOrRecover
.seekPartitions
.Unfortunately, this unconditional rewind operation leads to enormous spikes in network I/O as pretty much the same records are requested from Kafka over and over again. To be precise, with a local Kafka broker and a test application that always triggers a recovery I could see spikes exceeding 2000 Mbps in
iptraf
.Expected Behavior
I believe that it is not necessary to always rewind the remaining 499 records if the recovery is successful. This is what may happen with the failed record when non-blocking retries are enabled:
ClassCastException
) then it is sent straight into the DLT topic.Both these cases allow the next
iterator
record to be processed immediately as if nothing happened. This very same record will be at the front of the batch during the nextpoll
invocation anyway! The only special case I can think of isKafkaBackoffException
, which does absolutely require rewinding all the offsets so that the same batch is consumed again when the affected partitions are resumed.Context
The production rollout of our service that uses the non-blocking retries feature provided by Spring Kafka triggered a number of network I/O alarms, so we decided to find the root cause. This feature request is the result of the investigation :)
Implementing a workaround for this issue is possible but, admittedly, quite tricky as
invokeErrorHandler
is a private API method. These are the steps that may be taken:CommonErrorHandler
that:remainingRecords
to return eithertrue
orfalse
.handleRecord
in a way that delegates tohandleRemaining
of the default error handler with a singleton list as the second argument.RecordInterceptor
with an overriddenfailure
method that uses the above API to makeremainingRecords
returntrue
only ifSeekUtils.isBackoffException
istrue
.ListenerContainerFactoryConfigurer.setContainerCustomizer
may then be used to tie things together, e.g.:The text was updated successfully, but these errors were encountered: