-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Simplify ClusterStateUpdateTask Timeout Handling #64117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify ClusterStateUpdateTask Timeout Handling #64117
Conversation
It's confusing and slightly error prone (see elastic#64116) to handle the timeouts via overrides but the priority via a field. This simplifies the code to to avoid future issues and save over 100 LOC.
Pinging @elastic/es-distributed (:Distributed/Cluster Coordination) |
…-update-timeout-handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @original-brownbear
@@ -105,7 +105,8 @@ public void onTimeout(TimeValue timeout) { | |||
|
|||
private void submitClearVotingConfigExclusionsTask(ClearVotingConfigExclusionsRequest request, long startTimeMillis, | |||
ActionListener<ClearVotingConfigExclusionsResponse> listener) { | |||
clusterService.submitStateUpdateTask("clear-voting-config-exclusions", new ClusterStateUpdateTask(Priority.URGENT) { | |||
clusterService.submitStateUpdateTask("clear-voting-config-exclusions", new ClusterStateUpdateTask(Priority.URGENT, | |||
TimeValue.timeValueMillis(request.getTimeout().millis() + startTimeMillis - threadPool.relativeTimeInMillis())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unsafe in that it could be negative and cause an exception. Your change reduces the risk of that, but I wonder if we should fix it in this same PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 5897ae9, that should give us a consistent (as in same as when the update would timeout on the queue) exception here I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment only.
@@ -105,7 +106,13 @@ public void onTimeout(TimeValue timeout) { | |||
|
|||
private void submitClearVotingConfigExclusionsTask(ClearVotingConfigExclusionsRequest request, long startTimeMillis, | |||
ActionListener<ClearVotingConfigExclusionsResponse> listener) { | |||
clusterService.submitStateUpdateTask("clear-voting-config-exclusions", new ClusterStateUpdateTask(Priority.URGENT) { | |||
final long timeout = request.getTimeout().millis() + startTimeMillis - threadPool.relativeTimeInMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should just do Math.max(0, timeout) instead, just like in cluster health action? Seems like the right place to handle timeout=0 is inside submitStateUpdateTask
, rather than special handle it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ I pushed 91e5fb3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One last ask: can you update the issue description to mention the bug that this fixes?
Thanks Henning, I updated the description :) |
It's confusing and slightly error prone (see elastic#64116) to handle the timeouts via overrides but the priority via a field. This simplifies the code to to avoid future issues and save over 100 LOC. Also this fixes a bug in `TransportVotingConfigExclusionsAction` where trying to instantiate a time value with a negative time could throw and unexpected exception and as a result leak a listener.
It's confusing and slightly error prone (see #64116) to handle the timeouts via overrides but the priority via a field. This simplifies the code to to avoid future issues and save over 100 LOC. Also this fixes a bug in `TransportVotingConfigExclusionsAction` where trying to instantiate a time value with a negative time could throw and unexpected exception and as a result leak a listener.
It's confusing and slightly error prone (see #64116) to handle the timeouts
via overrides but the priority via a field. This simplifies the code to to avoid future
issues and save over 100 LOC.
Also this fixes a bug in
TransportVotingConfigExclusionsAction
where trying to instantiate a time value with a negative time could throw and unexpected exception and as a result leak a listener.