-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Throttling support for reindex #17039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This throttling support doesn't include throttling a running reindex request. That seems like a wonderful feature but it doesn't need to live in this PR. |
@@ -961,6 +961,14 @@ public XContentBuilder dateValueField(XContentBuilderString rawFieldName, XConte | |||
return this; | |||
} | |||
|
|||
public XContentBuilder timeValueField(String rawFieldName, String readableFieldName, TimeValue timeValue) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no need to duplicate the functionality here, this can call:
timeValueField(rawFieldName, readableFieldName, timeValue.millis(), TimeUnit.MILLIS);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(or reverse them, so that a new TimeValue
is only constructed if a readable rawTime is used)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you had it in the right order the first time.
@dakrone I think this is ready for another round of review. Thanks for the review so far! |
throttle the number of requests per second that the reindex issues. The | ||
throttling is done between bulk batches so that it can manipulate the scroll | ||
timeout. At this point it looks like it slightly over-throttles, producing | ||
slightly lower requests per second than you asked for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should avoid documentation with this sort of voice:
"At this point it looks like it slightly over-throttles, producing slightly lower requests per second than you asked for."
Perhaps instead:
"Ensures that requests never exceed the requests_per_second
setting at the cost of potentially over-throttling."
LGTM, left another minor comment. Another thing I'd love to see is an explanation in the docs about exactly how the throttling is implemented so that people monitoring their clusters can understand what the expected behavior is. For example, discussing how the initial scroll is executed, then processing documents is delayed depending on I think it'd be beneficial to have in the documentation, but it's up to you if you don't think it should be there. |
@dakrone I've written the throttling docs a bit. Better? |
Sure, I still don't care of using terms like "bursty" because they are harder to understand from a non-native English speaker's perspective (as are all colloquialisms) |
Since we're done talking about the code I'm going to squash and rebase with master. |
b6a754a
to
9e33d64
Compare
Rebase wasn't automatic but was mostly clean. Mostly just a matter of adding both sets of changes. In one case I had to add a timeout parameter to a new test to make it all compile. Small changes. |
Rebasing discovered some issues with headers and context. I'll add another test and fix. |
This is now blocked on #17077 |
9e33d64
to
0c78e38
Compare
And that is now merged. So we can have this now. |
LGTM |
The throttle is applied when starting the next scroll request so that its timeout can include the throttle time.
55cfb8d
to
da96b6e
Compare
The throttling is done between batches so that we can add the wait time to the scroll timeout.