-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Executes incremental reduce in the search thread pool #58461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change forks the execution of partial reduces in the coordinating node to the search thread pool. It also ensures that partial reduces are executed sequentially and asynchronously in order to limit the memory and cpu that a single search request can use but also to avoid blocking a network thread. If a partial reduce fails with an exception, the search request is cancelled and the reporting of the error is delayed to the start of the fetch phase (when the final reduce is performed). This ensures that we cleanup the in-flight search requests before returning an error to the user. Closes elastic#53411 Relates elastic#51857
Pinging @elastic/es-search (:Search/Search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of minor comments/questions. I think I will have to look at it another 3/4 times before I can properly review it :)
server/src/main/java/org/elasticsearch/action/search/QueryPhaseResultConsumer.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/search/TransportSearchAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/fetch/FetchPhase.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few things, mostly small. I think this is the sort of thing that we need, but I also feel like I don't know JVM threading stuff well enough to be super clear. I feel like we're being fairly paranoid here with the threading stuff. Which is probably fine because this won't trigger on small searches. But I don't know enough to be sure that my instinct is right about it either.
server/src/main/java/org/elasticsearch/action/search/QueryPhaseResultConsumer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/search/QueryPhaseResultConsumer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/search/QueryPhaseResultConsumer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/search/QueryPhaseResultConsumer.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had another look this morning and I think I mostly understand it. I left two small things.
server/src/main/java/org/elasticsearch/action/search/QueryPhaseResultConsumer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/search/QueryPhaseResultConsumer.java
Outdated
Show resolved
Hide resolved
This change forks the execution of partial reduces in the coordinating node to the search thread pool. It also ensures that partial reduces are executed sequentially and asynchronously in order to limit the memory and cpu that a single search request can use but also to avoid blocking a network thread. If a partial reduce fails with an exception, the search request is cancelled and the reporting of the error is delayed to the start of the fetch phase (when the final reduce is performed). This ensures that we cleanup the in-flight search requests before returning an error to the user. Closes #53411 Relates #51857
This change forks the execution of partial reduces in the coordinating node to the search thread pool.
It also ensures that partial reduces are executed sequentially and asynchronously in order to limit the
memory and cpu that a single search request can use but also to avoid blocking a network thread.
If a partial reduce fails with an exception, the search request is cancelled and the reporting of the error is
delayed to the start of the fetch phase (when the final reduce is performed). This ensures that we cleanup the
in-flight search requests before returning an error to the user.
Closes #53411
Relates #51857