-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Re-enable parallel collection for field sorted top hits #125916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enable parallel collection for field sorted top hits #125916
Conversation
With elastic#123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with elastic#123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
Hi @javanna, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! The code path clearly reverses the functionality. Is it worth having an external configuration, e.g., a YAML file, to control concurrency, or is it a single case and a workaround to investigate a bug?
@drempapis I don't understand your questions, sorry, could you clarify what functionality is reversed and how external config would help controlling the concurrency behavior? I think the key point here is that sort by field does support search concurrency, but it is in the average case not faster than sequential search across the segments, with the exception of top_hits, as highlighted by benchmark results. |
With elastic#123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with elastic#123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
With elastic#123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with elastic#123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
With elastic#123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with elastic#123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
…26012) With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
…26013) With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
…26011) With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
…26014) With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
@javanna, sorry not to be clear, I had in mind the pr #101535, which set the The other question was whether it is worth having an external config to change the boolean value of |
With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency.
The bug with script sort has been fixed with #123757 and concurrency re-enabled for it.
While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks.
This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.