-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Disable concurrency when top_hits sorts on anything but _score #123610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency. The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
Hi @javanna, I've created a changelog YAML for you. |
{ | ||
SearchSourceBuilder searchSourceBuilder = newSearchSourceBuilder.get(); | ||
searchSourceBuilder.aggregation(new TermsAggregationBuilder("terms").subAggregation(new TopHitsAggregationBuilder("tophits"))); | ||
assertFalse(searchSourceBuilder.supportsParallelCollection(fieldCardinality)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test unrelated to the "sort" change made here? Do we generally disallow parallel collection for nested aggs or is there something specific here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea I wanted to test the different combinations of top_hits within another agg as well as other agg with top_hits underneath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM.
…ic#123610) We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency. The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.
…ic#123610) We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency. The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.
…ic#123610) We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency. The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.
…) (#123642) We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency. The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.
…) (#123640) We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency. The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.
…) (#123643) We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency. The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.
…) (#123641) We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency. The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.
With elastic#123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with elastic#123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
With elastic#123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with elastic#123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
With elastic#123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with elastic#123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
With elastic#123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with elastic#123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
…26012) With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
…26013) With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
…26011) With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
…26014) With #123610 we disabled parallel collection for field and script sorted top hits, aligning its behaviour with that of top level search. This was mainly to work around a bug in script sorting that did not support inter-segment concurrency. The bug with script sort has been fixed with #123757 and concurrency re-enabled for it. While sort by field is not optimized for search concurrency, top hits benefits from it and disabling concurrency for sort by field in top hits has caused performance regressions in our nightly benchmarks. This commit re-enables concurrency for top hits with sort by field is used. This introduces back a discrepancy between top level search and top hits, in that concurrency is applied for top hits despite sort by field normally disables it. The key difference is the context where sorting is applied, and the fact that concurrency is disabled only for performance reasons on top level searches and not for functional reasons.
We already disable inter-segment concurrency in SearchSourceBuilder whenever the top-level sort provided is not _score. We shoudl apply the same rules in top_hits. We recenly stumbled upon non deterministic behaviour caused by script sorting defined within top hits. That is to be expected given that script sorting does not support search concurrency.
The sort script can be replaced with a runtime field, either defined in the mapping or in the search request, which does support concurrency and guarantees predictable behaviour.