Skip to content

Use global_ordinals_hash execution mode when sorting by sub aggregations. #26014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import org.elasticsearch.search.aggregations.BucketOrder;
import org.elasticsearch.search.aggregations.InternalAggregation;
import org.elasticsearch.search.aggregations.InternalOrder;
import org.elasticsearch.search.aggregations.InternalOrder.CompoundOrder;
import org.elasticsearch.search.aggregations.NonCollectingAggregator;
import org.elasticsearch.search.aggregations.bucket.BucketUtils;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregator.BucketCountThresholds;
Expand Down Expand Up @@ -93,6 +94,17 @@ public InternalAggregation buildEmptyAggregation() {
};
}

private static boolean isAggregationSort(BucketOrder order) {
if (order instanceof InternalOrder.Aggregation) {
return true;
} else if (order instanceof InternalOrder.CompoundOrder) {
InternalOrder.CompoundOrder compoundOrder = (CompoundOrder) order;
return compoundOrder.orderElements().stream().anyMatch(TermsAggregatorFactory::isAggregationSort);
} else {
return false;
}
}

@Override
protected Aggregator doCreateInternal(ValuesSource valuesSource, Aggregator parent, boolean collectsFromSingleBucket,
List<PipelineAggregator> pipelineAggregators, Map<String, Object> metaData) throws IOException {
Expand Down Expand Up @@ -139,10 +151,17 @@ protected Aggregator doCreateInternal(ValuesSource valuesSource, Aggregator pare
// to be unbounded and most instances may only aggregate few
// documents, so use hashed based
// global ordinals to keep the bucket ords dense.

// Additionally, if using partitioned terms the regular global
// ordinals would be sparse so we opt for hash

// Finally if we are sorting by sub aggregations, then these
// aggregations cannot be deferred, so global_ordinals_hash is
// a safer choice as we won't use memory for sub aggregations
// for buckets that are not collected.
if (Aggregator.descendsFromBucketAggregator(parent) ||
(includeExclude != null && includeExclude.isPartitionBased())) {
(includeExclude != null && includeExclude.isPartitionBased()) ||
isAggregationSort(order)) {
execution = ExecutionMode.GLOBAL_ORDINALS_HASH;
} else {
if (factories == AggregatorFactories.EMPTY) {
Expand Down