-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add an option to create "other" bucket for Terms aggregation #6804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
One question that is related to that change is whether |
I'd say just the doc counts, at least by default. |
The suggested syntax is already an option, so the developer using this option should understand the cost. Based on that, I think the default should include all sub aggregations.
What's the point of using Anyway, the syntax can be future-proof, so instead of
It may also check if the value of |
I have thought more about this issue and computing the document count for other buckets is not possible in the general case without doing another pass over the data (think about multi-valued fields). The only thing that it can do would be to return the number of other values (as opposed to documents). But we already have the If a bucket or count for other docs is really needed, the right way to build it would be to run a first query with the terms aggregation, and a second query that would have a filter aggregation that would exclude the returned terms. |
This will be really useful with sub-aggregations:
|
When using "terms" aggregation, it's often useful to get top X terms (achieved by using
size
parameter), but as well get a separate bucket for all other terms together (possibly constrained by minimum doc count).The query syntax might be:
And the response might look like:
The
_other_terms
bucket will be based on all tags with doc_count > 10 per tag, excluding already listed (top 3).Related to #5324
The text was updated successfully, but these errors were encountered: