-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Ordering term aggregation based on scripted metric. #15718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The CLA is now signed. |
…on to the script attribute (see description here elastic#15731).
For info, this pull request fixes this issue. |
@colings86 I think this is something you might be able to take a look at? |
@dakrone, @colings86 is possible to move forward here? So many people complain that ElasticSearch is not able to order term aggregation based on scripted metric? It's like a major missing feature IMO and for sure it's inconsistent with the rest of the aggregation framework. |
@lquerel sorry it's taken so long to get back to you on this PR. This is a tricky feature as there are problems with sorting the terms aggregation by any aggregation type which is the focus of #17588. Given this I am personally hesistant to add more cases where your can sort by a sub-aggregation but have no idea whether you actually have the top N or even what the error in the result could be. If we do end up implementing this feature I would also be keen for it to not change the aggregation framework in a way that is specific only to this aggregation since this IMO will make the aggregation framework tricky to understand and maintain. We actually want to move the |
I have also added a comment to the issue asking for use cases where users need to order by a scripted_metric aggregation to better understand where users think the current set of aggregations are lacking and what we can do to fill these gaps #8486 (comment) |
@colings86 Thanks for your response. I understand the need to remove all special cases from the core aggregation framework. The fact that scripted metrics don't behave as any other metrics is disturbing for the user. If scripted metrics are implemented as a plugin, then I hope the "order by" clause will be supported because otherwise ElasticSearch doesn't compete with the aggregation framework of a solution like crate.io (bases on ElasticSearch core) or even with the basic MongoDB aggregation framework. They support group by, order by on any computation. IMO, with a better support of different variants of average functions we could avoid a lot scripted metrics use cases (weighted average + other forms explained in this thread #8486 (comment)). Finally compute an accurate Top X on a distributed system is not easy and sometimes not really essential. A fast approximation is usually good enough in most case. |
@colings86 Where do we stand on this PR? It has been open for 1.5 years, we should make a decision on it. |
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
I am going to close this PR as any solution for this would need to work in a way that does not special case the |
Hi,
Scripted metrics have been introduced to compute complex computations based on a map/reduce approach. Unfortunately the usage of scripted metrics is limited because they can't be used in the 'order' clause of an aggregation. See this discussion for a full description of the issue.
https://discuss.elastic.co/t/ordering-terms-aggregation-based-on-pipeline-metric/31839
This pull-request is an attempt to fix this issue for scripted metrics returning a number as the result of their reduce method (probably more than 90% of use cases).
That's my first contribution to ES, so there is probably room for improvements and even may be mistakes here and there. Don't hesitate to guide me in the right direction in order to make this pull request a success.
Thanks
Laurent