Skip to content

Commit 222ee72

Browse files
authored
Add moving percentiles pipeline aggregation (#55441) (#56575)
Similar to what the moving function aggregation does, except merging windows of percentiles sketches together instead of cumulatively merging final metrics
1 parent 7b1f978 commit 222ee72

File tree

15 files changed

+1170
-16
lines changed

15 files changed

+1170
-16
lines changed

docs/reference/aggregations/pipeline.asciidoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,3 +287,4 @@ include::pipeline/bucket-script-aggregation.asciidoc[]
287287
include::pipeline/bucket-selector-aggregation.asciidoc[]
288288
include::pipeline/bucket-sort-aggregation.asciidoc[]
289289
include::pipeline/serial-diff-aggregation.asciidoc[]
290+
include::pipeline/moving-percentiles-aggregation.asciidoc[]
Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
[role="xpack"]
2+
[testenv="basic"]
3+
[[search-aggregations-pipeline-moving-percentiles-aggregation]]
4+
=== Moving Percentiles Aggregation
5+
6+
Given an ordered series of <<search-aggregations-metrics-percentile-aggregation, percentiles>>, the Moving Percentile aggregation
7+
will slide a window across those percentiles and allow the user to compute the cumulative percentile.
8+
9+
This is conceptually very similar to the <<search-aggregations-pipeline-movfn-aggregation, Moving Function>> pipeline aggregation,
10+
except it works on the percentiles sketches instead of the actual buckets values.
11+
12+
==== Syntax
13+
14+
A `moving_percentiles` aggregation looks like this in isolation:
15+
16+
[source,js]
17+
--------------------------------------------------
18+
{
19+
"moving_percentiles": {
20+
"buckets_path": "the_percentile",
21+
"window": 10
22+
}
23+
}
24+
--------------------------------------------------
25+
// NOTCONSOLE
26+
27+
[[moving-percentiles-params]]
28+
.`moving_percentiles` Parameters
29+
[options="header"]
30+
|===
31+
|Parameter Name |Description |Required |Default Value
32+
|`buckets_path` |Path to the percentile of interest (see <<buckets-path-syntax, `buckets_path` Syntax>> for more details |Required |
33+
|`window` |The size of window to "slide" across the histogram. |Required |
34+
|`shift` |<<shift-parameter, Shift>> of window position. |Optional | 0
35+
|===
36+
37+
`moving_percentiles` aggregations must be embedded inside of a `histogram` or `date_histogram` aggregation. They can be
38+
embedded like any other metric aggregation:
39+
40+
[source,console]
41+
--------------------------------------------------
42+
POST /_search
43+
{
44+
"size": 0,
45+
"aggs": {
46+
"my_date_histo":{ <1>
47+
"date_histogram":{
48+
"field":"date",
49+
"calendar_interval":"1M"
50+
},
51+
"aggs":{
52+
"the_percentile":{ <2>
53+
"percentiles":{
54+
"field": "price",
55+
"percents": [ 1.0, 99.0 ]
56+
}
57+
},
58+
"the_movperc": {
59+
"moving_percentiles": {
60+
"buckets_path": "the_percentile", <3>
61+
"window": 10
62+
}
63+
}
64+
}
65+
}
66+
}
67+
}
68+
--------------------------------------------------
69+
// TEST[setup:sales]
70+
71+
<1> A `date_histogram` named "my_date_histo" is constructed on the "timestamp" field, with one-day intervals
72+
<2> A `percentile` metric is used to calculate the percentiles of a field.
73+
<3> Finally, we specify a `moving_percentiles` aggregation which uses "the_percentile" sketch as its input.
74+
75+
Moving percentiles are built by first specifying a `histogram` or `date_histogram` over a field. You then add
76+
a percentile metric inside of that histogram. Finally, the `moving_percentiles` is embedded inside the histogram.
77+
The `buckets_path` parameter is then used to "point" at the percentiles aggregation inside of the histogram (see
78+
<<buckets-path-syntax>> for a description of the syntax for `buckets_path`).
79+
80+
And the following may be the response:
81+
82+
[source,console-result]
83+
--------------------------------------------------
84+
{
85+
"took": 11,
86+
"timed_out": false,
87+
"_shards": ...,
88+
"hits": ...,
89+
"aggregations": {
90+
"my_date_histo": {
91+
"buckets": [
92+
{
93+
"key_as_string": "2015/01/01 00:00:00",
94+
"key": 1420070400000,
95+
"doc_count": 3,
96+
"the_percentile": {
97+
"values": {
98+
"1.0": 150.0,
99+
"99.0": 200.0
100+
}
101+
}
102+
},
103+
{
104+
"key_as_string": "2015/02/01 00:00:00",
105+
"key": 1422748800000,
106+
"doc_count": 2,
107+
"the_percentile": {
108+
"values": {
109+
"1.0": 10.0,
110+
"99.0": 50.0
111+
}
112+
},
113+
"the_movperc": {
114+
"values": {
115+
"1.0": 150.0,
116+
"99.0": 200.0
117+
}
118+
}
119+
},
120+
{
121+
"key_as_string": "2015/03/01 00:00:00",
122+
"key": 1425168000000,
123+
"doc_count": 2,
124+
"the_percentile": {
125+
"values": {
126+
"1.0": 175.0,
127+
"99.0": 200.0
128+
}
129+
},
130+
"the_movperc": {
131+
"values": {
132+
"1.0": 10.0,
133+
"99.0": 200.0
134+
}
135+
}
136+
}
137+
]
138+
}
139+
}
140+
}
141+
--------------------------------------------------
142+
// TESTRESPONSE[s/"took": 11/"took": $body.took/]
143+
// TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]
144+
// TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]
145+
146+
The output format of the `moving_percentiles` aggregation is inherited from the format of the referenced
147+
<<search-aggregations-metrics-percentile-aggregation,`percentiles`>> aggregation.
148+
149+
Moving percentiles pipeline aggregations always run with `skip` gap policy.
150+
151+
152+
[[moving-percentiles-shift-parameter]]
153+
==== shift parameter
154+
155+
By default (with `shift = 0`), the window that is offered for calculation is the last `n` values excluding the current bucket.
156+
Increasing `shift` by 1 moves starting window position by `1` to the right.
157+
158+
- To include current bucket to the window, use `shift = 1`.
159+
- For center alignment (`n / 2` values before and after the current bucket), use `shift = window / 2`.
160+
- For right alignment (`n` values after the current bucket), use `shift = window`.
161+
162+
If either of window edges moves outside the borders of data series, the window shrinks to include available values only.

server/src/main/java/org/elasticsearch/search/aggregations/metrics/AbstractInternalHDRPercentiles.java

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ public double value(String name) {
8686
return value(Double.parseDouble(name));
8787
}
8888

89-
DocValueFormat formatter() {
89+
public DocValueFormat formatter() {
9090
return format;
9191
}
9292

@@ -96,10 +96,27 @@ public long getEstimatedMemoryFootprint() {
9696
return state.getEstimatedFootprintInBytes();
9797
}
9898

99-
DoubleHistogram getState() {
99+
/**
100+
* Return the internal {@link DoubleHistogram} sketch for this metric.
101+
*/
102+
public DoubleHistogram getState() {
100103
return state;
101104
}
102105

106+
/**
107+
* Return the keys (percentiles) requested.
108+
*/
109+
public double[] getKeys() {
110+
return keys;
111+
}
112+
113+
/**
114+
* Should the output be keyed.
115+
*/
116+
public boolean keyed() {
117+
return keyed;
118+
}
119+
103120
@Override
104121
public AbstractInternalHDRPercentiles reduce(List<InternalAggregation> aggregations, ReduceContext reduceContext) {
105122
DoubleHistogram merged = null;

server/src/main/java/org/elasticsearch/search/aggregations/metrics/AbstractInternalTDigestPercentiles.java

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,18 +72,35 @@ public double value(String name) {
7272

7373
public abstract double value(double key);
7474

75-
DocValueFormat formatter() {
75+
public DocValueFormat formatter() {
7676
return format;
7777
}
7878

7979
public long getEstimatedMemoryFootprint() {
8080
return state.byteSize();
8181
}
8282

83-
TDigestState getState() {
83+
/**
84+
* Return the internal {@link TDigestState} sketch for this metric.
85+
*/
86+
public TDigestState getState() {
8487
return state;
8588
}
8689

90+
/**
91+
* Return the keys (percentiles) requested.
92+
*/
93+
public double[] getKeys() {
94+
return keys;
95+
}
96+
97+
/**
98+
* Should the output be keyed.
99+
*/
100+
public boolean keyed() {
101+
return keyed;
102+
}
103+
87104
@Override
88105
public AbstractInternalTDigestPercentiles reduce(List<InternalAggregation> aggregations, ReduceContext reduceContext) {
89106
TDigestState merged = null;

x-pack/plugin/analytics/src/main/java/org/elasticsearch/xpack/analytics/AnalyticsPlugin.java

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,8 @@
3535
import org.elasticsearch.xpack.analytics.boxplot.BoxplotAggregationBuilder;
3636
import org.elasticsearch.xpack.analytics.boxplot.InternalBoxplot;
3737
import org.elasticsearch.xpack.analytics.cumulativecardinality.CumulativeCardinalityPipelineAggregationBuilder;
38-
import org.elasticsearch.xpack.analytics.cumulativecardinality.CumulativeCardinalityPipelineAggregator;
3938
import org.elasticsearch.xpack.analytics.mapper.HistogramFieldMapper;
39+
import org.elasticsearch.xpack.analytics.movingPercentiles.MovingPercentilesPipelineAggregationBuilder;
4040
import org.elasticsearch.xpack.analytics.stringstats.InternalStringStats;
4141
import org.elasticsearch.xpack.analytics.stringstats.StringStatsAggregationBuilder;
4242
import org.elasticsearch.xpack.analytics.topmetrics.InternalTopMetrics;
@@ -73,14 +73,18 @@ public AnalyticsPlugin(Settings settings) {
7373

7474
@Override
7575
public List<PipelineAggregationSpec> getPipelineAggregations() {
76-
return singletonList(
77-
new PipelineAggregationSpec(
78-
CumulativeCardinalityPipelineAggregationBuilder.NAME,
79-
CumulativeCardinalityPipelineAggregationBuilder::new,
80-
CumulativeCardinalityPipelineAggregator::new,
81-
usage.track(AnalyticsStatsAction.Item.CUMULATIVE_CARDINALITY,
82-
checkLicense(CumulativeCardinalityPipelineAggregationBuilder.PARSER)))
83-
);
76+
List<PipelineAggregationSpec> pipelineAggs = new ArrayList<>();
77+
pipelineAggs.add(new PipelineAggregationSpec(
78+
CumulativeCardinalityPipelineAggregationBuilder.NAME,
79+
CumulativeCardinalityPipelineAggregationBuilder::new,
80+
usage.track(AnalyticsStatsAction.Item.CUMULATIVE_CARDINALITY,
81+
checkLicense(CumulativeCardinalityPipelineAggregationBuilder.PARSER))));
82+
pipelineAggs.add(new PipelineAggregationSpec(
83+
MovingPercentilesPipelineAggregationBuilder.NAME,
84+
MovingPercentilesPipelineAggregationBuilder::new,
85+
usage.track(AnalyticsStatsAction.Item.MOVING_PERCENTILES,
86+
checkLicense(MovingPercentilesPipelineAggregationBuilder.PARSER))));
87+
return pipelineAggs;
8488
}
8589

8690
@Override

0 commit comments

Comments
 (0)