@@ -117,6 +117,7 @@ Example:
117
117
--------------------------------------------------
118
118
GET /_search
119
119
{
120
+ "size": 0,
120
121
"aggs" : {
121
122
"my_buckets": {
122
123
"composite" : {
@@ -135,6 +136,7 @@ Like the `terms` aggregation it is also possible to use a script to create the v
135
136
--------------------------------------------------
136
137
GET /_search
137
138
{
139
+ "size": 0,
138
140
"aggs" : {
139
141
"my_buckets": {
140
142
"composite" : {
@@ -170,6 +172,7 @@ Example:
170
172
--------------------------------------------------
171
173
GET /_search
172
174
{
175
+ "size": 0,
173
176
"aggs" : {
174
177
"my_buckets": {
175
178
"composite" : {
@@ -188,6 +191,7 @@ The values are built from a numeric field or a script that return numerical valu
188
191
--------------------------------------------------
189
192
GET /_search
190
193
{
194
+ "size": 0,
191
195
"aggs" : {
192
196
"my_buckets": {
193
197
"composite" : {
@@ -220,6 +224,7 @@ is specified by date/time expression:
220
224
--------------------------------------------------
221
225
GET /_search
222
226
{
227
+ "size": 0,
223
228
"aggs" : {
224
229
"my_buckets": {
225
230
"composite" : {
@@ -249,6 +254,7 @@ the format specified with the format parameter:
249
254
--------------------------------------------------
250
255
GET /_search
251
256
{
257
+ "size": 0,
252
258
"aggs" : {
253
259
"my_buckets": {
254
260
"composite" : {
@@ -291,6 +297,7 @@ For example:
291
297
--------------------------------------------------
292
298
GET /_search
293
299
{
300
+ "size": 0,
294
301
"aggs" : {
295
302
"my_buckets": {
296
303
"composite" : {
@@ -313,6 +320,7 @@ in the composite buckets.
313
320
--------------------------------------------------
314
321
GET /_search
315
322
{
323
+ "size": 0,
316
324
"aggs" : {
317
325
"my_buckets": {
318
326
"composite" : {
@@ -342,6 +350,7 @@ For example:
342
350
--------------------------------------------------
343
351
GET /_search
344
352
{
353
+ "size": 0,
345
354
"aggs" : {
346
355
"my_buckets": {
347
356
"composite" : {
@@ -368,6 +377,7 @@ It is possible to include them in the response by setting `missing_bucket` to
368
377
--------------------------------------------------
369
378
GET /_search
370
379
{
380
+ "size": 0,
371
381
"aggs" : {
372
382
"my_buckets": {
373
383
"composite" : {
@@ -393,7 +403,7 @@ first 10 composite buckets created from the values source.
393
403
The response contains the values for each composite bucket in an array containing the values extracted
394
404
from each value source.
395
405
396
- ==== After
406
+ ==== Pagination
397
407
398
408
If the number of composite buckets is too high (or unknown) to be returned in a single response
399
409
it is possible to split the retrieval in multiple requests.
@@ -407,6 +417,7 @@ For example:
407
417
--------------------------------------------------
408
418
GET /_search
409
419
{
420
+ "size": 0,
410
421
"aggs" : {
411
422
"my_buckets": {
412
423
"composite" : {
@@ -472,6 +483,7 @@ round of result can be retrieved with:
472
483
--------------------------------------------------
473
484
GET /_search
474
485
{
486
+ "size": 0,
475
487
"aggs" : {
476
488
"my_buckets": {
477
489
"composite" : {
@@ -489,6 +501,116 @@ GET /_search
489
501
490
502
<1> Should restrict the aggregation to buckets that sort **after** the provided values.
491
503
504
+ ==== Early termination
505
+
506
+ For optimal performance the <<index-modules-index-sorting,index sort>> should be set on the index so that it matches
507
+ parts or fully the source order in the composite aggregation.
508
+ For instance the following index sort:
509
+
510
+ [source,console]
511
+ --------------------------------------------------
512
+ PUT twitter
513
+ {
514
+ "settings" : {
515
+ "index" : {
516
+ "sort.field" : ["username", "timestamp"], <1>
517
+ "sort.order" : ["asc", "desc"] <2>
518
+ }
519
+ },
520
+ "mappings": {
521
+ "properties": {
522
+ "username": {
523
+ "type": "keyword",
524
+ "doc_values": true
525
+ },
526
+ "timestamp": {
527
+ "type": "date"
528
+ }
529
+ }
530
+ }
531
+ }
532
+ --------------------------------------------------
533
+
534
+ <1> This index is sorted by `username` first then by `timestamp`.
535
+ <2> ... in ascending order for the `username` field and in descending order for the `timestamp` field.
536
+
537
+ .. could be used to optimize these composite aggregations:
538
+
539
+ [source,console]
540
+ --------------------------------------------------
541
+ GET /_search
542
+ {
543
+ "size": 0,
544
+ "aggs" : {
545
+ "my_buckets": {
546
+ "composite" : {
547
+ "sources" : [
548
+ { "user_name": { "terms" : { "field": "user_name" } } } <1>
549
+ ]
550
+ }
551
+ }
552
+ }
553
+ }
554
+ --------------------------------------------------
555
+
556
+ <1> `user_name` is a prefix of the index sort and the order matches (`asc`).
557
+
558
+ [source,console]
559
+ --------------------------------------------------
560
+ GET /_search
561
+ {
562
+ "size": 0,
563
+ "aggs" : {
564
+ "my_buckets": {
565
+ "composite" : {
566
+ "sources" : [
567
+ { "user_name": { "terms" : { "field": "user_name" } } }, <1>
568
+ { "date": { "date_histogram": { "field": "timestamp", "calendar_interval": "1d", "order": "desc" } } } <2>
569
+ ]
570
+ }
571
+ }
572
+ }
573
+ }
574
+ --------------------------------------------------
575
+
576
+ <1> `user_name` is a prefix of the index sort and the order matches (`asc`).
577
+ <2> `timestamp` matches also the prefix and the order matches (`desc`).
578
+
579
+ In order to optimize the early termination it is advised to set `track_total_hits` in the request
580
+ to `false`. The number of total hits that match the request can be retrieved on the first request
581
+ and it would be costly to compute this number on every page:
582
+
583
+ [source,console]
584
+ --------------------------------------------------
585
+ GET /_search
586
+ {
587
+ "size": 0,
588
+ "track_total_hits": false,
589
+ "aggs" : {
590
+ "my_buckets": {
591
+ "composite" : {
592
+ "sources" : [
593
+ { "user_name": { "terms" : { "field": "user_name" } } },
594
+ { "date": { "date_histogram": { "field": "timestamp", "calendar_interval": "1d", "order": "desc" } } }
595
+ ]
596
+ }
597
+ }
598
+ }
599
+ }
600
+ --------------------------------------------------
601
+
602
+ Note that the order of the source is important, in the example below switching the `user_name` with the `timestamp`
603
+ would deactivate the sort optimization since this configuration wouldn't match the index sort specification.
604
+ If the order of sources do not matter for your use case you can follow these simple guidelines:
605
+
606
+ * Put the fields with the highest cardinality first.
607
+ * Make sure that the order of the field matches the order of the index sort.
608
+ * Put multi-valued fields last since they cannot be used for early termination.
609
+
610
+ WARNING: <<index-modules-index-sorting,index sort>> can slowdown indexing, it is very important to test index sorting
611
+ with your specific use case and dataset to ensure that it matches your requirement. If it doesn't note that `composite`
612
+ aggregations will also try to early terminate on non-sorted indices if the query matches all document (`match_all` query).
613
+
492
614
==== Sub-aggregations
493
615
494
616
Like any `multi-bucket` aggregations the `composite` aggregation can hold sub-aggregations.
@@ -501,6 +623,7 @@ per composite bucket:
501
623
--------------------------------------------------
502
624
GET /_search
503
625
{
626
+ "size": 0,
504
627
"aggs" : {
505
628
"my_buckets": {
506
629
"composite" : {
0 commit comments