@@ -116,6 +116,7 @@ Example:
116
116
--------------------------------------------------
117
117
GET /_search
118
118
{
119
+ "size": 0,
119
120
"aggs" : {
120
121
"my_buckets": {
121
122
"composite" : {
@@ -134,6 +135,7 @@ Like the `terms` aggregation it is also possible to use a script to create the v
134
135
--------------------------------------------------
135
136
GET /_search
136
137
{
138
+ "size": 0,
137
139
"aggs" : {
138
140
"my_buckets": {
139
141
"composite" : {
@@ -168,6 +170,7 @@ Example:
168
170
--------------------------------------------------
169
171
GET /_search
170
172
{
173
+ "size": 0,
171
174
"aggs" : {
172
175
"my_buckets": {
173
176
"composite" : {
@@ -186,6 +189,7 @@ The values are built from a numeric field or a script that return numerical valu
186
189
--------------------------------------------------
187
190
GET /_search
188
191
{
192
+ "size": 0,
189
193
"aggs" : {
190
194
"my_buckets": {
191
195
"composite" : {
@@ -218,6 +222,7 @@ is specified by date/time expression:
218
222
--------------------------------------------------
219
223
GET /_search
220
224
{
225
+ "size": 0,
221
226
"aggs" : {
222
227
"my_buckets": {
223
228
"composite" : {
@@ -247,6 +252,7 @@ the format specified with the format parameter:
247
252
--------------------------------------------------
248
253
GET /_search
249
254
{
255
+ "size": 0,
250
256
"aggs" : {
251
257
"my_buckets": {
252
258
"composite" : {
@@ -289,6 +295,7 @@ For example:
289
295
--------------------------------------------------
290
296
GET /_search
291
297
{
298
+ "size": 0,
292
299
"aggs" : {
293
300
"my_buckets": {
294
301
"composite" : {
@@ -311,6 +318,7 @@ in the composite buckets.
311
318
--------------------------------------------------
312
319
GET /_search
313
320
{
321
+ "size": 0,
314
322
"aggs" : {
315
323
"my_buckets": {
316
324
"composite" : {
@@ -340,6 +348,7 @@ For example:
340
348
--------------------------------------------------
341
349
GET /_search
342
350
{
351
+ "size": 0,
343
352
"aggs" : {
344
353
"my_buckets": {
345
354
"composite" : {
@@ -366,6 +375,7 @@ It is possible to include them in the response by setting `missing_bucket` to
366
375
--------------------------------------------------
367
376
GET /_search
368
377
{
378
+ "size": 0,
369
379
"aggs" : {
370
380
"my_buckets": {
371
381
"composite" : {
@@ -391,7 +401,7 @@ first 10 composite buckets created from the values source.
391
401
The response contains the values for each composite bucket in an array containing the values extracted
392
402
from each value source.
393
403
394
- ==== After
404
+ ==== Pagination
395
405
396
406
If the number of composite buckets is too high (or unknown) to be returned in a single response
397
407
it is possible to split the retrieval in multiple requests.
@@ -405,6 +415,7 @@ For example:
405
415
--------------------------------------------------
406
416
GET /_search
407
417
{
418
+ "size": 0,
408
419
"aggs" : {
409
420
"my_buckets": {
410
421
"composite" : {
@@ -470,6 +481,7 @@ round of result can be retrieved with:
470
481
--------------------------------------------------
471
482
GET /_search
472
483
{
484
+ "size": 0,
473
485
"aggs" : {
474
486
"my_buckets": {
475
487
"composite" : {
@@ -487,6 +499,116 @@ GET /_search
487
499
488
500
<1> Should restrict the aggregation to buckets that sort **after** the provided values.
489
501
502
+ ==== Early termination
503
+
504
+ For optimal performance the <<index-modules-index-sorting,index sort>> should be set on the index so that it matches
505
+ parts or fully the source order in the composite aggregation.
506
+ For instance the following index sort:
507
+
508
+ [source,console]
509
+ --------------------------------------------------
510
+ PUT twitter
511
+ {
512
+ "settings" : {
513
+ "index" : {
514
+ "sort.field" : ["username", "timestamp"], <1>
515
+ "sort.order" : ["asc", "desc"] <2>
516
+ }
517
+ },
518
+ "mappings": {
519
+ "properties": {
520
+ "username": {
521
+ "type": "keyword",
522
+ "doc_values": true
523
+ },
524
+ "timestamp": {
525
+ "type": "date"
526
+ }
527
+ }
528
+ }
529
+ }
530
+ --------------------------------------------------
531
+
532
+ <1> This index is sorted by `username` first then by `timestamp`.
533
+ <2> ... in ascending order for the `username` field and in descending order for the `timestamp` field.
534
+
535
+ .. could be used to optimize these composite aggregations:
536
+
537
+ [source,console]
538
+ --------------------------------------------------
539
+ GET /_search
540
+ {
541
+ "size": 0,
542
+ "aggs" : {
543
+ "my_buckets": {
544
+ "composite" : {
545
+ "sources" : [
546
+ { "user_name": { "terms" : { "field": "user_name" } } } <1>
547
+ ]
548
+ }
549
+ }
550
+ }
551
+ }
552
+ --------------------------------------------------
553
+
554
+ <1> `user_name` is a prefix of the index sort and the order matches (`asc`).
555
+
556
+ [source,console]
557
+ --------------------------------------------------
558
+ GET /_search
559
+ {
560
+ "size": 0,
561
+ "aggs" : {
562
+ "my_buckets": {
563
+ "composite" : {
564
+ "sources" : [
565
+ { "user_name": { "terms" : { "field": "user_name" } } }, <1>
566
+ { "date": { "date_histogram": { "field": "timestamp", "calendar_interval": "1d", "order": "desc" } } } <2>
567
+ ]
568
+ }
569
+ }
570
+ }
571
+ }
572
+ --------------------------------------------------
573
+
574
+ <1> `user_name` is a prefix of the index sort and the order matches (`asc`).
575
+ <2> `timestamp` matches also the prefix and the order matches (`desc`).
576
+
577
+ In order to optimize the early termination it is advised to set `track_total_hits` in the request
578
+ to `false`. The number of total hits that match the request can be retrieved on the first request
579
+ and it would be costly to compute this number on every page:
580
+
581
+ [source,console]
582
+ --------------------------------------------------
583
+ GET /_search
584
+ {
585
+ "size": 0,
586
+ "track_total_hits": false,
587
+ "aggs" : {
588
+ "my_buckets": {
589
+ "composite" : {
590
+ "sources" : [
591
+ { "user_name": { "terms" : { "field": "user_name" } } },
592
+ { "date": { "date_histogram": { "field": "timestamp", "calendar_interval": "1d", "order": "desc" } } }
593
+ ]
594
+ }
595
+ }
596
+ }
597
+ }
598
+ --------------------------------------------------
599
+
600
+ Note that the order of the source is important, in the example below switching the `user_name` with the `timestamp`
601
+ would deactivate the sort optimization since this configuration wouldn't match the index sort specification.
602
+ If the order of sources do not matter for your use case you can follow these simple guidelines:
603
+
604
+ * Put the fields with the highest cardinality first.
605
+ * Make sure that the order of the field matches the order of the index sort.
606
+ * Put multi-valued fields last since they cannot be used for early termination.
607
+
608
+ WARNING: <<index-modules-index-sorting,index sort>> can slowdown indexing, it is very important to test index sorting
609
+ with your specific use case and dataset to ensure that it matches your requirement. If it doesn't note that `composite`
610
+ aggregations will also try to early terminate on non-sorted indices if the query matches all document (`match_all` query).
611
+
490
612
==== Sub-aggregations
491
613
492
614
Like any `multi-bucket` aggregations the `composite` aggregation can hold sub-aggregations.
@@ -499,6 +621,7 @@ per composite bucket:
499
621
--------------------------------------------------
500
622
GET /_search
501
623
{
624
+ "size": 0,
502
625
"aggs" : {
503
626
"my_buckets": {
504
627
"composite" : {
0 commit comments