Skip to content

Commit 6e01367

Browse files
committed
Add docs for the fields retrieval API. (#58787)
This PR adds docs for the `fields` parameter. We now present `fields` as the preferred way to load specific fields in a search, with `docvalue_fields` and `stored_fields` as other options to look into. Source filtering is no longer featured prominently, and its section is moved to the end.
1 parent 50150f2 commit 6e01367

File tree

6 files changed

+231
-39
lines changed

6 files changed

+231
-39
lines changed

docs/build.gradle

+11-6
Original file line numberDiff line numberDiff line change
@@ -144,23 +144,28 @@ Closure setupTwitter = { String name, int count ->
144144
type: date
145145
likes:
146146
type: long
147+
location:
148+
properties:
149+
city:
150+
type: keyword
151+
country:
152+
type: keyword
147153
- do:
148154
bulk:
149155
index: twitter
150156
refresh: true
151157
body: |'''
152158
for (int i = 0; i < count; i++) {
153-
String user, text
159+
String body
154160
if (i == 0) {
155-
user = 'kimchy'
156-
text = 'trying out Elasticsearch'
161+
body = """{"user": "kimchy", "message": "trying out Elasticsearch", "date": "2009-11-15T14:12:12", "likes": 0,
162+
"location": { "city": "Amsterdam", "country": "Netherlands" }}"""
157163
} else {
158-
user = 'test'
159-
text = "some message with the number $i"
164+
body = """{"user": "test", "message": "some message with the number $i", "date": "2009-11-15T14:12:12", "likes": $i}"""
160165
}
161166
buildRestTests.setups[name] += """
162167
{"index":{"_id": "$i"}}
163-
{"user": "$user", "message": "$text", "date": "2009-11-15T14:12:12", "likes": $i}"""
168+
$body"""
164169
}
165170
}
166171
setupTwitter('twitter', 5)

docs/reference/aggregations/misc.asciidoc

+8-8
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,8 @@ GET /twitter/_search?typed_keys
105105
"aggregations": {
106106
"top_users": {
107107
"top_hits": {
108-
"size": 1
108+
"size": 1,
109+
"_source": ["user", "likes", "message"]
109110
}
110111
}
111112
}
@@ -141,9 +142,8 @@ In the response, the aggregations names will be changed to respectively `date_hi
141142
"_id": "0",
142143
"_score": 1.0,
143144
"_source": {
144-
"date": "2009-11-15T14:12:12",
145-
"message": "trying out Elasticsearch",
146145
"user": "kimchy",
146+
"message": "trying out Elasticsearch",
147147
"likes": 0
148148
}
149149
}
@@ -167,12 +167,12 @@ request. This is the case for Terms, Significant Terms and Percentiles aggregati
167167
also contains information about the type of the targeted field: `lterms` (for a terms aggregation on a Long field),
168168
`sigsterms` (for a significant terms aggregation on a String field), `tdigest_percentiles` (for a percentile
169169
aggregation based on the TDigest algorithm).
170-
170+
171171

172172
[[indexing-aggregation-results]]
173173
== Indexing aggregation results with {transforms}
174-
175-
<<transforms,{transforms-cap}>> enable you to convert existing {es} indices
176-
into summarized indices, which provide opportunities for new insights and
177-
analytics. You can use {transforms} to persistently index your aggregation
174+
175+
<<transforms,{transforms-cap}>> enable you to convert existing {es} indices
176+
into summarized indices, which provide opportunities for new insights and
177+
analytics. You can use {transforms} to persistently index your aggregation
178178
results into entity-centric indices.

docs/reference/docs/get.asciidoc

+5-1
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,11 @@ The API returns the following result:
241241
"user" : "kimchy",
242242
"date" : "2009-11-15T14:12:12",
243243
"likes": 0,
244-
"message" : "trying out Elasticsearch"
244+
"message" : "trying out Elasticsearch",
245+
"location" : {
246+
"city": "Amsterdam",
247+
"country": "Netherlands"
248+
}
245249
}
246250
}
247251
--------------------------------------------------

docs/reference/modules/cross-cluster-search.asciidoc

+4-6
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,8 @@ GET /cluster_one:twitter/_search
7676
"match": {
7777
"user": "kimchy"
7878
}
79-
}
79+
},
80+
"_source": ["user", "message", "likes"]
8081
}
8182
--------------------------------------------------
8283
// TEST[continued]
@@ -113,7 +114,6 @@ The API returns the following response:
113114
"_score": 1,
114115
"_source": {
115116
"user": "kimchy",
116-
"date": "2009-11-15T14:12:12",
117117
"message": "trying out Elasticsearch",
118118
"likes": 0
119119
}
@@ -147,7 +147,8 @@ GET /twitter,cluster_one:twitter,cluster_two:twitter/_search
147147
"match": {
148148
"user": "kimchy"
149149
}
150-
}
150+
},
151+
"_source": ["user", "message", "likes"]
151152
}
152153
--------------------------------------------------
153154
// TEST[continued]
@@ -184,7 +185,6 @@ The API returns the following response:
184185
"_score": 2,
185186
"_source": {
186187
"user": "kimchy",
187-
"date": "2009-11-15T14:12:12",
188188
"message": "trying out Elasticsearch",
189189
"likes": 0
190190
}
@@ -195,7 +195,6 @@ The API returns the following response:
195195
"_score": 1,
196196
"_source": {
197197
"user": "kimchy",
198-
"date": "2009-11-15T14:12:12",
199198
"message": "trying out Elasticsearch",
200199
"likes": 0
201200
}
@@ -206,7 +205,6 @@ The API returns the following response:
206205
"_score": 1,
207206
"_source": {
208207
"user": "kimchy",
209-
"date": "2009-11-15T14:12:12",
210208
"message": "trying out Elasticsearch",
211209
"likes": 0
212210
}

docs/reference/search/search-fields.asciidoc

+193-16
Original file line numberDiff line numberDiff line change
@@ -4,33 +4,212 @@
44

55
By default, each hit in the search response includes the document
66
<<mapping-source-field,`_source`>>, which is the entire JSON object that was
7-
provided when indexing the document. If you only need certain source fields in
8-
the search response, you can use the <<source-filtering,source filtering>> to
9-
restrict what parts of the source are returned.
7+
provided when indexing the document. To retrieve specific fields in the search
8+
response, you can use the `fields` parameter:
109

11-
Returning fields using only the document source has some limitations:
10+
[source,console]
11+
----
12+
POST twitter/_search
13+
{
14+
"query": {
15+
"match": {
16+
"message": "elasticsearch"
17+
}
18+
},
19+
"fields": ["user", "date"],
20+
"_source": false
21+
}
22+
----
23+
// TEST[setup:twitter]
1224

13-
* The `_source` field does not include <<multi-fields, multi-fields>> or
14-
<<alias, field aliases>>. Likewise, a field in the source does not contain
15-
values copied using the <<copy-to,`copy_to`>> mapping parameter.
16-
* Since the `_source` is stored as a single field in Lucene, the whole source
17-
object must be loaded and parsed, even if only a small number of fields are
18-
needed.
25+
The `fields` parameter consults both a document's `_source` and the index
26+
mappings to load and return values. Because it makes use of the mappings,
27+
`fields` has some advantages over referencing the `_source` directly: it
28+
accepts <<multi-fields, multi-fields>> and <<alias, field aliases>>, and
29+
also formats field values like dates in a consistent way.
1930

20-
To avoid these limitations, you can:
31+
A document's `_source` is stored as a single field in Lucene. So the whole
32+
`_source` object must be loaded and parsed even if only a small number of
33+
fields are requested. To avoid this limitation, you can try another option for
34+
loading fields:
2135

2236
* Use the <<docvalue-fields, `docvalue_fields`>>
2337
parameter to get values for selected fields. This can be a good
2438
choice when returning a fairly small number of fields that support doc values,
2539
such as keywords and dates.
26-
* Use the <<request-body-search-stored-fields, `stored_fields`>> parameter to get the values for specific stored fields. (Fields that use the <<mapping-store,`store`>> mapping option.)
40+
* Use the <<request-body-search-stored-fields, `stored_fields`>> parameter to
41+
get the values for specific stored fields (fields that use the
42+
<<mapping-store,`store`>> mapping option).
2743

28-
You can find more detailed information on each of these methods in the
44+
You can find more detailed information on each of these methods in the
2945
following sections:
3046

31-
* <<source-filtering>>
47+
* <<search-fields-param>>
3248
* <<docvalue-fields>>
3349
* <<stored-fields>>
50+
* <<source-filtering>>
51+
52+
[discrete]
53+
[[search-fields-param]]
54+
=== Fields
55+
56+
The `fields` parameter allows for retrieving a list of document fields in
57+
the search response. It consults both the document `_source` and the index
58+
mappings to return each value in a standardized way that matches its mapping
59+
type. By default, date fields are formatted according to the
60+
<<mapping-date-format,date format>> parameter in their mappings.
61+
62+
.*Example*
63+
[%collapsible]
64+
====
65+
The following search request uses the `fields` parameter to retrieve values
66+
for the `user` field, all fields starting with `location.`, and the
67+
`date` field:
68+
69+
[source,console]
70+
----
71+
POST twitter/_search
72+
{
73+
"query": {
74+
"match": {
75+
"message": "elasticsearch"
76+
}
77+
},
78+
"fields": [
79+
"user",
80+
"location.*", <1>
81+
{
82+
"field": "date",
83+
"format": "epoch_millis" <2>
84+
}
85+
],
86+
"_source": false
87+
}
88+
----
89+
// TEST[continued]
90+
91+
<1> Both full field names and wildcard patterns are accepted.
92+
<2> Using object notation, you can pass a `format` parameter to apply a custom
93+
format for the field's values. This is currently supported for
94+
<<date,`date` fields>> and <<date_nanos, `date_nanos` fields>>, which
95+
accept a <<mapping-date-format,date format>>.
96+
97+
The values are returned as a flat list in the `fields` section in each hit:
98+
99+
[source,console-result]
100+
----
101+
{
102+
"took" : 2,
103+
"timed_out" : false,
104+
"_shards" : {
105+
"total" : 1,
106+
"successful" : 1,
107+
"skipped" : 0,
108+
"failed" : 0
109+
},
110+
"hits" : {
111+
"total" : {
112+
"value" : 1,
113+
"relation" : "eq"
114+
},
115+
"max_score" : 1.0,
116+
"hits" : [
117+
{
118+
"_index" : "twitter",
119+
"_id" : "0",
120+
"_score" : 1.0,
121+
"fields" : {
122+
"user" : [
123+
"kimchy"
124+
],
125+
"date" : [
126+
"1258294332000"
127+
],
128+
"location.city": [
129+
"Amsterdam"
130+
],
131+
"location.country": [
132+
"Netherlands"
133+
]
134+
}
135+
}
136+
]
137+
}
138+
}
139+
----
140+
// TESTRESPONSE[s/"took" : 2/"took": $body.took/]
141+
// TESTRESPONSE[s/"max_score" : 1.0/"max_score" : $body.hits.max_score/]
142+
// TESTRESPONSE[s/"_score" : 1.0/"_score" : $body.hits.hits.0._score/]
143+
144+
Only leaf fields are returned -- `fields` does not allow for fetching entire
145+
objects.
146+
147+
====
148+
149+
The `fields` parameter handles field types like <<alias, field aliases>> and
150+
<<constant-keyword, `constant_keyword`>> whose values aren't always present in
151+
the `_source`. Other mapping options are also respected, including
152+
<<ignore-above, `ignore_above`>>, <<ignore-malformed, `ignore_malformed`>> and
153+
<<null-value, `null_value`>>.
154+
155+
[discrete]
156+
[[docvalue-fields]]
157+
=== Doc value fields
158+
159+
You can use the <<docvalue-fields,`docvalue_fields`>> parameter to return
160+
<<doc-values,doc values>> for one or more fields in the search response.
161+
162+
Doc values store the same values as the `_source` but in an on-disk,
163+
column-based structure that's optimized for sorting and aggregations. Since each
164+
field is stored separately, {es} only reads the field values that were requested
165+
and can avoid loading the whole document `_source`.
166+
167+
Doc values are stored for supported fields by default. However, doc values are
168+
not supported for <<text,`text`>> or
169+
{plugins}/mapper-annotated-text-usage.html[`text_annotated`] fields.
170+
171+
.*Example*
172+
[%collapsible]
173+
====
174+
The following search request uses the `docvalue_fields` parameter to retrieve
175+
doc values for the `user` field, all fields starting with `location.`, and the
176+
`date` field:
177+
178+
179+
[source,console]
180+
----
181+
GET twitter/_search
182+
{
183+
"query": {
184+
"match": {
185+
"message": "elasticsearch"
186+
}
187+
},
188+
"docvalue_fields": [
189+
"user",
190+
"location.*", <1>
191+
{
192+
"field": "date",
193+
"format": "epoch_millis" <2>
194+
}
195+
]
196+
}
197+
----
198+
// TEST[continued]
199+
200+
<1> Both full field names and wildcard patterns are accepted.
201+
<2> Using object notation, you can pass a `format` parameter to apply a custom
202+
format for the field's doc values. <<date,Date fields>> support a
203+
<<mapping-date-format,date `format`>>. <<number,Numeric fields>> support a
204+
https://docs.oracle.com/javase/8/docs/api/java/text/DecimalFormat.html[DecimalFormat
205+
pattern]. Other field datatypes do not support the `format` parameter.
206+
====
207+
208+
TIP: You cannot use the `docvalue_fields` parameter to retrieve doc values for
209+
nested objects. If you specify a nested object, the search returns an empty
210+
array (`[ ]`) for the field. To access nested fields, use the
211+
<<request-body-search-inner-hits, `inner_hits`>> parameter's `docvalue_fields`
212+
property.
34213

35214
[discrete]
36215
[[source-filtering]]
@@ -122,7 +301,6 @@ GET /_search
122301
----
123302
====
124303

125-
126304
[discrete]
127305
[[docvalue-fields]]
128306
=== Doc value fields
@@ -184,7 +362,6 @@ array (`[ ]`) for the field. To access nested fields, use the
184362
<<request-body-search-inner-hits, `inner_hits`>> parameter's `docvalue_fields`
185363
property.
186364

187-
188365
[discrete]
189366
[[stored-fields]]
190367
=== Stored fields

0 commit comments

Comments
 (0)