Skip to content

Commit 9fb427d

Browse files
committed
Add documentation for JSON fields. (#35281)
* Add documentation for JSON fields.
1 parent 963d6c8 commit 9fb427d

File tree

2 files changed

+200
-0
lines changed

2 files changed

+200
-0
lines changed

docs/reference/mapping/types.asciidoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,8 @@ string:: <<text,`text`>> and <<keyword,`keyword`>>
4444

4545
<<alias>>:: Defines an alias to an existing field.
4646

47+
<<json>>:: Allows an entire JSON object to be indexed as a single field.
48+
4749
<<rank-feature>>:: Record numeric feature to boost hits at query time.
4850

4951
<<rank-features>>:: Record numeric features to boost hits at query time.
@@ -87,6 +89,8 @@ include::types/geo-shape.asciidoc[]
8789

8890
include::types/ip.asciidoc[]
8991

92+
include::types/json.asciidoc[]
93+
9094
include::types/keyword.asciidoc[]
9195

9296
include::types/nested.asciidoc[]
Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
[[json]]
2+
=== JSON datatype
3+
4+
experimental[The `json` field type is experimental and may be changed in a breaking way in future releases.]
5+
6+
By default, each subfield in an object is mapped and indexed separately. If
7+
the names or types of the subfields are not known in advance, then they are
8+
<<dynamic-mapping, mapped dynamically>>.
9+
10+
The `json` type provides an alternative approach, where the entire object is
11+
mapped as a single field. Given an object, the `json` mapping will parse out
12+
its leaf values and index them into one field. The object's contents can then
13+
be searched through simple keyword-style queries.
14+
15+
This data type can be useful for indexing objects with a very large number of
16+
distinct keys. Compared to mapping each field separately, `json` fields have
17+
the following advantages:
18+
19+
- Only one field mapping is created for the whole object, which can help
20+
prevent a <<mapping-limit-settings, mappings explosion>> due to a large
21+
number of field mappings.
22+
- A `json` field may take up less space in the index, as only one underlying
23+
field is created.
24+
25+
However, `json` fields present a trade-off in terms of search functionality.
26+
Only basic queries are allowed, with no support for numeric range queries or
27+
aggregations. Further information on the limitations can be found in the
28+
<<supported-operations, Supported operations>> section.
29+
30+
NOTE: The `json` mapping type should **not** be used for indexing all JSON
31+
content, as it provides only limited search functionality. The default
32+
approach, where each subfield has its own entry in the mappings, works well in
33+
the majority of cases.
34+
35+
A `json` field can be created as follows:
36+
[source,js]
37+
--------------------------------
38+
PUT bug_reports
39+
{
40+
"mappings": {
41+
"properties": {
42+
"title": {
43+
"type": "text"
44+
},
45+
"labels": {
46+
"type": "json"
47+
}
48+
}
49+
}
50+
}
51+
52+
POST bug_reports/_doc/1
53+
{
54+
"title": "Results are not sorted correctly.",
55+
"labels": {
56+
"priority": "urgent",
57+
"release": ["v1.2.5", "v1.3.0"],
58+
"timestamp": {
59+
"created": 1541458026,
60+
"closed": 1541457010
61+
}
62+
}
63+
}
64+
--------------------------------
65+
// CONSOLE
66+
// TESTSETUP
67+
68+
During indexing, tokens are created for each leaf value in the JSON object. The
69+
values are indexed as string keywords, without analysis or special handling for
70+
numbers or dates.
71+
72+
Querying the top-level `json` field searches all leaf values in the object:
73+
[source,js]
74+
--------------------------------
75+
POST bug_reports/_search
76+
{
77+
"query": {
78+
"term": {"labels": "urgent"}
79+
}
80+
}
81+
--------------------------------
82+
// CONSOLE
83+
84+
To query on a specific key in the JSON object, object dot notation is used:
85+
[source,js]
86+
--------------------------------
87+
POST bug_reports/_search
88+
{
89+
"query": {
90+
"term": {"labels.release": "v1.3.0"}
91+
}
92+
}
93+
--------------------------------
94+
// CONSOLE
95+
96+
[[supported-operations]]
97+
==== Supported operations
98+
99+
Currently, `json` fields can be used with the following query types:
100+
101+
- `term`, `terms`, and `terms_set`
102+
- `prefix`
103+
- `range`
104+
- `match` and `multi_match`
105+
- `query_string` and `simple_query_string`
106+
- `exists`
107+
108+
When querying, it is not possible to refer to field keys using wildcards, as in
109+
`{ "term": {"labels.time*": 1541457010}}`. Note that all queries, including
110+
`range`, treat the values as string keywords.
111+
112+
Aggregating, highlighting, or sorting on a `json` field is not supported.
113+
114+
Finally, because of the way leaf values are stored in the index, the null
115+
character `\0` is not allowed to appear in the keys of the JSON object.
116+
117+
[[stored-fields]]
118+
==== Stored fields
119+
120+
If the <<mapping-store,`store`>> option is enabled, the entire JSON object will
121+
be stored in pretty-printed format. It can be retrieved through the top-level
122+
`json` field:
123+
124+
[source,js]
125+
--------------------------------
126+
POST bug_reports/_search
127+
{
128+
"query": { "match": { "title": "results not sorted" }},
129+
"stored_fields": ["labels"]
130+
}
131+
--------------------------------
132+
// CONSOLE
133+
134+
Field keys cannot be used to load stored content. For example, specifying
135+
`"stored_fields": ["labels.timestamp"]` will return an empty list.
136+
137+
[[json-params]]
138+
==== Parameters for JSON fields
139+
140+
Because of the similarities in the way values are indexed, the `json` type
141+
shares many mapping options with <<keyword, `keyword`>>. The following
142+
parameters are accepted:
143+
144+
[horizontal]
145+
146+
<<mapping-boost,`boost`>>::
147+
148+
Mapping field-level query time boosting. Accepts a floating point number,
149+
defaults to `1.0`.
150+
151+
`depth_limit`::
152+
153+
The maximum allowed depth of the JSON field, in terms of nested inner
154+
objects. If a JSON field exceeds this limit, then an error will be
155+
thrown. Defaults to `20`.
156+
157+
<<ignore-above,`ignore_above`>>::
158+
159+
Leaf values longer than this limit will not be indexed. By default, there
160+
is no limit and all values will be indexed. Note that this limit applies
161+
to the leaf values within the JSON field, and not the length of the entire
162+
field.
163+
164+
<<mapping-index,`index`>>::
165+
166+
Determines if the field should be searchable. Accepts `true` (default) or
167+
`false`.
168+
169+
<<index-options,`index_options`>>::
170+
171+
What information should be stored in the index for scoring purposes.
172+
Defaults to `docs` but can also be set to `freqs` to take term frequency
173+
into account when computing scores.
174+
175+
<<null-value,`null_value`>>::
176+
177+
A string value which is substituted for any explicit `null` values within
178+
the JSON field. Defaults to `null`, which means null sfields are treated as
179+
if it were missing.
180+
181+
<<similarity,`similarity`>>::
182+
183+
Which scoring algorithm or _similarity_ should be used. Defaults
184+
to `BM25`.
185+
186+
`split_queries_on_whitespace`::
187+
188+
Whether <<full-text-queries,full text queries>> should split the input on
189+
whitespace when building a query for this field. Accepts `true` or `false`
190+
(default).
191+
192+
<<mapping-store,`store`>>::
193+
194+
Whether the field value should be stored and retrievable separately from
195+
the <<mapping-source-field,`_source`>> field. Accepts `true` or `false`
196+
(default).

0 commit comments

Comments
 (0)