Skip to content

Commit d77388f

Browse files
Adam Lockejrodewig
Adam Locke
andauthored
[DOCS] Add links to flattened datatype (#56794)
* Changes for #52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <[email protected]> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <[email protected]> Co-authored-by: James Rodewig <[email protected]>
1 parent b8a4e00 commit d77388f

File tree

3 files changed

+57
-48
lines changed

3 files changed

+57
-48
lines changed

docs/reference/ingest/processors/kv.asciidoc

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
11
[[kv-processor]]
22
=== KV Processor
3-
This processor helps automatically parse messages (or specific event fields) which are of the foo=bar variety.
4-
5-
For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`, you can parse those automatically by configuring:
3+
This processor helps automatically parse messages (or specific event fields) which are of the `foo=bar` variety.
64

5+
For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`, you can parse those fields automatically by configuring:
76

87
[source,js]
98
--------------------------------------------------
@@ -17,8 +16,10 @@ For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`
1716
--------------------------------------------------
1817
// NOTCONSOLE
1918

19+
TIP: Using the KV Processor can result in field names that you cannot control. Consider using the <<flattened>> datatype instead, which maps an entire object as a single field and allows for simple searches over its contents.
20+
2021
[[kv-options]]
21-
.Kv Options
22+
.KV Options
2223
[options="header"]
2324
|======
2425
| Name | Required | Default | Description

docs/reference/mapping.asciidoc

Lines changed: 27 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ A mapping definition has:
1717

1818
<<mapping-fields,Meta-fields>>::
1919

20-
Meta-fields are used to customize how a document's metadata associated is
20+
Meta-fields are used to customize how a document's associated metadata is
2121
treated. Examples of meta-fields include the document's
2222
<<mapping-index-field,`_index`>>, <<mapping-id-field,`_id`>>, and
2323
<<mapping-source-field,`_source`>> fields.
@@ -58,17 +58,16 @@ via the <<multi-fields>> parameter.
5858
[float]
5959
=== Settings to prevent mappings explosion
6060

61-
Defining too many fields in an index is a condition that can lead to a
61+
Defining too many fields in an index can lead to a
6262
mapping explosion, which can cause out of memory errors and difficult
63-
situations to recover from. This problem may be more common than expected.
64-
As an example, consider a situation in which every new document inserted
65-
introduces new fields. This is quite common with dynamic mappings.
66-
Every time a document contains new fields, those will end up in the index's
67-
mappings. This isn't worrying for a small amount of data, but it can become a
63+
situations to recover from.
64+
65+
Consider a situation where every new document inserted
66+
introduces new fields, such as with <<dynamic-mapping,dynamic mapping>>.
67+
Each new field is added to the index mapping, which can become a
6868
problem as the mapping grows.
69-
The following settings allow you to limit the number of field mappings that
70-
can be created manually or dynamically, in order to prevent bad documents from
71-
causing a mapping explosion:
69+
70+
Use the following settings to limit the number of field mappings (created manually or dynamically) and prevent documents from causing a mapping explosion:
7271

7372
`index.mapping.total_fields.limit`::
7473
The maximum number of fields in an index. Field and object mappings, as well as
@@ -84,26 +83,37 @@ If you increase this setting, we recommend you also increase the
8483
<<search-settings,`indices.query.bool.max_clause_count`>> setting, which
8584
limits the maximum number of <<query-dsl-bool-query,boolean clauses>> in a query.
8685
====
86+
+
87+
[TIP]
88+
====
89+
If your field mappings contain a large, arbitrary set of keys, consider using the <<flattened,flattened>> datatype.
90+
====
8791

8892
`index.mapping.depth.limit`::
8993
The maximum depth for a field, which is measured as the number of inner
9094
objects. For instance, if all fields are defined at the root object level,
9195
then the depth is `1`. If there is one object mapping, then the depth is
92-
`2`, etc. The default is `20`.
96+
`2`, etc. Default is `20`.
9397

98+
// tag::nested-fields-limit[]
9499
`index.mapping.nested_fields.limit`::
95-
The maximum number of distinct `nested` mappings in an index, defaults to `50`.
100+
The maximum number of distinct `nested` mappings in an index. The `nested` type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting
101+
limits the number of unique `nested` types per index. Default is `50`.
102+
// end::nested-fields-limit[]
96103

104+
// tag::nested-objects-limit[]
97105
`index.mapping.nested_objects.limit`::
98-
The maximum number of `nested` JSON objects within a single document across
99-
all nested types, defaults to 10000.
106+
The maximum number of nested JSON objects that a single document can contain across all
107+
`nested` types. This limit helps to prevent out of memory errors when a document contains too many nested
108+
objects. Default is `10000`.
109+
// end::nested-objects-limit[]
100110

101111
`index.mapping.field_name_length.limit`::
102-
Setting for the maximum length of a field name. The default value is
103-
Long.MAX_VALUE (no limit). This setting isn't really something that addresses
112+
Setting for the maximum length of a field name. This setting isn't really something that addresses
104113
mappings explosion but might still be useful if you want to limit the field length.
105114
It usually shouldn't be necessary to set this setting. The default is okay
106-
unless a user starts to add a huge number of fields with really long names.
115+
unless a user starts to add a huge number of fields with really long names. Default is
116+
`Long.MAX_VALUE` (no limit).
107117

108118
[float]
109119
== Dynamic mapping

docs/reference/mapping/types/nested.asciidoc

Lines changed: 25 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,17 @@
55
++++
66

77
The `nested` type is a specialised version of the <<object,`object`>> datatype
8-
that allows arrays of objects to be indexed in a way that they can be queried
8+
that allows arrays of objects to be indexed in a way that they can be queried
99
independently of each other.
1010

11+
TIP: When ingesting key-value pairs with a large, arbitrary set of keys, you might consider modeling each key-value pair as its own nested document with `key` and `value` fields. Instead, consider using the <<flattened,flattened>> datatype, which maps an entire object as a single field and allows for simple searches over its contents.
12+
Nested documents and queries are typically expensive, so using the `flattened` datatype for this use case is a better option.
13+
14+
[[nested-arrays-flattening-objects]]
1115
==== How arrays of objects are flattened
1216

13-
Arrays of inner <<object,`object` fields>> do not work the way you may expect.
14-
Lucene has no concept of inner objects, so Elasticsearch flattens object
15-
hierarchies into a simple list of field names and values. For instance, the
17+
Elasticsearch has no concept of inner objects. Therefore, it flattens object
18+
hierarchies into a simple list of field names and values. For instance, consider the
1619
following document:
1720

1821
[source,console]
@@ -35,7 +38,7 @@ PUT my_index/_doc/1
3538

3639
<1> The `user` field is dynamically added as a field of type `object`.
3740

38-
would be transformed internally into a document that looks more like this:
41+
The previous document would be transformed internally into a document that looks more like this:
3942

4043
[source,js]
4144
--------------------------------------------------
@@ -71,10 +74,12 @@ GET my_index/_search
7174
==== Using `nested` fields for arrays of objects
7275

7376
If you need to index arrays of objects and to maintain the independence of
74-
each object in the array, you should use the `nested` datatype instead of the
75-
<<object,`object`>> datatype. Internally, nested objects index each object in
77+
each object in the array, use the `nested` datatype instead of the
78+
<<object,`object`>> datatype.
79+
80+
Internally, nested objects index each object in
7681
the array as a separate hidden document, meaning that each nested object can be
77-
queried independently of the others, with the <<query-dsl-nested-query,`nested` query>>:
82+
queried independently of the others with the <<query-dsl-nested-query,`nested` query>>:
7883

7984
[source,console]
8085
--------------------------------------------------
@@ -152,6 +157,8 @@ GET my_index/_search
152157
<4> `inner_hits` allow us to highlight the matching nested documents.
153158

154159

160+
[[nested-accessing-documents]]
161+
==== Interacting with `nested` documents
155162
Nested documents can be:
156163

157164
* queried with the <<query-dsl-nested-query,`nested`>> query.
@@ -207,29 +214,20 @@ document as standard (flat) fields. Defaults to `false`.
207214
[float]
208215
=== Limits on `nested` mappings and objects
209216

210-
As described earlier, each nested object is indexed as a separate document under the hood.
211-
Continuing with the example above, if we indexed a single document containing 100 `user` objects,
212-
then 101 Lucene documents would be created -- one for the parent document, and one for each
217+
As described earlier, each nested object is indexed as a separate Lucene document.
218+
Continuing with the previous example, if we indexed a single document containing 100 `user` objects,
219+
then 101 Lucene documents would be created: one for the parent document, and one for each
213220
nested object. Because of the expense associated with `nested` mappings, Elasticsearch puts
214221
settings in place to guard against performance problems:
215222

216-
`index.mapping.nested_fields.limit`::
217-
218-
The `nested` type should only be used in special cases, when arrays of objects need to be
219-
queried independently of each other. To safeguard against poorly designed mappings, this setting
220-
limits the number of unique `nested` types per index. In our example, the `user` mapping would
221-
count as only 1 towards this limit. Defaults to 50.
222-
223-
`index.mapping.nested_objects.limit`::
224-
225-
This setting limits the number of nested objects that a single document may contain across all
226-
`nested` types, in order to prevent out of memory errors when a document contains too many nested
227-
objects. To illustrate how the setting works, say we added another `nested` type called `comments`
228-
to our example mapping above. Then for each document, the combined number of `user` and `comment`
229-
objects it contains must be below the limit. Defaults to 10000.
223+
include::{docdir}/mapping.asciidoc[tag=nested-fields-limit]
230224

231-
Additional background on these settings, including information on their default values, can be found
232-
in <<mapping-limit-settings>>.
225+
In the previous example, the `user` mapping would count as only 1 towards this limit.
233226

227+
include::{docdir}/mapping.asciidoc[tag=nested-objects-limit]
234228

229+
To illustrate how this setting works, consider adding another `nested` type called `comments`
230+
to the previous example mapping. For each document, the combined number of `user` and `comment`
231+
objects it contains must be below the limit.
235232

233+
See <<mapping-limit-settings>> regarding additional settings for preventing mappings explosion.

0 commit comments

Comments
 (0)