Skip to content

Add support for dots in field names for metrics usecases #86166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
May 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
e4c9458
Add support for dots in field names for metrics usecases
javanna Apr 25, 2022
5d34e0f
Merge branch 'main' into enhancement/collapsed_objects
javanna Apr 26, 2022
af1d3fa
checkstyle
javanna Apr 26, 2022
64cb72b
iter
javanna Apr 26, 2022
b96ae79
add test
javanna Apr 26, 2022
8d47bfe
spotless
javanna Apr 26, 2022
c6c3b2c
Merge branch 'main' into enhancement/collapsed_objects
javanna Apr 26, 2022
f825e00
more tests
javanna Apr 26, 2022
7d12ca7
spotless
javanna Apr 26, 2022
227fe53
iter
javanna Apr 26, 2022
6175fc2
iter
javanna Apr 27, 2022
9761211
spotless
javanna Apr 28, 2022
33d570f
docs
javanna Apr 28, 2022
5034bc1
Update docs/changelog/86166.yaml
javanna Apr 28, 2022
4ebdd8f
Update docs/changelog/86166.yaml
javanna Apr 28, 2022
7a6143d
Update docs/changelog/86166.yaml
javanna Apr 28, 2022
3acbdfe
iter
javanna Apr 28, 2022
f48a4a1
add array of objects tests
javanna Apr 28, 2022
eab7d83
typo
javanna Apr 28, 2022
4ddb176
rename
javanna Apr 28, 2022
a5b5f9c
line length
javanna Apr 28, 2022
4df6111
changelog
javanna Apr 28, 2022
1430aff
rename leftover
javanna Apr 28, 2022
17bc4dd
spotless
javanna Apr 28, 2022
e5b0e20
Merge branch 'main' into enhancement/collapsed_objects
javanna May 16, 2022
ef89e05
iter
javanna May 16, 2022
df40355
changelog
javanna May 16, 2022
9e48d1b
iter
javanna May 16, 2022
cd3a6e5
iter
javanna May 16, 2022
cc7b420
add yaml test
javanna May 16, 2022
1ebb167
add test for synthetic source
javanna May 17, 2022
153a150
add another test for synthetic source
javanna May 17, 2022
6db29eb
update changelog
javanna May 17, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions docs/changelog/86166.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
pr: 86166
summary: Add support for dots in field names for metrics usecases
area: Mapping
type: feature
issues:
- 63530
highlight:
title: Add support for dots in field names for metrics usecases
body: |-
Metrics data can often be made of several fields with dots in their names,
sharing common prefixes, like in the following example:

```
{
"metrics.time" : 10,
"metrics.time.min" : 1,
"metrics.time.max" : 500
}
```

Such format causes a mapping conflict as the `metrics.time` holds a value,
but it also needs to be mapped as an object in order to hold the `min` and
`max` leaf fields.

A new object mapping parameter called `subobjects`, which defaults to `true`,
has been introduced to preserve dots in field names. An object with `subobjects`
set to `false` can only ever hold leaf sub-fields and no further objects. The
following example shows how it can be configured in the mappings for the
`metrics` object:

```
{
"mappings": {
"properties" : {
"metrics" : {
"type" : "object",
"subobjects" : false
}
}
}
}
```

With this configuration any child of `metrics` will be mapped unchanged,
without expanding dots in field names to the corresponding object structure.
That makes it possible to store the metrics document above.

notable: true
3 changes: 3 additions & 0 deletions docs/reference/mapping/params.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ The following mapping parameters are common to some or all field data types:
* <<properties,`properties`>>
* <<search-analyzer,`search_analyzer`>>
* <<similarity,`similarity`>>
* <<subobjects,`subobjects`>>
* <<mapping-store,`store`>>
* <<term-vector,`term_vector`>>

Expand Down Expand Up @@ -83,4 +84,6 @@ include::params/similarity.asciidoc[]

include::params/store.asciidoc[]

include::params/subobjects.asciidoc[]

include::params/term-vector.asciidoc[]
107 changes: 107 additions & 0 deletions docs/reference/mapping/params/subobjects.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
[[subobjects]]
=== `subobjects`

When indexing a document or updating mappings, Elasticsearch accepts fields that contain dots in their names,
which get expanded to their corresponding object structure. For instance, the field `metrics.time.max`
is mapped as a `max` leaf field with a parent `time` object, belonging to its parent `metrics` object.

The described default behaviour is reasonable for most scenarios, but causes problems in certain situations
where for instance a field `metrics.time` holds a value too, which is common when indexing metrics data.
A document holding a value for both `metrics.time.max` and `metrics.time` gets rejected given that `time`
would need to be a leaf field to hold a value as well as an object to hold the `max` sub-field.

The `subobjects` setting, which can be applied only to the top-level mapping definition and
to <<object,`object`>> fields, disables the ability for an object to hold further subobjects and makes it possible
to store documents where field names contain dots and share common prefixes. From the example above, if the object
container `metrics` has `subobjects` set to `false`, it can hold values for both `time` and `time.max` directly
without the need for any intermediate object, as dots in field names are preserved.

[source,console]
--------------------------------------------------
PUT my-index-000001
{
"mappings": {
"properties": {
"metrics": {
"type": "object",
"subobjects": false <1>
}
}
}
}

PUT my-index-000001/_doc/metric_1
{
"metrics.time" : 100, <2>
"metrics.time.min" : 10,
"metrics.time.max" : 900
}

PUT my-index-000001/_doc/metric_2
{
"metrics" : {
"time" : 100, <3>
"time.min" : 10,
"time.max" : 900
}
}

GET my-index-000001/_mapping
--------------------------------------------------

[source,console-result]
--------------------------------------------------
{
"my-index-000001" : {
"mappings" : {
"properties" : {
"metrics" : {
"subobjects" : false,
"properties" : {
"time" : {
"type" : "long"
},
"time.min" : { <4>
"type" : "long"
},
"time.max" : {
"type" : "long"
}
}
}
}
}
}
}
--------------------------------------------------

<1> The `metrics` field cannot hold other objects.
<2> Sample document holding flat paths
<3> Sample document holding an object (configured to not hold subobjects) and its leaf sub-fields
<4> The resulting mapping where dots in field names were preserved

The entire mapping may be configured to not support subobjects as well, in which case the document can
only ever hold leaf sub-fields:

[source,console]
--------------------------------------------------
PUT my-index-000001
{
"mappings": {
"subobjects": false <1>
}
}

PUT my-index-000001/_doc/metric_1
{
"time" : "100ms", <2>
"time.min" : "10ms",
"time.max" : "900ms"
}

--------------------------------------------------

<1> The entire mapping is configured to not support objects.
<2> The document does not support objects

The `subobjects` setting for existing fields and the top-level mapping definition cannot be updated.
6 changes: 6 additions & 0 deletions docs/reference/mapping/types/object.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,12 @@ The following parameters are accepted by `object` fields:
Whether the JSON value given for the object field should be
parsed and indexed (`true`, default) or completely ignored (`false`).

<<subobjects,`subobjects`>>::

Whether the object can hold subobjects (`true`, default) or not (`false`). If not, sub-fields
with dots in their names will be treated as leaves instead, otherwise their field names
would be expanded to their corresponding object structure.

<<properties,`properties`>>::

The fields within the object, which can be of any
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
"Metrics indexing":
- skip:
version: " - 8.2.99"
reason: added in 8.3.0

- do:
indices.put_template:
name: test
body:
index_patterns: test-*
mappings:
dynamic_templates:
- no_subobjects:
match: metrics
mapping:
type: object
subobjects: false

- do:
index:
index: test-1
id: 1
refresh: true
body:
{ metrics.time: 10, metrics.time.max: 100, metrics.time.min: 1 }

- do:
field_caps:
index: test-1
fields: metrics.time*
- match: {fields.metrics\.time.long.searchable: true}
- match: {fields.metrics\.time.long.aggregatable: true}
- match: {fields.metrics\.time\.max.long.searchable: true}
- match: {fields.metrics\.time\.max.long.aggregatable: true}
- match: {fields.metrics\.time\.min.long.searchable: true}
- match: {fields.metrics\.time\.min.long.aggregatable: true}
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ public final class ContentPath {

private String[] path = new String[10];

private boolean withinLeafObject = false;

public ContentPath() {
this(0);
}
Expand Down Expand Up @@ -54,6 +56,14 @@ public void remove() {
path[index--] = null;
}

public void setWithinLeafObject(boolean withinLeafObject) {
this.withinLeafObject = withinLeafObject;
}

public boolean isWithinLeafObject() {
return withinLeafObject;
}

public String pathAsText(String name) {
sb.setLength(0);
for (int i = offset; i < index; i++) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -445,7 +445,13 @@ private static void parseObject(final DocumentParserContext context, ObjectMappe
Mapper objectMapper = getMapper(context, mapper, currentFieldName);
if (objectMapper != null) {
context.path().add(currentFieldName);
if (objectMapper instanceof ObjectMapper objMapper) {
if (objMapper.subobjects() == false) {
context.path().setWithinLeafObject(true);
}
}
parseObjectOrField(context, objectMapper);
context.path().setWithinLeafObject(false);
context.path().remove();
} else {
parseObjectDynamic(context, mapper, currentFieldName);
Expand Down Expand Up @@ -474,7 +480,13 @@ private static void parseObjectDynamic(DocumentParserContext context, ObjectMapp
throwOnCreateDynamicNestedViaCopyTo(dynamicObjectMapper);
}
context.path().add(currentFieldName);
if (dynamicObjectMapper instanceof ObjectMapper objectMapper) {
if (objectMapper.subobjects() == false) {
context.path().setWithinLeafObject(true);
}
}
parseObjectOrField(context, dynamicObjectMapper);
context.path().setWithinLeafObject(false);
context.path().remove();
}
}
Expand Down Expand Up @@ -789,7 +801,7 @@ protected String contentType() {

private static class NoOpObjectMapper extends ObjectMapper {
NoOpObjectMapper(String name, String fullPath) {
super(name, fullPath, Explicit.IMPLICIT_TRUE, Dynamic.RUNTIME, Collections.emptyMap());
super(name, fullPath, Explicit.IMPLICIT_TRUE, Explicit.IMPLICIT_TRUE, Dynamic.RUNTIME, Collections.emptyMap());
}
}

Expand All @@ -815,7 +827,11 @@ private static class InternalDocumentParserContext extends DocumentParserContext
XContentParser parser
) throws IOException {
super(mappingLookup, indexSettings, indexAnalyzers, parserContext, source);
this.parser = DotExpandingXContentParser.expandDots(parser);
if (mappingLookup.getMapping().getRoot().subobjects()) {
this.parser = DotExpandingXContentParser.expandDots(parser, this.path::isWithinLeafObject);
} else {
this.parser = parser;
}
this.document = new LuceneDocument();
this.documents.add(document);
this.maxAllowedNumNestedDocs = indexSettings().getMappingNestedDocsLimit();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -314,7 +314,7 @@ public LuceneDocument doc() {
*/
public final DocumentParserContext createCopyToContext(String copyToField, LuceneDocument doc) throws IOException {
ContentPath path = new ContentPath(0);
XContentParser parser = DotExpandingXContentParser.expandDots(new CopyToParser(copyToField, parser()));
XContentParser parser = DotExpandingXContentParser.expandDots(new CopyToParser(copyToField, parser()), path::isWithinLeafObject);
return new Wrapper(this) {
@Override
public ContentPath path() {
Expand Down
Loading