Make document parsing aware of runtime fields #65210

javanna · 2020-11-18T15:26:41Z

Runtime fields are defined in a separate runtime section in the mappings. Since the runtime section was introduced, runtime fields are not taken into account when parsing documents. That means that if a document gets indexed that holds a field that's already defined as a runtime field, the field gets dynamically mapped as a concrete field although it will always be shadowed by the runtime field defined with the same name.

A more sensible default would be to instead consider runtime fields like ordinary mapped fields, so a dynamic update is not necessary whenever a field is defined as part of the runtime section. As a consequence, the field does not get indexed. If users prefer to keep indexing the field although it is shadowed, we consider this an exception, and they can do so by mapping the field under properties explicitly.

Relates to #62906

Runtime fields are defined in a separate runtime section in the mappings. Since the runtime section was introduced, runtime fields are not taken into account when parsing documents. That means that if a document gets indexed that holds a field that's already defined as a runtime field, the field gets dynamically mapped as a concrete field although it will always be shadowed by the runtime field defined with the same name. A more sensible default would be to instead consider runtime fields like ordinary mapped fields, so a dynamic update is not necessary whenever a field is defined as part of the runtime section. As a consequence, the field does not get indexed. If users prefer to keep indexing the field although it is shadowed, we consider this an exception, and they can do so by mapping the field under properties explicitly. Relates to elastic#62906

elasticmachine · 2020-11-18T15:26:44Z

Pinging @elastic/es-search (Team:Search)

javanna · 2020-11-18T15:27:41Z

server/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

+        return null;
+    }
+
+    private static class NoOpFieldMapper extends FieldMapper {


It is quite a shame to have to define all these methods (not all of them are abstract but I wanted to make sure that none of them is called), as only two are effectively needed: parseCreateField and copyTo.

It is, but 🤷. I think there are worse things.

Yeah it would be nice to somehow build an interface here but I can't see how that would easily work with ObjectMappers. For the future.

nik9000

Left some initial stuff. I imagine there are a ton of other tests worth adding but I haven't thought them through so I don't know what they are. And, you know, maybe we have enough tests and I'm wrong.

nik9000 · 2020-11-18T15:47:19Z

server/src/test/java/org/elasticsearch/index/mapper/DocumentParserTests.java

+                .startObject("properties")
+                  .startObject("field").field("type", "keyword").endObject()
+                .endObject()
+            .endObject().endObject();


When we hit this file with the formatter one day it'll make this look like garbage. Would you be ok running the formatter on this now and playing with the output some to make it look ok when the formatter runs?

You can do something like:

createDocumentMapper(topMapping(b -> { b.startObject("runtime"); { b.startObject("field").field("type", "test").endObject(); } b.endObject(); b.startObject("properties"); { b.startObject("field").field("type", "keyword").endObject(); } b.endObject(); });

Keeps the formatter happy and there's a bit less ceremony around builders.

nik9000 · 2020-11-18T15:49:10Z

server/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

+        return null;
+    }
+
+    private static class NoOpFieldMapper extends FieldMapper {


It is, but 🤷. I think there are worse things.

nik9000 · 2020-11-18T15:51:37Z

server/src/test/java/org/elasticsearch/index/mapper/DocumentParserTests.java

+                    .startObject("path1.path2.path3.field").field("type", "test").endObject()
+                .endObject()
+                .startObject("properties")
+                    .startObject("path1").field("type", "object").field("enabled", false).endObject()


What happens if enabled isn't set? I think we should continue to do nothing if enabled is actually true.

look at the test below for dynamic mappings. We map the objects but not the leaves. I agree it is debatable. The point is the objects get mapped the first time they are seen regardless of what they hold. For instance if you index a doc with an empty object, the object still gets mapped, which makes me think that it is the correct behaviour to map objects under properties as they may hold concrete fields in the future.

On the other hand, when we introduce the dynamic:runtime mode, given that everything is runtime we may not want to create objects under properties at all.

romseygeek · 2020-11-18T16:41:25Z

server/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

+        @Override
+        public Builder getMergeBuilder() {
+            throw new UnsupportedOperationException();
+        }


Is this going to cause problems if there is a dynamic mapper added elsewhere? Let's say we send a document with two fields, one of which gets mapped by the dynamic template as a keyword, and the other as a runtime field. In that case, there will be a call back to the master with the new Mapping which will contain the new dynamic mapper, plus this NoOpFieldMapper, and it will blow up when we try to serialize it I think?

I don't think so because the purpose of the no-op mapper at the moment is only to be used when parsing a document. It is not added to the dynamic mappers. Actually, its purpose is solely not to cause a dynamic mapping update. Keep me honest though :)

Gotcha, I had misunderstood where we were creating these.

romseygeek

LGTM. I left one suggestion around test formatting.

romseygeek · 2020-11-19T10:39:00Z

server/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

+        return null;
+    }
+
+    private static class NoOpFieldMapper extends FieldMapper {


Yeah it would be nice to somehow build an interface here but I can't see how that would easily work with ObjectMappers. For the future.

romseygeek · 2020-11-19T10:39:26Z

server/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java

+        @Override
+        public Builder getMergeBuilder() {
+            throw new UnsupportedOperationException();
+        }


Gotcha, I had misunderstood where we were creating these.

romseygeek · 2020-11-19T10:41:40Z

server/src/test/java/org/elasticsearch/index/mapper/DocumentParserTests.java

+                .startObject("properties")
+                  .startObject("field").field("type", "keyword").endObject()
+                .endObject()
+            .endObject().endObject();


You can do something like:

createDocumentMapper(topMapping(b -> { b.startObject("runtime"); { b.startObject("field").field("type", "test").endObject(); } b.endObject(); b.startObject("properties"); { b.startObject("field").field("type", "keyword").endObject(); } b.endObject(); });

Keeps the formatter happy and there's a bit less ceremony around builders.

javanna · 2020-11-19T13:59:32Z

run elasticsearch-ci/2

javanna · 2020-11-19T13:59:44Z

run elasticsearch-ci/packaging-sample-windows

nik9000

LGTM2

Runtime fields are defined in a separate runtime section in the mappings. Since the runtime section was introduced, runtime fields are not taken into account when parsing documents. That means that if a document gets indexed that holds a field that's already defined as a runtime field, the field gets dynamically mapped as a concrete field although it will always be shadowed by the runtime field defined with the same name. A more sensible default would be to instead consider runtime fields like ordinary mapped fields, so a dynamic update is not necessary whenever a field is defined as part of the runtime section. As a consequence, the field does not get indexed. If users prefer to keep indexing the field although it is shadowed, we consider this an exception, and they can do so by mapping the field under properties explicitly. Relates to elastic#62906

Runtime fields are defined in a separate runtime section in the mappings. Since the runtime section was introduced, runtime fields are not taken into account when parsing documents. That means that if a document gets indexed that holds a field that's already defined as a runtime field, the field gets dynamically mapped as a concrete field although it will always be shadowed by the runtime field defined with the same name. A more sensible default would be to instead consider runtime fields like ordinary mapped fields, so a dynamic update is not necessary whenever a field is defined as part of the runtime section. As a consequence, the field does not get indexed. If users prefer to keep indexing the field although it is shadowed, we consider this an exception, and they can do so by mapping the field under properties explicitly. Relates to #62906

javanna added >non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types v8.0.0 v7.11.0 labels Nov 18, 2020

javanna requested review from nik9000 and romseygeek November 18, 2020 15:26

elasticmachine added the Team:Search Meta label for search team label Nov 18, 2020

javanna commented Nov 18, 2020

View reviewed changes

nik9000 reviewed Nov 18, 2020

View reviewed changes

romseygeek reviewed Nov 18, 2020

View reviewed changes

romseygeek approved these changes Nov 19, 2020

View reviewed changes

javanna added 3 commits November 19, 2020 12:14

formatting

e6ce5a8

Merge branch 'master' into enhancement/runtime_fields_document_parsing

542fa16

more tests

40ff986

nik9000 approved these changes Nov 19, 2020

View reviewed changes

javanna merged commit 3aef586 into elastic:master Nov 19, 2020

javanna mentioned this pull request Dec 3, 2020

Shadowed mappers need to skip their children #65811

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make document parsing aware of runtime fields #65210

Make document parsing aware of runtime fields #65210

javanna commented Nov 18, 2020

elasticmachine commented Nov 18, 2020

javanna Nov 18, 2020

nik9000 Nov 18, 2020

romseygeek Nov 19, 2020

nik9000 left a comment

nik9000 Nov 18, 2020

romseygeek Nov 19, 2020

nik9000 Nov 18, 2020

nik9000 Nov 18, 2020

javanna Nov 18, 2020

romseygeek Nov 18, 2020

javanna Nov 18, 2020

romseygeek Nov 19, 2020

romseygeek left a comment

romseygeek Nov 19, 2020

romseygeek Nov 19, 2020

romseygeek Nov 19, 2020

javanna commented Nov 19, 2020

javanna commented Nov 19, 2020

nik9000 left a comment

Make document parsing aware of runtime fields #65210

Make document parsing aware of runtime fields #65210

Conversation

javanna commented Nov 18, 2020

elasticmachine commented Nov 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nik9000 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romseygeek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna commented Nov 19, 2020

javanna commented Nov 19, 2020

nik9000 left a comment

Choose a reason for hiding this comment