Skip to content

Commit 87f3579

Browse files
authored
Add nanosecond field mapper (#37755)
This adds a dedicated field mapper that supports nanosecond resolution - at the price of a reduced date range. When using the date field mapper, the time is stored as milliseconds since the epoch in a long in lucene. This field mapper stores the time in nanoseconds since the epoch - which means its range is much smaller, ranging roughly from 1970 to 2262. Note that aggregations will still be in milliseconds. However docvalue fields will have full nanosecond resolution Relates #27330
1 parent 15510da commit 87f3579

File tree

23 files changed

+725
-48
lines changed

23 files changed

+725
-48
lines changed

docs/reference/mapping/types.asciidoc

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,13 @@ document:
77
[float]
88
=== Core datatypes
99

10-
string:: <<text,`text`>> and <<keyword,`keyword`>>
11-
<<number>>:: `long`, `integer`, `short`, `byte`, `double`, `float`, `half_float`, `scaled_float`
12-
<<date>>:: `date`
13-
<<boolean>>:: `boolean`
14-
<<binary>>:: `binary`
15-
<<range>>:: `integer_range`, `float_range`, `long_range`, `double_range`, `date_range`
10+
string:: <<text,`text`>> and <<keyword,`keyword`>>
11+
<<number>>:: `long`, `integer`, `short`, `byte`, `double`, `float`, `half_float`, `scaled_float`
12+
<<date>>:: `date`
13+
<<date_nanos>>:: `date_nanos`
14+
<<boolean>>:: `boolean`
15+
<<binary>>:: `binary`
16+
<<range>>:: `integer_range`, `float_range`, `long_range`, `double_range`, `date_range`
1617

1718
[float]
1819
=== Complex datatypes
@@ -78,6 +79,8 @@ include::types/boolean.asciidoc[]
7879

7980
include::types/date.asciidoc[]
8081

82+
include::types/date_nanos.asciidoc[]
83+
8184
include::types/geo-point.asciidoc[]
8285

8386
include::types/geo-shape.asciidoc[]
@@ -106,4 +109,4 @@ include::types/rank-features.asciidoc[]
106109

107110
include::types/dense-vector.asciidoc[]
108111

109-
include::types/sparse-vector.asciidoc[]
112+
include::types/sparse-vector.asciidoc[]
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
[[date_nanos]]
2+
=== date_nanos datatype
3+
4+
This datatype is an addition to the `date` datatype. However there is an
5+
important distinction between the two. The existing `date` datatype stores
6+
dates in millisecond resolution. The `date_nanos` data type stores dates
7+
in nanosecond resolution, which limits its range of dates from roughly
8+
1970 to 2262, as dates are still stored as a long representing nanoseconds
9+
since the epoch.
10+
11+
Queries on nanoseconds are internally converted to range queries on this long
12+
representation, and the result of aggregations and stored fields is converted
13+
back to a string depending on the date format that is associated with the field.
14+
15+
Date formats can be customised, but if no `format` is specified then it uses
16+
the default:
17+
18+
"strict_date_optional_time||epoch_millis"
19+
20+
This means that it will accept dates with optional timestamps, which conform
21+
to the formats supported by
22+
<<strict-date-time,`strict_date_optional_time`>> including up to nine second
23+
fractionals or milliseconds-since-the-epoch (thus losing precision on the
24+
nano second part).
25+
26+
For instance:
27+
28+
[source,js]
29+
--------------------------------------------------
30+
PUT my_index?include_type_name=true
31+
{
32+
"mappings": {
33+
"_doc": {
34+
"properties": {
35+
"date": {
36+
"type": "date_nanos" <1>
37+
}
38+
}
39+
}
40+
}
41+
}
42+
43+
PUT my_index/_doc/1
44+
{ "date": "2015-01-01" } <2>
45+
46+
PUT my_index/_doc/2
47+
{ "date": "2015-01-01T12:10:30.123456789Z" } <3>
48+
49+
PUT my_index/_doc/3
50+
{ "date": 1420070400 } <4>
51+
52+
GET my_index/_search
53+
{
54+
"sort": { "date": "asc"} <5>
55+
}
56+
57+
GET my_index/_search
58+
{
59+
"script_fields" : {
60+
"my_field" : {
61+
"script" : {
62+
"lang" : "painless",
63+
"source" : "doc['date'].date.nanos" <6>
64+
}
65+
}
66+
}
67+
}
68+
69+
GET my_index/_search
70+
{
71+
"docvalue_fields" : [
72+
{
73+
"field" : "my_ip_field",
74+
"format": "strict_date_time" <7>
75+
}
76+
]
77+
}
78+
--------------------------------------------------
79+
// CONSOLE
80+
81+
<1> The `date` field uses the default `format`.
82+
<2> This document uses a plain date.
83+
<3> This document includes a time.
84+
<4> This document uses milliseconds-since-the-epoch.
85+
<5> Note that the `sort` values that are returned are all in
86+
nanoseconds-since-the-epoch.
87+
<6> Access the nanosecond part of the date in a script
88+
<7> Use doc value fields, which can be formatted in nanosecond
89+
resolution
90+
91+
You can also specify multiple date formats separated by `||`. The
92+
same mapping parameters than with the `date` field can be used.
93+
94+
[[date-nanos-limitations]]
95+
==== Limitations
96+
97+
Aggregations are still on millisecond resolution, even when using a
98+
`date_nanos` field.
99+

rest-api-spec/src/main/resources/rest-api-spec/test/field_caps/10_basic.yml

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ setup:
1414
type: double
1515
geo:
1616
type: geo_point
17+
date:
18+
type: date
1719
object:
1820
type: object
1921
properties:
@@ -45,6 +47,8 @@ setup:
4547
type: keyword
4648
number:
4749
type: double
50+
date:
51+
type: date
4852
geo:
4953
type: geo_point
5054
object:
@@ -77,6 +81,8 @@ setup:
7781
type: keyword
7882
number:
7983
type: long
84+
date:
85+
type: date
8086
geo:
8187
type: keyword
8288
object:
@@ -104,7 +110,7 @@ setup:
104110
- do:
105111
field_caps:
106112
index: 'test1,test2,test3'
107-
fields: [text, keyword, number, geo]
113+
fields: [text, keyword, number, date, geo]
108114

109115
- match: {fields.text.text.searchable: true}
110116
- match: {fields.text.text.aggregatable: false}
@@ -126,6 +132,11 @@ setup:
126132
- match: {fields.number.long.indices: ["test3"]}
127133
- is_false: fields.number.long.non_searchable_indices
128134
- is_false: fields.number.long.non_aggregatable_indices
135+
- match: {fields.date.date.searchable: true}
136+
- match: {fields.date.date.aggregatable: true}
137+
- is_false: fields.date.date.indices
138+
- is_false: fields.date.date.non_searchable_indices
139+
- is_false: fields.date.date.non_aggregatable_indices
129140
- match: {fields.geo.geo_point.searchable: true}
130141
- match: {fields.geo.geo_point.aggregatable: true}
131142
- match: {fields.geo.geo_point.indices: ["test1", "test2"]}
@@ -137,6 +148,33 @@ setup:
137148
- is_false: fields.geo.keyword.non_searchable_indices
138149
- is_false: fields.geo.keyword.on_aggregatable_indices
139150
---
151+
"Get date_nanos field caps":
152+
- skip:
153+
version: " - 6.99.99"
154+
reason: date_nanos field mapping type has been introcued in 7.0
155+
156+
- do:
157+
indices.create:
158+
include_type_name: false
159+
index: test_nanos
160+
body:
161+
mappings:
162+
properties:
163+
date_nanos:
164+
type: date_nanos
165+
166+
- do:
167+
field_caps:
168+
index: 'test_nanos'
169+
fields: [date_nanos]
170+
171+
- match: {fields.date_nanos.date_nanos.searchable: true}
172+
- match: {fields.date_nanos.date_nanos.aggregatable: true}
173+
- is_false: fields.date_nanos.date_nanos.indices
174+
- is_false: fields.date_nanos.date_nanos.non_searchable_indices
175+
- is_false: fields.date_nanos.date_nanos.non_aggregatable_indices
176+
177+
---
140178
"Get leaves field caps":
141179

142180
- do:
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
setup:
2+
- skip:
3+
version: " - 6.99.99"
4+
reason: "Implemented in 7.0"
5+
6+
- do:
7+
indices.create:
8+
index: date_ns
9+
body:
10+
settings:
11+
number_of_shards: 3
12+
number_of_replicas: 0
13+
mappings:
14+
properties:
15+
date:
16+
type: date_nanos
17+
field:
18+
type: long
19+
20+
- do:
21+
indices.create:
22+
index: date_ms
23+
body:
24+
settings:
25+
number_of_shards: 3
26+
number_of_replicas: 0
27+
mappings:
28+
properties:
29+
date:
30+
type: date
31+
field:
32+
type: long
33+
34+
---
35+
"test sorting against date_nanos only fields":
36+
37+
- do:
38+
bulk:
39+
refresh: true
40+
body:
41+
- '{ "index" : { "_index" : "date_ns", "_id" : "first" } }'
42+
# millis [1540815132987] to nanos [1540815132987654321]
43+
- '{"date" : "2018-10-29T12:12:12.123456789Z", "field" : 1 }'
44+
- '{ "index" : { "_index" : "date_ns", "_id" : "second" } }'
45+
# millis [1540815132123] to nanos [1540815132123456789]
46+
- '{"date" : "2018-10-29T12:12:12.987654321Z", "field" : 2 }'
47+
48+
- do:
49+
search:
50+
rest_total_hits_as_int: true
51+
index: date_ns*
52+
body:
53+
sort: [ { "date": "desc" } ]
54+
55+
- match: { hits.total: 2 }
56+
- length: { hits.hits: 2 }
57+
- match: { hits.hits.0._id: "second" }
58+
- match: { hits.hits.0.sort: [1540815132987654321] }
59+
- match: { hits.hits.1._id: "first" }
60+
- match: { hits.hits.1.sort: [1540815132123456789] }
61+
62+
- do:
63+
search:
64+
rest_total_hits_as_int: true
65+
index: date_ns*
66+
body:
67+
sort: [ { "date": "asc" } ]
68+
69+
- match: { hits.total: 2 }
70+
- length: { hits.hits: 2 }
71+
- match: { hits.hits.0._id: "first" }
72+
- match: { hits.hits.0.sort: [1540815132123456789] }
73+
- match: { hits.hits.1._id: "second" }
74+
- match: { hits.hits.1.sort: [1540815132987654321] }
75+
76+
77+
---
78+
"date_nanos requires dates after 1970 and before 2262":
79+
80+
- do:
81+
bulk:
82+
refresh: true
83+
body:
84+
- '{ "index" : { "_index" : "date_ns", "_id" : "date_ns_1" } }'
85+
- '{"date" : "1969-10-28T12:12:12.123456789Z" }'
86+
- '{ "index" : { "_index" : "date_ns", "_id" : "date_ns_2" } }'
87+
- '{"date" : "2263-10-29T12:12:12.123456789Z" }'
88+
89+
- match: { errors: true }
90+
- match: { items.0.index.status: 400 }
91+
- match: { items.0.index.error.type: mapper_parsing_exception }
92+
- match: { items.0.index.error.caused_by.reason: "date[1969-10-28T12:12:12.123456789Z] is before the epoch in 1970 and cannot be stored in nanosecond resolution" }
93+
- match: { items.1.index.status: 400 }
94+
- match: { items.1.index.error.type: mapper_parsing_exception }
95+
- match: { items.1.index.error.caused_by.reason: "date[2263-10-29T12:12:12.123456789Z] is after 2262-04-11T23:47:16.854775807 and cannot be stored in nanosecond resolution" }
96+
97+
98+
---
99+
"doc value fields are working as expected across date and date_nanos fields":
100+
101+
- do:
102+
bulk:
103+
refresh: true
104+
body:
105+
- '{ "index" : { "_index" : "date_ns", "_id" : "date_ns_1" } }'
106+
- '{"date" : "2018-10-29T12:12:12.123456789Z", "field" : 1 }'
107+
- '{ "index" : { "_index" : "date_ms", "_id" : "date_ms_1" } }'
108+
- '{"date" : "2018-10-29T12:12:12.987Z" }'
109+
110+
- do:
111+
search:
112+
rest_total_hits_as_int: true
113+
index: date*
114+
body:
115+
docvalue_fields: [ { "field": "date", "format" : "strict_date_optional_time" }, { "field": "date", "format": "epoch_millis" }, { "field" : "date", "format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSSSSX" } ]
116+
sort: [ { "date": "desc" } ]
117+
118+
- match: { hits.total: 2 }
119+
- length: { hits.hits: 2 }
120+
- match: { hits.hits.0._id: "date_ns_1" }
121+
- match: { hits.hits.1._id: "date_ms_1" }
122+
- match: { hits.hits.0.fields.date: [ "2018-10-29T12:12:12.123Z", "1540815132123.456789", "2018-10-29T12:12:12.123456789Z" ] }
123+
- match: { hits.hits.1.fields.date: [ "2018-10-29T12:12:12.987Z", "1540815132987", "2018-10-29T12:12:12.987000000Z" ] }
124+
125+
---
126+
"date histogram aggregation with date and date_nanos mapping":
127+
128+
- do:
129+
bulk:
130+
refresh: true
131+
body:
132+
- '{ "index" : { "_index" : "date_ns", "_id" : "date_ns_1" } }'
133+
- '{"date" : "2018-10-29T12:12:12.123456789Z" }'
134+
- '{ "index" : { "_index" : "date_ms", "_id" : "date_ms_1" } }'
135+
- '{"date" : "2018-10-29T12:12:12.987Z" }'
136+
- '{ "index" : { "_index" : "date_ns", "_id" : "date_ns_2" } }'
137+
- '{"date" : "2018-10-30T12:12:12.123456789Z" }'
138+
- '{ "index" : { "_index" : "date_ms", "_id" : "date_ms_2" } }'
139+
- '{"date" : "2018-10-30T12:12:12.987Z" }'
140+
141+
- do:
142+
search:
143+
rest_total_hits_as_int: true
144+
index: date*
145+
body:
146+
size: 0
147+
aggs:
148+
date:
149+
date_histogram:
150+
field: date
151+
interval: 1d
152+
153+
- match: { hits.total: 4 }
154+
- length: { aggregations.date.buckets: 2 }
155+
- match: { aggregations.date.buckets.0.key: 1540771200000 }
156+
- match: { aggregations.date.buckets.0.key_as_string: "2018-10-29T00:00:00.000Z" }
157+
- match: { aggregations.date.buckets.0.doc_count: 2 }
158+
- match: { aggregations.date.buckets.1.key: 1540857600000 }
159+
- match: { aggregations.date.buckets.1.key_as_string: "2018-10-30T00:00:00.000Z" }
160+
- match: { aggregations.date.buckets.1.doc_count: 2 }
161+

server/src/main/java/org/elasticsearch/action/search/SearchPhaseController.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -512,7 +512,7 @@ private InternalAggregations reduceAggsIncrementally(List<InternalAggregations>
512512
}
513513

514514
private static InternalAggregations reduceAggs(List<InternalAggregations> aggregationsList,
515-
List<SiblingPipelineAggregator> pipelineAggregators, ReduceContext reduceContext) {
515+
List<SiblingPipelineAggregator> pipelineAggregators, ReduceContext reduceContext) {
516516
InternalAggregations aggregations = InternalAggregations.reduce(aggregationsList, reduceContext);
517517
if (pipelineAggregators != null) {
518518
List<InternalAggregation> newAggs = StreamSupport.stream(aggregations.spliterator(), false)

0 commit comments

Comments
 (0)