Skip to content

Commit 22bbf10

Browse files
Handle missing and multiple values in script
Previously in script for numeric fields, there was no way to check if a document is missing a value. Also certain operations on multiple- values fields were missing. This PR adds the following: - add the following functions for multiple-valued numeric fields: doc['field'].min returns the minumum amoung values doc['field'].max returns the maximum amoung values doc['field'].sum returns the sum of amoung values doc['field'].avg returns the average of values - return null for doc['field'] if a document is missing a 'field1': Now we can do this: if (doc['field'] == null) {return -1;} return doc['field'].value; or doc['field']?.value ?: -1 This new behaviour will only work is the following system property is set: `export ES_JAVA_OPTS="-Des.script.null_for_missing_value=true"' Closes elastic#29286
1 parent 6011516 commit 22bbf10

File tree

15 files changed

+303
-24
lines changed

15 files changed

+303
-24
lines changed

docs/CHANGELOG.asciidoc

+4
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,10 @@ written to by an older Elasticsearch after writing to it with a newer Elasticsea
4141

4242
Do not ignore request analysis/similarity settings on index resize operations when the source index already contains such settings ({pull}30216[#30216])
4343

44+
=== Deprecations
45+
46+
Returning 0 for missing numeric fields in script is deprecated. PR:
47+
4448
=== Regressions
4549

4650
=== Known Issues

docs/painless/painless-getting-started.asciidoc

+26
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ GET hockey/_search
6868
----------------------------------------------------------------
6969
// CONSOLE
7070

71+
7172
Alternatively, you could do the same thing using a script field instead of a function score:
7273

7374
[source,js]
@@ -119,6 +120,31 @@ GET hockey/_search
119120
----------------------------------------------------------------
120121
// CONSOLE
121122

123+
[float]
124+
===== Missing values
125+
126+
Currently by default, if a document is missing a numeric field `field`,
127+
`doc['field'].value` returns `0` for this document. This default behaviour
128+
will be changed in the next major version of elasticsearch. Starting from 7.0,
129+
if a document is missing a field `field`, `doc['field']` for this document
130+
will return `null`. From 6.4 version, you can set a system property
131+
`export ES_JAVA_OPTS="-Des.script.null_for_missing_value=true"' on a node
132+
to make this node's behaviour compatible with the future major version.
133+
Otherwise, every time the node starts a deprecation warning will remind you
134+
about this forthcoming change in 7.x.
135+
136+
137+
===== Multiple values
138+
139+
There is a number of operations designed for numeric fields,
140+
if a document has multiple values in such a field:
141+
142+
- `doc['field'].min` - gets the minimum value among values
143+
- `doc['field'].max` - gets the maximum value among values
144+
- `doc['field'].sum` - gets the sum of all values
145+
- `doc['field'].avg` - gets the average of all values
146+
147+
122148
[float]
123149
==== Updating Fields with Painless
124150

modules/lang-painless/build.gradle

+1
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ esplugin {
2525
}
2626

2727
integTestCluster {
28+
systemProperty 'es.script.null_for_missing_value', 'true'
2829
module project.project(':modules:mapper-extras')
2930
}
3031

modules/lang-painless/src/main/resources/org/elasticsearch/painless/spi/org.elasticsearch.txt

+8
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,10 @@ class org.elasticsearch.index.fielddata.ScriptDocValues$Strings {
7373
class org.elasticsearch.index.fielddata.ScriptDocValues$Longs {
7474
Long get(int)
7575
long getValue()
76+
long getMin()
77+
long getMax()
78+
long getSum()
79+
double getAvg()
7680
List getValues()
7781
org.joda.time.ReadableDateTime getDate()
7882
List getDates()
@@ -89,6 +93,10 @@ class org.elasticsearch.index.fielddata.ScriptDocValues$Dates {
8993
class org.elasticsearch.index.fielddata.ScriptDocValues$Doubles {
9094
Double get(int)
9195
double getValue()
96+
double getMin()
97+
double getMax()
98+
double getSum()
99+
double getAvg()
92100
List getValues()
93101
}
94102

modules/lang-painless/src/test/resources/rest-api-spec/test/painless/20_scriptfield.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ setup:
9595
script_fields:
9696
bar:
9797
script:
98-
source: "(doc['missing'].value?.length() ?: 0) + params.x;"
98+
source: "(doc['missing']?.value?.length() ?: 0) + params.x;"
9999
params:
100100
x: 5
101101

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
setup:
2+
- do:
3+
indices.create:
4+
index: test
5+
body:
6+
settings:
7+
number_of_shards: 1
8+
mappings:
9+
_doc:
10+
properties:
11+
dval:
12+
type: double
13+
lval:
14+
type: long
15+
16+
- do:
17+
index:
18+
index: test
19+
type: _doc
20+
id: 1
21+
body: { "dval": 5.5, "lval": 5 }
22+
23+
- do:
24+
index:
25+
index: test
26+
type: _doc
27+
id: 2
28+
body: { "dval": [5.5, 3.5, 4.5] }
29+
30+
31+
- do:
32+
index:
33+
index: test
34+
type: _doc
35+
id: 3
36+
body: { "lval": [5, 3, 4] }
37+
38+
- do:
39+
indices.refresh: {}
40+
41+
---
42+
"check double and long values: missing values and operations on multiple values":
43+
- skip:
44+
version: " - 6.3.99"
45+
reason: Handling missing values and operations on multiple values were added from 6.4
46+
47+
- do:
48+
search:
49+
body:
50+
script_fields:
51+
val_dval:
52+
script:
53+
source: "doc['dval']?.value ?: -1.0"
54+
min_dval:
55+
script:
56+
source: "doc['dval']?.min ?: -1.0"
57+
max_dval:
58+
script:
59+
source: "doc['dval']?.max ?: -1.0"
60+
sum_dval:
61+
script:
62+
source: "doc['dval']?.sum ?: -1.0"
63+
avg_dval:
64+
script:
65+
source: "doc['dval']?.avg ?: -1.0"
66+
val_lval:
67+
script:
68+
source: "doc['lval']?.value ?: -1"
69+
min_lval:
70+
script:
71+
source: "doc['lval']?.min ?: -1"
72+
max_lval:
73+
script:
74+
source: "doc['lval']?.max ?: -1"
75+
sum_lval:
76+
script:
77+
source: "doc['lval']?.sum ?: -1"
78+
avg_lval:
79+
script:
80+
source: "doc['lval']?.avg ?: -1"
81+
82+
- match: { hits.hits.0.fields.val_dval.0: 5.5}
83+
- match: { hits.hits.0.fields.min_dval.0: 5.5}
84+
- match: { hits.hits.0.fields.max_dval.0: 5.5}
85+
- match: { hits.hits.0.fields.sum_dval.0: 5.5}
86+
- match: { hits.hits.0.fields.avg_dval.0: 5.5}
87+
88+
- match: { hits.hits.0.fields.val_lval.0: 5}
89+
- match: { hits.hits.0.fields.min_lval.0: 5}
90+
- match: { hits.hits.0.fields.max_lval.0: 5}
91+
- match: { hits.hits.0.fields.sum_lval.0: 5}
92+
- match: { hits.hits.0.fields.avg_lval.0: 5}
93+
94+
- match: { hits.hits.1.fields.val_dval.0: 3.5}
95+
- match: { hits.hits.1.fields.min_dval.0: 3.5}
96+
- match: { hits.hits.1.fields.max_dval.0: 5.5}
97+
- match: { hits.hits.1.fields.sum_dval.0: 13.5}
98+
- match: { hits.hits.1.fields.avg_dval.0: 4.5}
99+
100+
- match: { hits.hits.1.fields.val_lval.0: -1}
101+
- match: { hits.hits.1.fields.min_lval.0: -1}
102+
- match: { hits.hits.1.fields.max_lval.0: -1}
103+
- match: { hits.hits.1.fields.sum_lval.0: -1}
104+
- match: { hits.hits.1.fields.avg_lval.0: -1}
105+
106+
- match: { hits.hits.2.fields.val_dval.0: -1.0}
107+
- match: { hits.hits.2.fields.min_dval.0: -1.0}
108+
- match: { hits.hits.2.fields.max_dval.0: -1.0}
109+
- match: { hits.hits.2.fields.sum_dval.0: -1.0}
110+
- match: { hits.hits.2.fields.avg_dval.0: -1.0}
111+
112+
- match: { hits.hits.2.fields.val_lval.0: 3}
113+
- match: { hits.hits.2.fields.min_lval.0: 3}
114+
- match: { hits.hits.2.fields.max_lval.0: 5}
115+
- match: { hits.hits.2.fields.sum_lval.0: 12}
116+
- match: { hits.hits.2.fields.avg_lval.0: 4}
117+

0 commit comments

Comments
 (0)