Skip to content

Commit 89c6575

Browse files
authored
Update the signature of vector script functions. (elastic#48653)
Previously the functions accepted a doc values reference, whereas they now accept the name of the vector field. Here's an example of how a vector function was called before and after the change. ``` Before: cosineSimilarity(params.query_vector, doc['field']) After: cosineSimilarity(params.query_vector, 'field') ``` This seems more intuitive, since we don't allow direct access to vector doc values and the the meaning of `doc['field']` is unclear. The PR makes the following changes (broken into distinct commits): * Add new function signatures of the form `function(params.query_vector, 'field')` and deprecates the old ones. Because Painless doesn't allow two methods with the same name and number of arguments, we allow a generic `Object` to be passed in to the function and decide on the behavior through an `instanceof` check. * Refactor the class bindings so that the document field is passed to the constructor instead of the instance method. This allows us to avoid retrieving the vector doc values on every function invocation, which gives a tiny speed-up in benchmarks. Note that this PR adds new signatures for the sparse vector functions too, even though sparse vectors are deprecated. It seemed simplest to understand (for both us and users) to keep everything symmetric between dense and sparse vectors.
1 parent 25724c5 commit 89c6575

File tree

13 files changed

+376
-143
lines changed

13 files changed

+376
-143
lines changed

docs/painless/painless-api-reference/painless-api-reference-score/index.asciidoc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ The following specialized API is available in the Score context.
1010
==== Static Methods
1111
The following methods are directly callable without a class/instance qualifier. Note parameters denoted by a (*) are treated as read-only values.
1212

13-
* double cosineSimilarity(List *, VectorScriptDocValues.DenseVectorScriptDocValues)
14-
* double cosineSimilaritySparse(Map *, VectorScriptDocValues.SparseVectorScriptDocValues)
13+
* double cosineSimilarity(List *, String)
14+
* double cosineSimilaritySparse(Map *, String)
1515
* double decayDateExp(String *, String *, String *, double *, JodaCompatibleZonedDateTime)
1616
* double decayDateGauss(String *, String *, String *, double *, JodaCompatibleZonedDateTime)
1717
* double decayDateLinear(String *, String *, String *, double *, JodaCompatibleZonedDateTime)
@@ -21,8 +21,8 @@ The following methods are directly callable without a class/instance qualifier.
2121
* double decayNumericExp(double *, double *, double *, double *, double)
2222
* double decayNumericGauss(double *, double *, double *, double *, double)
2323
* double decayNumericLinear(double *, double *, double *, double *, double)
24-
* double dotProduct(List, VectorScriptDocValues.DenseVectorScriptDocValues)
25-
* double dotProductSparse(Map *, VectorScriptDocValues.SparseVectorScriptDocValues)
24+
* double dotProduct(List, String)
25+
* double dotProductSparse(Map *, String)
2626
* double randomScore(int *)
2727
* double randomScore(int *, String *)
2828
* double saturation(double, double)

docs/reference/migration/migrate_7_6.asciidoc

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,10 @@ We have not seen much interest in this experimental field type, and don't see
2929
a clear use case as it's currently designed. If you have feedback or
3030
suggestions around sparse vector functionality, please let us know through
3131
GitHub or the 'discuss' forums.
32+
33+
[discrete]
34+
==== Update to vector function signatures
35+
The vector functions of the form `function(query, doc['field'])` are
36+
deprecated, and the form `function(query, 'field')` should be used instead.
37+
For example, `cosineSimilarity(query, doc['field'])` is replaced by
38+
`cosineSimilarity(query, 'field')`.

docs/reference/vectors/vector-functions.asciidoc

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ GET my_index/_search
6868
}
6969
},
7070
"script": {
71-
"source": "cosineSimilarity(params.query_vector, doc['my_dense_vector']) + 1.0", <2>
71+
"source": "cosineSimilarity(params.query_vector, 'my_dense_vector') + 1.0", <2>
7272
"params": {
7373
"query_vector": [4, 3.4, -0.2] <3>
7474
}
@@ -105,7 +105,7 @@ GET my_index/_search
105105
},
106106
"script": {
107107
"source": """
108-
double value = dotProduct(params.query_vector, doc['my_dense_vector']);
108+
double value = dotProduct(params.query_vector, 'my_dense_vector');
109109
return sigmoid(1, Math.E, -value); <1>
110110
""",
111111
"params": {
@@ -139,7 +139,7 @@ GET my_index/_search
139139
}
140140
},
141141
"script": {
142-
"source": "1 / (1 + l1norm(params.queryVector, doc['my_dense_vector']))", <1>
142+
"source": "1 / (1 + l1norm(params.queryVector, 'my_dense_vector'))", <1>
143143
"params": {
144144
"queryVector": [4, 3.4, -0.2]
145145
}
@@ -178,7 +178,7 @@ GET my_index/_search
178178
}
179179
},
180180
"script": {
181-
"source": "1 / (1 + l2norm(params.queryVector, doc['my_dense_vector']))",
181+
"source": "1 / (1 + l2norm(params.queryVector, 'my_dense_vector'))",
182182
"params": {
183183
"queryVector": [4, 3.4, -0.2]
184184
}
@@ -196,7 +196,7 @@ You can check if a document has a value for the field `my_vector` by
196196

197197
[source,js]
198198
--------------------------------------------------
199-
"source": "doc['my_vector'].size() == 0 ? 0 : cosineSimilarity(params.queryVector, doc['my_vector'])"
199+
"source": "doc['my_vector'].size() == 0 ? 0 : cosineSimilarity(params.queryVector, 'my_vector')"
200200
--------------------------------------------------
201201
// NOTCONSOLE
202202

@@ -262,7 +262,7 @@ GET my_sparse_index/_search
262262
}
263263
},
264264
"script": {
265-
"source": "cosineSimilaritySparse(params.query_vector, doc['my_sparse_vector']) + 1.0",
265+
"source": "cosineSimilaritySparse(params.query_vector, 'my_sparse_vector') + 1.0",
266266
"params": {
267267
"query_vector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
268268
}
@@ -294,7 +294,7 @@ GET my_sparse_index/_search
294294
},
295295
"script": {
296296
"source": """
297-
double value = dotProductSparse(params.query_vector, doc['my_sparse_vector']);
297+
double value = dotProductSparse(params.query_vector, 'my_sparse_vector');
298298
return sigmoid(1, Math.E, -value);
299299
""",
300300
"params": {
@@ -327,7 +327,7 @@ GET my_sparse_index/_search
327327
}
328328
},
329329
"script": {
330-
"source": "1 / (1 + l1normSparse(params.queryVector, doc['my_sparse_vector']))",
330+
"source": "1 / (1 + l1normSparse(params.queryVector, 'my_sparse_vector'))",
331331
"params": {
332332
"queryVector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
333333
}
@@ -358,7 +358,7 @@ GET my_sparse_index/_search
358358
}
359359
},
360360
"script": {
361-
"source": "1 / (1 + l2normSparse(params.queryVector, doc['my_sparse_vector']))",
361+
"source": "1 / (1 + l2normSparse(params.queryVector, 'my_sparse_vector'))",
362362
"params": {
363363
"queryVector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
364364
}

server/src/main/java/org/elasticsearch/script/ScoreScript.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ public Map<String, Object> getParams() {
119119
}
120120

121121
/** The doc lookup for the Lucene segment this script was created for. */
122-
public final Map<String, ScriptDocValues<?>> getDoc() {
122+
public Map<String, ScriptDocValues<?>> getDoc() {
123123
return leafLookup.doc();
124124
}
125125

x-pack/plugin/src/test/resources/rest-api-spec/test/vectors/10_dense_vector_basic.yml

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
setup:
22
- skip:
3-
features: headers
3+
features: [headers, warnings]
44
version: " - 7.2.99"
55
reason: "dense_vector dims parameter was added from 7.3"
66

@@ -52,7 +52,7 @@ setup:
5252
script_score:
5353
query: {match_all: {} }
5454
script:
55-
source: "dotProduct(params.query_vector, doc['my_dense_vector'])"
55+
source: "dotProduct(params.query_vector, 'my_dense_vector')"
5656
params:
5757
query_vector: [0.5, 111.3, -13.0, 14.8, -156.0]
5858

@@ -82,7 +82,7 @@ setup:
8282
script_score:
8383
query: {match_all: {} }
8484
script:
85-
source: "cosineSimilarity(params.query_vector, doc['my_dense_vector'])"
85+
source: "cosineSimilarity(params.query_vector, 'my_dense_vector')"
8686
params:
8787
query_vector: [0.5, 111.3, -13.0, 14.8, -156.0]
8888

@@ -99,3 +99,26 @@ setup:
9999
- match: {hits.hits.2._id: "1"}
100100
- gte: {hits.hits.2._score: 0.78}
101101
- lte: {hits.hits.2._score: 0.791}
102+
103+
---
104+
"Deprecated function signature":
105+
- do:
106+
headers:
107+
Content-Type: application/json
108+
warnings:
109+
- The vector functions of the form function(query, doc['field']) are deprecated, and the form function(query, 'field') should be used instead. For example, cosineSimilarity(query, doc['field']) is replaced by cosineSimilarity(query, 'field').
110+
search:
111+
rest_total_hits_as_int: true
112+
body:
113+
query:
114+
script_score:
115+
query: {match_all: {} }
116+
script:
117+
source: "cosineSimilarity(params.query_vector, doc['my_dense_vector'])"
118+
params:
119+
query_vector: [0.5, 111.3, -13.0, 14.8, -156.0]
120+
121+
- match: {hits.total: 3}
122+
- match: {hits.hits.0._id: "3"}
123+
- match: {hits.hits.1._id: "2"}
124+
- match: {hits.hits.2._id: "1"}

x-pack/plugin/src/test/resources/rest-api-spec/test/vectors/15_dense_vector_l1l2.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ setup:
5353
script_score:
5454
query: {match_all: {} }
5555
script:
56-
source: "l1norm(params.query_vector, doc['my_dense_vector'])"
56+
source: "l1norm(params.query_vector, 'my_dense_vector')"
5757
params:
5858
query_vector: [0.5, 111.3, -13.0, 14.8, -156.0]
5959

@@ -83,7 +83,7 @@ setup:
8383
script_score:
8484
query: {match_all: {} }
8585
script:
86-
source: "l2norm(params.query_vector, doc['my_dense_vector'])"
86+
source: "l2norm(params.query_vector, 'my_dense_vector')"
8787
params:
8888
query_vector: [0.5, 111.3, -13.0, 14.8, -156.0]
8989

x-pack/plugin/src/test/resources/rest-api-spec/test/vectors/20_dense_vector_special_cases.yml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ setup:
6262
script_score:
6363
query: {match_all: {} }
6464
script:
65-
source: "cosineSimilarity(params.query_vector, doc['my_dense_vector'])"
65+
source: "cosineSimilarity(params.query_vector, 'my_dense_vector')"
6666
params:
6767
query_vector: [10, 10, 10]
6868

@@ -81,7 +81,7 @@ setup:
8181
script_score:
8282
query: {match_all: {} }
8383
script:
84-
source: "cosineSimilarity(params.query_vector, doc['my_dense_vector'])"
84+
source: "cosineSimilarity(params.query_vector, 'my_dense_vector')"
8585
params:
8686
query_vector: [10.0, 10.0, 10.0]
8787

@@ -111,7 +111,7 @@ setup:
111111
script_score:
112112
query: {match_all: {} }
113113
script:
114-
source: "cosineSimilarity(params.query_vector, doc['my_dense_vector'])"
114+
source: "cosineSimilarity(params.query_vector, 'my_dense_vector')"
115115
params:
116116
query_vector: [1, 2, 3, 4]
117117
- match: { error.root_cause.0.type: "script_exception" }
@@ -125,7 +125,7 @@ setup:
125125
script_score:
126126
query: {match_all: {} }
127127
script:
128-
source: "dotProduct(params.query_vector, doc['my_dense_vector'])"
128+
source: "dotProduct(params.query_vector, 'my_dense_vector')"
129129
params:
130130
query_vector: [1, 2, 3, 4]
131131
- match: { error.root_cause.0.type: "script_exception" }
@@ -161,7 +161,7 @@ setup:
161161
script_score:
162162
query: {match_all: {} }
163163
script:
164-
source: "cosineSimilarity(params.query_vector, doc['my_dense_vector'])"
164+
source: "cosineSimilarity(params.query_vector, 'my_dense_vector')"
165165
params:
166166
query_vector: [10.0, 10.0, 10.0]
167167
- match: { error.root_cause.0.type: "script_exception" }
@@ -177,7 +177,7 @@ setup:
177177
script_score:
178178
query: {match_all: {} }
179179
script:
180-
source: "doc['my_dense_vector'].size() == 0 ? 0 : cosineSimilarity(params.query_vector, doc['my_dense_vector'])"
180+
source: "doc['my_dense_vector'].size() == 0 ? 0 : cosineSimilarity(params.query_vector, 'my_dense_vector')"
181181
params:
182182
query_vector: [10.0, 10.0, 10.0]
183183

@@ -208,7 +208,7 @@ setup:
208208
script_score:
209209
query: {match_all: {} }
210210
script:
211-
source: "dotProductSparse(params.query_vector, doc['my_dense_vector'])"
211+
source: "dotProductSparse(params.query_vector, 'my_dense_vector')"
212212
params:
213213
query_vector: {"2": 0.5, "10" : 111.3, "3": 44}
214214
- match: { error.root_cause.0.type: "script_exception" }

x-pack/plugin/src/test/resources/rest-api-spec/test/vectors/30_sparse_vector_basic.yml

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ setup:
5555
script_score:
5656
query: {match_all: {} }
5757
script:
58-
source: "dotProductSparse(params.query_vector, doc['my_sparse_vector'])"
58+
source: "dotProductSparse(params.query_vector, 'my_sparse_vector')"
5959
params:
6060
query_vector: {"2": 0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
6161

@@ -87,7 +87,7 @@ setup:
8787
script_score:
8888
query: {match_all: {} }
8989
script:
90-
source: "cosineSimilaritySparse(params.query_vector, doc['my_sparse_vector'])"
90+
source: "cosineSimilaritySparse(params.query_vector, 'my_sparse_vector')"
9191
params:
9292
query_vector: {"2": -0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
9393

@@ -104,3 +104,27 @@ setup:
104104
- match: {hits.hits.2._id: "1"}
105105
- gte: {hits.hits.2._score: 0.78}
106106
- lte: {hits.hits.2._score: 0.791}
107+
108+
---
109+
"Deprecated function signature":
110+
- do:
111+
headers:
112+
Content-Type: application/json
113+
warnings:
114+
- The [sparse_vector] field type is deprecated and will be removed in 8.0.
115+
- The vector functions of the form function(query, doc['field']) are deprecated, and the form function(query, 'field') should be used instead. For example, cosineSimilarity(query, doc['field']) is replaced by cosineSimilarity(query, 'field').
116+
search:
117+
rest_total_hits_as_int: true
118+
body:
119+
query:
120+
script_score:
121+
query: {match_all: {} }
122+
script:
123+
source: "cosineSimilaritySparse(params.query_vector, doc['my_sparse_vector'])"
124+
params:
125+
query_vector: {"2": -0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
126+
127+
- match: {hits.total: 3}
128+
- match: {hits.hits.0._id: "3"}
129+
- match: {hits.hits.1._id: "2"}
130+
- match: {hits.hits.2._id: "1"}

x-pack/plugin/src/test/resources/rest-api-spec/test/vectors/35_sparse_vector_l1l2.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ setup:
5555
script_score:
5656
query: {match_all: {} }
5757
script:
58-
source: "l1normSparse(params.query_vector, doc['my_sparse_vector'])"
58+
source: "l1normSparse(params.query_vector, 'my_sparse_vector')"
5959
params:
6060
query_vector: {"2": 0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
6161

@@ -88,7 +88,7 @@ setup:
8888
script_score:
8989
query: {match_all: {} }
9090
script:
91-
source: "l2normSparse(params.query_vector, doc['my_sparse_vector'])"
91+
source: "l2normSparse(params.query_vector, 'my_sparse_vector')"
9292
params:
9393
query_vector: {"2": 0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
9494

0 commit comments

Comments
 (0)