Skip to content

Commit 32cb47b

Browse files
Add l1norm and l2norm distances for vectors (#44116)
Add L1norm - Manhattan distance Add L2norm - Euclidean distance relates to #37947
1 parent 31725ef commit 32cb47b

File tree

8 files changed

+895
-163
lines changed

8 files changed

+895
-163
lines changed

docs/reference/query-dsl/script-score-query.asciidoc

+4-129
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@ a function to be used to compute a new score for each document returned
1111
by the query. For more information on scripting see
1212
<<modules-scripting, scripting documentation>>.
1313

14-
1514
Here is an example of using `script_score` to assign each matched document
1615
a score equal to the number of likes divided by 10:
1716

@@ -32,7 +31,6 @@ GET /_search
3231
}
3332
--------------------------------------------------
3433
// CONSOLE
35-
// TEST[setup:twitter]
3634

3735
==== Accessing the score of a document within a script
3836

@@ -72,131 +70,6 @@ to be the most efficient by using the internal mechanisms.
7270
--------------------------------------------------
7371
// NOTCONSOLE
7472

75-
[role="xpack"]
76-
[testenv="basic"]
77-
[[vector-functions]]
78-
===== Functions for vector fields
79-
80-
experimental[]
81-
82-
These functions are used for
83-
for <<dense-vector,`dense_vector`>> and
84-
<<sparse-vector,`sparse_vector`>> fields.
85-
86-
NOTE: During vector functions' calculation, all matched documents are
87-
linearly scanned. Thus, expect the query time grow linearly
88-
with the number of matched documents. For this reason, we recommend
89-
to limit the number of matched documents with a `query` parameter.
90-
91-
For dense_vector fields, `cosineSimilarity` calculates the measure of
92-
cosine similarity between a given query vector and document vectors.
93-
94-
[source,js]
95-
--------------------------------------------------
96-
{
97-
"query": {
98-
"script_score": {
99-
"query": {
100-
"match_all": {}
101-
},
102-
"script": {
103-
"source": "cosineSimilarity(params.queryVector, doc['my_dense_vector'])",
104-
"params": {
105-
"queryVector": [4, 3.4, -0.2] <1>
106-
}
107-
}
108-
}
109-
}
110-
}
111-
--------------------------------------------------
112-
// NOTCONSOLE
113-
<1> To take advantage of the script optimizations, provide a query vector as a script parameter.
114-
115-
Similarly, for sparse_vector fields, `cosineSimilaritySparse` calculates cosine similarity
116-
between a given query vector and document vectors.
117-
118-
[source,js]
119-
--------------------------------------------------
120-
{
121-
"query": {
122-
"script_score": {
123-
"query": {
124-
"match_all": {}
125-
},
126-
"script": {
127-
"source": "cosineSimilaritySparse(params.queryVector, doc['my_sparse_vector'])",
128-
"params": {
129-
"queryVector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
130-
}
131-
}
132-
}
133-
}
134-
}
135-
--------------------------------------------------
136-
// NOTCONSOLE
137-
138-
For dense_vector fields, `dotProduct` calculates the measure of
139-
dot product between a given query vector and document vectors.
140-
141-
[source,js]
142-
--------------------------------------------------
143-
{
144-
"query": {
145-
"script_score": {
146-
"query": {
147-
"match_all": {}
148-
},
149-
"script": {
150-
"source": "dotProduct(params.queryVector, doc['my_dense_vector'])",
151-
"params": {
152-
"queryVector": [4, 3.4, -0.2]
153-
}
154-
}
155-
}
156-
}
157-
}
158-
--------------------------------------------------
159-
// NOTCONSOLE
160-
161-
Similarly, for sparse_vector fields, `dotProductSparse` calculates dot product
162-
between a given query vector and document vectors.
163-
164-
[source,js]
165-
--------------------------------------------------
166-
{
167-
"query": {
168-
"script_score": {
169-
"query": {
170-
"match_all": {}
171-
},
172-
"script": {
173-
"source": "dotProductSparse(params.queryVector, doc['my_sparse_vector'])",
174-
"params": {
175-
"queryVector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
176-
}
177-
}
178-
}
179-
}
180-
}
181-
--------------------------------------------------
182-
// NOTCONSOLE
183-
184-
NOTE: If a document doesn't have a value for a vector field on which
185-
a vector function is executed, an error will be thrown.
186-
187-
You can check if a document has a value for the field `my_vector` by
188-
`doc['my_vector'].size() == 0`. Your overall script can look like this:
189-
190-
[source,js]
191-
--------------------------------------------------
192-
"source": "doc['my_vector'].size() == 0 ? 0 : cosineSimilarity(params.queryVector, doc['my_vector'])"
193-
--------------------------------------------------
194-
// NOTCONSOLE
195-
196-
NOTE: If a document's dense vector field has a number of dimensions
197-
different from the query's vector, an error will be thrown.
198-
199-
20073
[[random-score-function]]
20174
===== Random score function
20275
`random_score` function generates scores that are uniformly distributed
@@ -310,6 +183,9 @@ You can read more about decay functions
310183
NOTE: Decay functions on dates are limited to dates in the default format
311184
and default time zone. Also calculations with `now` are not supported.
312185

186+
===== Functions for vector fields
187+
<<vector-functions, Functions for vector fields>> are accessible through
188+
`script_score` query.
313189

314190
==== Faster alternatives
315191
Script Score Query calculates the score for every hit (matching document).
@@ -409,5 +285,4 @@ through a script:
409285
Script Score query has equivalent <<decay-functions, decay functions>>
410286
that can be used in script.
411287

412-
413-
288+
include::{es-repo-dir}/vectors/vector-functions.asciidoc[]

0 commit comments

Comments
 (0)