4
4
<titleabbrev>Distance feature</titleabbrev>
5
5
++++
6
6
7
- The `distance_feature` query is a specialized query that only works
8
- on <<date, `date`>>, <<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>>
9
- fields. Its goal is to boost documents' scores based on proximity
10
- to some given origin. For example, use this query if you want to
11
- give more weight to documents with dates closer to a certain date,
12
- or to documents with locations closer to a certain location.
13
-
14
- This query is called `distance_feature` query, because it dynamically
15
- calculates distances between the given origin and documents' field values,
16
- and use these distances as features to boost the documents' scores.
17
-
18
- `distance_feature` query is typically used on its own to find the nearest
19
- neighbors to a given point, or put in a `should` clause of a
20
- <<query-dsl-bool-query,`bool`>> query so that its score is added to the score
21
- of the query.
22
-
23
- Compared to using <<query-dsl-function-score-query,`function_score`>> or other
24
- ways to modify the score, this query has the benefit of being able to
25
- efficiently skip non-competitive hits when
26
- <<search-uri-request,`track_total_hits`>> is not set to `true`.
27
-
28
- ==== Syntax of distance_feature query
29
-
30
- `distance_feature` query has the following syntax:
31
- [source,js]
32
- --------------------------------------------------
33
- "distance_feature": {
34
- "field": <field>,
35
- "origin": <origin>,
36
- "pivot": <pivot>,
37
- "boost" : <boost>
38
- }
39
- --------------------------------------------------
40
- // NOTCONSOLE
41
-
42
- [horizontal]
43
- `field`::
44
- Required parameter. Defines the name of the field on which to calculate
45
- distances. Must be a field of the type `date`, `date_nanos` or `geo_point`,
46
- and must be indexed (`"index": true`, which is the default) and has
47
- <<doc-values, doc values>> (`"doc_values": true`, which is the default).
48
-
49
- `origin`::
50
- Required parameter. Defines a point of origin used for calculating
51
- distances. Must be a date for date and date_nanos fields,
52
- and a geo-point for geo_point fields. Date math (for example `now-1h`) is
53
- supported for a date origin.
54
-
55
- `pivot`::
56
- Required parameter. Defines the distance from origin at which the computed
57
- score will equal to a half of the `boost` parameter. Must be
58
- a `number+date unit` ("1h", "10d",...) for date and date_nanos fields,
59
- and a `number + geo unit` ("1km", "12m",...) for geo fields.
7
+ Boosts the <<query-filter-context, relevance score>> of documents closer to a
8
+ provided `origin` date or point. For example, you can use this query to give
9
+ more weight to documents closer to a certain date or location.
60
10
61
- `boost`::
62
- Optional parameter with a default value of `1`. Defines the factor by which
63
- to multiply the score. Must be a non-negative float number.
11
+ You can use the `distance_feature` query to find the nearest neighbors to a
12
+ location. You can also use the query in a <<query-dsl-bool-query,`bool`>>
13
+ search's `should` filter to add boosted relevance scores to the `bool` query's
14
+ scores.
64
15
65
16
66
- The `distance_feature` query computes a document's score as following:
17
+ [[distance-feature-query-ex-request]]
18
+ ==== Example request
67
19
68
- `score = boost * pivot / (pivot + distance)`
20
+ [[distance-feature-index-setup]]
21
+ ===== Index setup
22
+ To use the `distance_feature` query, your index must include a <<date, `date`>>,
23
+ <<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>> field.
69
24
70
- where `distance` is the absolute difference between the origin and
71
- a document's field value .
25
+ To see how you can set up an index for the `distance_feature` query, try the
26
+ following example .
72
27
73
- ==== Example using distance_feature query
28
+ . Create an `items` index with the following field mapping:
29
+ +
30
+ --
74
31
75
- Let's look at an example. We index several documents containing
76
- information about sales items, such as name, production date,
77
- and location.
32
+ * `name`, a <<keyword,`keyword`>> field
33
+ * `production_date`, a <<date, ` date`>> field
34
+ * ` location`, a <<geo-point,`geo_point`>> field
78
35
79
36
[source,js]
80
- --------------------------------------------------
81
- PUT items
37
+ ----
38
+ PUT / items
82
39
{
83
40
"mappings": {
84
41
"properties": {
@@ -94,40 +51,54 @@ PUT items
94
51
}
95
52
}
96
53
}
54
+ ----
55
+ // CONSOLE
56
+ // TESTSETUP
57
+ --
97
58
98
- PUT items/_doc/1
59
+ . Index several documents to this index.
60
+ +
61
+ --
62
+ [source,js]
63
+ ----
64
+ PUT /items/_doc/1?refresh
99
65
{
100
66
"name" : "chocolate",
101
67
"production_date": "2018-02-01",
102
68
"location": [-71.34, 41.12]
103
69
}
104
70
105
- PUT items/_doc/2
71
+ PUT / items/_doc/2?refresh
106
72
{
107
73
"name" : "chocolate",
108
74
"production_date": "2018-01-01",
109
75
"location": [-71.3, 41.15]
110
76
}
111
77
112
78
113
- PUT items/_doc/3
79
+ PUT / items/_doc/3?refresh
114
80
{
115
81
"name" : "chocolate",
116
82
"production_date": "2017-12-01",
117
83
"location": [-71.3, 41.12]
118
84
}
119
-
120
- POST items/_refresh
121
- --------------------------------------------------
85
+ ----
122
86
// CONSOLE
87
+ --
88
+
89
+
90
+ [[distance-feature-query-ex-query]]
91
+ ===== Example queries
123
92
124
- We look for all chocolate items, but we also want chocolates
125
- that are produced recently (closer to the date `now`)
126
- to be ranked higher.
93
+ [[distance-feature-query-date-ex]]
94
+ ====== Boost documents based on date
95
+ The following `bool` search returns documents with a `name` value of
96
+ `chocolate`. The search also uses the `distance_feature` query to increase the
97
+ relevance score of documents with a `production_date` value closer to `now`.
127
98
128
99
[source,js]
129
- --------------------------------------------------
130
- GET items/_search
100
+ ----
101
+ GET / items/_search
131
102
{
132
103
"query": {
133
104
"bool": {
@@ -146,17 +117,18 @@ GET items/_search
146
117
}
147
118
}
148
119
}
149
- --------------------------------------------------
120
+ ----
150
121
// CONSOLE
151
- // TEST[continued]
152
122
153
- We can look for all chocolate items, but we also want chocolates
154
- that are produced locally (closer to our geo origin)
155
- come first in the result list.
123
+ [[distance-feature-query-distance-ex]]
124
+ ====== Boost documents based on location
125
+ The following `bool` search returns documents with a `name` value of
126
+ `chocolate`. The search also uses the `distance_feature` query to increase the
127
+ relevance score of documents with a `location` value closer to `[-71.3, 41.15]`.
156
128
157
129
[source,js]
158
- --------------------------------------------------
159
- GET items/_search
130
+ ----
131
+ GET / items/_search
160
132
{
161
133
"query": {
162
134
"bool": {
@@ -175,6 +147,83 @@ GET items/_search
175
147
}
176
148
}
177
149
}
178
- --------------------------------------------------
150
+ ----
179
151
// CONSOLE
180
- // TEST[continued]
152
+
153
+
154
+ [[distance-feature-top-level-params]]
155
+ ==== Top-level parameters for `distance_feature`
156
+ `field`::
157
+ (Required, string) Name of the field used to calculate distances. This field
158
+ must meet the following criteria:
159
+
160
+ * Be a <<date, `date`>>, <<date_nanos, `date_nanos`>> or
161
+ <<geo-point,`geo_point`>> field
162
+ * Have an <<mapping-index,`index`>> mapping parameter value of `true`, which is
163
+ the default
164
+ * Have an <<doc-values,`doc_values`>> mapping parameter value of `true`, which
165
+ is the default
166
+
167
+ `origin`::
168
+ +
169
+ --
170
+ (Required, string) Date or point of origin used to calculate distances.
171
+
172
+ If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
173
+ field, the `origin` value must be a <<date-format-pattern,date>>.
174
+ <<date-math,Date Math>>, such as `now-1h`, is supported.
175
+
176
+ If the `field` value is a <<geo-point,`geo_point`>> field, the `origin` value
177
+ must be a geopoint.
178
+ --
179
+
180
+ `pivot`::
181
+ +
182
+ --
183
+ (Required, <<time-units,time unit>> or <<distance-units,distance unit>>)
184
+ Distance from the `origin` at which relevance scores receive half of the `boost`
185
+ value.
186
+
187
+ If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
188
+ field, the `pivot` value must be a <<time-units,time unit>>, such as `1h` or
189
+ `10d`.
190
+
191
+ If the `field` value is a <<geo-point,`geo_point`>> field, the `pivot` value
192
+ must be a <<distance-units,distance unit>>, such as `1km` or `12m`.
193
+ --
194
+
195
+ `boost`::
196
+ +
197
+ --
198
+ (Optional, float) Floating point number used to multiply the
199
+ <<query-filter-context, relevance score>> of matching documents. This value
200
+ cannot be negative. Defaults to `1.0`.
201
+ --
202
+
203
+
204
+ [[distance-feature-notes]]
205
+ ==== Notes
206
+
207
+ [[distance-feature-calculation]]
208
+ ===== How the `distance_feature` query calculates relevance scores
209
+ The `distance_feature` query dynamically calculates the distance between the
210
+ `origin` value and a document's field values. It then uses this distance as a
211
+ feature to boost the <<query-filter-context, relevance score>> of closer
212
+ documents.
213
+
214
+ The `distance_feature` query calculates a document's <<query-filter-context,
215
+ relevance score>> as follows:
216
+
217
+ ```
218
+ relevance score = boost * pivot / (pivot + distance)
219
+ ```
220
+
221
+ The `distance` is the absolute difference between the `origin` value and a
222
+ document's field value.
223
+
224
+ [[distance-feature-skip-hits]]
225
+ ===== Skip non-competitive hits
226
+ Unlike the <<query-dsl-function-score-query,`function_score`>> query or other
227
+ ways to change <<query-filter-context, relevance scores>>, the
228
+ `distance_feature` query efficiently skips non-competitive hits when the
229
+ <<search-uri-request,`track_total_hits`>> parameter is **not** `true`.
0 commit comments