Skip to content

Commit 7b9675b

Browse files
jrodewigjkakavas
authored andcommitted
[DOCS] Reformat distance feature query (#44916)
1 parent 3eb5d26 commit 7b9675b

File tree

1 file changed

+135
-86
lines changed

1 file changed

+135
-86
lines changed

docs/reference/query-dsl/distance-feature-query.asciidoc

Lines changed: 135 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -4,81 +4,38 @@
44
<titleabbrev>Distance feature</titleabbrev>
55
++++
66

7-
The `distance_feature` query is a specialized query that only works
8-
on <<date, `date`>>, <<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>>
9-
fields. Its goal is to boost documents' scores based on proximity
10-
to some given origin. For example, use this query if you want to
11-
give more weight to documents with dates closer to a certain date,
12-
or to documents with locations closer to a certain location.
13-
14-
This query is called `distance_feature` query, because it dynamically
15-
calculates distances between the given origin and documents' field values,
16-
and use these distances as features to boost the documents' scores.
17-
18-
`distance_feature` query is typically used on its own to find the nearest
19-
neighbors to a given point, or put in a `should` clause of a
20-
<<query-dsl-bool-query,`bool`>> query so that its score is added to the score
21-
of the query.
22-
23-
Compared to using <<query-dsl-function-score-query,`function_score`>> or other
24-
ways to modify the score, this query has the benefit of being able to
25-
efficiently skip non-competitive hits when
26-
<<search-uri-request,`track_total_hits`>> is not set to `true`.
27-
28-
==== Syntax of distance_feature query
29-
30-
`distance_feature` query has the following syntax:
31-
[source,js]
32-
--------------------------------------------------
33-
"distance_feature": {
34-
"field": <field>,
35-
"origin": <origin>,
36-
"pivot": <pivot>,
37-
"boost" : <boost>
38-
}
39-
--------------------------------------------------
40-
// NOTCONSOLE
41-
42-
[horizontal]
43-
`field`::
44-
Required parameter. Defines the name of the field on which to calculate
45-
distances. Must be a field of the type `date`, `date_nanos` or `geo_point`,
46-
and must be indexed (`"index": true`, which is the default) and has
47-
<<doc-values, doc values>> (`"doc_values": true`, which is the default).
48-
49-
`origin`::
50-
Required parameter. Defines a point of origin used for calculating
51-
distances. Must be a date for date and date_nanos fields,
52-
and a geo-point for geo_point fields. Date math (for example `now-1h`) is
53-
supported for a date origin.
54-
55-
`pivot`::
56-
Required parameter. Defines the distance from origin at which the computed
57-
score will equal to a half of the `boost` parameter. Must be
58-
a `number+date unit` ("1h", "10d",...) for date and date_nanos fields,
59-
and a `number + geo unit` ("1km", "12m",...) for geo fields.
7+
Boosts the <<query-filter-context, relevance score>> of documents closer to a
8+
provided `origin` date or point. For example, you can use this query to give
9+
more weight to documents closer to a certain date or location.
6010

61-
`boost`::
62-
Optional parameter with a default value of `1`. Defines the factor by which
63-
to multiply the score. Must be a non-negative float number.
11+
You can use the `distance_feature` query to find the nearest neighbors to a
12+
location. You can also use the query in a <<query-dsl-bool-query,`bool`>>
13+
search's `should` filter to add boosted relevance scores to the `bool` query's
14+
scores.
6415

6516

66-
The `distance_feature` query computes a document's score as following:
17+
[[distance-feature-query-ex-request]]
18+
==== Example request
6719

68-
`score = boost * pivot / (pivot + distance)`
20+
[[distance-feature-index-setup]]
21+
===== Index setup
22+
To use the `distance_feature` query, your index must include a <<date, `date`>>,
23+
<<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>> field.
6924

70-
where `distance` is the absolute difference between the origin and
71-
a document's field value.
25+
To see how you can set up an index for the `distance_feature` query, try the
26+
following example.
7227

73-
==== Example using distance_feature query
28+
. Create an `items` index with the following field mapping:
29+
+
30+
--
7431

75-
Let's look at an example. We index several documents containing
76-
information about sales items, such as name, production date,
77-
and location.
32+
* `name`, a <<keyword,`keyword`>> field
33+
* `production_date`, a <<date, `date`>> field
34+
* `location`, a <<geo-point,`geo_point`>> field
7835

7936
[source,js]
80-
--------------------------------------------------
81-
PUT items
37+
----
38+
PUT /items
8239
{
8340
"mappings": {
8441
"properties": {
@@ -94,40 +51,54 @@ PUT items
9451
}
9552
}
9653
}
54+
----
55+
// CONSOLE
56+
// TESTSETUP
57+
--
9758

98-
PUT items/_doc/1
59+
. Index several documents to this index.
60+
+
61+
--
62+
[source,js]
63+
----
64+
PUT /items/_doc/1?refresh
9965
{
10066
"name" : "chocolate",
10167
"production_date": "2018-02-01",
10268
"location": [-71.34, 41.12]
10369
}
10470
105-
PUT items/_doc/2
71+
PUT /items/_doc/2?refresh
10672
{
10773
"name" : "chocolate",
10874
"production_date": "2018-01-01",
10975
"location": [-71.3, 41.15]
11076
}
11177
11278
113-
PUT items/_doc/3
79+
PUT /items/_doc/3?refresh
11480
{
11581
"name" : "chocolate",
11682
"production_date": "2017-12-01",
11783
"location": [-71.3, 41.12]
11884
}
119-
120-
POST items/_refresh
121-
--------------------------------------------------
85+
----
12286
// CONSOLE
87+
--
88+
89+
90+
[[distance-feature-query-ex-query]]
91+
===== Example queries
12392

124-
We look for all chocolate items, but we also want chocolates
125-
that are produced recently (closer to the date `now`)
126-
to be ranked higher.
93+
[[distance-feature-query-date-ex]]
94+
====== Boost documents based on date
95+
The following `bool` search returns documents with a `name` value of
96+
`chocolate`. The search also uses the `distance_feature` query to increase the
97+
relevance score of documents with a `production_date` value closer to `now`.
12798

12899
[source,js]
129-
--------------------------------------------------
130-
GET items/_search
100+
----
101+
GET /items/_search
131102
{
132103
"query": {
133104
"bool": {
@@ -146,17 +117,18 @@ GET items/_search
146117
}
147118
}
148119
}
149-
--------------------------------------------------
120+
----
150121
// CONSOLE
151-
// TEST[continued]
152122

153-
We can look for all chocolate items, but we also want chocolates
154-
that are produced locally (closer to our geo origin)
155-
come first in the result list.
123+
[[distance-feature-query-distance-ex]]
124+
====== Boost documents based on location
125+
The following `bool` search returns documents with a `name` value of
126+
`chocolate`. The search also uses the `distance_feature` query to increase the
127+
relevance score of documents with a `location` value closer to `[-71.3, 41.15]`.
156128

157129
[source,js]
158-
--------------------------------------------------
159-
GET items/_search
130+
----
131+
GET /items/_search
160132
{
161133
"query": {
162134
"bool": {
@@ -175,6 +147,83 @@ GET items/_search
175147
}
176148
}
177149
}
178-
--------------------------------------------------
150+
----
179151
// CONSOLE
180-
// TEST[continued]
152+
153+
154+
[[distance-feature-top-level-params]]
155+
==== Top-level parameters for `distance_feature`
156+
`field`::
157+
(Required, string) Name of the field used to calculate distances. This field
158+
must meet the following criteria:
159+
160+
* Be a <<date, `date`>>, <<date_nanos, `date_nanos`>> or
161+
<<geo-point,`geo_point`>> field
162+
* Have an <<mapping-index,`index`>> mapping parameter value of `true`, which is
163+
the default
164+
* Have an <<doc-values,`doc_values`>> mapping parameter value of `true`, which
165+
is the default
166+
167+
`origin`::
168+
+
169+
--
170+
(Required, string) Date or point of origin used to calculate distances.
171+
172+
If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
173+
field, the `origin` value must be a <<date-format-pattern,date>>.
174+
<<date-math,Date Math>>, such as `now-1h`, is supported.
175+
176+
If the `field` value is a <<geo-point,`geo_point`>> field, the `origin` value
177+
must be a geopoint.
178+
--
179+
180+
`pivot`::
181+
+
182+
--
183+
(Required, <<time-units,time unit>> or <<distance-units,distance unit>>)
184+
Distance from the `origin` at which relevance scores receive half of the `boost`
185+
value.
186+
187+
If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
188+
field, the `pivot` value must be a <<time-units,time unit>>, such as `1h` or
189+
`10d`.
190+
191+
If the `field` value is a <<geo-point,`geo_point`>> field, the `pivot` value
192+
must be a <<distance-units,distance unit>>, such as `1km` or `12m`.
193+
--
194+
195+
`boost`::
196+
+
197+
--
198+
(Optional, float) Floating point number used to multiply the
199+
<<query-filter-context, relevance score>> of matching documents. This value
200+
cannot be negative. Defaults to `1.0`.
201+
--
202+
203+
204+
[[distance-feature-notes]]
205+
==== Notes
206+
207+
[[distance-feature-calculation]]
208+
===== How the `distance_feature` query calculates relevance scores
209+
The `distance_feature` query dynamically calculates the distance between the
210+
`origin` value and a document's field values. It then uses this distance as a
211+
feature to boost the <<query-filter-context, relevance score>> of closer
212+
documents.
213+
214+
The `distance_feature` query calculates a document's <<query-filter-context,
215+
relevance score>> as follows:
216+
217+
```
218+
relevance score = boost * pivot / (pivot + distance)
219+
```
220+
221+
The `distance` is the absolute difference between the `origin` value and a
222+
document's field value.
223+
224+
[[distance-feature-skip-hits]]
225+
===== Skip non-competitive hits
226+
Unlike the <<query-dsl-function-score-query,`function_score`>> query or other
227+
ways to change <<query-filter-context, relevance scores>>, the
228+
`distance_feature` query efficiently skips non-competitive hits when the
229+
<<search-uri-request,`track_total_hits`>> parameter is **not** `true`.

0 commit comments

Comments
 (0)