Skip to content

Commit 58df375

Browse files
committed
[DOCS] Reformats interval query (elastic#45350)
1 parent 49bcad9 commit 58df375

File tree

1 file changed

+150
-77
lines changed

1 file changed

+150
-77
lines changed

docs/reference/query-dsl/intervals-query.asciidoc

Lines changed: 150 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,25 @@
44
<titleabbrev>Intervals</titleabbrev>
55
++++
66

7-
An `intervals` query allows fine-grained control over the order and proximity of
8-
matching terms. Matching rules are constructed from a small set of definitions,
9-
and the rules are then applied to terms from a particular `field`.
7+
Returns documents based on the order and proximity of matching terms.
8+
9+
The `intervals` query uses *matching rules*, constructed from a small set of
10+
definitions. Theses rules are then applied to terms from a specified `field`.
1011

1112
The definitions produce sequences of minimal intervals that span terms in a
12-
body of text. These intervals can be further combined and filtered by
13+
body of text. These intervals can be further combined and filtered by
1314
parent sources.
1415

15-
The example below will search for the phrase `my favourite food` appearing
16-
before the terms `hot` and `water` or `cold` and `porridge` in any order, in
17-
the field `my_text`
16+
17+
[[intervals-query-ex-request]]
18+
==== Example request
19+
20+
The following `intervals` search returns documents containing `my
21+
favorite food` immediately followed by `hot water` or `cold porridge` in the
22+
`my_text` field.
23+
24+
This search would match a `my_text` value of `my favorite food is cold
25+
porridge` but not `when it's cold my favorite food is porridge`.
1826

1927
[source,js]
2028
--------------------------------------------------
@@ -28,7 +36,7 @@ POST _search
2836
"intervals" : [
2937
{
3038
"match" : {
31-
"query" : "my favourite food",
39+
"query" : "my favorite food",
3240
"max_gaps" : 0,
3341
"ordered" : true
3442
}
@@ -42,81 +50,158 @@ POST _search
4250
}
4351
}
4452
]
45-
},
46-
"_name" : "favourite_food"
53+
}
4754
}
4855
}
4956
}
5057
}
5158
--------------------------------------------------
5259
// CONSOLE
5360

54-
In the above example, the text `my favourite food is cold porridge` would
55-
match because the two intervals matching `my favourite food` and `cold
56-
porridge` appear in the correct order, but the text `when it's cold my
57-
favourite food is porridge` would not match, because the interval matching
58-
`cold porridge` starts before the interval matching `my favourite food`.
61+
[[intervals-top-level-params]]
62+
==== Top-level parameters for `intervals`
63+
[[intervals-rules]]
64+
`<field>`::
65+
+
66+
--
67+
(Required, rule object) Field you wish to search.
68+
69+
The value of this parameter is a rule object used to match documents
70+
based on matching terms, order, and proximity.
71+
72+
Valid rules include:
73+
74+
* <<intervals-match,`match`>>
75+
* <<intervals-all_of,`all_of`>>
76+
* <<intervals-any_of,`any_of`>>
77+
* <<interval_filter,`filter`>>
78+
--
5979

6080
[[intervals-match]]
61-
==== `match`
81+
==== `match` rule parameters
6282

63-
The `match` rule matches analyzed text, and takes the following parameters:
83+
The `match` rule matches analyzed text.
6484

65-
[horizontal]
6685
`query`::
67-
The text to match.
86+
(Required, string) Text you wish to find in the provided `<field>`.
87+
6888
`max_gaps`::
69-
Specify a maximum number of gaps between the terms in the text. Terms that
70-
appear further apart than this will not match. If unspecified, or set to -1,
71-
then there is no width restriction on the match. If set to 0 then the terms
72-
must appear next to each other.
89+
+
90+
--
91+
(Optional, integer) Maximum number of positions between the matching terms.
92+
Terms further apart than this are not considered matches. Defaults to
93+
`-1`.
94+
95+
If unspecified or set to `-1`, there is no width restriction on the match. If
96+
set to `0`, the terms must appear next to each other.
97+
--
98+
7399
`ordered`::
74-
Whether or not the terms must appear in their specified order. Defaults to
75-
`false`
100+
(Optional, boolean)
101+
If `true`, matching terms must appear in their specified order. Defaults to
102+
`false`.
103+
76104
`analyzer`::
77-
Which analyzer should be used to analyze terms in the `query`. By
78-
default, the search analyzer of the top-level field will be used.
105+
(Optional, string) <<analysis, analyzer>> used to analyze terms in the `query`.
106+
Defaults to the top-level `<field>`'s analyzer.
107+
79108
`filter`::
80-
An optional <<interval_filter,interval filter>>
109+
(Optional, <<interval_filter,interval filter>> rule object) An optional interval
110+
filter.
81111

82112
[[intervals-all_of]]
83-
==== `all_of`
113+
==== `all_of` rule parameters
84114

85-
`all_of` returns returns matches that span a combination of other rules.
115+
The `all_of` rule returns matches that span a combination of other rules.
86116

87-
[horizontal]
88117
`intervals`::
89-
An array of rules to combine. All rules must produce a match in a
90-
document for the overall source to match.
118+
(Required, array of rule objects) An array of rules to combine. All rules must
119+
produce a match in a document for the overall source to match.
120+
91121
`max_gaps`::
92-
Specify a maximum number of gaps between the rules. Combinations that match
93-
across a distance greater than this will not match. If set to -1 or
94-
unspecified, there is no restriction on this distance. If set to 0, then the
95-
matches produced by the rules must all appear immediately next to each other.
122+
+
123+
--
124+
(Optional, integer) Maximum number of positions between the matching terms.
125+
Intervals produced by the rules further apart than this are not considered
126+
matches. Defaults to `-1`.
127+
128+
If unspecified or set to `-1`, there is no width restriction on the match. If
129+
set to `0`, the terms must appear next to each other.
130+
--
131+
96132
`ordered`::
97-
Whether the intervals produced by the rules should appear in the order in
98-
which they are specified. Defaults to `false`
133+
(Optional, boolean) If `true`, intervals produced by the rules should appear in
134+
the order in which they are specified. Defaults to `false`.
135+
99136
`filter`::
100-
An optional <<interval_filter,interval filter>>
137+
(Optional, <<interval_filter,interval filter>> rule object) Rule used to filter
138+
returned intervals.
101139

102140
[[intervals-any_of]]
103-
==== `any_of`
141+
==== `any_of` rule parameters
104142

105-
The `any_of` rule emits intervals produced by any of its sub-rules.
143+
The `any_of` rule returns intervals produced by any of its sub-rules.
106144

107-
[horizontal]
108145
`intervals`::
109-
An array of rules to match
146+
(Required, array of rule objects) An array of rules to match.
147+
110148
`filter`::
111-
An optional <<interval_filter,interval filter>>
149+
(Optional, <<interval_filter,interval filter>> rule object) Rule used to filter
150+
returned intervals.
112151

113152
[[interval_filter]]
114-
==== filters
153+
==== `filter` rule parameters
154+
155+
The `filter` rule returns intervals based on a query. See
156+
<<interval-filter-rule-ex>> for an example.
157+
158+
`after`::
159+
(Optional, query object) Query used to return intervals that follow an interval
160+
from the `filter` rule.
115161

116-
You can filter intervals produced by any rules by their relation to the
117-
intervals produced by another rule. The following example will return
118-
documents that have the words `hot` and `porridge` within 10 positions
119-
of each other, without the word `salty` in between:
162+
`before`::
163+
(Optional, query object) Query used to return intervals that occur before an
164+
interval from the `filter` rule.
165+
166+
`contained_by`::
167+
(Optional, query object) Query used to return intervals contained by an interval
168+
from the `filter` rule.
169+
170+
`containing`::
171+
(Optional, query object) Query used to return intervals that contain an interval
172+
from the `filter` rule.
173+
174+
`not_contained_by`::
175+
(Optional, query object) Query used to return intervals that are *not*
176+
contained by an interval from the `filter` rule.
177+
178+
`not_containing`::
179+
(Optional, query object) Query used to return intervals that do *not* contain
180+
an interval from the `filter` rule.
181+
182+
`not_overlapping`::
183+
(Optional, query object) Query used to return intervals that do *not* overlap
184+
with an interval from the `filter` rule.
185+
186+
`overlapping`::
187+
(Optional, query object) Query used to return intervals that overlap with an
188+
interval from the `filter` rule.
189+
190+
`script`::
191+
(Optional, <<modules-scripting-using, script object>>) Script used to return
192+
matching documents. This script must return a boolean value, `true` or `false`.
193+
See <<interval-script-filter>> for an example.
194+
195+
196+
[[intervals-query-note]]
197+
==== Notes
198+
199+
[[interval-filter-rule-ex]]
200+
===== Filter example
201+
202+
The following search includes a `filter` rule. It returns documents that have
203+
the words `hot` and `porridge` within 10 positions of each other, without the
204+
word `salty` in between:
120205

121206
[source,js]
122207
--------------------------------------------------
@@ -143,25 +228,12 @@ POST _search
143228
--------------------------------------------------
144229
// CONSOLE
145230

146-
The following filters are available:
147-
[horizontal]
148-
`containing`::
149-
Produces intervals that contain an interval from the filter rule
150-
`contained_by`::
151-
Produces intervals that are contained by an interval from the filter rule
152-
`not_containing`::
153-
Produces intervals that do not contain an interval from the filter rule
154-
`not_contained_by`::
155-
Produces intervals that are not contained by an interval from the filter rule
156-
`not_overlapping`::
157-
Produces intervals that do not overlap with an interval from the filter rule
158-
159231
[[interval-script-filter]]
160-
==== Script filters
232+
===== Script filters
161233

162-
You can also filter intervals based on their start position, end position and
163-
internal gap count, using a script. The script has access to an `interval`
164-
variable, with `start`, `end` and `gaps` methods:
234+
You can use a script to filter intervals based on their start position, end
235+
position, and internal gap count. The following `filter` script uses the
236+
`interval` variable with the `start`, `end`, and `gaps` methods:
165237

166238
[source,js]
167239
--------------------------------------------------
@@ -185,12 +257,13 @@ POST _search
185257
--------------------------------------------------
186258
// CONSOLE
187259

260+
188261
[[interval-minimization]]
189-
==== Minimization
262+
===== Minimization
190263

191264
The intervals query always minimizes intervals, to ensure that queries can
192-
run in linear time. This can sometimes cause surprising results, particularly
193-
when using `max_gaps` restrictions or filters. For example, take the
265+
run in linear time. This can sometimes cause surprising results, particularly
266+
when using `max_gaps` restrictions or filters. For example, take the
194267
following query, searching for `salty` contained within the phrase `hot
195268
porridge`:
196269

@@ -218,15 +291,15 @@ POST _search
218291
--------------------------------------------------
219292
// CONSOLE
220293

221-
This query will *not* match a document containing the phrase `hot porridge is
294+
This query does *not* match a document containing the phrase `hot porridge is
222295
salty porridge`, because the intervals returned by the match query for `hot
223296
porridge` only cover the initial two terms in this document, and these do not
224297
overlap the intervals covering `salty`.
225298

226299
Another restriction to be aware of is the case of `any_of` rules that contain
227-
sub-rules which overlap. In particular, if one of the rules is a strict
228-
prefix of the other, then the longer rule will never be matched, which can
229-
cause surprises when used in combination with `max_gaps`. Consider the
300+
sub-rules which overlap. In particular, if one of the rules is a strict
301+
prefix of the other, then the longer rule can never match, which can
302+
cause surprises when used in combination with `max_gaps`. Consider the
230303
following query, searching for `the` immediately followed by `big` or `big bad`,
231304
immediately followed by `wolf`:
232305

@@ -257,10 +330,10 @@ POST _search
257330
--------------------------------------------------
258331
// CONSOLE
259332

260-
Counter-intuitively, this query *will not* match the document `the big bad
261-
wolf`, because the `any_of` rule in the middle will only produce intervals
333+
Counter-intuitively, this query does *not* match the document `the big bad
334+
wolf`, because the `any_of` rule in the middle only produces intervals
262335
for `big` - intervals for `big bad` being longer than those for `big`, while
263-
starting at the same position, and so being minimized away. In these cases,
336+
starting at the same position, and so being minimized away. In these cases,
264337
it's better to rewrite the query so that all of the options are explicitly
265338
laid out at the top level:
266339

0 commit comments

Comments
 (0)