Skip to content

[DOCS] Rewrite fuzzy query docs #42078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 14, 2019
118 changes: 71 additions & 47 deletions docs/reference/query-dsl/fuzzy-query.asciidoc
Original file line number Diff line number Diff line change
@@ -1,75 +1,99 @@
[[query-dsl-fuzzy-query]]
=== Fuzzy Query

The fuzzy query uses similarity based on Levenshtein edit distance.
Returns documents that contain terms similar to the search term.

==== String fields
{es} measures similarity, or fuzziness, using a
http://en.wikipedia.org/wiki/Levenshtein_distance[Levenshtein edit distance]. An
edit distance is the number of one-character changes needed to turn one term
into another. These changes can include:

The `fuzzy` query generates matching terms that are within the
maximum edit distance specified in `fuzziness` and then checks the term
dictionary to find out which of those generated terms actually exist in the
index. The final query uses up to `max_expansions` matching terms.
* Changing a character (**b**ox → **f**ox)
* Removing a character (**b**lack → lack)
* Inserting a character (sic → sic**k**)
* Transposing two adjacent characters (**ac**t → **ca**t)

Here is a simple example:
To find similar terms, the `fuzzy` query creates a set of all possible
variations, or expansions, of the search term within a specified edit distance.
The query then returns exact matches for each expansion.

[source,js]
--------------------------------------------------
GET /_search
{
"query": {
"fuzzy" : { "user" : "ki" }
}
}
--------------------------------------------------
// CONSOLE

Or with more advanced settings:
[[fuzzy-query-ex-request]]
==== Example request

[source,js]
--------------------------------------------------
----
GET /_search
{
"query": {
"fuzzy" : {
"user" : {
"fuzzy": {
"user": {
"value": "ki",
"boost": 1.0,
"fuzziness": 2,
"fuzziness": "AUTO",
"max_expansions": 50,
"prefix_length": 0,
"max_expansions": 100
"transpositions": true,
"rewrite": "constant_score"
}
}
}
}
--------------------------------------------------
----
// CONSOLE

[float]
===== Parameters
[[fuzzy-query-top-level-params]]
==== Top-level parameters for `fuzzy`
`<field>`::
(Required, object) Field you wish to search.

[horizontal]
`fuzziness`::
[[fuzzy-query-field-params]]
==== Parameters for `<field>`
`value`::
(Required, string) Term you wish to find in the provided `<field>`.

The maximum edit distance. Defaults to `AUTO`. See <<fuzziness>>.

`prefix_length`::

The number of initial characters which will not be ``fuzzified''. This
helps to reduce the number of terms which must be examined. Defaults
to `0`.
`fuzziness`::
(Optional, string) Maximum edit distance used to create expansions. Valid values
are:
+
--
`AUTO` (Default)::
Changes edit distance based on the length of the search term.
+
You can set a minimum and maximum edit distance using the
`AUTO:[minimum],[maximum]` syntax. The default value is `AUTO:3,6`, which uses
the following edit distances based on search term length:
+
* For search terms of 2 characters or less, terms must match exactly.
* For search terms of 3–5 characters, terms must match within a one edit
distance.
* For search terms greater than 5 characters, terms must match within a two edit
distance.

`0`:: No edits allowed. Terms must match exactly.

`1`:: Terms must match within one edit.

`2`:: Terms must match within two edits.
--

`max_expansions`::
+
--
(Optional, integer) Maximum number of variations created. Defaults to `50`.
+
WARNING: Avoid using a high value in the `max_expansions` parameter, especially
if the `transpositions` parameter value is `0`. High values in the
`max_expansions` parameter can cause poor performance due to the high number of
variations examined.
--

The maximum number of terms that the `fuzzy` query will expand to.
Defaults to `50`.
`prefix_length`::
(Optional, integer) Number of beginning characters left unchanged when creating
expansions. Defaults to `0`.

`transpositions`::
(Optional, boolean) Indicates whether edits include transpositions of two
adjacent characters (ab → ba). Defaults to `true`.

Whether fuzzy transpositions (`ab` -> `ba`) are supported.
Default is `true`.

WARNING: This query can be very heavy if `prefix_length` is set to `0` and if
`max_expansions` is set to a high number. It could result in every term in the
index being examined!


`rewrite`::
(Optional, string) Method used to rewrite the query. For valid values and more
information, see the <<query-dsl-multi-term-rewrite, `rewrite` parameter>>.