Skip to content

[DOCS] Rewrite fuzzy query docs #42078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 14, 2019
100 changes: 62 additions & 38 deletions docs/reference/query-dsl/fuzzy-query.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,75 +4,99 @@
<titleabbrev>Fuzzy</titleabbrev>
++++

The fuzzy query uses similarity based on Levenshtein edit distance.
Returns documents that contain terms similar to the search term, as measured by
a http://en.wikipedia.org/wiki/Levenshtein_distance[Levenshtein edit distance].

==== String fields
An edit distance is the number of one-character changes needed to turn one term
into another. These changes can include:

The `fuzzy` query generates matching terms that are within the
maximum edit distance specified in `fuzziness` and then checks the term
dictionary to find out which of those generated terms actually exist in the
index. The final query uses up to `max_expansions` matching terms.
* Changing a character (**b**ox → **f**ox)
* Removing a character (**b**lack → lack)
* Inserting a character (sic → sic**k**)
* Transposing two adjacent characters (**ac**t → **ca**t)

Here is a simple example:
To find similar terms, the `fuzzy` query creates a set of all possible
variations, or expansions, of the search term within a specified edit distance.
The query then returns exact matches for each expansion.

[[fuzzy-query-ex-request]]
==== Example requests

[[fuzzy-query-ex-simple]]
===== Simple example

[source,js]
--------------------------------------------------
----
GET /_search
{
"query": {
"fuzzy" : { "user" : "ki" }
"fuzzy": {
"user": {
"value": "ki"
}
}
}
}
--------------------------------------------------
----
// CONSOLE

Or with more advanced settings:
[[fuzzy-query-ex-advanced]]
===== Example using advanced parameters

[source,js]
--------------------------------------------------
----
GET /_search
{
"query": {
"fuzzy" : {
"user" : {
"fuzzy": {
"user": {
"value": "ki",
"boost": 1.0,
"fuzziness": 2,
"fuzziness": "AUTO",
"max_expansions": 50,
"prefix_length": 0,
"max_expansions": 100
"transpositions": true,
"rewrite": "constant_score"
}
}
}
}
--------------------------------------------------
----
// CONSOLE

[float]
===== Parameters
[[fuzzy-query-top-level-params]]
==== Top-level parameters for `fuzzy`
`<field>`::
(Required, object) Field you wish to search.

[horizontal]
`fuzziness`::

The maximum edit distance. Defaults to `AUTO`. See <<fuzziness>>.
[[fuzzy-query-field-params]]
==== Parameters for `<field>`
`value`::
(Required, string) Term you wish to find in the provided `<field>`.

`prefix_length`::
`fuzziness`::
(Optional, string) Maximum edit distance allowed for matching. See <<fuzziness>>
for valid values and more information.

The number of initial characters which will not be ``fuzzified''. This
helps to reduce the number of terms which must be examined. Defaults
to `0`.

`max_expansions`::
+
--
(Optional, integer) Maximum number of variations created. Defaults to `50`.

The maximum number of terms that the `fuzzy` query will expand to.
Defaults to `50`.

`transpositions`::

Whether fuzzy transpositions (`ab` -> `ba`) are supported.
Default is `true`.
WARNING: Avoid using a high value in the `max_expansions` parameter, especially
if the `prefix_length` parameter value is `0`. High values in the
`max_expansions` parameter can cause poor performance due to the high number of
variations examined.
--

WARNING: This query can be very heavy if `prefix_length` is set to `0` and if
`max_expansions` is set to a high number. It could result in every term in the
index being examined!
`prefix_length`::
(Optional, integer) Number of beginning characters left unchanged when creating
expansions. Defaults to `0`.

`transpositions`::
(Optional, boolean) Indicates whether edits include transpositions of two
adjacent characters (ab → ba). Defaults to `true`.

`rewrite`::
(Optional, string) Method used to rewrite the query. For valid values and more
information, see the <<query-dsl-multi-term-rewrite, `rewrite` parameter>>.