-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[DOCS] Rewrite fuzzy
query docs
#42078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 6 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
e17c341
[DOCS] Rewrite fuzzy query
jrodewig 222eccd
[DOCS] Correct character removal example
jrodewig cdecfb8
Remove generic boost parm documentation
jrodewig b9f4c41
Update parameter docs to better match Elastic API Reference template
jrodewig 7a5f4a4
[DOCS] Update parameter format
jrodewig d6eebfc
Correct <field> datatype
jrodewig 9b97c9e
reword intro sentence
jrodewig 7b5e10a
Re-add simple example
jrodewig 7a1daa3
Correct max_expansions warning
jrodewig 9cec50a
Replace fuzziness def with xref
jrodewig e715757
Merge branch 'master' into fuzzy-query-rewrite
elasticmachine 17824ba
Replace fuzziness def with xref
jrodewig 9b9df5c
Remove extraneous "+"
jrodewig File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,75 +1,99 @@ | ||
[[query-dsl-fuzzy-query]] | ||
=== Fuzzy Query | ||
|
||
The fuzzy query uses similarity based on Levenshtein edit distance. | ||
Returns documents that contain terms similar to the search term. | ||
|
||
==== String fields | ||
{es} measures similarity, or fuzziness, using a | ||
jrodewig marked this conversation as resolved.
Show resolved
Hide resolved
|
||
http://en.wikipedia.org/wiki/Levenshtein_distance[Levenshtein edit distance]. An | ||
edit distance is the number of one-character changes needed to turn one term | ||
into another. These changes can include: | ||
|
||
The `fuzzy` query generates matching terms that are within the | ||
maximum edit distance specified in `fuzziness` and then checks the term | ||
dictionary to find out which of those generated terms actually exist in the | ||
index. The final query uses up to `max_expansions` matching terms. | ||
* Changing a character (**b**ox → **f**ox) | ||
* Removing a character (**b**lack → lack) | ||
* Inserting a character (sic → sic**k**) | ||
* Transposing two adjacent characters (**ac**t → **ca**t) | ||
|
||
Here is a simple example: | ||
To find similar terms, the `fuzzy` query creates a set of all possible | ||
variations, or expansions, of the search term within a specified edit distance. | ||
The query then returns exact matches for each expansion. | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
GET /_search | ||
{ | ||
"query": { | ||
"fuzzy" : { "user" : "ki" } | ||
} | ||
} | ||
-------------------------------------------------- | ||
// CONSOLE | ||
|
||
Or with more advanced settings: | ||
[[fuzzy-query-ex-request]] | ||
==== Example request | ||
jrodewig marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[source,js] | ||
-------------------------------------------------- | ||
---- | ||
GET /_search | ||
{ | ||
"query": { | ||
"fuzzy" : { | ||
"user" : { | ||
"fuzzy": { | ||
"user": { | ||
"value": "ki", | ||
"boost": 1.0, | ||
"fuzziness": 2, | ||
"fuzziness": "AUTO", | ||
"max_expansions": 50, | ||
"prefix_length": 0, | ||
"max_expansions": 100 | ||
"transpositions": true, | ||
"rewrite": "constant_score" | ||
} | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
---- | ||
// CONSOLE | ||
|
||
[float] | ||
===== Parameters | ||
[[fuzzy-query-top-level-params]] | ||
==== Top-level parameters for `fuzzy` | ||
`<field>`:: | ||
(Required, object) Field you wish to search. | ||
|
||
[horizontal] | ||
`fuzziness`:: | ||
[[fuzzy-query-field-params]] | ||
==== Parameters for `<field>` | ||
`value`:: | ||
(Required, string) Term you wish to find in the provided `<field>`. | ||
|
||
The maximum edit distance. Defaults to `AUTO`. See <<fuzziness>>. | ||
|
||
`prefix_length`:: | ||
|
||
The number of initial characters which will not be ``fuzzified''. This | ||
helps to reduce the number of terms which must be examined. Defaults | ||
to `0`. | ||
`fuzziness`:: | ||
jrodewig marked this conversation as resolved.
Show resolved
Hide resolved
|
||
(Optional, string) Maximum edit distance used to create expansions. Valid values | ||
are: | ||
+ | ||
-- | ||
`AUTO` (Default):: | ||
Changes edit distance based on the length of the search term. | ||
+ | ||
You can set a minimum and maximum edit distance using the | ||
`AUTO:[minimum],[maximum]` syntax. The default value is `AUTO:3,6`, which uses | ||
the following edit distances based on search term length: | ||
+ | ||
* For search terms of 2 characters or less, terms must match exactly. | ||
* For search terms of 3–5 characters, terms must match within a one edit | ||
distance. | ||
* For search terms greater than 5 characters, terms must match within a two edit | ||
distance. | ||
|
||
`0`:: No edits allowed. Terms must match exactly. | ||
|
||
`1`:: Terms must match within one edit. | ||
|
||
`2`:: Terms must match within two edits. | ||
-- | ||
|
||
`max_expansions`:: | ||
+ | ||
-- | ||
(Optional, integer) Maximum number of variations created. Defaults to `50`. | ||
+ | ||
WARNING: Avoid using a high value in the `max_expansions` parameter, especially | ||
if the `transpositions` parameter value is `0`. High values in the | ||
jrodewig marked this conversation as resolved.
Show resolved
Hide resolved
|
||
`max_expansions` parameter can cause poor performance due to the high number of | ||
variations examined. | ||
-- | ||
|
||
The maximum number of terms that the `fuzzy` query will expand to. | ||
Defaults to `50`. | ||
`prefix_length`:: | ||
(Optional, integer) Number of beginning characters left unchanged when creating | ||
expansions. Defaults to `0`. | ||
|
||
`transpositions`:: | ||
(Optional, boolean) Indicates whether edits include transpositions of two | ||
adjacent characters (ab → ba). Defaults to `true`. | ||
|
||
Whether fuzzy transpositions (`ab` -> `ba`) are supported. | ||
Default is `true`. | ||
|
||
WARNING: This query can be very heavy if `prefix_length` is set to `0` and if | ||
`max_expansions` is set to a high number. It could result in every term in the | ||
index being examined! | ||
|
||
|
||
`rewrite`:: | ||
(Optional, string) Method used to rewrite the query. For valid values and more | ||
information, see the <<query-dsl-multi-term-rewrite, `rewrite` parameter>>. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.