|
1 | 1 | [[query-dsl-multi-term-rewrite]]
|
2 |
| -== Multi Term Query Rewrite |
3 |
| - |
4 |
| -Multi term queries, like |
5 |
| -<<query-dsl-wildcard-query,wildcard>> and |
6 |
| -<<query-dsl-prefix-query,prefix>> are called |
7 |
| -multi term queries and end up going through a process of rewrite. This |
8 |
| -also happens on the |
9 |
| -<<query-dsl-query-string-query,query_string>>. |
10 |
| -All of those queries allow to control how they will get rewritten using |
11 |
| -the `rewrite` parameter: |
12 |
| - |
13 |
| -* `constant_score` (default): A rewrite method that performs like |
14 |
| -`constant_score_boolean` when there are few matching terms and otherwise |
15 |
| -visits all matching terms in sequence and marks documents for that term. |
16 |
| -Matching documents are assigned a constant score equal to the query's |
17 |
| -boost. |
18 |
| -* `scoring_boolean`: A rewrite method that first translates each term |
19 |
| -into a should clause in a boolean query, and keeps the scores as |
20 |
| -computed by the query. Note that typically such scores are meaningless |
21 |
| -to the user, and require non-trivial CPU to compute, so it's almost |
22 |
| -always better to use `constant_score`. This rewrite method will hit |
23 |
| -too many clauses failure if it exceeds the boolean query limit (defaults |
24 |
| -to `1024`). |
25 |
| -* `constant_score_boolean`: Similar to `scoring_boolean` except scores |
26 |
| -are not computed. Instead, each matching document receives a constant |
27 |
| -score equal to the query's boost. This rewrite method will hit too many |
28 |
| -clauses failure if it exceeds the boolean query limit (defaults to |
29 |
| -`1024`). |
30 |
| -* `top_terms_N`: A rewrite method that first translates each term into |
31 |
| -should clause in boolean query, and keeps the scores as computed by the |
32 |
| -query. This rewrite method only uses the top scoring terms so it will |
33 |
| -not overflow boolean max clause count. The `N` controls the size of the |
34 |
| -top scoring terms to use. |
35 |
| -* `top_terms_boost_N`: A rewrite method that first translates each term |
36 |
| -into should clause in boolean query, but the scores are only computed as |
37 |
| -the boost. This rewrite method only uses the top scoring terms so it |
38 |
| -will not overflow the boolean max clause count. The `N` controls the |
39 |
| -size of the top scoring terms to use. |
40 |
| -* `top_terms_blended_freqs_N`: A rewrite method that first translates each |
41 |
| -term into should clause in boolean query, but all term queries compute scores |
42 |
| -as if they had the same frequency. In practice the frequency which is used |
43 |
| -is the maximum frequency of all matching terms. This rewrite method only uses |
44 |
| -the top scoring terms so it will not overflow boolean max clause count. The |
45 |
| -`N` controls the size of the top scoring terms to use. |
| 2 | +== `rewrite` Parameter |
| 3 | + |
| 4 | +WARNING: This parameter is for expert users only. Changing the value of |
| 5 | +this parameter can impact search performance and relevance. |
| 6 | + |
| 7 | +{es} uses https://lucene.apache.org/core/[Apache Lucene] internally to power |
| 8 | +indexing and searching. In their original form, Lucene cannot execute the |
| 9 | +following queries: |
| 10 | + |
| 11 | +* <<query-dsl-fuzzy-query, `fuzzy`>> |
| 12 | +* <<query-dsl-prefix-query, `prefix`>> |
| 13 | +* <<query-dsl-query-string-query, `query_string`>> |
| 14 | +* <<query-dsl-regexp-query, `regexp`>> |
| 15 | +* <<query-dsl-wildcard-query, `wildcard`>> |
| 16 | + |
| 17 | +To execute them, Lucene changes these queries to a simpler form, such as a |
| 18 | +<<query-dsl-bool-query, `bool` query>> or a |
| 19 | +https://en.wikipedia.org/wiki/Bit_array[bit set]. |
| 20 | + |
| 21 | +The `rewrite` parameter determines: |
| 22 | + |
| 23 | +* How Lucene calculates the relevance scores for each matching document |
| 24 | +* Whether Lucene changes the original query to a `bool` |
| 25 | +query or bit set |
| 26 | +* If changed to a `bool` query, which `term` query clauses are included |
| 27 | + |
| 28 | +[float] |
| 29 | +[[rewrite-param-valid-values]] |
| 30 | +=== Valid values |
| 31 | + |
| 32 | +`constant_score` (Default):: |
| 33 | +Uses the `constant_score_boolean` method for fewer matching terms. Otherwise, |
| 34 | +this method finds all matching terms in sequence and returns matching documents |
| 35 | +using a bit set. |
| 36 | + |
| 37 | +`constant_score_boolean`:: |
| 38 | +Assigns each document a relevance score equal to the `boost` |
| 39 | +parameter. |
| 40 | ++ |
| 41 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 42 | +query>>. This `bool` query contains a `should` clause and |
| 43 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 44 | ++ |
| 45 | +This method can cause the final `bool` query to exceed the clause limit in the |
| 46 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 47 | +setting. If the query exceeds this limit, {es} returns an error. |
| 48 | + |
| 49 | +`scoring_boolean`:: |
| 50 | +Calculates a relevance score for each matching document. |
| 51 | ++ |
| 52 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 53 | +query>>. This `bool` query contains a `should` clause and |
| 54 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 55 | ++ |
| 56 | +This method can cause the final `bool` query to exceed the clause limit in the |
| 57 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 58 | +setting. If the query exceeds this limit, {es} returns an error. |
| 59 | + |
| 60 | +`top_terms_blended_freqs_N`:: |
| 61 | +Calculates a relevance score for each matching document as if all terms had the |
| 62 | +same frequency. This frequency is the maximum frequency of all matching terms. |
| 63 | ++ |
| 64 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 65 | +query>>. This `bool` query contains a `should` clause and |
| 66 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 67 | ++ |
| 68 | +The final `bool` query only includes `term` queries for the top `N` scoring |
| 69 | +terms. |
| 70 | ++ |
| 71 | +You can use this method to avoid exceeding the clause limit in the |
| 72 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 73 | +setting. |
| 74 | + |
| 75 | +`top_terms_boost_N`:: |
| 76 | +Assigns each matching document a relevance score equal to the `boost` parameter. |
| 77 | ++ |
| 78 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 79 | +query>>. This `bool` query contains a `should` clause and |
| 80 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 81 | ++ |
| 82 | +The final `bool` query only includes `term` queries for the top `N` terms. |
| 83 | ++ |
| 84 | +You can use this method to avoid exceeding the clause limit in the |
| 85 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 86 | +setting. |
| 87 | + |
| 88 | +`top_terms_N`:: |
| 89 | +Calculates a relevance score for each matching document. |
| 90 | ++ |
| 91 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 92 | +query>>. This `bool` query contains a `should` clause and |
| 93 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 94 | ++ |
| 95 | +The final `bool` query |
| 96 | +only includes `term` queries for the top `N` scoring terms. |
| 97 | ++ |
| 98 | +You can use this method to avoid exceeding the clause limit in the |
| 99 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 100 | +setting. |
| 101 | + |
| 102 | +[float] |
| 103 | +[[rewrite-param-perf-considerations]] |
| 104 | +=== Performance considerations for the `rewrite` parameter |
| 105 | +For most uses, we recommend using the `constant_score`, |
| 106 | +`constant_score_boolean`, or `top_terms_boost_N` rewrite methods. |
| 107 | + |
| 108 | +Other methods calculate relevance scores. These score calculations are often |
| 109 | +expensive and do not improve query results. |
0 commit comments