diff --git a/docs/reference/query-dsl/combined-fields-query.asciidoc b/docs/reference/query-dsl/combined-fields-query.asciidoc index 390d71276cf3d..8e703965e9462 100644 --- a/docs/reference/query-dsl/combined-fields-query.asciidoc +++ b/docs/reference/query-dsl/combined-fields-query.asciidoc @@ -5,14 +5,14 @@ ++++ The `combined_fields` query supports searching multiple text fields as if their -contents had been indexed into one combined field. It takes a term-centric -view of the query: first it analyzes the query string into individual terms, +contents had been indexed into one combined field. The query takes a term-centric +view of the input string: first it analyzes the query string into individual terms, then looks for each term in any of the fields. This query is particularly useful when a match could span multiple text fields, for example the `title`, -`abstract` and `body` of an article: +`abstract`, and `body` of an article: [source,console] --------------------------------------------------- +---- GET /_search { "query": { @@ -23,31 +23,36 @@ GET /_search } } } --------------------------------------------------- +---- The `combined_fields` query takes a principled approach to scoring based on the simple BM25F formula described in http://www.staff.city.ac.uk/~sb317/papers/foundations_bm25_review.pdf[The Probabilistic Relevance Framework: BM25 and Beyond]. When scoring matches, the query combines term and collection statistics across -fields. This allows it to score each match as if the specified fields had been -indexed into a single combined field. (Note that this is a best attempt -- -`combined_fields` makes some approximations and scores will not obey this -model perfectly.) +fields to score each match as if the specified fields had been indexed into a +single, combined field. This scoring is a best attempt; `combined_fields` makes +some approximations and scores will not obey the BM25F model perfectly. +// tag::max-clause-limit[] [WARNING] .Field number limit =================================================== -There is a limit on the number of fields that can be queried at once. It is -defined by the `indices.query.bool.max_clause_count` <> -which defaults to 1024. +By default, there is a limit to the number of clauses a query can contain. This +limit is defined by the +<> +setting, which defaults to `1024`. For `combined_fields` queries, the number of +clauses is calculated as the number of fields multiplied by the number of terms. =================================================== +// end::max-clause-limit[] ==== Per-field boosting -Individual fields can be boosted with the caret (`^`) notation: +Field boosts are interpreted according to the combined field model. For example, +if the `title` field has a boost of 2, the score is calculated as if each term +in the title appeared twice in the synthetic combined field. [source,console] --------------------------------------------------- +---- GET /_search { "query": { @@ -57,11 +62,8 @@ GET /_search } } } --------------------------------------------------- - -Field boosts are interpreted according to the combined field model. For example, -if the `title` field has a boost of 2, the score is calculated as if each term -in the title appeared twice in the synthetic combined field. +---- +<1> Individual fields can be boosted with the caret (`^`) notation. NOTE: The `combined_fields` query requires that field boosts are greater than or equal to 1.0. Field boosts are allowed to be fractional. @@ -149,7 +151,7 @@ term-centric: `operator` and `minimum_should_match` are applied per-term, instead of per-field. Concretely, a query like [source,console] --------------------------------------------------- +---- GET /_search { "query": { @@ -160,12 +162,15 @@ GET /_search } } } --------------------------------------------------- +---- -is executed as +is executed as: - +(combined("database", fields:["title" "abstract"])) - +(combined("systems", fields:["title", "abstract"])) +[source,txt] +---- ++(combined("database", fields:["title" "abstract"])) ++(combined("systems", fields:["title", "abstract"])) +---- In other words, each term must be present in at least one field for a document to match. @@ -178,8 +183,8 @@ to scoring based on the BM25F algorithm. [NOTE] .Custom similarities =================================================== -The `combined_fields` query currently only supports the `BM25` similarity -(which is the default unless a <> -is configured). <> are also not allowed. +The `combined_fields` query currently only supports the BM25 similarity, +which is the default unless a <> +is configured. <> are also not allowed. Using `combined_fields` in either of these cases will result in an error. =================================================== diff --git a/docs/reference/query-dsl/multi-match-query.asciidoc b/docs/reference/query-dsl/multi-match-query.asciidoc index 706050efa9df2..7f1dfc6b95e0f 100644 --- a/docs/reference/query-dsl/multi-match-query.asciidoc +++ b/docs/reference/query-dsl/multi-match-query.asciidoc @@ -67,9 +67,7 @@ index settings, which in turn defaults to `*`. `*` extracts all fields in the ma are eligible to term queries and filters the metadata fields. All extracted fields are then combined to build a query. -WARNING: There is a limit on the number of fields that can be queried -at once. It is defined by the `indices.query.bool.max_clause_count` <> -which defaults to 1024. +include::combined-fields-query.asciidoc[tag=max-clause-limit] [[multi-match-types]] [discrete]