Description
Completion Suggester V2
The completion suggester provides auto-complete/search-as-you-type functionality.
This is a navigational feature to guide users to relevant results as they are typing, improving search precision. It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters.
The completions are indexed as a weighted FST (finite state transducer) to provide fast Top N prefix-based
searches suitable for serving relevant results as a user types.
Notable Features:
- Document oriented suggestions:
- Near-real time.
- Deleted document filtering.
- Multiple Context support.
- Return document field values via
payload
.
- Query Interface:
- Regular expression support via
regex
. - Typo tolerance via
fuzzy
. - Context boosting at query time.
- Regular expression support via
Completion Suggester V2 is based on LUCENE-6339 and LUCENE-6459, the first iteration of Lucene's new suggest API.
Mapping
The completion fields are indexed in a special way, hence a field mapping has to be defined.
Following shows a field mapping for a completion field named title_suggest:
PUT {INDEX_NAME}
{
"mappings": {
{TYPE_NAME}: {
"properties": {
"title_suggest": {
"type": "completion"
}
}
}
}
You can choose index and search time analyzer for the completion field by adding analyzer
and search_analyzer
options.
Context Mappings
Adding a contexts
option in the field mapping defines a context-enabled completion field. You may want
a context-enabled completion field, if you require filtering or boosting suggestions by a criteria other than
just its prefix. Note that adding high-cardinality context values will increase the size of the in-memory
index significantly.
There are two types of supported context types: category
and geo
.
Category Context Mapping
Category contexts are indexed as prefixes to the completion field value.
The following adds a category context named genre:
...
"contexts": [
{
"name": "genre",
"type": "category"
}
]
You can also pull context values from another field in a document by using a path
option specifying the field name.
Geo Context Mapping
Geo points are encoded as geohash strings and prefixed to the completion field value.
The following adds a geo context named location:
...
"contexts": [
{
"name": "location",
"type": "geo"
}
]
You can also set precision
option to choose the geohash length and path
to pull context values from another
field in the document.
Indexing
Just like any other field, you can add multiple completion fields to a document. You can also index multiple completions
for a completion field per document. Each completion value is tied to its document and can be assigned an index-time
weight, which determines its relative rank among other completion values which share a common prefix.
The following indexes a completion value and its weight for the title_suggest completion field:
POST {INDEX_NAME}/{TYPE_NAME}
{
"title_suggest": {
"input": "title1",
"weight": 7
}
}
You can use the short-form, if you prefer not to add weight
to the completions:
POST {INDEX_NAME}/{TYPE_NAME}
{
"title_suggest": "title1",
}
Arrays are also supported to index multiple values,
The following indexes multiple completion entries (input
and weight
) for a single document:
POST {INDEX_NAME}/{TYPE_NAME}
{
"title_suggest": [
{
"input": "title1",
"weight": 14
},
{
"input": "alternate_title",
"weight": 7
}
]
}
Indexing context-enabled fields
You can use the path
option previously mentioned to pull context values from another field
in the document or add contexts
option to the completion entry while indexing.
The following explicitly indexes context values along with completions:
POST {INDEX_NAME}/{TYPE_NAME}
{
"genre_title_suggest": {
"input": "title1",
"contexts": {
"genre": ["genre1", "genre2"]
},
"weight": 7
}
}
You can also configure the path
option in the context mapping to pull values from another
field as follows (assuming path
for the genre context has been set to genre field):
POST {INDEX_NAME}/{TYPE_NAME}
{
"genre_title_suggest": "title1",
"genre": ["genre1", "genre2"]
}
Query Interface
The point of indexing values as completions is to be able to run fast prefix-based searches on them.
You can run Prefix, Fuzzy and Regex queries on all completion fields. In case of a context-
enabled completion field, providing no context indicates all contexts will be considered. But you
can not run a Context query on a completion field with no contexts. When a query is run on a context-
enabled field, the contexts for a completion is returned with the suggestion.
Prefix Query
The following suggests completions from the field title_suggest that start with titl:
POST {INDEX}/_suggest
{
"suggest-namespace" : {
"prefix" : "titl",
"completion" : {
"field" : "title_suggest"
}
}
}
The suggestions are sorted by their index-time weight
.
Fuzzy Prefix Query
A fuzzy prefix query can serve typo-tolerant suggestions. It scores suggestions closer (based on its edit distance)
to the provided prefix higher, regardless of their weight.
POST {INDEX}/_suggest
{
"suggest-namespace" : {
"prefix" : "sug",
"completion" : {
"field" : "suggest",
"fuzzy" : { (1)
"fuzziness" : 2
}
}
}
}
Specify fuzzy as shown in (1) to use typo-tolerant suggester. Full options for fuzzy
Regex Prefix Query
A regex prefix query matches all the term prefixes that match a regular expression. Regex is anchored at the begining but not at the end.
The suggestions are sorted by their index-time weight
.
POST {INDEX}/_suggest
{
"suggest-namespace" : {
"regex" : "s[u|a]g", (1)
"completion" : {
"field" : "suggest"
}
}
}
Specify regex as shown in (1), instead of prefix to use regular expressions. Supported regular expression syntax
Context Query
Adding contexts
(1) option to the query enables filtering and/or boosting suggestions based on their context values.
This query scores suggestions by multiplying the query-time boost withe the suggestion weight.
POST {INDEX}/_suggest
{
"suggest-namespace" : {
"prefix" : "sug",
"completion" : {
"field" : "genre_title_suggest",
"contexts": { (1)
"genre": [
{
"value" : "rock",
"boost" : 3
},
{
"value" : "indie",
"boost" : 2
}
]
}
}
}
}
The contexts can also be specified without any boost:
...
"contexts": {
"genre" : ["rock", "indie"]
}
Geo Context Query:
The result will be scored such that the suggestions are first sorted by the distance between the corresponding geo context and the provided
geo location and then by the weight of the suggestions.
...
"contexts" : {
"location" : {
"context" : {
"lat" : ..,
"lon" : ..
},
"precision" : ..
}
}
Example
The following performs a Fuzzy Prefix Query combined with a Context Query on a context-enabled completion field named genre_song_suggest.
POST {INDEX}/_suggest
{
"suggest-namespace" : {
"prefix" : "like a roling st",
"completion" : {
"field" : "genre_song_suggest",
"fuzzy" : {
"fuzziness" : 2
},
"contexts" : {
"genre": [
{
"context" : "rock",
"boost" : 3
},
{
"context" : "indie",
"boost" : 2
}
]
}
}
}
}
This query will return all song names for the genre rock and indie that are within an edit distance of 2 from the prefix like a roling st.
The song names with genre of rock will be boosted higher then that of indie.
The completion field values that share the longest prefix with like a roling st will be additionally boosted higher.
Payload
You can retrieve any document field values along with its completions using the payload
option.
The following returns the url field with each suggestion entry:
POST {INDEX}/_suggest
{
"suggest-namespace" : {
"prefix" : "titl",
"completion" : {
"field" : "title_suggest",
"payload" : ["url"]
}
}
}
The response format is as follows:
{
...
"suggest-namespace" : [
{
"prefix" : "sugg",
"offset" : 0,
"length" : 4,
"options" : [
{
"text" : "suggestion",
"score" : 34.0,
"payload": {
"url" : [ "url_1" ]
}
}
]
}
]
}