Make it easier to optimize search with better analysis

We have a number of filters that can help make search faster:
 - shingles for faster phrases
 - ngrams for infix search
 - edge n-grams for prefix/suffix search

Yet leveraging them to improve search speed typically makes Elasticsearch much harder to use since query parsers are not aware of whether these filters are in use.

To give the example of prefix search, I'm wondering whether we should add a `MappedFieldType.prefixQuery` factory method that would be called by query parsers. Regular `text` fields would still create a `PrefixQuery` but we could have a new field type that would be optimized for prefix search which would automatically add a filter to the analysis chain at index time. It would be like the edge n-gram filter except that it would add a marker to differenciate prefixes from actual terms. For instance if we want to optimize prefix queries for prefixes that would be up to 4 chars, we could analyze `foobar` as [`foobar`, `\0f`, `\0fo`, `\0foo`, `\0foob`]. I'm using `\0` here but anything that can help differenciate prefixes from the original term while preventing collisions would work.

Then at search time, `MappedFieldType.prefixQuery` would look at the length of the term and prepend a `\0` and run a `term` query if there are 4 chars or less, and run a regular `PrefixQuery` otherwise.

We could do the same for infix search or phrase queries using similar ideas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make it easier to optimize search with better analysis #27049

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Make it easier to optimize search with better analysis #27049

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions