-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Update Edge NGram Tokenizer documentation #48956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-search (:Search/Analysis) |
Pinging @elastic/es-docs (:Docs) |
Hi @Stefanqn Thanks for reaching out. Unfortunately, I wasn't able to reproduce your reported problem using the current snippet setup. I've outlined my steps below. Can you highlight the expected behavior or where your steps differed?
Response:
|
please set |
Thanks @Stefanqn. I was able to reproduce. I'll work on getting this fixed in our docs. Thanks for reporting. |
thanks! |
…for index analyzers (#49007) The `edge_ngram` tokenizer limits tokens to the `max_gram` character length. Autocomplete searches for terms longer than this limit return no results. To prevent this, you can use the `truncate` token filter to truncate tokens to the `max_gram` character length. However, this could return irrelevant results. This commit adds some advisory text to make users aware of this limitation and outline the tradeoffs for each approach. Closes #48956.
…for index analyzers (#49007) The `edge_ngram` tokenizer limits tokens to the `max_gram` character length. Autocomplete searches for terms longer than this limit return no results. To prevent this, you can use the `truncate` token filter to truncate tokens to the `max_gram` character length. However, this could return irrelevant results. This commit adds some advisory text to make users aware of this limitation and outline the tradeoffs for each approach. Closes #48956.
…for index analyzers (#49007) The `edge_ngram` tokenizer limits tokens to the `max_gram` character length. Autocomplete searches for terms longer than this limit return no results. To prevent this, you can use the `truncate` token filter to truncate tokens to the `max_gram` character length. However, this could return irrelevant results. This commit adds some advisory text to make users aware of this limitation and outline the tradeoffs for each approach. Closes #48956.
…for index analyzers (#49007) The `edge_ngram` tokenizer limits tokens to the `max_gram` character length. Autocomplete searches for terms longer than this limit return no results. To prevent this, you can use the `truncate` token filter to truncate tokens to the `max_gram` character length. However, this could return irrelevant results. This commit adds some advisory text to make users aware of this limitation and outline the tradeoffs for each approach. Closes #48956.
…for index analyzers (#49007) The `edge_ngram` tokenizer limits tokens to the `max_gram` character length. Autocomplete searches for terms longer than this limit return no results. To prevent this, you can use the `truncate` token filter to truncate tokens to the `max_gram` character length. However, this could return irrelevant results. This commit adds some advisory text to make users aware of this limitation and outline the tradeoffs for each approach. Closes #48956.
Elasticsearch version (
bin/elasticsearch --version
): currentPlugins installed: []
JVM version (
java -version
): -OS version (
uname -a
if on a Unix-like system): -Description of the problem including expected versus actual behavior: bad behaviour
Steps to reproduce:
search_analyzer
andanalyzer
max_gram
length, e.g. for"max_gram": 3
searching for an existing, indexed "aaaa" returns an empty result set.showing the need for a
truncate
filter.Provide logs (if relevant): -
The text was updated successfully, but these errors were encountered: