-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add limits for ngram and shingle settings #27411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add limits for ngram and shingle settings #27411
Conversation
Create index-level settings: max_ngram_diff - maximum allowed difference between max_gram and min_gram in NGramTokenFilter/NGramTokenizer. Default is 1. max_shingle_diff - maximum allowed difference between max_shingle_size and min_shingle_size in ShingleTokenFilter. Default is 3. Log a warning when trying to create NGramTokenFilter, NGramTokenizer, ShingleTokenFilter where difference between max_size and min_size exceeds the settings's value. Closes elastic#25887
b5093d5
to
836f0c6
Compare
@@ -27,6 +27,23 @@ | |||
- match: { detail.tokenizer.tokens.2.token: od } | |||
|
|||
--- | |||
"nGram_exception": | |||
- skip: | |||
version: " - 6.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jimczi Jim, I am not sure about this 6.0.0
number. I assumed since we are on 6.0.1
now, I would skip up to 6.0.0
.
Or should this rather be 6.0.99
considering that you label this PR as v6.1.0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pr targets 6.x, that's 6.1.0. We don't want to break the build so even if you plan to backport to 6.0.1 you need to do it in multiple steps. The first commit sets the expectation to the current version (so 6.0.99 here), otherwise the bwc tests would fail and you can change the expectations only after the backport to previous versions.
For this deprecation, I think it's ok to not backport to 6.0.1 and start only in 6.1.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mayya-sharipova
If the backport is simple you can directly cherry-pick and push. We usually just tag the pr on master with all the targeted versions.
Though this pr is against 6.x, that's 6.1.0 currently so the deprecation will start in this version, you need to change the rest tests.
"nGram_exception": | ||
- skip: | ||
version: " - 6.0.0" | ||
reason: starting from version 6.0.1 this produces a warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation will start in 6.1.0 (that's the current 6.x).
Create index-level settings: max_ngram_diff - maximum allowed difference between max_gram and min_gram in NGramTokenFilter/NGramTokenizer. Default is 1. max_shingle_diff - maximum allowed difference between max_shingle_size and min_shingle_size in ShingleTokenFilter. Default is 3. Log a warning when trying to create NGramTokenFilter, NGramTokenizer, ShingleTokenFilter where difference between max_size and min_size exceeds the settings's value. Closes elastic#25887
* 6.x: (41 commits) [TEST] Fix `GeoShapeQueryTests#testPointsOnly` failure Transition transport apis to use void listeners (#27440) AwaitsFix GeoShapeQueryTests#testPointsOnly #27454 Ensure nested documents have consistent version and seq_ids (#27455) Tests: Add Fedora-27 to packaging tests #26800: Fix docs rendering Move the CLI into its own subproject (#27114) Correct usage of "an" to "a" in getting started docs Avoid NPE when getting build information Remove manual tracking of registered channels (#27445) Standardize underscore requirements in parameters (#27414) Remove parameters on HandshakeResponseHandler (#27444) [GEO] fix pointsOnly bug for MULTIPOINT peanut butter hamburgers Uses TransportMasterNodeAction to update shard snapshot status (#27165) Log primary-replica resync failures Add limits for ngram and shingle settings (#27411) Enforce a minimum task execution and service time of 1 nanosecond Fix place-holder in allocation decider messages (#27436) Remove newline from log message (#27425) ...
Create index-level settings:
max_ngram_diff - maximum allowed difference between max_gram and min_gram in
NGramTokenFilter/NGramTokenizer. Default is 1.
max_shingle_diff - maximum allowed difference between max_shingle_size and
min_shingle_size in ShingleTokenFilter. Default is 3.
Log a warning when
trying to create NGramTokenFilter, NGramTokenizer, ShingleTokenFilter
where difference between max_size and min_size exceeds the settings's value.
Relates to #27211
Closes #25887