Skip to content

Fix exception message when validating the name of tokenizer(edgeNGram) #112705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

YeonghyeonKO
Copy link
Contributor

@YeonghyeonKO YeonghyeonKO commented Sep 10, 2024

The method getTokenizers in CommonAnalysisPlugin returns IllegalArgumentException with the message:

The [edgeNGram] tokenizer name was deprecated in 7.6. Please use the tokenizer name to [edge_nGram] for indices created in versions 8 or higher instead.

After IndexVersion v.8.0.0, edge_ngram (not edge_nGram) is the exact name of edgeNGram tokenizer so I ask to change the word in this Pull Request. The PR could reduce confusion in defining edgeNGram tokenizers when statically mapping indices in Elasticsearch.

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v8.16.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Sep 10, 2024
@YeonghyeonKO YeonghyeonKO changed the title fix exception message when validating the name of tokenizer(edgeNGram) Fix exception message when validating the name of tokenizer(edgeNGram) Sep 10, 2024
@YeonghyeonKO
Copy link
Contributor Author

@mark-vieira Hi,

  • I'd like to ask you that since elastic decided to add APGL 3.0 to existing licenses, will it be applied starting from v9.0.0?
    (ref : https://www.elastic.co/kr/blog/elasticsearch-is-open-source-again)
  • Also, this PR is about index mappings when using edgeNGram tokenizer then which team is going to review this PR? (If a meeting for triage is held regularlym, please let me know the schedule.

@javanna javanna added :Search Relevance/Analysis How text is split into tokens and removed needs:triage Requires assignment of a team area label labels Sep 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Sep 13, 2024
@mark-vieira
Copy link
Contributor

As mentioned in the licensing FAQ the AGPL change will apply starting with version 8.16.

@carlosdelest carlosdelest self-assigned this Sep 17, 2024
@carlosdelest carlosdelest added auto-backport Automatically create backport pull requests when merged v8.16.0 >non-issue labels Sep 17, 2024
@carlosdelest
Copy link
Member

@elasticsearchmachine run elasticsearch-ci

@carlosdelest
Copy link
Member

@elasticsearchmachine update branch

@carlosdelest
Copy link
Member

@elasticmachine test this please

@carlosdelest
Copy link
Member

@YeonghyeonKO I tried to update the branch and resolve conflicts, but some compilation errors resulted from it. Can you please update from the branch and correct them?

Thanks!

@benwtrent
Copy link
Member

@carlosdelest @YeonghyeonKO take a look here: #113009

@YeonghyeonKO
Copy link
Contributor Author

@carlosdelest Thanks for pulling codes from main branch to this branch. I fixed compilation error by importing classes and defining static variable(deprecationLogger).

However as @benwtrent commented above that after @javanna's PR( #113009 ) had been merged already, this PR can be CLOSED.

@@ -296,6 +303,22 @@ public Map<String, AnalysisProvider<TokenizerFactory>> getTokenizers() {
tokenizers.put("thai", ThaiTokenizerFactory::new);
tokenizers.put("ngram", NGramTokenizerFactory::new);
tokenizers.put("edge_ngram", EdgeNGramTokenizerFactory::new);
tokenizers.put("edgeNGram", (IndexSettings indexSettings, Environment environment, String name, Settings settings) -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this should just be removed :). I think this pr can be closed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for reviewing my PR :)

@javanna
Copy link
Member

javanna commented Sep 18, 2024

This fix was still valid for 8.x I believe, my commit with the removal went only to main.

@carlosdelest
Copy link
Member

Thanks for clarifying @javanna !

@YeonghyeonKO , can you open another PR with this code, but to 8.x branch, and assign it to me?

Thanks!

@YeonghyeonKO YeonghyeonKO reopened this Sep 20, 2024
@YeonghyeonKO YeonghyeonKO changed the base branch from main to 8.x September 20, 2024 00:57
@YeonghyeonKO YeonghyeonKO requested review from a team as code owners September 20, 2024 00:57
@YeonghyeonKO YeonghyeonKO changed the base branch from 8.x to main September 20, 2024 00:58
@YeonghyeonKO
Copy link
Contributor Author

YeonghyeonKO commented Sep 20, 2024

@carlosdelest @javanna
OK, I'll open a new PR for 8.x version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged external-contributor Pull request authored by a developer outside the Elasticsearch team >non-issue :Search Relevance/Analysis How text is split into tokens Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants