Skip to content

Add version to prebuilt analyzers #3966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 28, 2013

Conversation

spinscale
Copy link
Contributor

This patch takes the version of the created index into account when a
prebuilt analyzer is created.
So, if an index was created with 0.90.4, then the prebuilt analyzers
will be the same than on the 0.90.4 release.

One reason for this feature is the possibility to change pre built
analyzers like the standard one.

The patch tries to reuse analyzers as mutch as possible. So even if
version X.Y.Z and X.Y.A use the same lucene analyzers, the same instance
is reused in order to prevent overcreation of lucene analyzer instances.

Closes #3790

@kimchy
Copy link
Member

kimchy commented Oct 24, 2013

am I correct in the fact that we just cache the analyzers for the current version, and if the version is not CURRENT, then we create a new instance each time? If so:

First, this is not very evident in terms of design in the PreBuiltAnalyzerProviderFactory class (i.e. the CURRENT version is provided as a constructor parameter, which you don't know that its based on CURRENT, and then the check for CURRENT is done when asking to provide the analyzer.

The other part is that if someone upgrades, for example, from 0.90.5 to 0.90.6, then all the other indices in previous versions will now create all the analyzers for all the different indices on that version (0.90.5). I was thinking that IndicesAnalysisService would also cache the fact that an analyzer was create for version X, so it its asked again, it will return it. Otherwise, on many indices case deployment and a minor upgrade, we end up creating many analyzers. The caching can be a tuple of analyzer name and the Version, with its own built PreBuiltAnalyzerProviderFactory. Make sense?

@kimchy
Copy link
Member

kimchy commented Oct 24, 2013

Ahh, I missed the fact that we have the cache on the enum itself. 2 things that I think are still problematic, will comment over it.

Version indexVersion = settings.getAsVersion(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT);
if (!Version.CURRENT.equals(indexVersion)) {
Analyzer analyzer = PreBuiltAnalyzers.valueOf(name.toUpperCase(Locale.ROOT)).getAnalyzer(indexVersion);
return new PreBuiltAnalyzerProvider(name, AnalyzerScope.INDEX, analyzer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be AnalyzerScope.INDEX, cause its cached on the enum level, so the scope should probably AnalyzerScope.INDICES. I would simply use the scope provided in the constructor (which will be the INDICES one).

@spinscale
Copy link
Contributor Author

Updated: Set the Scope to INDICES and made the getAnalyzer method synchronized. Thx for reviewing.

@s1monw
Copy link
Contributor

s1monw commented Oct 25, 2013

@spinscale maybe we should have a test that creates and closes indices randomly and makes sure we don't get the Scope wrong in the future. I think such a test would be very helpful!

This patch takes the version of the created index into account when a
prebuilt analyzer is created.
So, if an index was created with 0.90.4, then the prebuilt analyzers
will be the same than on the 0.90.4 release.

One reason for this feature is the possibility to change pre built
analyzers like the standard one.

The patch tries to reuse analyzers as mutch as possible. So even if
version X.Y.Z and X.Y.A use the same lucene analyzers, the same instance
is reused in order to prevent overcreation of lucene analyzer instances.

Closes elastic#3790
@spinscale spinscale merged commit ec0880d into elastic:master Oct 28, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add version to Analyzers
3 participants