Skip to content

Resolve ILM settings based loaded template or/and template_api #1102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mashhurs
Copy link
Contributor

@mashhurs mashhurs commented Dec 16, 2022

Issue description

When installing the template (either custom or auto resolved), sometimes there is a mismatch. For example, without this change, the case with template_api=>'legacy' connecting to ES 8+ resolves the wrong ILM setting field and crashes.

What this change introduces?

This change introduces ILM setting resolution based on custom/loaded template or/and template_api settings.

Acceptance criteria

For the given any settings, including custom template or/and template_api, plugin needs to resolve ILM setting instead crashing.

Testing

  • Unit tests
Finished in 1 minute 22.92 seconds (files took 7.1 seconds to load)
269 examples, 0 failures

Randomized with seed 22550

History
The change mainly solves two problems:

  1. the plugin crashes when resolving and installing default template using legacy template (legacy with ES 8.x versions, composable with ES 7.x versions behaves same). This change encourages customers using custom their own custom template when using legacy template API (legacy or composable) with any ES versions.
  2. the plugin crashes when using custom template with ILM enabled. This change opts out the ILM setting processes when using custom template and discourages users using custom template when ILM enabled, informs to set ILM settings in the template.

@andsel
Copy link
Contributor

andsel commented Dec 16, 2022

I don't know why this change impact also the test with datastream and alias, maybe are not related.
However I think we should add an integration test to cover this use case, maybe a test that's enabled when running on 7.x stack that check the template installation.

@yaauie
Copy link
Contributor

yaauie commented Dec 19, 2022

I have giant warning bells about exploding complexity with this change.

We already have significant complexity in the matrix of templates we need to maintain, and this change as-proposed only solves one possible edge-case (e.g., template_api => composable when connected to ES7 would attempt to push a non-composable template to the compostable template API). Extending this change-request to add a third dimension to the matrix of templates we maintain is costly from a maintenance/complexity standpoint.

I would much rather we make using the template_api require that the user also provides a template, because doing so is less surprising than secretly using an ES7 template when we also provide ES8 templates.

@mashhurs
Copy link
Contributor Author

I have giant warning bells about exploding complexity with this change.

We already have significant complexity in the matrix of templates we need to maintain, and this change as-proposed only solves one possible edge-case (e.g., template_api => composable when connected to ES7 would attempt to push a non-composable template to the compostable template API). Extending this change-request to add a third dimension to the matrix of templates we maintain is costly from a maintenance/complexity standpoint.

I would much rather we make using the template_api require that the user also provides a template, because doing so is less surprising than secretly using an ES7 template when we also provide ES8 templates.

Thanks for @yaauie for good insights. I think we have to make a clear behavior for default template + ES version combination with our plugin since following edge cases are open:

  1. template_api is legacy with ES8 (current behavior is to use legacy template API with ES8+ecs_compatibility template)
  2. template_api is composable with ES7 (current behavior is to use index template API with ES7+ecs_compatibility template)

For the 1st case, as per ES doc, it sounds to me we need to support default template as legacy template is not yet deprecated.
For the 2nd case, I think we need to simply fail since ES7 doesn't even have a index template (_index_template index)

Your opinion is great that simplifies and encourages users to know template details what they are doing and future changes. But my concern is, we seem not aligning with ES8. It seems ES8 (the attachment below with 8.5.3) has a legacy template API and default templates. Let's have a quick discussion when you get available to be on same page.

image

@mashhurs mashhurs self-assigned this Dec 30, 2022
@mashhurs mashhurs requested a review from yaauie January 4, 2023 11:21
@mashhurs mashhurs changed the title Load and use legacy template with ES v8+ when using legacy template API. Properly handling the case where using resolving custom template with legacy template API and ES8+ versions Jan 4, 2023
@mashhurs mashhurs requested a review from jsvd January 9, 2023 15:11
@mashhurs mashhurs marked this pull request as ready for review January 9, 2023 15:11
Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Breaks real-world configs by silently ignoring ilm_* settings when template is provided.
  • Guards only against specific combination of (template_api=>legacy) + (with default template) + ( when connected to ES8), when other combinations cause the same root issue (template_api is a hint about a provided template, not a directive)

@@ -15,16 +15,24 @@ def self.install_template(plugin)
"The legacy template API is slated for removal in Elasticsearch 9.")
end

# we need to guide customers using their own (custom) template when they want to use legacy template API with Elasticsearch 8.x (or up) versions,
# otherwise `load_default_template` method loads the template based on ES 8.x versions which fits the `_index_template` index mapping but not `_template`.
if !plugin.template && plugin.template_api == 'legacy' && plugin.maximum_seen_major_version >= 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not just about template_api => legacy; it is about the provided template_api setting matching the format of the template. It exists so that users can (a) continue to use their own legacy-formatted templates when their LS (any version) is connected to ES8 or (b) begin using their own composable-formatted templates before upgrading to ES8. When a user does not provide a template, they cannot provide a hint as to the format of the template that we will auto-load.

For example:

  • An LS pipeline configured with template_api => composable (and no explicit template), connected to ES7, will not work because the default template when connected to ES7 is a legacy-formatted template.
  • Similarly, a user running any version of LS, connected to ES7, configured explicitly with template_api => legacy (and no explicit template) will work by accident because the nature of being connected to ES7 causes the default value for template to be a legacy template; when the user upgrades their ES to ES8 without touching their LS config, the nature of them being connected to ES8 will cause this plugin to select a composable-format template, but their setting template_api => legacy will cause it to be routed to an API that can't make sense of it, which will in turn cause the pipeline to crash until it is modified.

Explicitly setting template_api at all should require a template to be explicitly given.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ry for the clarification!
Indeed the case is same when template_api is composable with ES 7.x versions. And we have cases where customer is using the ES 7.x versions and continuously upgrading es-output plugin.
I have applied the change which abstracts overall case. Please review it one more time.
Thanks.

template = read_template_file(plugin.template)
else
plugin.logger.info("Using a default mapping template", :es_version => plugin.maximum_seen_major_version,
:ecs_compatibility => plugin.ecs_compatibility)
template = load_default_template(plugin.maximum_seen_major_version, plugin.ecs_compatibility)
# we set ILM setting ONLY when we automatically resolve template
add_ilm_settings_to_template(plugin, template) if plugin.ilm_in_use?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this into "if template provided" clause will silently break users who rely on the current injection when they provide both their own base template and also configure `ilm_* settings.

If we intend to change our behaviour to prevent combining template and ilm_*, then we should error descriptively when the two incompatible options are both provided. This is still breaking, but not silent.

A better path may be to deprecate providing the two options simultaneously, and to attempt to apply the given ilm_* settings to the given template, rejecting the configuration only if applying those settings fails.

This would allow us to:

  • discourage the use of template and ilm_* together
  • continue supporting working-in-the-wild configurations
  • guide uses with currently-non-working solutions to manage their entire template in one place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the 1st and 2nd points, I was thinking to warn
Related lines: elasticsearch.rb:L350 and template_manager.rb:L31~L34

Can you please provide more info about the 3rd point, what would be an ideal suggestion manage their entire template in one place?!

@mashhurs mashhurs changed the title Properly handling the case where using resolving custom template with legacy template API and ES8+ versions Clarifying two cases: installing default template when using legacy template API and using custom template when ILM enabled. Jan 16, 2023
@mashhurs mashhurs changed the title Clarifying two cases: installing default template when using legacy template API and using custom template when ILM enabled. Clarifying two cases: installing default template when using legacy template API and using custom template when ILM enabled. Jan 16, 2023
Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot remove support for combining ilm_* settings with template in a patch release, silently or otherwise. While doing so technically prevents the specific situation that leads up to a crash, it also breaks real-world configurations and violates our promise to our users to provide them with adequate deprecation windows so that they can resolve pending breaking changes independently of upgrades to our stack components.

Half-managed ILM/templates historically have worked, and our docs talk about ILM injection in a manner that makes no caveats for the source of the template that they are injecting the ILM settings into. When ILM injection guesses the format of the template correctly, it works even if the user provided their own base template (e.g., template_api => legacy with a legacy-formatted template connected to ES 7; OR template_api => composable with a composable-formatted template connected to ES 8).

There are two issues at play, both with the newly-introduced template_api hint about the format of the provided template (which allows the user to provide a template whose format doesn't match the favored format for the version of ES we are connected to):

  1. template_api does not make sense in the absence of a user-provided template (it does not change how a default template is resolved or generated, and making it do so was never part of its scope).
  2. Even when a template_api hint is available, it is not used by the ILM injection code when it is guessing how to apply the ILM settings. Instead, it continues to switch based solely on which version of Elasticsearch it is connected to, and when it guesses incorrectly it crashes opaquely.

Therefore:

  • Providing the new template_api setting without also providing a template should be a configuration error (e.g., a `template` is required when providing a `template_api` hint about how to handle a user-provided `template`; If you wish to use the #{template_api} template API, you need to provide your own #{template_api}-formatted template).
  • ILM should use the template_api hint when deciding how to apply the ILM settings.

Additionally, we know that half-managed templates with ILM are unnecessarily risky for our users, and the complexity is only going to continue to grow; therefore:

  • we should deprecate (but continue to support) providing ilm_* settings along-side an explicit template, and should guide users to resolve the deprecation by adding their ILM settings to the template that they are providing (e.g., "Injecting ILM configuration into a custom template is deprecated and support for doing so will be removed in a future release; when managing your own template, please include your ILM configuration directly in the template you provide instead of using this plugin's `ilm_*` settings")
  • ILM injection should be audited to ensure we are as resilient as we can possibly be (e.g., tolerate missing intermediate nestings by creating them)
  • When ILM injection does crash (e.g., if we need to set a key on a key/value map, but the value at that address is a scalar), we should provide clear messaging and helpful guidance (e.g., "failed to apply ILM settings to the provided template: #{e.message}; when managing your own template, we advise including your ILM configuration directly in the template you provide and avoiding this plugin's `ilm_*` settings")

@@ -1079,6 +1079,8 @@ the *$JDK_HOME/conf/security/java.security* configuration file. That is, `TLSv1.
* There is no default value for this setting.

You can set the path to your own template here, if you so desire.
When a custom template is used, the limitation is that ILM setups will be ignored. It is recommended to add ILM settings in the template, manually manage rollover and policy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change that can be avoided, and certainly not one that should be released in a patch-level fix.

@mashhurs mashhurs force-pushed the legacy-template-with-es-v8-plus-bugfix branch from c8f2089 to 46815e2 Compare February 7, 2023 13:04
@mashhurs mashhurs changed the title Clarifying two cases: installing default template when using legacy template API and using custom template when ILM enabled. Resolve ILM settings based loaded template or/and template_api Feb 7, 2023
Comment on lines 57 to 61
return composable_index_template_settings(template) if template.key?('template')
return legacy_index_template_settings(template) if template.key?('settings')
return template_endpoint(plugin) == INDEX_TEMPLATE_ENDPOINT ?
composable_index_template_settings(template) :
legacy_index_template_settings(template)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of logic and guesswork here and I'm a bit concerned with the lack of logging

Suggested change
return composable_index_template_settings(template) if template.key?('template')
return legacy_index_template_settings(template) if template.key?('settings')
return template_endpoint(plugin) == INDEX_TEMPLATE_ENDPOINT ?
composable_index_template_settings(template) :
legacy_index_template_settings(template)
if template.key?('template')
logger.trace("template_settings: settings location for template is under 'template' key", :template => template, :template_api => plugin.template_api, :es_version => plugin.maximum_seen_major_version)
composable_index_template_settings(template)
elsif template.key?('settings')
logger.trace("template_settings: settings location for template is at object's root", :template => template, :template_api => plugin.template_api, :es_version => plugin.maximum_seen_major_version)
legacy_index_template_settings(template)
else
template_endpoint = template_endpoint(plugin)
logger.trace("template_settings: template doesn't have 'settings' or 'template' fields, falling back to auto detection, :template => template, :template_api => plugin.template_api, :es_version => plugin.maximum_seen_major_version, :detected_endpoint => template_endpoint)
template_endpoint == INDEX_TEMPLATE_ENDPOINT ?
composable_index_template_settings(template) :
legacy_index_template_settings(template)
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the logs between as well.

legacy_index_template_settings(template)
end

# Returns (if exists) or creates _template API compatible template settings
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Returns (if exists) or creates _template API compatible template settings
# Sets ['settings'] field to be compatible with _template API structure

template['settings'] ||= {}
end

# Returns (if exists) or creates _index_template API compatible template settings
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Returns (if exists) or creates _index_template API compatible template settings
# Sets the ['template']['settings'] fields if not exist to be compatible with _index_template API structure

Copy link
Member

@jsvd jsvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I've tested the multiple scenarios and it all seems to work correctly.

@mashhurs mashhurs force-pushed the legacy-template-with-es-v8-plus-bugfix branch from 0eedf94 to 0dd560a Compare February 7, 2023 17:47
@mashhurs mashhurs merged commit 4507240 into logstash-plugins:main Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants