-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add setting to decommission legacy monitoring cluster alerts #62668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add setting to decommission legacy monitoring cluster alerts #62668
Conversation
This will allow us to test if the exporters actually pull the watches back down when they aren't in use.
…sioning setting is present.
Pinging @elastic/es-core-features (:Core/Features/Monitoring) |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple comments
@@ -62,7 +63,9 @@ public Exporters(Settings settings, Map<String, Exporter.Factory> factories, | |||
|
|||
final List<Setting.AffixSetting<?>> dynamicSettings = | |||
getSettings().stream().filter(Setting::isDynamic).collect(Collectors.toList()); | |||
clusterService.getClusterSettings().addSettingsUpdateConsumer(this::setExportersSetting, dynamicSettings); | |||
final List<Setting<?>> updateSettings = new ArrayList<Setting<?>>(dynamicSettings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should #getSettings() also return this new setting ? (I think it is only really used for testing, but seems more correct to return all the settings up through the plugin).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I missed that step, good catch!
@@ -77,6 +77,9 @@ | |||
public static final Setting<Boolean> CLEAN_WATCHER_HISTORY = boolSetting("xpack.watcher.history.cleaner_service.enabled", | |||
true, Setting.Property.Dynamic, Setting.Property.NodeScope, Setting.Property.Deprecated); | |||
|
|||
public static final Setting<Boolean> MIGRATION_DECOMMISSION_ALERTS = boolSetting("xpack.monitoring.migration.decommission_alerts", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels kinda weird to have this AND xpack.monitoring.exporters.*.cluster_alerts.management.enabled
setting.
I wonder if makes sense to
- expose a POST _monitoring/cluster_alert/disable endpoint (or whatever the name is)
- when that is called iterate through all of the current exporters and set
xpack.monitoring.exporters.<i>.cluster_alerts.management.enabled=false
and set some cluster state to no longer try to install cluster alerts (and reach out to delete the cluster alerts for those exporters) - If a new exporter is created, set
xpack.monitoring.exporters.<new_exporter>.cluster_alerts.management.enabled=false
based on the cluster state. - expose a POST _monitoring/cluster_alert/enable
thoughts ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be in favour of less settings but OTOH current flow with this setting seems simpler and less error-prone. And currently you can't enable single exporter back (with xpack.monitoring.exporters.<i>.cluster_alerts.management.enabled=true
) which I think is good thing (easier to reason about cluster state).
We can still expose _monitoring/cluster_alert/disable
endpoint if needed though if we want, it may be easier for the user to just call API than looking up setting to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider this more of a first step with a new action that performs the migration to follow. If the exporters are disabled, this setting only ensures that the alerts are not created if they are turned back on (since alerts are created lazily much to my chagrin).
The main concern that was raised in the migration issue was that if a new exporter is configured, either on purpose or by accident, the cluster alerts for that exporter will not be installed. A global setting to control the existence of the alerts is the easiest way to fix that, even if it overlaps with other settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense, thanks for explaining.
@@ -484,6 +488,56 @@ public void testWatchPublishBlocksAfterSuccessfulWatcherCheck() { | |||
verifyNoMoreInteractions(client); | |||
} | |||
|
|||
public void testWatchRemovalOnDecomissionAlerts() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this name is bit misleading ? (maybe testDeployClusterAlerts) ?
...onitoring/src/main/java/org/elasticsearch/xpack/monitoring/exporter/local/LocalExporter.java
Outdated
Show resolved
Hide resolved
@@ -77,6 +77,9 @@ | |||
public static final Setting<Boolean> CLEAN_WATCHER_HISTORY = boolSetting("xpack.watcher.history.cleaner_service.enabled", | |||
true, Setting.Property.Dynamic, Setting.Property.NodeScope, Setting.Property.Deprecated); | |||
|
|||
public static final Setting<Boolean> MIGRATION_DECOMMISSION_ALERTS = boolSetting("xpack.monitoring.migration.decommission_alerts", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be in favour of less settings but OTOH current flow with this setting seems simpler and less error-prone. And currently you can't enable single exporter back (with xpack.monitoring.exporters.<i>.cluster_alerts.management.enabled=true
) which I think is good thing (easier to reason about cluster state).
We can still expose _monitoring/cluster_alert/disable
endpoint if needed though if we want, it may be easier for the user to just call API than looking up setting to use
…/monitoring/exporter/local/LocalExporter.java Co-authored-by: Przemko Robakowski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM... a couple minor points that could be addressed (but no need for additional reviews)
@elasticmachine run elasticsearch-ci/bwc |
@elasticmachine run elasticsearch-ci/default-distro |
@elasticmachine update branch |
…62668) (#64443) Adds a setting that, when enabled, directs any currently running exporters in monitoring will treat any cluster alert definition as excluded from the list of allowed cluster alert watches. This is the first step to adding a migration path away from using cluster alerts configured by the monitoring plugin and toward those managed by the stack monitoring solutions on the new alerting feature. Co-authored-by: Przemko Robakowski <[email protected]>
Adds a setting that, when enabled, directs any currently running exporters in monitoring will treat any cluster alert definition as excluded from the list of allowed cluster alert watches. This is the first step to adding a migration path away from using cluster alerts configured by the monitoring plugin and toward those managed by the stack monitoring solutions on the new alerting feature.