-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Discontinue archiving broken cluster settings #28253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discontinue archiving broken cluster settings #28253
Conversation
89b88f0
to
24a8d4e
Compare
Currently unknown or invalid cluster settings get archived. For a better user experience, we stop archving broken cluster settings. Instead, we will fail to recover the cluster state. The solution for users in an upgrade case would be to rollback to the previous version, address the settings that would be unknown or invalid the next major version, and then proceed with the upgrade. Closes elastic#28026
24a8d4e
to
a8ec291
Compare
Pinging @elastic/es-core-infra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mayya-sharipova I am sorry that reviewing this fell through the cracks. I did an initial review based on the current code. Glancing at the current labels I also wonder whether we should just keep this for 7.0.0 and not backport it given that this is a breaking change?
/** | ||
* Checks invalid or unknown settings. Any setting that is not recognized or fails validation | ||
* will be processed by consumers. | ||
* An exception will be thrown if any invalid or unknown setting is found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please mention the exact type that is thrown (IllegalStateException
)?
if (setting != null) { | ||
setting.get(settings); | ||
} else { | ||
if (!isPrivateSetting(key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We usually use the x == false
idiom instead of !x
.
@@ -64,6 +66,8 @@ | |||
@ClusterScope(scope = Scope.TEST, numDataNodes = 0) | |||
public class GatewayIndexStateIT extends ESIntegTestCase { | |||
|
|||
@Rule | |||
public ExpectedException expectedException = ExpectedException.none(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying that this is wrong but it seems we don't use it (so far?) in our code base. For consistency I'd probably check for expected exceptions as we did before but maybe we want to use ExpectedException
more broadly in the future?
assertNull(state.metaData().persistentSettings().get("archived." | ||
+ ElectMasterService.DISCOVERY_ZEN_MINIMUM_MASTER_NODES_SETTING.getKey())); | ||
assertHitCount(client().prepareSearch().setQuery(matchAllQuery()).get(), 1L); | ||
expectedException.expect(ElasticsearchException.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do a deeper inspection of the underlying cause and maybe even the error message (at least whether it matches a certain pattern)?
} | ||
} | ||
if (failedKeys.size() > 0) { | ||
throw new IllegalStateException("Invalid or unknown settings: " + String.join(", ", failedKeys)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this error will occur during an upgrade which is a high stress scenario anyway and the mitigation is not entirely straightforward, I think it would be good if we could provide specific guidance how to address this problem? Otherwise, people would need to turn to the migration guide for 7.0 and check the changes to find out how to handle this?
@danielmitterdorfer Hi Daniel, thank so much for your feedback. I wonder if this change is still relevant after @jasontedor's PR: #28888 |
@mayya-sharipova I agree that this PR seems obsolete with #28888 already being merged. As far as I understood the discussion in the related issue #28026, and specifically #28026 (comment), the PR #28888 should actually fix #28026? Can you please double-check with Jason and close this PR and update / close issue #28026 accordingly depending on the result of your discussion? |
Closing this PR, as it looks like it is stalled. If the issue is still relevant, people are welcome to work on it. |
Currently unknown or invalid cluster settings get archived.
For a better user experience, we stop archving broken cluster settings.
Instead, we will fail to recover the cluster state.
The solution for users in an upgrade case would be to rollback
to the previous version, address the settings that would be unknown
or invalid the next major version, and then proceed with the upgrade.
Closes #28026