Skip to content

[Settings] Allow Deletion of unknown settings #28609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pickypg opened this issue Feb 9, 2018 · 4 comments
Closed

[Settings] Allow Deletion of unknown settings #28609

pickypg opened this issue Feb 9, 2018 · 4 comments
Labels
:Core/Infra/Settings Settings infrastructure and APIs >feature Team:Core/Infra Meta label for core/infra team

Comments

@pickypg
Copy link
Member

pickypg commented Feb 9, 2018

When upgrading a cluster with a /_cluster/settings that disappears (or is changed and not properly registered by a plugin), then it becomes impossible to unset a persistent setting. Worse still, it then becomes a problem to apply new settings because it tries to merge the old settings.

PUT /_cluster/settings
{
  "persistent": {
    "unknown.setting": null
  }
}

should be allowed to run even though the setting is unrecognized.

@pickypg pickypg added the :Core/Infra/Settings Settings infrastructure and APIs label Feb 9, 2018
@mayya-sharipova
Copy link
Contributor

We have an alternative issue #28026 that says: "we should not archive unknown and broken cluster settings. Instead, we should fail to recover the cluster state. The solution for users in an upgrade case would be to rollback to the previous version, address the settings that would be unknown or broken in the next major version, and then proceed with the upgrade."

Implementing #28026 makes this issue unnecessary, as the broken settings should NOT exist anymore in the cluster state.

But the valid question is what to do with clusters that can't downgrade and have broken settings stuck in the cluster state.

@pickypg
Copy link
Member Author

pickypg commented Feb 13, 2018

But the valid question is what to do with clusters that can't downgrade and have broken settings stuck in the cluster state.

In such a case, I think the ability to remove them is the only option?

@bleskes
Copy link
Contributor

bleskes commented Feb 16, 2018

We talked about it in fix it friday and we think this happens due to rolling restarts. Those currently don't validate anything and if a 6.x node joins a 5.x cluster which has setting that 6.x doesn't understand no error message will be triggered. Once the rolling restart is finished, we end up with a 6.x cluster with broken settings, preventing any future change.

To deal with the above we decided we need to add a join validator that checks that the joining node understands all current settings in the cluster.

While that is good and will prevent this specific issue as described, it still has problems:

  1. People that are in this state - i.e., a 6.x cluster with broken settings - needs a solution.
  2. It's still possible that while the cluster is in mixed mode and the master is on 5.x, people will update their setting to include a 5.x only setting.

To address the first point we want a 6.x to automatically archive any settings it doesn't understand on any setting update (regardless of which settings is being update).

To address the second, we thought to add a functionality to the TransportClusterUpdateSettingsAction where it first looks at the current cluster version nodes and if there are nodes that are with a higher version than the master, it will first reach out to them to validate the setting. Once the setting change is validate on the node with highest version, it will proceed to update the setting under the cluster state thread. While the setting update task is on the cluster state thread, it will again check that the preflight check is valid - i.e., no node with a newer version has joined. If it is not valid, the entire request will fail.

@s1monw another option that occured to me while writing is - maybe we should extend the Deprecated property with a required "remove in version x" parameter. Then the master can validate the settings on it's own based on this information. WDYT?

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@rjernst rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020
@gwbrown
Copy link
Contributor

gwbrown commented Dec 3, 2020

I believe this issue has been addressed by #28888, and so I'm going to close it. If you believe I am incorrect about that, please comment and/or reopen this issue.

@gwbrown gwbrown closed this as completed Dec 3, 2020
@gwbrown gwbrown removed the needs:triage Requires assignment of a team area label label Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Settings Settings infrastructure and APIs >feature Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

7 participants