-
Notifications
You must be signed in to change notification settings - Fork 25.2k
[Zen2] Hide not recovered state #36224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Zen2] Hide not recovered state #36224
Conversation
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've request one small change to the hiding logic, o.w. looking good.
@Override | ||
protected Settings nodeSettings(int nodeOrdinal) { | ||
return Settings.builder().put(super.nodeSettings(nodeOrdinal)) | ||
.put(TestZenDiscovery.USE_ZEN2.getKey(), false) // no state persistence yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also remove import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internalCluster().stopRandomDataNode(); | ||
internalCluster().stopRandomDataNode(); | ||
internalCluster().fullRestart(); | ||
internalCluster().stopRandomNode(InternalTestCluster.nameFilter(nodes.get(1))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to readd more randomization, maybe write nodes.get(1), nodes.get(2)
both here and in the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Override | ||
protected Settings nodeSettings(int nodeOrdinal) { | ||
return Settings.builder().put(super.nodeSettings(nodeOrdinal)) | ||
.put(TestZenDiscovery.USE_ZEN2.getKey(), false) // no state persistence yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ywelsch I've fixed nits, for some reason I don't see your comments regarding changing hiding logic, can you please comment again? |
weird, my comment has disappeared. I had asked for the hiding logic to keep the coordinationmetadata and the cluster uuid in the empty metadata, as these are particularly useful at this stage where the state might not be recovered yet, but the cluster state is already accessible through the cluster state API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* master: (133 commits) SNAPSHOT: Increase Timeout to Stabilize Test (elastic#36294) Fix get certificates HLRC API (elastic#36198) Avoid shutting down the only master (elastic#36272) Fix typo in comment Fix total hits serialization of the search response (elastic#36290) Fix FullClusterRestartIT#testRollupIDSchemeAfterRestart Mute FullClusterRestartIT#testRollupIDSchemeAfterRestart as we await a fix. [Docs] Add Profile API limitations (elastic#36252) Make sure test don't use Math.random for reproducability (elastic#36241) Fix compilation ingest: support default pipeline through an alias (elastic#36231) Zen2: Rename tombstones to exclusions (elastic#36226) [Zen2] Hide not recovered state (elastic#36224) Test: mute testDataNodeRestartWithBusyMasterDuringSnapshot Test: mute testSnapshotAndRestoreWithNested Revert "Test: mute failing mtermvector rest test" Test: mute failing mtermvector rest test add version 6.5.3 (elastic#36268) Make hits.total an object in the search response (elastic#35849) [DOCS] Fixes broken link in execute watch ...
This PR hides
ClusterState
, that hasSTATE_NOT_RECOVERED_BLOCK
,from
ClusterStateAppliers
.This is needed, because some appliers, such as
IngestService
, rely onthe fact, that cluster state with
STATE_NOT_RECOVERED_BLOCK
won'tcontain anything useful.
Once the state is recovered it's fully available for the appliers.
Also, this PR switches most of the tests that require persistence/state
recovery from Zen1 to Zen2.
There are still the following tests, that were assumed to work after
persistence/recovery is implemented, but do not, namely:
PrimaryAllocationIT- fixed with 7d26c64testIndexDeletionWhenNodeRejoins