-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add information when master node left to DiscoveryNodes' shortSummary() #28197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some questions / suggestions. let me know what you think.
@@ -448,7 +452,7 @@ public boolean hasChanges() { | |||
} | |||
|
|||
public boolean masterNodeChanged() { | |||
return newMasterNode != null; | |||
return newMasterNode != previousMasterNode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use ephimeral ids, not object identity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's a bit scary, sorry.
@@ -400,10 +400,14 @@ public Delta delta(DiscoveryNodes other) { | |||
DiscoveryNode previousMasterNode = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just set them both and figure things out in the delta class?
if (newMasterNode.getId().equals(localNodeId)) { | ||
// we are the master, no nodes we removed, we are actually the first master | ||
sb.append("new_master ").append(newMasterNode()); | ||
if (newMasterNode() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need all this string crafting? can we just check for change and say "its was x, now it's y"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would simplify the summary, indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is some effort involved with string crafting. But I think it helps the end user understand what happened, for example,
- "stepped down as master" (if localNode == previousMaster && newMaster == null)
- "became master" (if localNode == newMaster)
- "lost master" (if localNode != previousMaster && newMaster == null)
- "detected master" (newMaster != null && newMaster != localNode)
- ...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good if you guys think it's worth it. My only concern is that people will start thinking stuff that are necessarily true - (or might be not true in the future) - like are we guaranteed to always have an even with a master stepped down? Maybe we have a master swap, or a restart and the intermediate cluster state was lost in batching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked with @ywelsch and we agreed on logging a simple message with previous/current master values for now, as crafting more detailed messages could lead the user to expect a specific message while the related event could be lost because of cluster state batching etc.
For now it will log messages like:
[node_tm0] master node changed {previous [], current [{node_tm1}{...}
[node_tm2] master node changed {previous [{node_tm2}{...}], current []}, reason: not enough master nodes
but we may revisit this in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also look at the DiscoveryNodes.delta(...)
, which has a pretty unconventional definition of previousMasterNode
and newMasterNode
.
I guess this is because the Delta reports null values for previous and master nodes when it didn't change. I changed and simplified things, please let me know if that's too much. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left two more suggestions
return newMasterNode != null; | ||
String newMasterId = (newMasterNode != null) ? newMasterNode.getEphemeralId() : null; | ||
String previousMasterId = (previousMasterNode != null) ? previousMasterNode.getEphemeralId() : null; | ||
return Objects.equals(newMasterId, previousMasterId) == false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this is equivalent to the simpler
return Objects.equals(newMasterNode, previousMasterNode) == false;
(see equals implementation of DiscoveryNode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks
DiscoveryNode previousMasterNode = null; | ||
if (other.masterNodeId != null) { | ||
previousMasterNode = other.getMasterNode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getMasterNode()
already returns null if masterNodeId
is null.
Maybe cleaner to make that explicit in the getMasterNode()
method (and not rely on the map implementation to do that for us) and also use @Nullable
on the return type. The effect is that we won't need these conditions here and can just write return new Delta(other.getMasterNode(), getMasterNode(), ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @tlrx
This commit changes `DiscoveryNodes.Delta.shortSummary()` in order to add information to the summary when the master node left.
* master: Trim down usages of `ShardOperationFailedException` interface (#28312) Do not return all indices if a specific alias is requested via get aliases api. [Test] Lower bwc version for rank-eval rest tests CountedBitSet doesn't need to extend BitSet. (#28239) Calculate sum in Kahan summation algorithm in aggregations (#27807) (#27848) Remove the `update_all_types` option. (#28288) Add information when master node left to DiscoveryNodes' shortSummary() (#28197) Provide explanation of dangling indices, fixes #26008 (#26999)
* master: (94 commits) Completely remove Painless Type from AnalyzerCaster in favor of Java Class. (elastic#28329) Fix spelling error Reindex: Wait for deletion in test Reindex: log more on rare test failure Ensure we protect Collections obtained from scripts from self-referencing (elastic#28335) [Docs] Fix asciidoc style in composite agg docs Adds the ability to specify a format on composite date_histogram source (elastic#28310) Provide a better error message for the case when all shards failed (elastic#28333) [Test] Re-Add integer_range and date_range field types for query builder tests (elastic#28171) Added Put Mapping API to high-level Rest client (elastic#27869) Revert change that does not return all indices if a specific alias is requested via get alias api. (elastic#28294) Painless: Replace Painless Type with Java Class during Casts (elastic#27847) Notify affixMap settings when any under the registered prefix matches (elastic#28317) Trim down usages of `ShardOperationFailedException` interface (elastic#28312) Do not return all indices if a specific alias is requested via get aliases api. [Test] Lower bwc version for rank-eval rest tests CountedBitSet doesn't need to extend BitSet. (elastic#28239) Calculate sum in Kahan summation algorithm in aggregations (elastic#27807) (elastic#27848) Remove the `update_all_types` option. (elastic#28288) Add information when master node left to DiscoveryNodes' shortSummary() (elastic#28197) ...
The DiscoveryNodes.Delta was changed in elastic#28197. Previous/Master nodes are now always set in the `Delta` (before the change they were set only if the master changed) and the `masterChanged()` method is now based on object equality and nodes ephemeral ids (before the change it was based on nodes id). This commit adapts the DiscoveryNodesTests.testDeltas() to reflect the changes.
The DiscoveryNodes.Delta was changed in #28197. Previous/Master nodes are now always set in the `Delta` (before the change they were set only if the master changed) and the `masterChanged()` method is now based on object equality and nodes ephemeral ids (before the change it was based on nodes id). This commit adapts the DiscoveryNodesTests.testDeltas() to reflect the changes.
I don't think it worth backporting this to 6.2.3 |
This commit changes
DiscoveryNodes.Delta.shortSummary()
in order toadd information to the summary when the master node left.
Related to #28169