-
Notifications
You must be signed in to change notification settings - Fork 25.2k
ES master re-election algorithm tries electing a non-reachable master. #31801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-distributed |
When printing |
Yes, I checked. The above WARN was logged on all 8 data nodes. |
Are you using any |
Are you using any discovery.zen.* settings that deviate from the defaults?
Can you try setting discovery.zen.master_election.ignore_non_master_pings to true and see if that helps?
Just a thought- It appeared to me as though the cluster state did not change even though the master fault detection stopped- which further caused the stale data to be present in cluster state. If we can have some staleness associated with the master value stored in cluster state- master would have been re-elected even though the fault detection stopped due to whatever reason. We should know the concurrency issue which did not change(or revert) the cluster state when master pings stopped- but invalidating master entries in the cluster state would have helped here. |
Please provide a response to this. |
The scenario you described had all master-eligible nodes restart. If it's the data nodes having poisonous information about who they think is the master, this setting will make sure that the active mastership information by data nodes is discarded, i.e., only the values coming from the cluster state of the master-eligible nodes will be trusted for that information, and as those have been restarted, they will not contain info about mastership of any node, thereby allowing a proper election.
We should not be building new features into some components just to work around bugs in other components. If there's a bug here, we need to find and fix it. Once the cluster is in this situation, it would be interesting to see on each node what their local cluster state looks like ( |
The changes in #32006 mean that we are now much more picky about the nodes that we consider to be "live" during the discovery phase, and a master is chosen from the known-live nodes more sensibly. In particular, we check that we're actually connected to a node during discovery, so I think this poisonous situation can no longer occur. |
Elasticsearch version (bin/elasticsearch --version): 5.5
Plugins installed: []
JVM version (java -version): 1.8
OS version (uname -a if on a Unix-like system): 4.9.38-16.35.amzn1.x86_64
Description of the problem including expected versus actual behavior:
This issue was observed in a cluster with 8 data nodes and 3 dedicated master eligible nodes.
We have observed that master re-election gets stuck in a loop sometimes even though all master eligible nodes are up and running and only one of them is not reachable. The
UnicastHostsProvider
returns the host, but it is not ping-able from any other node. It is only afterUnicastHostsProvider
stops returning that host that the master re-election is triggered. In this particular case, master re-election was stuck in a loop for the node for more than 1.5 hours becauseUnicastHostsProvider
continues to return it. That particular host was the master before the re-election was triggered.Scenario:
I want to understand the following:
Why does the master node re-appear in the current nodes even though master left. Please note that the master node ephemeral id is old. Also, since master pinger died, we have no way of removing a stale master now.
Steps to reproduce:
Unfortunately I don't have the steps to reproduce the issue, but have seen this happening more than once. I have the relevant logs and I have looked at the code flow which makes me think that this is a bug.
Provide logs (if relevant):
Log level is info for all the logs
Logs on one of the data nodes[removed ip addresses]: These log lines appear only once on all data nodes.
Post this all I see on all data nodes is a failed attempt to connect to the master - also the master node ephemeral id is old.
Logs on master node- Please note that the master node ephemeral id is different here- hence it tries to connect to some itself on a remote connection, instead of waiting for incoming joins.
My hypothesis of the issue:
The text was updated successfully, but these errors were encountered: