-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Do not allow stale replicas to automatically be promoted to primary #14671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
Thanks @jasontedor . can we also update the resiliency page? |
Thanks Jason! On 11 nov. 2015 4:35 PM +0100, Jason [email protected], wrote:
|
7 tasks
Closed by #15281 |
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Consider a primary shard
P
hosted on nodep
and its replica shardQ
hosted on nodeq
. Ifp
is isolated from the cluster (e.g., through node failure, a flapping NIC, or an excessively long garbage collection pause), indexing operations can continue onq
afterQ
is promoted to primary; these indexing operations will be acknowledged to the requesting clients. Ifq
is subsequently isolated beforep
rejoins and before a new replica is assigned to another node in the cluster, the subsequent rejoining ofp
can currently lead toP
being promoted to primary again. The indexing operations acknowledged byq
will be lost.A mechanism needs to be built to prevent the automatic promotion of a stale shard in such a scenario and instead only promote a non-stale shard to primary (if a non-stale shard is availabie). The only scenario in which a stale shard should be promoted to primary is through manual intervention by a system operator (e.g., in cases when
q
suffers a total hardware failure).Relates #10933
The text was updated successfully, but these errors were encountered: