-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Prevent TransportReplicationAction to route request based on stale local routing table #16274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@bleskes instead of using the cluster state version, we could as well use the index metadata version. The index metadata version is updated whenever a new shard is started (thanks to active allocation ids). wdyt? On a related note, we could use this field as well to wait for dynamic mapping updates to be applied. (for that the update mappings api would have to return the current index metadata version). |
} else { | ||
// chasing the node with the active primary for a second hop requires that we are at least up-to-date with the current cluster state version | ||
// this prevents redirect loops between two nodes when a primary was relocated and the relocation target is not aware that it is the active primary shard already. | ||
request.minimumClusterStateVersion(state.version()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we assert that the version is not set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got confused. It actually likely the previous routing node has an older cluster state in which we should override the existing value.. nevermind
@bleskes renamed the field and removed integration test. |
@@ -155,6 +157,16 @@ public final Request consistencyLevel(WriteConsistencyLevel consistencyLevel) { | |||
return (Request) this; | |||
} | |||
|
|||
@SuppressWarnings("unchecked") | |||
Request routedBasedOnClusterVersion(long routedBasedOnClusterVersion) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add some java docs here?
LGTM . Thanks @ywelsch - Left some minor comments, no need for another cycle. |
397b6dc
to
96be074
Compare
…cal routing table Closes elastic#16274 Closes elastic#12573 Closes elastic#12574
96be074
to
af1f637
Compare
Prevent TransportReplicationAction to route request based on stale local routing table
…cal routing table Closes elastic#16274 Closes elastic#12573 Closes elastic#12574
…cal routing table (#19296) When relocating a primary shard, there is a cluster state update at the end of relocation where the active primary is switched from the relocation source to the relocation target. If relocation source receives and processes this cluster state before the relocation target, there is a time span where relocation source believes active primary to be on relocation target and relocation target believes active primary to be on relocation source. This results in index/delete/flush requests being sent back and forth and can end in an OOM on both nodes. Backport of #16274 to 2.4.0.
Relates to #12573
When relocating a primary shard, there is a cluster state update at the end of relocation where the active primary is switched from the relocation source to the relocation target. If relocation source receives and processes this cluster state before the relocation target, there is a time span where relocation source believes active primary to be on relocation target and relocation target believes active primary to be on relocation source. This results in index/delete/flush requests being sent back and forth and can end in an OOM on both nodes.
This PR adds a field to the index/delete/flush request that helps detect the case where we locally have stale routing information. In case this staleness is detected, we wait until we have received an up-to-date cluster state before rerouting the request.
I have included the test from #12574 in this PR to demonstrate the fix in an integration test. That integration test will not be part of the final commit, however.