-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Snapshot: Migrate TransportRequestHandler to TransportMasterNodeAction #27165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently, we are using a plain TransportRequestHandler to post snapshot status messages to the master. However, it doesn't have a robust retry mechanism as TransportMasterNodeAction. This changes migrate from TransportRequestHandler to TransportMasterNodeAction. Most of code in TransportSnapshotUpdateStatusAction is copied from SnapshotShardsService. Serializing a MasterNodeRequest requires 8 bytes more than a TransportRequest. In order to maintain the BWC in a mixed cluster, we have to serialize/deserialize a MasterNodeRequest as a TransportRequest without timeout. Closes elastic#27151
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of comments/questions. In general, it looks good to me, but we need to figure out and outline our backport strategy more clearly. I feel like a lot of code here will not be actually needed in 7.x, when we will backport this to 6.x.
* This method serializes a {@link MasterNodeRequest} as a {@link org.elasticsearch.transport.TransportRequest} | ||
* without timeout. The master will have to use the default timeout setting. | ||
*/ | ||
protected final void readFromAsTransportRequest(StreamInput in) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is clever, but at the same time might be trap-y. This class is used in a lot of places and I think I would rather have 2 different clean implementation in 6.x and one clean implementation in 7.x then this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, having two handlers may be safer than this approach.
@Override | ||
public void readFrom(StreamInput in) throws IOException { | ||
// To keep BWC, we have to deserialize a MasterNodeRequest from a TransportRequest from older versions. | ||
if (in.getVersion().before(Version.V_7_0_0_alpha1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this temporary solution until we backport his to 6.x branch? If this is the case, could you mark all this places with //TODO: or something like this so it would be obvious what we keep in 7.x and what goes only into 6.x.
// addLowPriorityApplier to make sure that Repository will be created before snapshot | ||
clusterService.addLowPriorityApplier(this); | ||
// this is only useful on the nodes that can hold data. | ||
clusterService.addListener(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment here why this is needs to be a listener and not an applier and why it's ok to have this as addListener
and not addLast
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test DedicatedClusterSnapshotRestoreIT#testSnapshotWithStuckNode
was failed with an Applier
but passed with a Listener
. However, I don't really know the root cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@imotov, I think I have figured out the root cause. A Listener is called after the new state is set, while an Applier is called before that. In SnapshotShardsService
, we call #syncShardStatsOnNewMaster
-> #updateIndexShardSnapshotStatus
which in turn uses a TransportMasterNodeAction. The TransportMasterNodeAction accesses the state of the cluster directly which may not be available yet if it's invoked from the Applier.
Caused by: java.lang.AssertionError: should not be called by a cluster state applier. reason [the applied cluster state is not yet available]
This explains why #testSnapshotWithStuckNode
was failed with an Applier but passed with a Listener.
@imotov, I have reverted the first commit and added a new commit with 2 separate handlers. Could you please have another look? Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I would like @ywelsch to also take a look from ClusterState perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I found an issue with the new action being invoked incorrectly. Also, we should think about a sensible timeout. Finally, I would like to see a test which shows that the retry mechanism works. Please also update PR description.
transportService.sendRequest(master, UPDATE_SNAPSHOT_ACTION_NAME, request, EmptyTransportResponseHandler.INSTANCE_SAME); | ||
if (master.getVersion().onOrAfter(Version.V_7_0_0_alpha1)) { | ||
UpdateIndexShardSnapshotStatusRequest request = new UpdateIndexShardSnapshotStatusRequest(snapshot, shardId, status); | ||
transportService.sendRequest(master, UPDATE_SNAPSHOT_STATUS_ACTION_NAME, request, EmptyTransportResponseHandler.INSTANCE_SAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this request has a default masterNodeTimeout
of 30 seconds. Shouldn't we wait longer (possibly forever?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed cb73eea
transportService.sendRequest(master, UPDATE_SNAPSHOT_ACTION_NAME, request, EmptyTransportResponseHandler.INSTANCE_SAME); | ||
if (master.getVersion().onOrAfter(Version.V_7_0_0_alpha1)) { | ||
UpdateIndexShardSnapshotStatusRequest request = new UpdateIndexShardSnapshotStatusRequest(snapshot, shardId, status); | ||
transportService.sendRequest(master, UPDATE_SNAPSHOT_STATUS_ACTION_NAME, request, EmptyTransportResponseHandler.INSTANCE_SAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you send the request like this, and the master is unavailable, there will be no retry (i.e. the retry code in TransportMasterNodeAction won't come into play). The action needs to be sent to to the node itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed ea7ec38
public void messageReceived(UpdateIndexShardSnapshotStatusRequest request, final TransportChannel channel) throws Exception { | ||
innerUpdateSnapshotState(request); | ||
public void messageReceived(UpdateSnapshotStatusRequestV6 requestV6, final TransportChannel channel) throws Exception { | ||
final UpdateIndexShardSnapshotStatusRequest request = new UpdateIndexShardSnapshotStatusRequest(requestV6.snapshot(), requestV6.shardId(), requestV6.status()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, masterNodeTimeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@ywelsch, I have added a disruption test and verified that the retry mechanism works. Could you please take another look? Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the newly added test actually fail without your changes?
|
||
logger.info("--> unblocking repository"); | ||
unblockNode("test-repo", blockedNode); | ||
Thread.sleep(200); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should try to avoid Thread.sleep calls in our tests
unblockNode("test-repo", blockedNode); | ||
Thread.sleep(200); | ||
logger.info("--> stop disrupting cluster"); | ||
internalCluster().clearDisruptionScheme(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is usually done by calling networkDisruption.stopDisrupting();
@ywelsch, The result of the test is false positive. As we enabled the delayed disruption, the elasticsearch/core/src/main/java/org/elasticsearch/snapshots/SnapshotShardsService.java Lines 125 to 128 in dfdf496
This causes the MockTransportService throw NPE when simulating network delay. Lines 430 to 433 in 4c06b8f
I registered that handler for all nodes, then the test was passed with both approaches. I tried with other kinds of disruption but no luck.
What do you think? Thank you. |
We probably can't get rid of |
@ywelsch, @dnhatn Yes, I think 6.x needs both because 6.x can talk to 6.0 and 6.0 is using the old way. However, 7.0 is only going to talk to 6.last and 6.last is going to have both protocols so we don't need support for both in 7.0. That's should be the final state, I think. However, to get there we need to figure out how we are going to merge these changes in without breaking backward compatibility tests. One way to do it is to merge both protocols in 7.0, bake it there for a while, then back port it to 6.x, and remove old protocol from 7.0 at the same time. |
@imotov Thank you for your suggestion. Just to confirm the backport steps.
|
Yes, except you can combine 7.x portion of 2) and 3) into a single commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Currently, we are using a plain TransportRequestHandler to post snapshot status messages to the master. However, it doesn't have a robust retry mechanism as TransportMasterNodeAction. This change migrates from TransportRequestHandler to TransportMasterNodeAction for the new versions and keeps the current implementation for the old versions. Closes #27151
* master: (31 commits) [TEST] Fix `GeoShapeQueryTests#testPointsOnly` failure Transition transport apis to use void listeners (#27440) AwaitsFix GeoShapeQueryTests#testPointsOnly #27454 Bump test version after backport Ensure nested documents have consistent version and seq_ids (#27455) Tests: Add Fedora-27 to packaging tests Delete some seemingly unused exceptions (#27439) #26800: Fix docs rendering Remove config prompting for secrets and text (#27216) Move the CLI into its own subproject (#27114) Correct usage of "an" to "a" in getting started docs Avoid NPE when getting build information Removes BWC snapshot status handler used in 6.x (#27443) Remove manual tracking of registered channels (#27445) Remove parameters on HandshakeResponseHandler (#27444) [GEO] fix pointsOnly bug for MULTIPOINT Standardize underscore requirements in parameters (#27414) peanut butter hamburgers Log primary-replica resync failures Uses TransportMasterNodeAction to update shard snapshot status (#27165) ...
* 6.x: (41 commits) [TEST] Fix `GeoShapeQueryTests#testPointsOnly` failure Transition transport apis to use void listeners (#27440) AwaitsFix GeoShapeQueryTests#testPointsOnly #27454 Ensure nested documents have consistent version and seq_ids (#27455) Tests: Add Fedora-27 to packaging tests #26800: Fix docs rendering Move the CLI into its own subproject (#27114) Correct usage of "an" to "a" in getting started docs Avoid NPE when getting build information Remove manual tracking of registered channels (#27445) Standardize underscore requirements in parameters (#27414) Remove parameters on HandshakeResponseHandler (#27444) [GEO] fix pointsOnly bug for MULTIPOINT peanut butter hamburgers Uses TransportMasterNodeAction to update shard snapshot status (#27165) Log primary-replica resync failures Add limits for ngram and shingle settings (#27411) Enforce a minimum task execution and service time of 1 nanosecond Fix place-holder in allocation decider messages (#27436) Remove newline from log message (#27425) ...
Currently, we are using a plain TransportRequestHandler to post snapshot status messages to the master. However, it doesn't have a robust retry mechanism as TransportMasterNodeAction. This change migrates from TransportRequestHandler to TransportMasterNodeAction for the new versions and keeps the current implementation for the old versions (BWC).
Closes #27151