Snapshot: Migrate TransportRequestHandler to TransportMasterNodeAction #27165

dnhatn · 2017-10-30T02:17:27Z

Currently, we are using a plain TransportRequestHandler to post snapshot status messages to the master. However, it doesn't have a robust retry mechanism as TransportMasterNodeAction. This change migrates from TransportRequestHandler to TransportMasterNodeAction for the new versions and keeps the current implementation for the old versions (BWC).

Closes #27151

Currently, we are using a plain TransportRequestHandler to post snapshot status messages to the master. However, it doesn't have a robust retry mechanism as TransportMasterNodeAction. This changes migrate from TransportRequestHandler to TransportMasterNodeAction. Most of code in TransportSnapshotUpdateStatusAction is copied from SnapshotShardsService. Serializing a MasterNodeRequest requires 8 bytes more than a TransportRequest. In order to maintain the BWC in a mixed cluster, we have to serialize/deserialize a MasterNodeRequest as a TransportRequest without timeout. Closes elastic#27151

imotov

Left a couple of comments/questions. In general, it looks good to me, but we need to figure out and outline our backport strategy more clearly. I feel like a lot of code here will not be actually needed in 7.x, when we will backport this to 6.x.

imotov · 2017-10-30T21:52:28Z

core/src/main/java/org/elasticsearch/action/support/master/MasterNodeRequest.java

+     * This method serializes a {@link MasterNodeRequest} as a {@link org.elasticsearch.transport.TransportRequest}
+     * without timeout. The master will have to use the default timeout setting.
+     */
+    protected final void readFromAsTransportRequest(StreamInput in) throws IOException {


This is clever, but at the same time might be trap-y. This class is used in a lot of places and I think I would rather have 2 different clean implementation in 6.x and one clean implementation in 7.x then this here.

I agree, having two handlers may be safer than this approach.

imotov · 2017-10-30T21:54:30Z

core/src/main/java/org/elasticsearch/snapshots/UpdateSnapshotStatusRequest.java

+    @Override
+    public void readFrom(StreamInput in) throws IOException {
+        // To keep BWC, we have to deserialize a MasterNodeRequest from a TransportRequest from older versions.
+        if (in.getVersion().before(Version.V_7_0_0_alpha1)) {


Is this temporary solution until we backport his to 6.x branch? If this is the case, could you mark all this places with //TODO: or something like this so it would be obvious what we keep in 7.x and what goes only into 6.x.

imotov · 2017-10-30T22:03:37Z

core/src/main/java/org/elasticsearch/snapshots/SnapshotShardsService.java

-            // addLowPriorityApplier to make sure that Repository will be created before snapshot
-            clusterService.addLowPriorityApplier(this);
+            // this is only useful on the nodes that can hold data.
+            clusterService.addListener(this);


Could you add a comment here why this is needs to be a listener and not an applier and why it's ok to have this as addListener and not addLast?

The test DedicatedClusterSnapshotRestoreIT#testSnapshotWithStuckNode was failed with an Applier but passed with a Listener. However, I don't really know the root cause.

@imotov, I think I have figured out the root cause. A Listener is called after the new state is set, while an Applier is called before that. In SnapshotShardsService, we call #syncShardStatsOnNewMaster -> #updateIndexShardSnapshotStatus which in turn uses a TransportMasterNodeAction. The TransportMasterNodeAction accesses the state of the cluster directly which may not be available yet if it's invoked from the Applier.

Caused by: java.lang.AssertionError: should not be called by a cluster state applier. reason [the applied cluster state is not yet available]

This explains why #testSnapshotWithStuckNode was failed with an Applier but passed with a Listener.

This reverts commit 1f6bf50.

dnhatn · 2017-11-06T19:10:53Z

@imotov, I have reverted the first commit and added a new commit with 2 separate handlers. Could you please have another look? Thank you.

imotov

LGTM, but I would like @ywelsch to also take a look from ClusterState perspective.

ywelsch

I think I found an issue with the new action being invoked incorrectly. Also, we should think about a sensible timeout. Finally, I would like to see a test which shows that the retry mechanism works. Please also update PR description.

ywelsch · 2017-11-13T14:36:52Z

core/src/main/java/org/elasticsearch/snapshots/SnapshotShardsService.java

-            transportService.sendRequest(master, UPDATE_SNAPSHOT_ACTION_NAME, request, EmptyTransportResponseHandler.INSTANCE_SAME);
+            if (master.getVersion().onOrAfter(Version.V_7_0_0_alpha1)) {
+                UpdateIndexShardSnapshotStatusRequest request = new UpdateIndexShardSnapshotStatusRequest(snapshot, shardId, status);
+                transportService.sendRequest(master, UPDATE_SNAPSHOT_STATUS_ACTION_NAME, request, EmptyTransportResponseHandler.INSTANCE_SAME);


this request has a default masterNodeTimeout of 30 seconds. Shouldn't we wait longer (possibly forever?).

I pushed cb73eea

ywelsch · 2017-11-13T14:48:36Z

core/src/main/java/org/elasticsearch/snapshots/SnapshotShardsService.java

-            transportService.sendRequest(master, UPDATE_SNAPSHOT_ACTION_NAME, request, EmptyTransportResponseHandler.INSTANCE_SAME);
+            if (master.getVersion().onOrAfter(Version.V_7_0_0_alpha1)) {
+                UpdateIndexShardSnapshotStatusRequest request = new UpdateIndexShardSnapshotStatusRequest(snapshot, shardId, status);
+                transportService.sendRequest(master, UPDATE_SNAPSHOT_STATUS_ACTION_NAME, request, EmptyTransportResponseHandler.INSTANCE_SAME);


if you send the request like this, and the master is unavailable, there will be no retry (i.e. the retry code in TransportMasterNodeAction won't come into play). The action needs to be sent to to the node itself.

I pushed ea7ec38

ywelsch · 2017-11-13T14:55:33Z

core/src/main/java/org/elasticsearch/snapshots/SnapshotShardsService.java

-        public void messageReceived(UpdateIndexShardSnapshotStatusRequest request, final TransportChannel channel) throws Exception {
-            innerUpdateSnapshotState(request);
+        public void messageReceived(UpdateSnapshotStatusRequestV6 requestV6, final TransportChannel channel) throws Exception {
+            final UpdateIndexShardSnapshotStatusRequest request = new UpdateIndexShardSnapshotStatusRequest(requestV6.snapshot(), requestV6.shardId(), requestV6.status());


again, masterNodeTimeout?

dnhatn · 2017-11-14T20:45:19Z

@ywelsch, I have added a disruption test and verified that the retry mechanism works. Could you please take another look? Thank you.

ywelsch

Does the newly added test actually fail without your changes?

ywelsch · 2017-11-15T11:10:11Z

core/src/test/java/org/elasticsearch/snapshots/SnapshotShardsServiceIT.java

+
+        logger.info("--> unblocking repository");
+        unblockNode("test-repo", blockedNode);
+        Thread.sleep(200);


we should try to avoid Thread.sleep calls in our tests

ywelsch · 2017-11-15T11:11:56Z

core/src/test/java/org/elasticsearch/snapshots/SnapshotShardsServiceIT.java

+        unblockNode("test-repo", blockedNode);
+        Thread.sleep(200);
+        logger.info("--> stop disrupting cluster");
+        internalCluster().clearDisruptionScheme(true);


this is usually done by calling networkDisruption.stopDisrupting();

dnhatn · 2017-11-15T15:07:55Z

@ywelsch, The newly added test was failed with a plain transport handler, but passed with TransportMasterNodeAction. I pushed 85da45d to address your suggestions. Could you please take another quick look? Thank you.

dnhatn · 2017-11-16T03:24:09Z

@ywelsch, The result of the test is false positive. As we enabled the delayed disruption, the MockTransportService clones an outgoing request using the request handler of that request on a sender node (eg. the data node in the test). However, in SnapshotShardsService, we only register the request handler for master nodes only.

elasticsearch/core/src/main/java/org/elasticsearch/snapshots/SnapshotShardsService.java

Lines 125 to 128 in dfdf496

    
           if (DiscoveryNode.isMasterNode(settings)) { 
        
               // This needs to run only on nodes that can become masters 
        
               transportService.registerRequestHandler(UPDATE_SNAPSHOT_ACTION_NAME, UpdateIndexShardSnapshotStatusRequest::new, ThreadPool.Names.SAME, new UpdateSnapshotStateRequestHandler()); 
        
           }

This causes the MockTransportService throw NPE when simulating network delay.

elasticsearch/test/framework/src/main/java/org/elasticsearch/test/transport/MockTransportService.java

Lines 430 to 433 in 4c06b8f

    
           RequestHandlerRegistry reg = MockTransportService.this.getRequestHandler(action); 
        
           BytesStreamOutput bStream = new BytesStreamOutput(); 
        
           request.writeTo(bStream); 
        
           final TransportRequest clonedRequest = reg.newRequest(bStream.bytes().streamInput());

I registered that handler for all nodes, then the test was passed with both approaches.

I tried with other kinds of disruption but no luck.

NetworkDisruption.NetworkDisconnect: both approaches were either success or failed.

Stop the current master node. This actually showed a difference between a plain transport handler and a master node action if I commented out syncShardStatsOnNewMaster(event)

elasticsearch/core/src/main/java/org/elasticsearch/snapshots/SnapshotShardsService.java

Lines 158 to 169 in dfdf496

    
           public void applyClusterState(ClusterChangedEvent event) { 
        
               try { 
        
                   SnapshotsInProgress prev = event.previousState().custom(SnapshotsInProgress.TYPE); 
        
                   SnapshotsInProgress curr = event.state().custom(SnapshotsInProgress.TYPE); 
        
                   if ((prev == null && curr != null) || (prev != null && prev.equals(curr) == false)) { 
        
                       processIndexShardSnapshots(event); 
        
                   } 
        
                   String masterNodeId = event.state().nodes().getMasterNodeId(); 
        
                   if (masterNodeId != null && masterNodeId.equals(event.previousState().nodes().getMasterNodeId()) == false) { 
        
                       syncShardStatsOnNewMaster(event); 
        
                   }

syncShardStatsOnNewMaster(event) actually sends the snapshot status to a new elected master (I think a master node action does more than this). Actually, all existing tests were passed with the new approach (but not with the existing one) when I commented out syncShardStatsOnNewMaster(event).

What do you think? Thank you.

ywelsch · 2017-11-16T09:13:01Z

We probably can't get rid of syncShardStatsOnNewMaster as long as we run in mixed-version clusters with 6.0 nodes. If this is backported to v6.x, we could get rid of it in v7.0.0, however. @imotov WDYT?

imotov · 2017-11-16T15:42:22Z

@ywelsch, @dnhatn Yes, I think 6.x needs both because 6.x can talk to 6.0 and 6.0 is using the old way. However, 7.0 is only going to talk to 6.last and 6.last is going to have both protocols so we don't need support for both in 7.0. That's should be the final state, I think.

However, to get there we need to figure out how we are going to merge these changes in without breaking backward compatibility tests. One way to do it is to merge both protocols in 7.0, bake it there for a while, then back port it to 6.x, and remove old protocol from 7.0 at the same time.

dnhatn · 2017-11-16T20:28:01Z

@imotov Thank you for your suggestion. Just to confirm the backport steps.

Merge this PR to the 7.0 only
Backport this PR with a new commit updating the version checking to 6.x
Add a clean up commit to 7.0 only. This commit merely removes the BWC code.

imotov · 2017-11-17T01:33:32Z

Yes, except you can combine 7.x portion of 2) and 3) into a single commit.

ywelsch

LGTM

dnhatn · 2017-11-17T16:51:26Z

Thanks @imotov and @ywelsch for your helpful reviews.

Currently, we are using a plain TransportRequestHandler to post snapshot status messages to the master. However, it doesn't have a robust retry mechanism as TransportMasterNodeAction. This change migrates from TransportRequestHandler to TransportMasterNodeAction for the new versions and keeps the current implementation for the old versions. Closes #27151

* master: (31 commits) [TEST] Fix `GeoShapeQueryTests#testPointsOnly` failure Transition transport apis to use void listeners (#27440) AwaitsFix GeoShapeQueryTests#testPointsOnly #27454 Bump test version after backport Ensure nested documents have consistent version and seq_ids (#27455) Tests: Add Fedora-27 to packaging tests Delete some seemingly unused exceptions (#27439) #26800: Fix docs rendering Remove config prompting for secrets and text (#27216) Move the CLI into its own subproject (#27114) Correct usage of "an" to "a" in getting started docs Avoid NPE when getting build information Removes BWC snapshot status handler used in 6.x (#27443) Remove manual tracking of registered channels (#27445) Remove parameters on HandshakeResponseHandler (#27444) [GEO] fix pointsOnly bug for MULTIPOINT Standardize underscore requirements in parameters (#27414) peanut butter hamburgers Log primary-replica resync failures Uses TransportMasterNodeAction to update shard snapshot status (#27165) ...

* 6.x: (41 commits) [TEST] Fix `GeoShapeQueryTests#testPointsOnly` failure Transition transport apis to use void listeners (#27440) AwaitsFix GeoShapeQueryTests#testPointsOnly #27454 Ensure nested documents have consistent version and seq_ids (#27455) Tests: Add Fedora-27 to packaging tests #26800: Fix docs rendering Move the CLI into its own subproject (#27114) Correct usage of "an" to "a" in getting started docs Avoid NPE when getting build information Remove manual tracking of registered channels (#27445) Standardize underscore requirements in parameters (#27414) Remove parameters on HandshakeResponseHandler (#27444) [GEO] fix pointsOnly bug for MULTIPOINT peanut butter hamburgers Uses TransportMasterNodeAction to update shard snapshot status (#27165) Log primary-replica resync failures Add limits for ngram and shingle settings (#27411) Enforce a minimum task execution and service time of 1 nanosecond Fix place-holder in allocation decider messages (#27436) Remove newline from log message (#27425) ...

dnhatn added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement v6.1.0 v7.0.0 labels Oct 30, 2017

dnhatn requested review from jasontedor, imotov and ywelsch October 30, 2017 02:17

imotov reviewed Oct 30, 2017

View reviewed changes

dnhatn added 3 commits November 4, 2017 21:52

Revert "Snapshot: Use TransportMasterNodeAction to update"

37b8050

This reverts commit 1f6bf50.

Merge branch 'master' into snapshot-migrate-transport

82a98ef

Uses two handlers

97c4724

imotov approved these changes Nov 8, 2017

View reviewed changes

ywelsch suggested changes Nov 13, 2017

View reviewed changes

dnhatn added 4 commits November 14, 2017 14:36

Merge branch 'master' into snapshot-migrate-transport

82aedcc

add a disruption test

06292d1

post update messages to the local node

ea7ec38

try posting status forever

cb73eea

ywelsch reviewed Nov 15, 2017

View reviewed changes

do not use sleep

85da45d

ywelsch approved these changes Nov 17, 2017

View reviewed changes

Merge branch 'master' into snapshot-migrate-transport

57daa0d

dnhatn merged commit db688e1 into elastic:master Nov 17, 2017

dnhatn deleted the snapshot-migrate-transport branch November 17, 2017 16:55

dnhatn added the backport pending label Nov 17, 2017

dnhatn removed the backport pending label Nov 17, 2017

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Snapshot: Migrate TransportRequestHandler to TransportMasterNodeAction #27165

Snapshot: Migrate TransportRequestHandler to TransportMasterNodeAction #27165

Uh oh!

Conversation

dnhatn commented Oct 30, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

imotov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Nov 6, 2017

Uh oh!

imotov left a comment

Choose a reason for hiding this comment

Uh oh!

ywelsch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Nov 14, 2017

Uh oh!

ywelsch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Nov 15, 2017

Uh oh!

dnhatn commented Nov 16, 2017

Uh oh!

ywelsch commented Nov 16, 2017

Uh oh!

imotov commented Nov 16, 2017

Uh oh!

dnhatn commented Nov 16, 2017

Uh oh!

imotov commented Nov 17, 2017

Uh oh!

ywelsch left a comment

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Nov 17, 2017

Uh oh!

Uh oh!

dnhatn commented Oct 30, 2017 •

edited

Loading