Remove Redundant Loading of RepositoryData during Restore #51977

original-brownbear · 2020-02-06T06:30:00Z

We can just put the IndexId instead of just the index name into the recovery soruce and
save one load of RepositoryData on each shard restore that way.

We can just put the `IndexId` instead of just the index name into the recovery soruce and save one load of `RepositoryData` on each shard restore that way.

elasticmachine · 2020-02-06T06:30:03Z

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

ywelsch · 2020-02-06T16:21:11Z

server/src/main/java/org/elasticsearch/cluster/routing/RecoverySource.java

@@ -252,6 +278,9 @@ protected void writeAdditionalFields(StreamOutput out) throws IOException {
            snapshot.writeTo(out);
            Version.writeVersion(version, out);
            out.writeString(index);


can we avoid sending this field if indexId != null (i.e. in the non-bwc case)

Not sure, see the other question, seems to me we need to write the uuid string as optional, maybe not?

ywelsch

I've left one comment that allows to simplify the code here after backport.

ywelsch · 2020-02-06T16:25:11Z

server/src/main/java/org/elasticsearch/cluster/routing/RecoverySource.java

        }

        SnapshotRecoverySource(StreamInput in) throws IOException {
            restoreUUID = in.readString();
            snapshot = new Snapshot(in);
            version = Version.readVersion(in);
            index = in.readString();
+            if (in.getVersion().onOrAfter(Version.V_8_0_0)) {


the approach here does not allow us to later switch to just reading directly IndexId from stream after backport to 7.x.

Can you change things so that you conditionally do in.readString(); or new IndexId(in)

Tell me if I'm missing something here. But doesn't this break in the edge case of:

Running mixed cluster with old version master

Fail over to new version master

New version master sends the existing recovery source over the wire because it still has null for the uuid because the old master didn't add that when creating the RecoverySource? (am I missing a spot where this is upgraded/recreated during master fail-over?

Maybe we should differentiate the in.readString() / new IndexId(in) serialization part and make the logic only rely on IndexId, both on master and 7.x, and INDEX_UUID_NA_VALUE to resolve the index id later when needed?

original-brownbear · 2020-02-07T10:58:48Z

Thanks @tlrx I totally forgot about INDEX_UUID_NA_VALUE. I think that's much cleaner than what I had before and we use it in other spots in snapshots as well already.

Should be good for another review :)

tlrx

LGTM - thanks for the simplification. I left few minor comments.

tlrx · 2020-02-07T11:33:58Z

server/src/main/java/org/elasticsearch/index/shard/StoreRecovery.java

-            if (!shardId.getIndexName().equals(indexName)) {
-                snapshotShardId = new ShardId(indexName, IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id());
-            } else {
+            final IndexId indexIdFromCS = restoreSource.index();


Maybe just final IndexId indexId = restoreSource.index();

tlrx · 2020-02-07T11:34:29Z

server/src/main/java/org/elasticsearch/index/shard/StoreRecovery.java

+            } else {
+                snapshotShardId = new ShardId(indexIdFromCS.getName(), IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id());
+            }
+            // If the index UUID was not found in the recovery source we will have to load RepositoryData and resolve it buy index name


tlrx · 2020-02-07T11:35:17Z

server/src/main/java/org/elasticsearch/index/shard/StoreRecovery.java

+                snapshotShardId = new ShardId(indexIdFromCS.getName(), IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id());
+            }
+            // If the index UUID was not found in the recovery source we will have to load RepositoryData and resolve it buy index name
+            final boolean indexUUIDUnavailable = indexIdFromCS.getId().equals(IndexMetaData.INDEX_UUID_NA_VALUE);


Maybe final boolean resolveIndexId = IndexMetaData.INDEX_UUID_NA_VALUE.equals(indexId.getId());

original-brownbear · 2020-02-07T11:52:10Z

@tlrx thanks for the suggestions, I found a refactoring to address those I think :) See: 11f84fe

original-brownbear · 2020-02-07T12:25:42Z

Jenkins run elasticsearch-ci/1 (unrelated failure)

ywelsch

LGTM

original-brownbear · 2020-02-07T14:24:38Z

Thanks Tanguy + Yannick!

) We can just put the `IndexId` instead of just the index name into the recovery soruce and save one load of `RepositoryData` on each shard restore that way.

…52108) We can just put the `IndexId` instead of just the index name into the recovery soruce and save one load of `RepositoryData` on each shard restore that way.

Remove Redundant Loading of RepositoryData during Restore

c9f0cc2

We can just put the `IndexId` instead of just the index name into the recovery soruce and save one load of `RepositoryData` on each shard restore that way.

original-brownbear added >non-issue :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.7.0 labels Feb 6, 2020

original-brownbear added 4 commits February 6, 2020 07:33

nicer

d0081c5

safer bwc

946ce25

Merge remote-tracking branch 'elastic/master' into cleaner-restore

14e71fb

shorter

f6a6f7a

original-brownbear requested review from ywelsch and tlrx February 6, 2020 08:24

ywelsch reviewed Feb 6, 2020

View reviewed changes

original-brownbear requested a review from ywelsch February 6, 2020 19:39

original-brownbear added 3 commits February 7, 2020 10:47

Merge remote-tracking branch 'elastic/master' into cleaner-restore

c6110c8

CR: nicer serialization

9bf7a99

shorter diff

274a7e5

tlrx approved these changes Feb 7, 2020

View reviewed changes

nicer

11f84fe

original-brownbear requested a review from tlrx February 7, 2020 11:52

tlrx approved these changes Feb 7, 2020

View reviewed changes

ywelsch approved these changes Feb 7, 2020

View reviewed changes

original-brownbear merged commit e79e6d9 into elastic:master Feb 7, 2020

original-brownbear deleted the cleaner-restore branch February 7, 2020 14:24

original-brownbear added the backport pending label Feb 7, 2020

original-brownbear mentioned this pull request Feb 9, 2020

Remove Redundant Loading of RepositoryData during Restore (#51977) #52108

Merged

original-brownbear removed the backport pending label Feb 9, 2020

original-brownbear restored the cleaner-restore branch January 6, 2021 14:08

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Redundant Loading of RepositoryData during Restore #51977

Remove Redundant Loading of RepositoryData during Restore #51977

original-brownbear commented Feb 6, 2020

elasticmachine commented Feb 6, 2020

ywelsch Feb 6, 2020

original-brownbear Feb 6, 2020

ywelsch left a comment

ywelsch Feb 6, 2020

original-brownbear Feb 6, 2020

tlrx Feb 7, 2020

original-brownbear commented Feb 7, 2020

tlrx left a comment

tlrx Feb 7, 2020

tlrx Feb 7, 2020

tlrx Feb 7, 2020

original-brownbear commented Feb 7, 2020

original-brownbear commented Feb 7, 2020

ywelsch left a comment

original-brownbear commented Feb 7, 2020

Remove Redundant Loading of RepositoryData during Restore #51977

Remove Redundant Loading of RepositoryData during Restore #51977

Conversation

original-brownbear commented Feb 6, 2020

elasticmachine commented Feb 6, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ywelsch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear commented Feb 7, 2020

tlrx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

original-brownbear commented Feb 7, 2020

original-brownbear commented Feb 7, 2020

ywelsch left a comment

Choose a reason for hiding this comment

original-brownbear commented Feb 7, 2020