-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Remove Redundant Loading of RepositoryData during Restore #51977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Redundant Loading of RepositoryData during Restore #51977
Conversation
We can just put the `IndexId` instead of just the index name into the recovery soruce and save one load of `RepositoryData` on each shard restore that way.
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
@@ -252,6 +278,9 @@ protected void writeAdditionalFields(StreamOutput out) throws IOException { | |||
snapshot.writeTo(out); | |||
Version.writeVersion(version, out); | |||
out.writeString(index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we avoid sending this field if indexId != null
(i.e. in the non-bwc case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, see the other question, seems to me we need to write the uuid string as optional, maybe not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left one comment that allows to simplify the code here after backport.
} | ||
|
||
SnapshotRecoverySource(StreamInput in) throws IOException { | ||
restoreUUID = in.readString(); | ||
snapshot = new Snapshot(in); | ||
version = Version.readVersion(in); | ||
index = in.readString(); | ||
if (in.getVersion().onOrAfter(Version.V_8_0_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the approach here does not allow us to later switch to just reading directly IndexId from stream after backport to 7.x.
Can you change things so that you conditionally do in.readString();
or new IndexId(in)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tell me if I'm missing something here. But doesn't this break in the edge case of:
- Running mixed cluster with old version master
- Fail over to new version master
- New version master sends the existing recovery source over the wire because it still has
null
for the uuid because the old master didn't add that when creating theRecoverySource
? (am I missing a spot where this is upgraded/recreated during master fail-over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should differentiate the in.readString()
/ new IndexId(in)
serialization part and make the logic only rely on IndexId
, both on master and 7.x, and INDEX_UUID_NA_VALUE to resolve the index id later when needed?
Thanks @tlrx I totally forgot about Should be good for another review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks for the simplification. I left few minor comments.
if (!shardId.getIndexName().equals(indexName)) { | ||
snapshotShardId = new ShardId(indexName, IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id()); | ||
} else { | ||
final IndexId indexIdFromCS = restoreSource.index(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just final IndexId indexId = restoreSource.index();
} else { | ||
snapshotShardId = new ShardId(indexIdFromCS.getName(), IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id()); | ||
} | ||
// If the index UUID was not found in the recovery source we will have to load RepositoryData and resolve it buy index name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo buy
snapshotShardId = new ShardId(indexIdFromCS.getName(), IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id()); | ||
} | ||
// If the index UUID was not found in the recovery source we will have to load RepositoryData and resolve it buy index name | ||
final boolean indexUUIDUnavailable = indexIdFromCS.getId().equals(IndexMetaData.INDEX_UUID_NA_VALUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe final boolean resolveIndexId = IndexMetaData.INDEX_UUID_NA_VALUE.equals(indexId.getId());
Jenkins run elasticsearch-ci/1 (unrelated failure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks Tanguy + Yannick! |
We can just put the
IndexId
instead of just the index name into the recovery soruce andsave one load of
RepositoryData
on each shard restore that way.