Skip to content

Remove Redundant Loading of RepositoryData during Restore #51977

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

original-brownbear
Copy link
Member

We can just put the IndexId instead of just the index name into the recovery soruce and
save one load of RepositoryData on each shard restore that way.

We can just put the `IndexId` instead of just the index name into the recovery soruce and
save one load of `RepositoryData` on each shard restore that way.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@@ -252,6 +278,9 @@ protected void writeAdditionalFields(StreamOutput out) throws IOException {
snapshot.writeTo(out);
Version.writeVersion(version, out);
out.writeString(index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid sending this field if indexId != null (i.e. in the non-bwc case)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, see the other question, seems to me we need to write the uuid string as optional, maybe not?

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left one comment that allows to simplify the code here after backport.

}

SnapshotRecoverySource(StreamInput in) throws IOException {
restoreUUID = in.readString();
snapshot = new Snapshot(in);
version = Version.readVersion(in);
index = in.readString();
if (in.getVersion().onOrAfter(Version.V_8_0_0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the approach here does not allow us to later switch to just reading directly IndexId from stream after backport to 7.x.

Can you change things so that you conditionally do in.readString(); or new IndexId(in)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell me if I'm missing something here. But doesn't this break in the edge case of:

  1. Running mixed cluster with old version master
  2. Fail over to new version master
  3. New version master sends the existing recovery source over the wire because it still has null for the uuid because the old master didn't add that when creating the RecoverySource? (am I missing a spot where this is upgraded/recreated during master fail-over?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should differentiate the in.readString() / new IndexId(in) serialization part and make the logic only rely on IndexId, both on master and 7.x, and INDEX_UUID_NA_VALUE to resolve the index id later when needed?

@original-brownbear
Copy link
Member Author

Thanks @tlrx I totally forgot about INDEX_UUID_NA_VALUE. I think that's much cleaner than what I had before and we use it in other spots in snapshots as well already.

Should be good for another review :)

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks for the simplification. I left few minor comments.

if (!shardId.getIndexName().equals(indexName)) {
snapshotShardId = new ShardId(indexName, IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id());
} else {
final IndexId indexIdFromCS = restoreSource.index();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just final IndexId indexId = restoreSource.index();

} else {
snapshotShardId = new ShardId(indexIdFromCS.getName(), IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id());
}
// If the index UUID was not found in the recovery source we will have to load RepositoryData and resolve it buy index name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo buy

snapshotShardId = new ShardId(indexIdFromCS.getName(), IndexMetaData.INDEX_UUID_NA_VALUE, shardId.id());
}
// If the index UUID was not found in the recovery source we will have to load RepositoryData and resolve it buy index name
final boolean indexUUIDUnavailable = indexIdFromCS.getId().equals(IndexMetaData.INDEX_UUID_NA_VALUE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe final boolean resolveIndexId = IndexMetaData.INDEX_UUID_NA_VALUE.equals(indexId.getId());

@original-brownbear
Copy link
Member Author

@tlrx thanks for the suggestions, I found a refactoring to address those I think :) See: 11f84fe

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/1 (unrelated failure)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

Thanks Tanguy + Yannick!

@original-brownbear original-brownbear merged commit e79e6d9 into elastic:master Feb 7, 2020
@original-brownbear original-brownbear deleted the cleaner-restore branch February 7, 2020 14:24
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Feb 9, 2020
)

We can just put the `IndexId` instead of just the index name into the recovery soruce and
save one load of `RepositoryData` on each shard restore that way.
original-brownbear added a commit that referenced this pull request Feb 9, 2020
…52108)

We can just put the `IndexId` instead of just the index name into the recovery soruce and
save one load of `RepositoryData` on each shard restore that way.
@original-brownbear original-brownbear restored the cleaner-restore branch January 6, 2021 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants