Skip to content

Refactor RestoreService Restore Path #73258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented May 20, 2021

Make the restore path a little easier to follow by splitting it up into
the cluster state update and the steps that happen before the CS update.
Also, document more pieces of it and remove some confusing redundant code.
Outside of spots that have inline comments to the contrary, this is a purely mechanical
refactoring, no logical changes.

Partly also motivated by upcoming changes to how SnapshotInfo is loaded (with the current structure it's essentially impossible to make loading SnapshotInfo async).

Note to reviewers: this change is much smaller than it appears from the diff. Unfortunately, even with ?w=1 it doesn't render so neatly. If you separately check the methods that moved from the CS update task anonymous class to the top level like updateIndexSettings you should find that there's barely any changes to the code. If this is too bothersome to review though, I thing we could just do 2 steps here instead and move the code around to separate methods and the named CS update task class in one PR and then do the other adjustments separately. I just opted not to do that because the first step would be so awkward (because of all the redundant variables that would have to be passed around).

Make the restore path a little easier to follow by splitting it up into
the cluster state update and the steps that happen before the CS update.
Also, document more pieces of it and remove some confusing redundant code.
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 20, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

return allocationService.reroute(updatedState, "restored snapshot [" + snapshot + "]");
}

private void checkAliasNameConflicts(Map<String, String> renamedIndices, Set<String> aliases) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this works was adjusted here. No need to collect all aliases into a set up front and then check against the renamedIndices keyset when renamedIndices is constant throughout the process. Just check alias names against the keyset in-line as we loop through the indices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why we check this one case here? There are other possible problems with aliases too: the new aliases might conflict with the names of existing indices and/or the names of the indices being restored might conflict with existing aliases. I mean we do check these things in Metadata.Builder#build() I’m just wondering why we check this one extra case explicitly like this too.

ImmutableOpenMap<ShardId, RestoreInProgress.ShardRestoreStatus> shards;
Set<String> aliases = new HashSet<>();

if (indices.isEmpty() == false) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this check, just run the loop (or not if there's no indices) -> dropped it

ClusterBlocks.Builder blocks = ClusterBlocks.builder().blocks(currentState.blocks());
RoutingTable.Builder rtBuilder = RoutingTable.builder(currentState.routingTable());
ImmutableOpenMap<ShardId, RestoreInProgress.ShardRestoreStatus> shards;
Set<String> aliases = new HashSet<>();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment, no need for this state in the loop

createIndexService.validateIndexName(renamedIndexName, currentState);
createIndexService.validateDotIndex(renamedIndexName, isHidden);
createIndexService.validateIndexSettings(renamedIndexName, snapshotIndexMetadata.getSettings(), false);
IndexMetadata.Builder indexMdBuilder = IndexMetadata.builder(snapshotIndexMetadata)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified the builder usage here and in the existing closed index path to use as much chaining as possible and keep things tidy and extracted the logic to separate methods to make this a little shorter to read.

@@ -1095,9 +754,12 @@ public static int failedShards(ImmutableOpenMap<ShardId, RestoreInProgress.Shard
return failedShards;
}

private static Map<String, String> renamedIndices(RestoreSnapshotRequest request, List<String> filteredIndices,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this return Map<String, IndexId> instead right away so we don't have to keep a hold of the RepositoryData for the duration of the CS update request only to look up the IndexId later.

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/rest-compatibility

@original-brownbear
Copy link
Member Author

Jenkins run
elasticsearch-ci/bwc

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Took me some time to process but I agree the overall change makes things easier to read.

@original-brownbear
Copy link
Member Author

Thanks Tanguy!

@original-brownbear original-brownbear merged commit b13f43b into elastic:master May 20, 2021
@original-brownbear original-brownbear deleted the cleanup-restore-service branch May 20, 2021 13:51
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2 after repeating the (50+!) steps I needed to get here. I left one question.

return allocationService.reroute(updatedState, "restored snapshot [" + snapshot + "]");
}

private void checkAliasNameConflicts(Map<String, String> renamedIndices, Set<String> aliases) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why we check this one case here? There are other possible problems with aliases too: the new aliases might conflict with the names of existing indices and/or the names of the indices being restored might conflict with existing aliases. I mean we do check these things in Metadata.Builder#build() I’m just wondering why we check this one extra case explicitly like this too.

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Jun 13, 2021
Make the restore path a little easier to follow by splitting it up into
the cluster state update and the steps that happen before the CS update.
Also, document more pieces of it and remove some confusing redundant code.
original-brownbear added a commit that referenced this pull request Jun 14, 2021
Make the restore path a little easier to follow by splitting it up into
the cluster state update and the steps that happen before the CS update.
Also, document more pieces of it and remove some confusing redundant code.
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
@original-brownbear original-brownbear restored the cleanup-restore-service branch April 18, 2023 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >refactoring Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.14.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants