Do not start snapshots that are deleted during initialization #27931

tlrx · 2017-12-20T19:53:04Z

When a new snapshot is created it is added to the cluster state as a snapshot-in-progress in INIT state, and the initialization is kicked off in a new runnable task by SnapshotService.beginSnapshot(). The
initialization writes multiple files before updating the cluster state to change the snapshot-in-progress to STARTED state.

This leaves a short window in which the snapshot could be deleted (let's say, because the snapshot is stuck in INIT or because it takes too much time to upload all the initialization files for all snapshotted indices). If the INIT snapshot is deleted, a race begins between the deletion which sets the snapshot-in-progress to ABORTED in cluster state and tries to finalize the snapshot and the initialization in SnapshotService.beginSnapshot() which changes the state back to STARTED.

This pull request changes SnapshotService.beginSnapshot() so that an ABORTED snapshot is not started if it has been deleted during initialization. It also adds a test that would have failed
with the previous behaviour, and changes few method names here and there.

When a new snapshot is created it is added to the cluster state as a snapshot-in-progress in INIT state, and the initialization is kicked off in a new runnable task by SnapshotService.beginSnapshot(). The initialization writes multiple files before updating the cluster state to change the snapshot-in-progress to STARTED state. This leaves a short window in which the snapshot could be deleted (let's say, because the snapshot is stuck in INIT or because it takes too much time to upload all the initialization files for all snapshotted indices). If the INIT snapshot is deleted, the snapshot-in-progress becomes ABORTED but once the initialization in SnapshotService.beginSnapshot() finished it is change back to STARTED state again. This commit avoids an ABORTED snapshot to be started if it has been deleted during initialization. It also adds a test that would have failed with the previous behavior, and changes few method names here and there.

imotov

Nice cleanup! Thanks!

When a new snapshot is created it is added to the cluster state as a snapshot-in-progress in INIT state, and the initialization is kicked off in a new runnable task by SnapshotService.beginSnapshot(). The initialization writes multiple files before updating the cluster state to change the snapshot-in-progress to STARTED state. This leaves a short window in which the snapshot could be deleted (let's say, because the snapshot is stuck in INIT or because it takes too much time to upload all the initialization files for all snapshotted indices). If the INIT snapshot is deleted, the snapshot-in-progress becomes ABORTED but once the initialization in SnapshotService.beginSnapshot() finished it is change back to STARTED state again. This commit avoids an ABORTED snapshot to be started if it has been deleted during initialization. It also adds a test that would have failed with the previous behavior, and changes few method names here and there.

…snapshot With the current snapshot/restore logic, a newly created snapshot is added by the SnapshotService.createSnapshot() method as a SnapshotInProgress object in the cluster state. This snapshot has the INIT state. Once the cluster state update is processed, the beginSnapshot() method is executed using the SNAPSHOT thread pool. The beginSnapshot() method starts the initialization of the snapshot using the initializeSnapshot() method. This method reads the repository data and then writes the global metadata file and an index metadata file per index to be snapshotted. These operations can take some time to be completed (many minutes). At this stage and if the master node is disconnected the snapshot can be stucked in INIT state on versions 5.6.4/6.0.0 or lower (pull request elastic#27214 fixed this on 5.6.5/6.0.1 and higher). If the snapshot is not stucked but the initialization takes some time and the user decides to abort the snapshot, a delete snapshot request can sneak in. The deletion updates the cluster state to check the state of the SnapshotInProgress. When the snapshot is in INIT, it executes the endSnapshot() method (which returns immediately) and then the snapshot's state is updated to ABORTED in the cluster state. The deletion will then listen for the snapshot completion in order to continue with the deletion of the snapshot. But before returning, the endSnapshot() method added a new Runnable to the SNAPSHOT thread pool that forces the finalization of the initializing snapshot. This finalization writes the snapshot metadata file and updates the index-N file in the repository. At this stage two things can potentially be executed concurrently: the initialization of the snapshot and the finalization of the snapshot. When the initializeSnapshot() is terminated, the cluster state is updated to start the snapshot and to move it to the STARTED state (this is before elastic#27931 which prevents an ABORTED snapshot to be started at all). The snapshot is started and shards start to be snapshotted but they quickly fail because the snapshot was ABORTED by the deletion. All shards are reported as FAILED to the master node, which executes endSnapshot() too (using SnapshotStateExecutor). Then many things can happen, depending on the execution of tasks by the SNAPSHOT thread pool and the time taken by each read/write/delete operation by the repository implementation. Especially on S3, where operations can take time (disconnections, retries, timeouts) and where the data consistency model allows to read old data or requires some time for objects to be replicated. Here are some scenario seen in cluster logs: a) the snapshot is finalized by the snapshot deletion. Snapshot metadata file exists in the repository so the future finalization by the snapshot creation will fail with a "fail to finalize snapshot" message in logs. Deletion process continues. b) the snapshot is finalized by the snapshot creation. Snapshot metadata file exists in the repository so the future finalization by the snapshot deletion will fail with a "fail to finalize snapshot" message in logs. Deletion process continues. c) both finalizations are executed concurrently, things can fail at different read or write operations. Shards failures can be lost as well as final snapshot state, depending on which SnapshotInProgress.Entry is used to finalize the snapshot. d) the snapshot is finalized by the snapshot deletion, the snapshot in progress is removed from the cluster state, triggering the execution of the completion listeners. The deletion process continues and the deleteSnapshotFromRepository() is executed using the SNAPSHOT thread pool. This method reads the repository data, the snapshot metadata and the index metadata for all indices included in the snapshot before updated the index-N file from the repository. It can also take some time and I think these operations could potentially be executed concurrently with the finalization of the snapshot by the snapshot creation, leading to corrupted data. This commit does not solve all the issues reported here, but it removes the finalization of the snapshot by the snapshot deletion. This way, the deletion marks the snapshot as ABORTED in cluster state and waits for the snapshot completion. It is the responsability of the snapshot execution to detect the abortion and terminates itself correctly. This avoids concurrent snapshot finalizations and also ordinates the operations: the deletion aborts the snapshot and waits for the snapshot completion, the creation detects the abortion and stops by itself and finalizes the snapshot, then the deletion resumes and continues the deletion process.

When a new snapshot is created it is added to the cluster state as a snapshot-in-progress in INIT state, and the initialization is kicked off in a new runnable task by SnapshotService.beginSnapshot(). The initialization writes multiple files before updating the cluster state to change the snapshot-in-progress to STARTED state. This leaves a short window in which the snapshot could be deleted (let's say, because the snapshot is stuck in INIT or because it takes too much time to upload all the initialization files for all snapshotted indices). If the INIT snapshot is deleted, the snapshot-in-progress becomes ABORTED but once the initialization in SnapshotService.beginSnapshot() finished it is change back to STARTED state again. This commit avoids an ABORTED snapshot to be started if it has been deleted during initialization. It also adds a test that would have failed with the previous behavior, and changes few method names here and there.

tlrx · 2018-01-15T14:52:38Z

This has been backported to 6.0.3 along with #28078 in 9b6d37a and in 5.6.7 in 84503a1

When a new snapshot is created it is added to the cluster state as a snapshot-in-progress in INIT state, and the initialization is kicked off in a new runnable task by SnapshotService.beginSnapshot(). The initialization writes multiple files before updating the cluster state to change the snapshot-in-progress to STARTED state. This leaves a short window in which the snapshot could be deleted (let's say, because the snapshot is stuck in INIT or because it takes too much time to upload all the initialization files for all snapshotted indices). If the INIT snapshot is deleted, the snapshot-in-progress becomes ABORTED but once the initialization in SnapshotService.beginSnapshot() finished it is change back to STARTED state again. This commit avoids an ABORTED snapshot to be started if it has been deleted during initialization. It also adds a test that would have failed with the previous behavior, and changes few method names here and there.

tlrx · 2018-01-15T20:27:18Z

Sorry, I mixed up labels. This was merged in 5.6.7 and 6.1.3.

tlrx added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs review v6.0.2 v6.1.2 v6.2.0 v7.0.0 labels Dec 20, 2017

tlrx requested a review from imotov December 20, 2017 19:53

imotov approved these changes Dec 20, 2017

View reviewed changes

tlrx merged commit bd9daf4 into elastic:master Dec 22, 2017

tlrx removed the v6.0.2 label Dec 22, 2017

tlrx deleted the do-not-start-aborted-snapshots branch December 22, 2017 14:58

tlrx mentioned this pull request Dec 23, 2017

[CI] SharedClusterSnapshotRestoreIT.testAbortedSnapshotDuringInitDoesNotStart() fails #27974

Closed

tlrx mentioned this pull request Jan 4, 2018

Avoid concurrent snapshot finalizations when deleting an INIT snapshot #28078

Merged

tlrx mentioned this pull request Jan 10, 2018

[CI] StackOverflowError when executing SnapshotDisruptionIT.testDisruptionOnSnapshotInitialization #28169

Closed

tlrx added the v6.0.2 label Jan 15, 2018

tlrx added v5.6.6 v5.6.7 v6.1.3 and removed v5.6.6 v6.1.2 labels Jan 15, 2018

colings86 added the >bug label Jan 22, 2018

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not start snapshots that are deleted during initialization #27931

Do not start snapshots that are deleted during initialization #27931

Uh oh!

tlrx commented Dec 20, 2017

Uh oh!

imotov left a comment

Uh oh!

tlrx commented Jan 15, 2018 •

edited

Loading

Uh oh!

tlrx commented Jan 15, 2018

Uh oh!

Uh oh!

Do not start snapshots that are deleted during initialization #27931

Do not start snapshots that are deleted during initialization #27931

Uh oh!

Conversation

tlrx commented Dec 20, 2017

Uh oh!

imotov left a comment

Choose a reason for hiding this comment

Uh oh!

tlrx commented Jan 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tlrx commented Jan 15, 2018

Uh oh!

Uh oh!

tlrx commented Jan 15, 2018 •

edited

Loading