|
| 1 | +/* |
| 2 | + * Licensed to Elasticsearch under one or more contributor |
| 3 | + * license agreements. See the NOTICE file distributed with |
| 4 | + * this work for additional information regarding copyright |
| 5 | + * ownership. Elasticsearch licenses this file to you under |
| 6 | + * the Apache License, Version 2.0 (the "License"); you may |
| 7 | + * not use this file except in compliance with the License. |
| 8 | + * You may obtain a copy of the License at |
| 9 | + * |
| 10 | + * http://www.apache.org/licenses/LICENSE-2.0 |
| 11 | + * |
| 12 | + * Unless required by applicable law or agreed to in writing, |
| 13 | + * software distributed under the License is distributed on an |
| 14 | + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 15 | + * KIND, either express or implied. See the License for the |
| 16 | + * specific language governing permissions and limitations |
| 17 | + * under the License. |
| 18 | + */ |
| 19 | + |
| 20 | +/** |
| 21 | + * <p>This package exposes the Elasticsearch Snapshot functionality.</p> |
| 22 | + * |
| 23 | + * <h1>Preliminaries</h1> |
| 24 | + * |
| 25 | + * <p>There are two communication channels between all nodes and master in the snapshot functionality:</p> |
| 26 | + * <ul> |
| 27 | + * <li>The master updates the cluster state by adding, removing or altering the contents of its custom entry |
| 28 | + * {@link org.elasticsearch.cluster.SnapshotsInProgress}. All nodes consume the state of the {@code SnapshotsInProgress} and will start or |
| 29 | + * abort relevant shard snapshot tasks accordingly.</li> |
| 30 | + * <li>Nodes that are executing shard snapshot tasks report either success or failure of their snapshot task by submitting a |
| 31 | + * {@link org.elasticsearch.snapshots.SnapshotShardsService.UpdateIndexShardSnapshotStatusRequest} to the master node that will update the |
| 32 | + * snapshot's entry in the cluster state accordingly.</li> |
| 33 | + * </ul> |
| 34 | + * |
| 35 | + * <h1>Snapshot Creation</h1> |
| 36 | + * <p>Snapshots are created by the following sequence of events:</p> |
| 37 | + * <ol> |
| 38 | + * <li>An invocation of {@link org.elasticsearch.snapshots.SnapshotsService#createSnapshot} enqueues a cluster state update to create |
| 39 | + * a {@link org.elasticsearch.cluster.SnapshotsInProgress.Entry} in the cluster state's {@code SnapshotsInProgress}. This initial snapshot |
| 40 | + * entry has its state set to {@code INIT} and an empty map set for the state of the individual shard's snapshots.</li> |
| 41 | + * |
| 42 | + * <li>After the snapshot's entry with state {@code INIT} is in the cluster state, {@link org.elasticsearch.snapshots.SnapshotsService} |
| 43 | + * determines the primary shards' assignments for all indices that are being snapshotted and updates the existing |
| 44 | + * {@code SnapshotsInProgress.Entry} with state {@code STARTED} and adds the map of {@link org.elasticsearch.index.shard.ShardId} to |
| 45 | + * {@link org.elasticsearch.cluster.SnapshotsInProgress.ShardSnapshotStatus} that tracks the assignment of which node is to snapshot which |
| 46 | + * shard. All shard snapshots are executed on the shard's primary node. Thus all shards for which the primary node was found to have a |
| 47 | + * healthy copy of the shard are marked as being in state {@code INIT} in this map. If the primary for a shard is unassigned, it is marked |
| 48 | + * as {@code MISSING} in this map. In case the primary is initializing at this point, it is marked as in state {@code WAITING}. In case a |
| 49 | + * shard's primary is relocated at any point after its {@code SnapshotsInProgress.Entry} has moved to state {@code STARTED} and thus been |
| 50 | + * assigned to a specific cluster node, that shard's snapshot will fail and move to state {@code FAILED}.</li> |
| 51 | + * |
| 52 | + * <li>The new {@code SnapshotsInProgress.Entry} is then observed by |
| 53 | + * {@link org.elasticsearch.snapshots.SnapshotShardsService#clusterChanged} on all nodes and since the entry is in state {@code STARTED} |
| 54 | + * the {@code SnapshotShardsService} will check if any local primary shards are to be snapshotted (signaled by the shard's snapshot state |
| 55 | + * being {@code INIT}). For those local primary shards found in state {@code INIT}) the snapshot process of writing the shard's data files |
| 56 | + * to the snapshot's {@link org.elasticsearch.repositories.Repository} is executed. Once the snapshot execution finishes for a shard an |
| 57 | + * {@code UpdateIndexShardSnapshotStatusRequest} is sent to the master node signaling either status {@code SUCCESS} or {@code FAILED}. |
| 58 | + * The master node will then update a shard's state in the snapshots {@code SnapshotsInProgress.Entry} whenever it receives such a |
| 59 | + * {@code UpdateIndexShardSnapshotStatusRequest}.</li> |
| 60 | + * |
| 61 | + * <li>If as a result of the received status update requests, all shards in the cluster state are in a completed state, i.e are marked as |
| 62 | + * either {@code SUCCESS}, {@code FAILED} or {@code MISSING}, the {@code SnapshotShardsService} will update the state of the {@code Entry} |
| 63 | + * itself and mark it as {@code SUCCESS}. At the same time {@link org.elasticsearch.snapshots.SnapshotsService#endSnapshot} is executed, |
| 64 | + * writing the metadata necessary to finalize the snapshot in the repository to the repository.</li> |
| 65 | + * |
| 66 | + * <li>After writing the final metadata to the repository, a cluster state update to remove the snapshot from the cluster state is |
| 67 | + * submitted and the removal of the snapshot's {@code SnapshotsInProgress.Entry} from the cluster state completes the snapshot process. |
| 68 | + * </li> |
| 69 | + * </ol> |
| 70 | + * |
| 71 | + * <h1>Deleting a Snapshot</h1> |
| 72 | + * |
| 73 | + * <p>Deleting a snapshot can take the form of either simply deleting it from the repository or (if it has not completed yet) aborting it |
| 74 | + * and subsequently deleting it from the repository.</p> |
| 75 | + * |
| 76 | + * <h2>Aborting a Snapshot</h2> |
| 77 | + * |
| 78 | + * <ol> |
| 79 | + * <li>Aborting a snapshot starts by updating the state of the snapshot's {@code SnapshotsInProgress.Entry} to {@code ABORTED}.</li> |
| 80 | + * |
| 81 | + * <li>The snapshot's state change to {@code ABORTED} in cluster state is then picked up by the {@code SnapshotShardsService} on all nodes. |
| 82 | + * Those nodes that have shard snapshot actions for the snapshot assigned to them, will abort them and notify master about the shards |
| 83 | + * snapshot status accordingly. If the shard snapshot action completed or was in state {@code FINALIZE} when the abort was registered by |
| 84 | + * the {@code SnapshotShardsService}, then the shard's state will be reported to master as {@code SUCCESS}. |
| 85 | + * Otherwise, it will be reported as {@code FAILED}.</li> |
| 86 | + * |
| 87 | + * <li>Once all the shards are reported to master as either {@code SUCCESS} or {@code FAILED} the {@code SnapshotsService} on the master |
| 88 | + * will finish the snapshot process as all shard's states are now completed and hence the snapshot can be completed as explained in point 4 |
| 89 | + * of the snapshot creation section above.</li> |
| 90 | + * </ol> |
| 91 | + * |
| 92 | + * <h2>Deleting a Snapshot from a Repository</h2> |
| 93 | + * |
| 94 | + * <ol> |
| 95 | + * <li>Assuming there are no entries in the cluster state's {@code SnapshotsInProgress}, deleting a snapshot starts by the |
| 96 | + * {@code SnapshotsService} creating an entry for deleting the snapshot in the cluster state's |
| 97 | + * {@link org.elasticsearch.cluster.SnapshotDeletionsInProgress}.</li> |
| 98 | + * |
| 99 | + * <li>Once the cluster state contains the deletion entry in {@code SnapshotDeletionsInProgress} the {@code SnapshotsService} will invoke |
| 100 | + * {@link org.elasticsearch.repositories.Repository#deleteSnapshot} for the given snapshot, which will remove files associated with the |
| 101 | + * snapshot from the repository as well as update its meta-data to reflect the deletion of the snapshot.</li> |
| 102 | + * |
| 103 | + * <li>After the deletion of the snapshot's data from the repository finishes, the {@code SnapshotsService} will submit a cluster state |
| 104 | + * update to remove the deletion's entry in {@code SnapshotDeletionsInProgress} which concludes the process of deleting a snapshot.</li> |
| 105 | + * </ol> |
| 106 | + */ |
| 107 | +package org.elasticsearch.snapshots; |
0 commit comments