Skip to content

Commit e564c4d

Browse files
Add Package Level JavaDoc on Snapshots (#38108) (#39514)
* Add Package Level JavaDoc on Snapshots
1 parent 8a19d98 commit e564c4d

File tree

1 file changed

+107
-0
lines changed

1 file changed

+107
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
/*
2+
* Licensed to Elasticsearch under one or more contributor
3+
* license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright
5+
* ownership. Elasticsearch licenses this file to you under
6+
* the Apache License, Version 2.0 (the "License"); you may
7+
* not use this file except in compliance with the License.
8+
* You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing,
13+
* software distributed under the License is distributed on an
14+
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
* KIND, either express or implied. See the License for the
16+
* specific language governing permissions and limitations
17+
* under the License.
18+
*/
19+
20+
/**
21+
* <p>This package exposes the Elasticsearch Snapshot functionality.</p>
22+
*
23+
* <h1>Preliminaries</h1>
24+
*
25+
* <p>There are two communication channels between all nodes and master in the snapshot functionality:</p>
26+
* <ul>
27+
* <li>The master updates the cluster state by adding, removing or altering the contents of its custom entry
28+
* {@link org.elasticsearch.cluster.SnapshotsInProgress}. All nodes consume the state of the {@code SnapshotsInProgress} and will start or
29+
* abort relevant shard snapshot tasks accordingly.</li>
30+
* <li>Nodes that are executing shard snapshot tasks report either success or failure of their snapshot task by submitting a
31+
* {@link org.elasticsearch.snapshots.SnapshotShardsService.UpdateIndexShardSnapshotStatusRequest} to the master node that will update the
32+
* snapshot's entry in the cluster state accordingly.</li>
33+
* </ul>
34+
*
35+
* <h1>Snapshot Creation</h1>
36+
* <p>Snapshots are created by the following sequence of events:</p>
37+
* <ol>
38+
* <li>An invocation of {@link org.elasticsearch.snapshots.SnapshotsService#createSnapshot} enqueues a cluster state update to create
39+
* a {@link org.elasticsearch.cluster.SnapshotsInProgress.Entry} in the cluster state's {@code SnapshotsInProgress}. This initial snapshot
40+
* entry has its state set to {@code INIT} and an empty map set for the state of the individual shard's snapshots.</li>
41+
*
42+
* <li>After the snapshot's entry with state {@code INIT} is in the cluster state, {@link org.elasticsearch.snapshots.SnapshotsService}
43+
* determines the primary shards' assignments for all indices that are being snapshotted and updates the existing
44+
* {@code SnapshotsInProgress.Entry} with state {@code STARTED} and adds the map of {@link org.elasticsearch.index.shard.ShardId} to
45+
* {@link org.elasticsearch.cluster.SnapshotsInProgress.ShardSnapshotStatus} that tracks the assignment of which node is to snapshot which
46+
* shard. All shard snapshots are executed on the shard's primary node. Thus all shards for which the primary node was found to have a
47+
* healthy copy of the shard are marked as being in state {@code INIT} in this map. If the primary for a shard is unassigned, it is marked
48+
* as {@code MISSING} in this map. In case the primary is initializing at this point, it is marked as in state {@code WAITING}. In case a
49+
* shard's primary is relocated at any point after its {@code SnapshotsInProgress.Entry} has moved to state {@code STARTED} and thus been
50+
* assigned to a specific cluster node, that shard's snapshot will fail and move to state {@code FAILED}.</li>
51+
*
52+
* <li>The new {@code SnapshotsInProgress.Entry} is then observed by
53+
* {@link org.elasticsearch.snapshots.SnapshotShardsService#clusterChanged} on all nodes and since the entry is in state {@code STARTED}
54+
* the {@code SnapshotShardsService} will check if any local primary shards are to be snapshotted (signaled by the shard's snapshot state
55+
* being {@code INIT}). For those local primary shards found in state {@code INIT}) the snapshot process of writing the shard's data files
56+
* to the snapshot's {@link org.elasticsearch.repositories.Repository} is executed. Once the snapshot execution finishes for a shard an
57+
* {@code UpdateIndexShardSnapshotStatusRequest} is sent to the master node signaling either status {@code SUCCESS} or {@code FAILED}.
58+
* The master node will then update a shard's state in the snapshots {@code SnapshotsInProgress.Entry} whenever it receives such a
59+
* {@code UpdateIndexShardSnapshotStatusRequest}.</li>
60+
*
61+
* <li>If as a result of the received status update requests, all shards in the cluster state are in a completed state, i.e are marked as
62+
* either {@code SUCCESS}, {@code FAILED} or {@code MISSING}, the {@code SnapshotShardsService} will update the state of the {@code Entry}
63+
* itself and mark it as {@code SUCCESS}. At the same time {@link org.elasticsearch.snapshots.SnapshotsService#endSnapshot} is executed,
64+
* writing the metadata necessary to finalize the snapshot in the repository to the repository.</li>
65+
*
66+
* <li>After writing the final metadata to the repository, a cluster state update to remove the snapshot from the cluster state is
67+
* submitted and the removal of the snapshot's {@code SnapshotsInProgress.Entry} from the cluster state completes the snapshot process.
68+
* </li>
69+
* </ol>
70+
*
71+
* <h1>Deleting a Snapshot</h1>
72+
*
73+
* <p>Deleting a snapshot can take the form of either simply deleting it from the repository or (if it has not completed yet) aborting it
74+
* and subsequently deleting it from the repository.</p>
75+
*
76+
* <h2>Aborting a Snapshot</h2>
77+
*
78+
* <ol>
79+
* <li>Aborting a snapshot starts by updating the state of the snapshot's {@code SnapshotsInProgress.Entry} to {@code ABORTED}.</li>
80+
*
81+
* <li>The snapshot's state change to {@code ABORTED} in cluster state is then picked up by the {@code SnapshotShardsService} on all nodes.
82+
* Those nodes that have shard snapshot actions for the snapshot assigned to them, will abort them and notify master about the shards
83+
* snapshot status accordingly. If the shard snapshot action completed or was in state {@code FINALIZE} when the abort was registered by
84+
* the {@code SnapshotShardsService}, then the shard's state will be reported to master as {@code SUCCESS}.
85+
* Otherwise, it will be reported as {@code FAILED}.</li>
86+
*
87+
* <li>Once all the shards are reported to master as either {@code SUCCESS} or {@code FAILED} the {@code SnapshotsService} on the master
88+
* will finish the snapshot process as all shard's states are now completed and hence the snapshot can be completed as explained in point 4
89+
* of the snapshot creation section above.</li>
90+
* </ol>
91+
*
92+
* <h2>Deleting a Snapshot from a Repository</h2>
93+
*
94+
* <ol>
95+
* <li>Assuming there are no entries in the cluster state's {@code SnapshotsInProgress}, deleting a snapshot starts by the
96+
* {@code SnapshotsService} creating an entry for deleting the snapshot in the cluster state's
97+
* {@link org.elasticsearch.cluster.SnapshotDeletionsInProgress}.</li>
98+
*
99+
* <li>Once the cluster state contains the deletion entry in {@code SnapshotDeletionsInProgress} the {@code SnapshotsService} will invoke
100+
* {@link org.elasticsearch.repositories.Repository#deleteSnapshot} for the given snapshot, which will remove files associated with the
101+
* snapshot from the repository as well as update its meta-data to reflect the deletion of the snapshot.</li>
102+
*
103+
* <li>After the deletion of the snapshot's data from the repository finishes, the {@code SnapshotsService} will submit a cluster state
104+
* update to remove the deletion's entry in {@code SnapshotDeletionsInProgress} which concludes the process of deleting a snapshot.</li>
105+
* </ol>
106+
*/
107+
package org.elasticsearch.snapshots;

0 commit comments

Comments
 (0)