|
| 1 | +/* |
| 2 | + * Licensed to Elasticsearch under one or more contributor |
| 3 | + * license agreements. See the NOTICE file distributed with |
| 4 | + * this work for additional information regarding copyright |
| 5 | + * ownership. Elasticsearch licenses this file to you under |
| 6 | + * the Apache License, Version 2.0 (the "License"); you may |
| 7 | + * not use this file except in compliance with the License. |
| 8 | + * You may obtain a copy of the License at |
| 9 | + * |
| 10 | + * http://www.apache.org/licenses/LICENSE-2.0 |
| 11 | + * |
| 12 | + * Unless required by applicable law or agreed to in writing, |
| 13 | + * software distributed under the License is distributed on an |
| 14 | + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 15 | + * KIND, either express or implied. See the License for the |
| 16 | + * specific language governing permissions and limitations |
| 17 | + * under the License. |
| 18 | + */ |
| 19 | + |
| 20 | +/** |
| 21 | + * <p>This package exposes the blobstore repository used by Elasticsearch Snapshots.</p> |
| 22 | + * |
| 23 | + * <h1>Preliminaries</h1> |
| 24 | + * |
| 25 | + * <p>The {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository} forms the basis of implementations of |
| 26 | + * {@link org.elasticsearch.repositories.Repository} on top of a blob store. A blobstore can be used as the basis for an implementation |
| 27 | + * as long as it provides for GET, PUT, DELETE, and LIST operations. For a read-only repository, it suffices if the blobstore provides only |
| 28 | + * GET operations. |
| 29 | + * These operations are formally defined as specified by the {@link org.elasticsearch.common.blobstore.BlobContainer} interface that |
| 30 | + * any {@code BlobStoreRepository} implementation must provide via its implementation of |
| 31 | + * {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository#getBlobContainer()}.</p> |
| 32 | + * |
| 33 | + * <p>The blob store is written to and read from by master-eligible nodes and data nodes. All metadata related to a snapshot's |
| 34 | + * scope and health is written by the master node.</p> |
| 35 | + * <p>The data-nodes on the other hand, write the data for each individual shard but do not write any blobs outside of shard directories for |
| 36 | + * shards that they hold the primary of. For each shard, the data-node holding the shard's primary writes the actual data in form of |
| 37 | + * the shard's segment files to the repository as well as metadata about all the segment files that the repository stores for the shard.</p> |
| 38 | + * |
| 39 | + * <p>For the specifics on how the operations on the repository documented below are invoked during the snapshot process please refer to |
| 40 | + * the documentation of the {@link org.elasticsearch.snapshots} package.</p> |
| 41 | + * |
| 42 | + * <p>{@code BlobStoreRepository} maintains the following structure of blobs containing data and metadata in the blob store. The exact |
| 43 | + * operations executed on these blobs are explained below.</p> |
| 44 | + * <pre> |
| 45 | + * {@code |
| 46 | + * STORE_ROOT |
| 47 | + * |- index-N - JSON serialized {@link org.elasticsearch.repositories.RepositoryData} containing a list of all snapshot ids |
| 48 | + * | and the indices belonging to each snapshot, N is the generation of the file |
| 49 | + * |- index.latest - contains the numeric value of the latest generation of the index file (i.e. N from above) |
| 50 | + * |- incompatible-snapshots - list of all snapshot ids that are no longer compatible with the current version of the cluster |
| 51 | + * |- snap-20131010.dat - SMILE serialized {@link org.elasticsearch.snapshots.SnapshotInfo} for snapshot "20131010" |
| 52 | + * |- meta-20131010.dat - SMILE serialized {@link org.elasticsearch.cluster.metadata.MetaData} for snapshot "20131010" |
| 53 | + * | (includes only global metadata) |
| 54 | + * |- snap-20131011.dat - SMILE serialized {@link org.elasticsearch.snapshots.SnapshotInfo} for snapshot "20131011" |
| 55 | + * |- meta-20131011.dat - SMILE serialized {@link org.elasticsearch.cluster.metadata.MetaData} for snapshot "20131011" |
| 56 | + * ..... |
| 57 | + * |- indices/ - data for all indices |
| 58 | + * |- Ac1342-B_x/ - data for index "foo" which was assigned the unique id Ac1342-B_x (not to be confused with the actual index uuid) |
| 59 | + * | | in the repository |
| 60 | + * | |- meta-20131010.dat - JSON Serialized {@link org.elasticsearch.cluster.metadata.IndexMetaData} for index "foo" |
| 61 | + * | |- 0/ - data for shard "0" of index "foo" |
| 62 | + * | | |- __1 \ (files with numeric names were created by older ES versions) |
| 63 | + * | | |- __2 | |
| 64 | + * | | |- __VPO5oDMVT5y4Akv8T_AO_A |- files from different segments see snap-* for their mappings to real segment files |
| 65 | + * | | |- __1gbJy18wS_2kv1qI7FgKuQ | |
| 66 | + * | | |- __R8JvZAHlSMyMXyZc2SS8Zg / |
| 67 | + * | | ..... |
| 68 | + * | | |- snap-20131010.dat - SMILE serialized {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot} for |
| 69 | + * | | | snapshot "20131010" |
| 70 | + * | | |- snap-20131011.dat - SMILE serialized {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot} for |
| 71 | + * | | | snapshot "20131011" |
| 72 | + * | | |- index-123 - SMILE serialized {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshots} for |
| 73 | + * | | | the shard |
| 74 | + * | | |
| 75 | + * | |- 1/ - data for shard "1" of index "foo" |
| 76 | + * | | |- __1 |
| 77 | + * | | ..... |
| 78 | + * | | |
| 79 | + * | |-2/ |
| 80 | + * | ...... |
| 81 | + * | |
| 82 | + * |- 1xB0D8_B3y/ - data for index "bar" which was assigned the unique id of 1xB0D8_B3y in the repository |
| 83 | + * ...... |
| 84 | + * } |
| 85 | + * </pre> |
| 86 | + * |
| 87 | + * <h1>Getting the Repository's RepositoryData</h1> |
| 88 | + * |
| 89 | + * <p>Loading the {@link org.elasticsearch.repositories.RepositoryData} that holds the list of all snapshots as well as the mapping of |
| 90 | + * indices' names to their repository {@link org.elasticsearch.repositories.IndexId} is done by invoking |
| 91 | + * {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository#getRepositoryData} and implemented as follows:</p> |
| 92 | + * <ol> |
| 93 | + * <li> |
| 94 | + * <ol> |
| 95 | + * <li>The blobstore repository stores the {@code RepositoryData} in blobs named with incrementing suffix {@code N} at {@code /index-N} |
| 96 | + * directly under the repository's root.</li> |
| 97 | + * <li>The blobstore also stores the most recent {@code N} as a 64bit long in the blob {@code /index.latest} directly under the |
| 98 | + * repository's root.</li> |
| 99 | + * </ol> |
| 100 | + * </li> |
| 101 | + * <li> |
| 102 | + * <ol> |
| 103 | + * <li>First, find the most recent {@code RepositoryData} by getting a list of all index-N blobs through listing all blobs with prefix |
| 104 | + * "index-" under the repository root and then selecting the one with the highest value for N.</li> |
| 105 | + * <li>If this operation fails because the repository's {@code BlobContainer} does not support list operations (in the case of read-only |
| 106 | + * repositories), read the highest value of N from the the index.latest blob.</li> |
| 107 | + * </ol> |
| 108 | + * </li> |
| 109 | + * <li> |
| 110 | + * <ol> |
| 111 | + * <li>Use the just determined value of {@code N} and get the {@code /index-N} blob and deserialize the {@code RepositoryData} from it.</li> |
| 112 | + * <li>If no value of {@code N} could be found since neither an {@code index.latest} nor any {@code index-N} blobs exist in the repository, |
| 113 | + * it is assumed to be empty and {@link org.elasticsearch.repositories.RepositoryData#EMPTY} is returned.</li> |
| 114 | + * </ol> |
| 115 | + * </li> |
| 116 | + * </ol> |
| 117 | + * <h1>Creating a Snapshot</h1> |
| 118 | + * |
| 119 | + * <p>Creating a snapshot in the repository happens in the three steps described in detail below.</p> |
| 120 | + * |
| 121 | + * <h2>Initializing a Snapshot in the Repository</h2> |
| 122 | + * |
| 123 | + * <p>Creating a snapshot in the repository starts with a call to {@link org.elasticsearch.repositories.Repository#initializeSnapshot} which |
| 124 | + * the blob store repository implements via the following actions:</p> |
| 125 | + * <ol> |
| 126 | + * <li>Verify that no snapshot by the requested name exists.</li> |
| 127 | + * <li>Write a blob containing the cluster metadata to the root of the blob store repository at {@code /meta-${snapshot-uuid}.dat}</li> |
| 128 | + * <li>Write the metadata for each index to a blob in that index's directory at |
| 129 | + * {@code /indices/${index-snapshot-uuid}/meta-${snapshot-uuid}.dat}</li> |
| 130 | + * </ol> |
| 131 | + * TODO: This behavior is problematic, adjust these docs once https://github.com/elastic/elasticsearch/issues/41581 is fixed |
| 132 | + * |
| 133 | + * <h2>Writing Shard Data (Segments)</h2> |
| 134 | + * |
| 135 | + * <p>Once all the metadata has been written by the snapshot initialization, the snapshot process moves on to writing the actual shard data |
| 136 | + * to the repository by invoking {@link org.elasticsearch.repositories.Repository#snapshotShard} on the data-nodes that hold the primaries |
| 137 | + * for the shards in the current snapshot. It is implemented as follows:</p> |
| 138 | + * |
| 139 | + * <p>Note:</p> |
| 140 | + * <ul> |
| 141 | + * <li>For each shard {@code i} in a given index, its path in the blob store is located at {@code /indices/${index-snapshot-uuid}/${i}}</li> |
| 142 | + * <li>All the following steps are executed exclusively on the shard's primary's data node.</li> |
| 143 | + * </ul> |
| 144 | + * |
| 145 | + * <ol> |
| 146 | + * <li>Create the {@link org.apache.lucene.index.IndexCommit} for the shard to snapshot.</li> |
| 147 | + * <li>List all blobs in the shard's path. Find the {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshots} blob |
| 148 | + * with name {@code index-${N}} for the highest possible value of {@code N} in the list to get the information of what segment files are |
| 149 | + * already available in the blobstore.</li> |
| 150 | + * <li>By comparing the files in the {@code IndexCommit} and the available file list from the previous step, determine the segment files |
| 151 | + * that need to be written to the blob store. For each segment that needs to be added to the blob store, generate a unique name by combining |
| 152 | + * the segment data blob prefix {@code __} and a UUID and write the segment to the blobstore.</li> |
| 153 | + * <li>After completing all segment writes, a blob containing a |
| 154 | + * {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot} with name {@code snap-${snapshot-uuid}.dat} is written to |
| 155 | + * the shard's path and contains a list of all the files referenced by the snapshot as well as some metadata about the snapshot. See the |
| 156 | + * documentation of {@code BlobStoreIndexShardSnapshot} for details on its contents.</li> |
| 157 | + * <li>Once all the segments and the {@code BlobStoreIndexShardSnapshot} blob have been written, an updated |
| 158 | + * {@code BlobStoreIndexShardSnapshots} blob is written to the shard's path with name {@code index-${N+1}}.</li> |
| 159 | + * </ol> |
| 160 | + * |
| 161 | + * <h2>Finalizing the Snapshot</h2> |
| 162 | + * |
| 163 | + * <p>After all primaries have finished writing the necessary segment files to the blob store in the previous step, the master node moves on |
| 164 | + * to finalizing the snapshot by invoking {@link org.elasticsearch.repositories.Repository#finalizeSnapshot}. This method executes the |
| 165 | + * following actions in order:</p> |
| 166 | + * <ol> |
| 167 | + * <li>Write the {@link org.elasticsearch.snapshots.SnapshotInfo} blob for the given snapshot to the key {@code /snap-${snapshot-uuid}.dat} |
| 168 | + * directly under the repository root.</li> |
| 169 | + * <li>Write an updated {@code RepositoryData} blob to the key {@code /index-${N+1}} using the {@code N} determined when initializing the |
| 170 | + * snapshot in the first step. When doing this, the implementation checks that the blob for generation {@code N + 1} has not yet been |
| 171 | + * written to prevent concurrent updates to the repository. If the blob for {@code N + 1} already exists the execution of finalization |
| 172 | + * stops under the assumption that a master failover occurred and the snapshot has already been finalized by the new master.</li> |
| 173 | + * <li>Write the updated {@code /index.latest} blob containing the new repository generation {@code N + 1}.</li> |
| 174 | + * </ol> |
| 175 | + * |
| 176 | + * <h1>Deleting a Snapshot</h1> |
| 177 | + * |
| 178 | + * <p>Deleting a snapshot is an operation that is exclusively executed on the master node that runs through the following sequence of |
| 179 | + * action when {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository#deleteSnapshot} is invoked:</p> |
| 180 | + * |
| 181 | + * <ol> |
| 182 | + * <li>Get the current {@code RepositoryData} from the latest {@code index-N} blob at the repository root.</li> |
| 183 | + * <li>Write an updated {@code RepositoryData} blob with the deleted snapshot removed to key {@code /index-${N+1}} directly under the |
| 184 | + * repository root.</li> |
| 185 | + * <li>Write an updated {@code index.latest} blob containing {@code N + 1}.</li> |
| 186 | + * <li>Delete the global {@code MetaData} blob {@code meta-${snapshot-uuid}.dat} stored directly under the repository root for the snapshot |
| 187 | + * as well as the {@code SnapshotInfo} blob at {@code /snap-${snapshot-uuid}.dat}.</li> |
| 188 | + * <li>For each index referenced by the snapshot: |
| 189 | + * <ol> |
| 190 | + * <li>Delete the snapshot's {@code IndexMetaData} at {@code /indices/${index-snapshot-uuid}/meta-${snapshot-uuid}}.</li> |
| 191 | + * <li>Go through all shard directories {@code /indices/${index-snapshot-uuid}/${i}} and: |
| 192 | + * <ol> |
| 193 | + * <li>Remove the {@code BlobStoreIndexShardSnapshot} blob at {@code /indices/${index-snapshot-uuid}/${i}/snap-${snapshot-uuid}.dat}.</li> |
| 194 | + * <li>List all blobs in the shard path {@code /indices/${index-snapshot-uuid}} and build a new {@code BlobStoreIndexShardSnapshots} from |
| 195 | + * the remaining {@code BlobStoreIndexShardSnapshot} blobs in the shard. Afterwards, write it to the next shard generation blob at |
| 196 | + * {@code /indices/${index-snapshot-uuid}/${i}/index-${N+1}} (The shard's generation is determined from the list of {@code index-N} blobs |
| 197 | + * in the shard directory).</li> |
| 198 | + * <li>Delete all segment blobs (identified by having the data blob prefix {@code __}) in the shard directory which are not referenced by |
| 199 | + * the new {@code BlobStoreIndexShardSnapshots} that has been written in the previous step.</li> |
| 200 | + * </ol> |
| 201 | + * </li> |
| 202 | + * </ol> |
| 203 | + * </li> |
| 204 | + * </ol> |
| 205 | + * TODO: The above sequence of actions can lead to leaking files when an index completely goes out of scope. Adjust this documentation once |
| 206 | + * https://github.com/elastic/elasticsearch/issues/13159 is fixed. |
| 207 | + */ |
| 208 | +package org.elasticsearch.repositories.blobstore; |
0 commit comments