Skip to content

Add Package Level Documentation to o.e.r.blobstore #42101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Original file line number Diff line number Diff line change
Expand Up @@ -121,45 +121,9 @@
* <p>
* This repository works with any {@link BlobStore} implementation. The blobStore could be (and preferred) lazy initialized in
* {@link #createBlobStore()}.
* <p>
* BlobStoreRepository maintains the following structure in the blob store
* <pre>
* {@code
* STORE_ROOT
* |- index-N - JSON serialized {@link RepositoryData} containing a list of all snapshot ids and the indices belonging to
* | each snapshot, N is the generation of the file
* |- index.latest - contains the numeric value of the latest generation of the index file (i.e. N from above)
* |- incompatible-snapshots - list of all snapshot ids that are no longer compatible with the current version of the cluster
* |- snap-20131010.dat - SMILE serialized {@link SnapshotInfo} for snapshot "20131010"
* |- meta-20131010.dat - SMILE serialized {@link MetaData} for snapshot "20131010" (includes only global metadata)
* |- snap-20131011.dat - SMILE serialized {@link SnapshotInfo} for snapshot "20131011"
* |- meta-20131011.dat - SMILE serialized {@link MetaData} for snapshot "20131011"
* .....
* |- indices/ - data for all indices
* |- Ac1342-B_x/ - data for index "foo" which was assigned the unique id of Ac1342-B_x in the repository
* | |- meta-20131010.dat - JSON Serialized {@link IndexMetaData} for index "foo"
* | |- 0/ - data for shard "0" of index "foo"
* | | |- __1 \ (files with numeric names were created by older ES versions)
* | | |- __2 |
* | | |- __VPO5oDMVT5y4Akv8T_AO_A |- files from different segments see snap-* for their mappings to real segment files
* | | |- __1gbJy18wS_2kv1qI7FgKuQ |
* | | |- __R8JvZAHlSMyMXyZc2SS8Zg /
* | | .....
* | | |- snap-20131010.dat - SMILE serialized {@link BlobStoreIndexShardSnapshot} for snapshot "20131010"
* | | |- snap-20131011.dat - SMILE serialized {@link BlobStoreIndexShardSnapshot} for snapshot "20131011"
* | | |- index-123 - SMILE serialized {@link BlobStoreIndexShardSnapshots} for the shard
* | |
* | |- 1/ - data for shard "1" of index "foo"
* | | |- __1
* | | .....
* | |
* | |-2/
* | ......
* |
* |- 1xB0D8_B3y/ - data for index "bar" which was assigned the unique id of 1xB0D8_B3y in the repository
* ......
* }
* </pre>
* </p>
* For in depth documentation on how exactly implementations of this class interact with the snapshot functionality please refer to the
* documentation of the package {@link org.elasticsearch.repositories.blobstore}.
*/
public abstract class BlobStoreRepository extends AbstractLifecycleComponent implements Repository {
private static final Logger logger = LogManager.getLogger(BlobStoreRepository.class);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

/**
* <p>This package exposes the blobstore repository used by Elasticsearch Snapshots.</p>
*
* <h1>Preliminaries</h1>
*
* <p>The {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository} forms the basis of implementations of
* {@link org.elasticsearch.repositories.Repository} on top of a blob store. A blobstore can be used as the basis for an implementation
* as long as it provides for GET, PUT and (except for in the case of read-only repositories) LIST operations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says that LIST is not required on read-only repositories, but PUT is. That sounds like it should be the other way round.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually neither is required for a read-only repo :) You only need GET for those, will adjust accordingly :)

* These operations are formally defined as specified by the {@link org.elasticsearch.common.blobstore.BlobContainer} interface that
* any {@code BlobStoreRepository} implementation must provide via its implementation of
* {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository#getBlobContainer()}.</p>
*
* <p>The blob store is written to and read from by both the master- as well as the data-nodes. All metadata related to a snapshots'
* scope and health (i.e. the indices a snapshot contains, a snapshot of the cluster state, index metadata and the status of the snapshot)
* are written by the master node.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the above sentence could be clearer if it were a few shorter sentences instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just dropped the part in brackets here now, it seemed really redundant? Given that the text below eventually just links to the classes that get serialized here anyway?

* <p>The data-nodes on the other hand, write the data for each individual shard but do not write any blobs outside of shard directories for
* shards that they hold the primary of. For each shard, the data-node holding the shard's primary writes the actual data in form of
* the shard's segment files to the repository as well as metadata about all the segment files that the repository stores for the shard.</p>
*
* <p>For the specifics on how the operations on the repository documented below are invoked during the snapshot process please refer to
* the documentation of the {@link org.elasticsearch.snapshots} package.</p>
*
* <p>{@code BlobStoreRepository} maintains the following structure of blobs containing data and metadata in the blob store. The exact
* operations executed on these blobs are explained below.</p>
* <pre>
* {@code
* STORE_ROOT
* |- index-N - JSON serialized {@link org.elasticsearch.repositories.RepositoryData} containing a list of all snapshot ids
* | and the indices belonging to each snapshot, N is the generation of the file
* |- index.latest - contains the numeric value of the latest generation of the index file (i.e. N from above)
* |- incompatible-snapshots - list of all snapshot ids that are no longer compatible with the current version of the cluster
* |- snap-20131010.dat - SMILE serialized {@link org.elasticsearch.snapshots.SnapshotInfo} for snapshot "20131010"
* |- meta-20131010.dat - SMILE serialized {@link org.elasticsearch.cluster.metadata.MetaData} for snapshot "20131010"
* | (includes only global metadata)
* |- snap-20131011.dat - SMILE serialized {@link org.elasticsearch.snapshots.SnapshotInfo} for snapshot "20131011"
* |- meta-20131011.dat - SMILE serialized {@link org.elasticsearch.cluster.metadata.MetaData} for snapshot "20131011"
* .....
* |- indices/ - data for all indices
* |- Ac1342-B_x/ - data for index "foo" which was assigned the unique id of Ac1342-B_x in the repository
* | |- meta-20131010.dat - JSON Serialized {@link org.elasticsearch.cluster.metadata.IndexMetaData} for index "foo"
* | |- 0/ - data for shard "0" of index "foo"
* | | |- __1 \ (files with numeric names were created by older ES versions)
* | | |- __2 |
* | | |- __VPO5oDMVT5y4Akv8T_AO_A |- files from different segments see snap-* for their mappings to real segment files
* | | |- __1gbJy18wS_2kv1qI7FgKuQ |
* | | |- __R8JvZAHlSMyMXyZc2SS8Zg /
* | | .....
* | | |- snap-20131010.dat - SMILE serialized {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot} for
* | | | snapshot "20131010"
* | | |- snap-20131011.dat - SMILE serialized {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot} for
* | | | snapshot "20131011"
* | | |- index-123 - SMILE serialized {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshots} for
* | | | the shard
* | |
* | |- 1/ - data for shard "1" of index "foo"
* | | |- __1
* | | .....
* | |
* | |-2/
* | ......
* |
* |- 1xB0D8_B3y/ - data for index "bar" which was assigned the unique id of 1xB0D8_B3y in the repository
* ......
* }
* </pre>
*
* <h1>Getting the Repository's RepositoryData</h1>
*
* <p>Loading the {@link org.elasticsearch.repositories.RepositoryData} that holds the list of all snapshots as well as the mapping of
* indices' names to their repository {@link org.elasticsearch.repositories.IndexId} is done by invoking
* {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository#getRepositoryData} and implemented as follows:</p>
* <ol>
* <li>
* <ol>
* <li>The blobstore repository stores the {@code RepositoryData} in blobs named with incrementing suffix {@code N} at {@code /index-N}
* directly under the repositories' root.</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/repositories'/repository's/ here and I think in most other places too, because we're only really talking about a single repository here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point :) fixed

* <li>The blobstore also stores the most recent {@code N} as a 64bit long in the blob {@code /index.latest} directly under the
* repositories' root.</li>
* </ol>
* </li>
* <li>
* <ol>
* <li>First, find the most recent {@code RepositoryData} by getting a list of all index-N blobs through listing all blobs with prefix
* "index-" under the repository root and then selecting the one with the highest value for N.</li>
* <li>If this operation fails because the repositories' {@code BlobContainer} does not support list operations (in the case of read-only
* repositories), read the highest value of N from the the index.latest blob.</li>
* </ol>
* </li>
* <li>
* <ol>
* <li>Use the just determined value of {@code N} and get the {@code /index-N} blob and deserialize the {@code RepositoryData} from it.</li>
* <li>If no value of {@code N} could be found since neither an {@code index.latest} nor any {@code index-N} blobs exist in the repository,
* it is assumed to be empty and {@link org.elasticsearch.repositories.RepositoryData#EMPTY} is returned.</li>
* </ol>
* </li>
* </ol>
* <h1>Creating a Snapshot</h1>
*
* <p>Creating a snapshot in the repository happens in the three steps described in detail below.</p>
*
* <h2>Initializing a Snapshot in the Repository</h2>
*
* <p>Creating a snapshot in the repository starts with a call to {@link org.elasticsearch.repositories.Repository#initializeSnapshot} which
* the blob store repository implements via the following actions:</p>
* <ol>
* <li>Verify that no snapshot by the requested name exists.</li>
* <li>Write a blob containing the cluster metadata to the root of the blob store repository at {@code /meta-${snapshot-uuid}.dat}</li>
* <li>Write the metadata for each index to a blob in that index's directory at
* {@code /indices/${index-snapshot-uuid}/meta-${snapshot-uuid}.dat}</li>
* </ol>
* TODO: This behavior is problematic, adjust these docs once https://github.com/elastic/elasticsearch/issues/41581 is fixed
*
* <h2>Writing Shard Data (Segments)</h2>
*
* <p>Once all the metadata has been written by the snapshot initialization, the snapshot process moves on to writing the actual shard data
* to the repository by invoking {@link org.elasticsearch.repositories.Repository#snapshotShard} on the data-nodes that hold the primaries
* for the shards in the current snapshot. It is implemented as follows:</p>
*
* <p>Note:</p>
* <ul>
* <li>For each shard {@code i} in a given index, its path in the blob store is located at {@code /indices/${index-snapshot-uuid}/${i}}</li>
* <li>All the following steps are executed exclusively on the shard's primary's data node.</li>
* </ul>
*
* <ol>
* <li>Create the {@link org.apache.lucene.index.IndexCommit} for the shard to snapshot.</li>
* <li>List all blobs in the shard's path. Find the {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshots} blob
* with name {@code index-${N}} for the highest possible value of {@code N} in the list to get the information of what segment files are
* already available in the blobstore.</li>
* <li>By comparing the files in the {@code IndexCommit} and the available file list from the previous step, determine the segment files
* that need to be written to the blob store. For each segment that needs to be added to the blob store, generate a unique name by combining
* the segment data blob prefix {@code __} and a UUID and write the segment to the blobstore.</li>
* <li>After completing all segment writes, a blob containing a
* {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot} with name {@code snap-${snapshot-uuid}.dat} is written to
* the shard's path and contains a list of all the files referenced by the snapshot as well as some metadata about the snapshot. See the
* documentation of {@code BlobStoreIndexShardSnapshot} for details on its contents.</li>
* <li>Once all the segments and the {@code BlobStoreIndexShardSnapshot} blob have been written, an updated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we elaborate here why we need both BlobStoreIndexShardSnapshot and BlobStoreIndexShardSnapshots? AFAIK, it's needed exclusively for delete snapshot operations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently use BlobStoreIndexShardSnapshots to determine the segments to upload when creating incremental snapshots of a shard as well as what to delete.
Technically speaking BlobStoreIndexShardSnapshots is redundant as you could rebuild it from all the BlobStoreIndexShardSnapshot blobs. In practice this might get tricky since it implies downloading potentially hundreds of them (you have exactly as many as you have snapshots referencing the shard).
I think I'd really just leave the details of what is in BlobStoreIndexShardSnapshot precisely to the link to that class (it contains some additional metadata about the time it took to snapshot the shard that we expose via an API). It seems to me it's easier to read this by not mentioning the duplication of data here, that just has the reader wondering why (and it's a fair question :D) when understanding that the BlobStoreIndexShardSnapshots holds the list of segments is all that really matters for the big picture here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

* {@code BlobStoreIndexShardSnapshots} blob is written to the shard's path with name {@code index-${N+1}}.</li>
* </ol>
*
* <h2>Finalizing the Snapshot</h2>
*
* <p>After all primaries have finished writing the necessary segment files to the blob store in the previous step, the master node moves on
* to finalizing the snapshot by invoking {@link org.elasticsearch.repositories.Repository#finalizeSnapshot}. This method executes the
* following actions in order:</p>
* <ol>
* <li>Write the {@link org.elasticsearch.snapshots.SnapshotInfo} blob for the given snapshot to the key {@code /snap-${snapshot-uuid}.dat}
* directly under the repository root.</li>
* <li>Write an updated {@code RepositoryData} blob to the key {@code /index-${N+1}} using the {@code N} determined when initializing the
* snapshot in the first step. When doing this, the implementation checks that the blob for generation {@code N + 1} has not yet been
* written to prevent concurrent updates to the repository. If the blob for {@code N + 1} already exists the execution of finalization
* stops under the assumption that a master failover occurred and the snapshot has already been finalized by the new master.</li>
* <li>Write the updated {@code /index.latest} blob containing the new repository generation {@code N + 1}.</li>
* </ol>
*
* <h1>Deleting a Snapshot</h1>
*
* <p>Deleting a snapshot is an operation that is exclusively executed on the master node that runs through the following sequence of
* action when {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository#deleteSnapshot} is invoked:</p>
*
* <ol>
* <li>Get the current {@code RepositoryData} from the latest {@code index-N} blob at the repository root.</li>
* <li>Write an updated {@code RepositoryData} blob with the deleted snapshot removed to key {@code /index-${N+1}} directly under the
* repository root.</li>
* <li>Write an updated {@code index.latest} blob containing {@code N + 1}.</li>
* <li>Delete the global {@code MetaData} blob {@code meta-${snapshot-uuid}.dat} stored directly under the repository root for the snapshot
* as well as the {@code SnapshotInfo} blob at {@code /snap-${snapshot-uuid}.dat}.</li>
* <li>For each index referenced by the snapshot:
* <ol>
* <li>Delete the snapshot's {@code IndexMetaData} at {@code /indices/${index-snapshot-uuid}/meta-${snapshot-uuid}}.</li>
* <li>Go through all shard directories {@code /indices/${index-snapshot-uuid}/${i}} and:
* <ol>
* <li>Remove the {@code BlobStoreIndexShardSnapshot} blob at {@code /indices/${index-snapshot-uuid}/${i}/snap-${snapshot-uuid}.dat}.</li>
* <li>List all blobs in the shard path {@code /indices/${index-snapshot-uuid}} and build a new {@code BlobStoreIndexShardSnapshots} from
* the remaining {@code BlobStoreIndexShardSnapshot} blobs in the shard. Afterwards, write it to the next shard generation blob at
* {@code /indices/${index-snapshot-uuid}/${i}/index-${N+1}} (The shard's generation is determined from the list of {@code index-N} blobs
* in the shard directory).</li>
* <li>Delete all segment blobs (identified by having the data blob prefix {@code __}) in the shard directory which are not referenced by
* the new {@code BlobStoreIndexShardSnapshots} that has been written in the previous step.</li>
* </ol>
* </li>
* </ol>
* </li>
* </ol>
* TODO: The above sequence of actions can lead to leaking files when an index completely goes out of scope. Adjust this documentation once
* https://github.com/elastic/elasticsearch/issues/13159 is fixed.
*/
package org.elasticsearch.repositories.blobstore;