-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add Package Level Documentation to o.e.r.blobstore #42101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Package Level Documentation to o.e.r.blobstore #42101
Conversation
* Added verbose documentation for the `o.e.r.blobstore` package similar to that added for the snapshot package in elastic#38108 * Moved the documentation on the BlobStoreRepository to the package level to have things in a single place for easier readability.
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Armin, I left a few small suggestions.
* | ||
* <p>The {@link org.elasticsearch.repositories.blobstore.BlobStoreRepository} forms the basis of implementations of | ||
* {@link org.elasticsearch.repositories.Repository} on top of a blob store. A blobstore can be used as the basis for an implementation | ||
* as long as it provides for GET, PUT and (except for in the case of read-only repositories) LIST operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says that LIST
is not required on read-only repositories, but PUT
is. That sounds like it should be the other way round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually neither is required for a read-only repo :) You only need GET for those, will adjust accordingly :)
server/src/main/java/org/elasticsearch/repositories/blobstore/package-info.java
Outdated
Show resolved
Hide resolved
* <li> | ||
* <ol> | ||
* <li>The blobstore repository stores the {@code RepositoryData} in blobs named with incrementing suffix {@code N} at {@code /index-N} | ||
* directly under the repositories' root.</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/repositories'/repository's/
here and I think in most other places too, because we're only really talking about a single repository here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point :) fixed
* | ||
* <p>The blob store is written to and read from by both the master- as well as the data-nodes. All metadata related to a snapshots' | ||
* scope and health (i.e. the indices a snapshot contains, a snapshot of the cluster state, index metadata and the status of the snapshot) | ||
* are written by the master node.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above sentence could be clearer if it were a few shorter sentences instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just dropped the part in brackets here now, it seemed really redundant? Given that the text below eventually just links to the classes that get serialized here anyway?
…package-info.java Co-Authored-By: David Turner <[email protected]>
…lasticsearch into document-blob-store
thanks for taking a look @DaveCTurner, all points addressed now I think :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrershov should have a look here as well before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@original-brownbear thanks for your PR, this documentation is really useful! I've asked for some clarifications/elaborations.
server/src/main/java/org/elasticsearch/repositories/blobstore/package-info.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/repositories/blobstore/package-info.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/repositories/blobstore/package-info.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/repositories/blobstore/package-info.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/repositories/blobstore/package-info.java
Show resolved
Hide resolved
* {@link org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardSnapshot} with name {@code snap-${snapshot-uuid}.dat} is written to | ||
* the shard's path and contains a list of all the files referenced by the snapshot as well as some metadata about the snapshot. See the | ||
* documentation of {@code BlobStoreIndexShardSnapshot} for details on its contents.</li> | ||
* <li>Once all the segments and the {@code BlobStoreIndexShardSnapshot} blob have been written, an updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we elaborate here why we need both BlobStoreIndexShardSnapshot and BlobStoreIndexShardSnapshots? AFAIK, it's needed exclusively for delete snapshot operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently use BlobStoreIndexShardSnapshots
to determine the segments to upload when creating incremental snapshots of a shard as well as what to delete.
Technically speaking BlobStoreIndexShardSnapshots
is redundant as you could rebuild it from all the BlobStoreIndexShardSnapshot
blobs. In practice this might get tricky since it implies downloading potentially hundreds of them (you have exactly as many as you have snapshots referencing the shard).
I think I'd really just leave the details of what is in BlobStoreIndexShardSnapshot
precisely to the link to that class (it contains some additional metadata about the time it took to snapshot the shard that we expose via an API). It seems to me it's easier to read this by not mentioning the duplication of data here, that just has the reader wondering why (and it's a fair question :D) when understanding that the BlobStoreIndexShardSnapshots
holds the list of segments is all that really matters for the big picture here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
@andrershov all addressed I think :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Jenkins run |
thanks @andrershov @ywelsch @DaveCTurner ! |
* Add Package Level Documentation to o.e.r.blobstore * Added verbose documentation for the `o.e.r.blobstore` package similar to that added for the snapshot package in elastic#38108 * Moved the documentation on the BlobStoreRepository to the package level to have things in a single place for easier readability.
o.e.r.blobstore
package similar tothat added for the snapshot package in
Add Package Level JavaDoc on Snapshots #38108
level to have things in a single place for easier readability.