-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add Package Level JavaDoc on Snapshots #38108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Package Level JavaDoc on Snapshots #38108
Conversation
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you! @DaveCTurner can you have a look as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. I think it's very useful to have this written down, and this is a very clear explanation. I left a few suggestions for gaps to fill.
Could you try and change many of the @code
tags into @link
ones? I think you should be able to remove many of the package qualifiers by adding import
statements.
Could you reflow the text to some fixed width (maybe 140 characters)? The right margin is rather ragged.
server/src/main/java/org/elasticsearch/snapshots/package-info.java
Outdated
Show resolved
Hide resolved
* <h2>Deleting a Snapshot from a Repository</h2> | ||
* | ||
* <ol> | ||
* <li>Assuming there are no entries in the cluster-state's {@code SnapshotsInProgress}, deleting a snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd also be interesting to hear how we delete a snapshot after aborting it, i.e. what if this Assuming
is not true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is covered in the paragraph above, that's why I put Aborting a Snapshot
as a subtopic of Deleting a Snapshot
. Should I move things around a bit to make it clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The detail I felt we were missing is how we achieve that "after" relationship. Aborting a snapshot isn't atomic, it requires acks from all the shards first, so how do we know to delete the snapshot after aborting it? It looks like we add a listener on the current master so all is good as long as we don't elect a new master first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea true, the delete after abort will not happen if we have a master failover. Then the abort will go through clean but no delete will happen subsequently. That's something we could maybe fix I guess but I'm not sure it's worth the risk (at least at this point in time).
* </li> | ||
* | ||
* <li>After the snapshot's entry with state {@code INIT} is in the cluster-state, | ||
* {@link org.elasticsearch.snapshots.SnapshotsService} determines the primary shards' assignments for all indices that are being |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raise the question of what happens if a primary shard moves in between the creation of the SnapshotsInProgress
entry and the time when it is populated by the SnapshotsService
. It'd be good to hear more on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a sentence on that in a699b72, hope it helps. Relocation between INIT
and STARTED
isn't really an issue since the assignment of nodes and moving to STARTED
happens in the same ClusterStateUpdateTask
(Maybe make that more explicit in the docs? I opted not to since it seemed too detailed here but maybe not?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the addition is helpful. Is it right that we do not start to gracefully relocate a primary while it is being snapshotted (i.e. we have SnapshotInProgressAllocationDecider
), so that if the primary moves later then it's a genuine failure?
What do we do with a primary that is already being relocated when the snapshot starts? Do we use the relocation target (which is initialising, so goes into the WAITING
state)?
I'm still getting to grips with all of this, so perhaps I'm going into too much detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it right that we do not start to gracefully relocate a primary while it is being snapshotted (i.e. we have
SnapshotInProgressAllocationDecider
), so that if the primary moves later then it's a genuine failure?
Yes, that's exactly right :) I felt like it was more confusing than helpful to add the "not graceful" qualifier to the text, but maybe I'm wrong?
Yea exactly, we have this in SnapshotsService
:
What do we do with a primary that is already being relocated when the snapshot starts? Do we use the relocation target (which is initialising, so goes into the WAITING
state)?
} else if (primary.relocating() || primary.initializing()) {
builder.put(shardId, new SnapshotsInProgress.ShardSnapshotStatus(primary.currentNodeId(), State.WAITING));
The problem with that is, that I can't add an |
@DaveCTurner @ywelsch thanks for taking a look, I fixed or commented on the points by @DaveCTurner => take another look when you have some time :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Add Package Level JavaDoc on Snapshots
* Added verbose documentation for the `o.e.r.blobstore` package similar to that added for the snapshot package in elastic#38108 * Moved the documentation on the BlobStoreRepository to the package level to have things in a single place for easier readability.
* Add Package Level Documentation to o.e.r.blobstore * Added verbose documentation for the `o.e.r.blobstore` package similar to that added for the snapshot package in #38108 * Moved the documentation on the BlobStoreRepository to the package level to have things in a single place for easier readability.
* Add Package Level Documentation to o.e.r.blobstore * Added verbose documentation for the `o.e.r.blobstore` package similar to that added for the snapshot package in elastic#38108 * Moved the documentation on the BlobStoreRepository to the package level to have things in a single place for easier readability.
As discussed in October, gave it a shot adding some basic JavaDoc on how the snapshot operation works to the package.
I would've found it helpful but let's see if others agree that this is the kind of docs, level of verbosity we're interested in :)