-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add Repository Consistency Assertion to SnapshotResiliencyTests #40857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Repository Consistency Assertion to SnapshotResiliencyTests #40857
Conversation
* Add some quick validation on not leaving behind any dangling metadata or dangling indices to the snapshot resiliency tests * Added todo about expanding this assertion further
Pinging @elastic/es-distributed |
* Assert that there are no unreferenced indices or unreferenced root-level metadata blobs in any repository. | ||
* TODO: Expand the logic here to also check for unreferenced segment blobs and shard level metadata | ||
*/ | ||
private void assertNoStaleRepositoryData() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to split this method into separate methods and give them reasonable names, such as assertSnapshotUUIDs, assertIndexUUIDs, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure extracted those methods :)
testClusterNodes.nodes.values().forEach(TestClusterNode::stop); | ||
public void stopServices() throws IOException { | ||
try { | ||
assertNoStaleRepositoryData(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have two After
methods - the first one stopServices, the second one - assertNoStaleRepositoryData?
Even if we can have only one after method it probably makes sense to rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed it, can't really use another after here since I need to validate consistency before closing the services since that deletes the files.
@andrershov thanks for taking a look, addressed both points :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
thanks @andrershov |
…tic#40857) * Add Repository Consistency Assertion to SnapshotResiliencyTests * Add some quick validation on not leaving behind any dangling metadata or dangling indices to the snapshot resiliency tests * Added todo about expanding this assertion further
* Add Repository Consistency Assertion to SnapshotResiliencyTests (#40857) * Add Repository Consistency Assertion to SnapshotResiliencyTests * Add some quick validation on not leaving behind any dangling metadata or dangling indices to the snapshot resiliency tests * Added todo about expanding this assertion further * Fix SnapshotResiliencyTest Repo Consistency Check (#41332) * Fix SnapshotResiliencyTest Repo Consistency Check * Due to the random creation of an empty `extra0` file by the Lucene mockFS we see broken tests because we use the existence of an index folder in assertions and the index deletion doesn't go through if there are extra files in an index folder * Fixed by removing the `extra0` file and resulting empty directory trees before asserting repo consistency * Closes #41326 * Reenable SnapshotResiliency Test (#41437) This was fixed in #41332 but I forgot to reenable the test. * fix compile on java8
…tic#40857) * Add Repository Consistency Assertion to SnapshotResiliencyTests * Add some quick validation on not leaving behind any dangling metadata or dangling indices to the snapshot resiliency tests * Added todo about expanding this assertion further
.dat
there :)I just figured I'd add this real quick as it was easy to add and might help spot issues quicker now that we're refactoring the deletes more and more.