-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Unblock blocked repositories after test execution #61703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unblock blocked repositories after test execution #61703
Conversation
If a test fails before a repository has been unblocked it prevents the clean up to proceed making all subsequent tests to fail. Closes elastic#61541
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of doing this for shared cluster but I have some doubts about the implementation.
test/framework/src/main/java/org/elasticsearch/snapshots/AbstractSnapshotIntegTestCase.java
Outdated
Show resolved
Hide resolved
test/framework/src/main/java/org/elasticsearch/snapshots/AbstractSnapshotIntegTestCase.java
Outdated
Show resolved
Hide resolved
@@ -101,6 +102,24 @@ protected Settings nodeSettings(int nodeOrdinal) { | |||
return Arrays.asList(MockRepository.Plugin.class); | |||
} | |||
|
|||
@After |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of this, but I'm a little suspicious that this is the right place to put this. Many tests don't reuse the test cluster in which case the work here is unnecessary? Maybe we should make this change in the logic that cleans up the test cluster when its reused instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that @After
hooks from AbstractSnapshotIntgetTestCase
are executed before the hooks in ESIntegTestCase
. Is there a way to determine the scope of a cluster? we could bypass this cleanup if the cluster is not reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can do a simpler thing and simply skip the repo consistency checks in case of test failure via wrapping in:
if (getSuiteFailureMarker().wasSuccessful()) {
}
That should fix all the cases because I think closing the repos upstream in the EsIntegTestCase
cleanup logic should deal with unblocking and removing all the repos cleanly.
That would also make test failures easier to interpret since we get rid of failed repo verifications on failed tests?
jenkins run elasticsearch-ci/packaging-sample-windows |
…ositories-after-test
…ositories-after-test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fcofdez looks good, just one question left :)
@@ -569,6 +583,27 @@ private void afterInternal(boolean afterClass) throws Exception { | |||
} | |||
} | |||
|
|||
public void unblockRepositories() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little unsure about this. I think we should only run this if the original test failed shouldn't we? (this is what we do in REST tests)
Otherwise we can have tests leaking running/blocked snapshots in the background and are quietly cleaning them up here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense. do we have something like getSuiteFailureMarker()
but for a single test? I cannot find any method like that on ESTestCase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe org.elasticsearch.test.ESTestCase#afterIfFailed
will work here (I haven't checked the exact order of things I must admit). If it's called too late maybe we can do something with a @Rule
? (I must admit I'm not an expert in JUnit so the latter is a bit of a guess).
@@ -2205,4 +2240,36 @@ public static String resolveCustomDataPath(String index) { | |||
public static boolean inFipsJvm() { | |||
return Boolean.parseBoolean(System.getProperty(FIPS_SYSPROP)); | |||
} | |||
|
|||
protected void awaitNoMoreSnapshotRunningOperations(String viaNode) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: reorder this to natural word order awaitNoMoreRunningSnapshotOperations
:)
…ositories-after-test
…ositories-after-test
jenkins test this |
Unfortunately it seems like rules are executed after |
@fcofdez sorry this kinda fell off of my radar :( I wonder if this can be dealt with in a very easy way now though because I found two things in the last two weeks:
diff --git a/test/framework/src/main/java/org/elasticsearch/snapshots/AbstractSnapshotIntegTestCase.java b/test/framework/src/main/java/org/elasticsearch/snapshots/AbstractSnapshotIntegTestCase.java
index cf09ca5cebf..2a835263db1 100644
--- a/test/framework/src/main/java/org/elasticsearch/snapshots/AbstractSnapshotIntegTestCase.java
+++ b/test/framework/src/main/java/org/elasticsearch/snapshots/AbstractSnapshotIntegTestCase.java
@@ -120,8 +120,9 @@ public abstract class AbstractSnapshotIntegTestCase extends ESIntegTestCase {
private String skipRepoConsistencyCheckReason;
- @After
- public void assertRepoConsistency() {
+ @Override
+ public void afterIfSuccessful() throws Exception {
+ super.afterIfSuccessful();
if (skipRepoConsistencyCheckReason == null) {
clusterAdmin().prepareGetRepositories().get().repositories().forEach(repositoryMetadata -> {
final String name = repositoryMetadata.name();
and fix all our problems that way by simply not doing any repo health checks for failed tests? |
Closing this I think this hasn't come up again and new tests generally seem to not use shared clusters so this has become somewhat irrelevant now I hope. |
If a test fails before a repository has been unblocked it prevents
the clean up to proceed making all subsequent tests to fail during
the cleanup phase.
Closes #61541