-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Rerun test task when test jdk crashed with System exit #71881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rerun test task when test jdk crashed with System exit #71881
Conversation
Pinging @elastic/es-delivery (Team:Delivery) |
return activeDescriptorsById.size() == 1; | ||
} | ||
|
||
public RoundResult getResult() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
return new RoundResult(currentRoundFailedTests, previousRoundFailedTests, lastRun()); | ||
} | ||
|
||
public void reset(boolean lastRetry) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the lastRetry
argument isn't used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaned up
|
||
private final Map<Object, TestDescriptorInternal> activeDescriptorsById = new HashMap<>(); | ||
|
||
private TestNames currentRoundFailedTests = new TestNames(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since getResult()
isn't used, I don't think either of these are either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
break; | ||
} catch (ExecException e) { | ||
if (retryCount == maxRetries) { | ||
throw new GradleException("Max retries hit", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we hit the retry limit we never call storeActiveDescriptors()
. I think that would mean the crash reporter would report the wrong failures, should there have been any in the last round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what's the benefit of the indirection of having a finalizing task action report the failures rather than just doing so right here? Is this code executed in the worker JVM vs the daemon or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed this to report the trace for each run and simplified things here.
public void report() { | ||
if(failPaths.size() > 0) { | ||
String report = "================\n" + | ||
"Test JDK System exit trace:\n" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should include in this message something to the effect of "test jvm exited unexpectedly" or would Gradle still report such a thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tweaked the reporting a bit to include this. I think there's not much value to show the whole stack trace in gradle reporting the jvm exit as it doesn't provide any real value to trace this down.
Also, what does this look like in build scans if we have partial test execution an then retry? Do test get reported as flaky as they do with the Gradle test retry plugin? |
- report jvm trace each round - add test coverage for max reruns
d29060a
to
d5bd493
Compare
@mark-vieira sorry that PR wasn't meant to be reviewed already and was meant to be still a draft. Please review again. I cleaned this up and also did some further rework. |
It depends on the result of the tests. If all tests pass in the first run and in the 2nd run (not taking the system exit causing one into account) then tests are only shown as one time executed in build scan (see example of the build scan I added to the infra test of this PR: https://gradle-enterprise.elastic.co/s/wsczh3rwdcvq4/tests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet, LGTM.
|
||
private final Map<Object, TestDescriptorInternal> activeDescriptorsById = new HashMap<>(); | ||
|
||
private Object rootTestDescriptorId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the "root" descriptor? It's the one for the task, right? Can we clarify this with a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly. added a clarifying comment.
Related to elastic#52610 this PR introduces a rerun of all tests for a test task if the test jvm has crashed because of a system exit. We furthermore log potential tests that caused the System.exit based on which tests have been active at the time of the system exit. We also modified the build scan logic to track unexpected test jvm exists with the tag `unexpected-test-jvm-exit`
Related to #52610 this PR introduces a rerun of all tests for a test task if the test jvm has crashed because of a system exit. We furthermore log potential tests that caused the System.exit based on which tests have been active at the time of the system exit. We also modified the build scan logic to track unexpected test jvm exists with the tag `unexpected-test-jvm-exit`
Related to #52610 this PR introduces a rerun of all tests for a test task if the test jvm has crashed because of a system exit. We furthermore log potential tests that caused the System.exit based on which tests have been active at the time of the system exit.