-
-
Notifications
You must be signed in to change notification settings - Fork 32k
bpo-44708: Only re-run test methods that match names of previously failing test methods #27287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
self.good.append(test_name) | ||
elif ok in (FAILED, CHILD_ERROR): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The different order here is required due to how inheritance works. isinstance(Failed)
would catch all the lower cases.
|
||
if rerun and ok not in {FAILED, CHILD_ERROR, INTERRUPTED}: | ||
if rerun and not isinstance(result, (Failed, Interrupted)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChildError
is a Failed
subclass so we don't have to enumerate it. Additionally, I feel like this list was missing other failures like uncaught exceptions, env changeds, and refleaks. Now it doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For refleaks/envchanged I don't know if it makes sense to rerun because it those fail, then there is no way that the test suite is going to succeed after that, no? OTOH I don't see how it would hurt (just let's make sure that it works as we expect and it doesn't interfere with the refleak machinery).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is a little cryptic. The rerun
boolean here means "we are already re-running" and this if statement figures out whether we should remove the current test that just re-ran from "bad" tests (which were populated on the first run). The bug that was there previously is that a refleak or env change in the second run would count as an "ok" outcome and the test would be removed from "bad", leading to a green CI badge that's wrong.
self.ns.match_tests.extend(error_names) | ||
self.ns.match_tests.extend(failure_names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the money shot.
PASSED = 1 | ||
FAILED = 0 | ||
ENV_CHANGED = -1 | ||
SKIPPED = -2 | ||
RESOURCE_DENIED = -3 | ||
INTERRUPTED = -4 | ||
CHILD_ERROR = -5 # error in a child process | ||
TEST_DID_NOT_RUN = -6 | ||
TIMEOUT = -7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing those with objects was necessary to store related context when returning, which allows matching test method names in main.py
. This helps with consistency when running across processes as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an excellent change, dealing with these have been quite painful before
ENV_CHANGED test failed because it changed the execution environment | ||
FAILED test failed | ||
PASSED test passed | ||
EMPTY_TEST_SUITE test ran no subtests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring was outdated anyway.
ns_dict, test_name = json.loads(worker_args) | ||
ns = types.SimpleNamespace(**ns_dict) | ||
ns = Namespace(**ns_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Custom Namespace subclass pays off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
if display_failure: | ||
msg = f"{msg} -- {exc}" | ||
print(msg, file=sys.stderr, flush=True) | ||
return Failed(test_name, errors=exc.errors, failures=exc.failures) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how we store the failed and erroring method names to be matched later for re-running.
self, | ||
name: str, | ||
duration_sec: float = 0.0, | ||
xml_data: list[str] | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by comment: T1 | T2
syntax is 3.10 only and won't be able to land in 3.9 unless you use from __future__ import annotations
or stringify them manually. Ditto for other occurrences below.
I don't have enough experience in libregrtest
to review the rest, but this is a change I look forward to :).
if display_failure: | ||
msg = f"{msg} -- {exc}" | ||
print(msg, file=sys.stderr, flush=True) | ||
return Failed(test_name, errors=exc.errors, failures=exc.failures) | ||
except support.TestFailed as exc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what scenarios are we are raising Failed
but not TestFailedWithDetails
now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are quite a few bare TestFailed in custom test_main
functions in various test_*.py
files. Example:
Line 5044 in c05a790
raise support.TestFailed("Can't read certificate file %r" % filename) |
if errors or failures: | ||
if self.ns.match_tests is None: | ||
self.ns.match_tests = [] | ||
self.ns.match_tests.extend(error_names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Do we want to rerun errors? In theory those are things like failures to set up the test and similar. Do you think that this is something we should run again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only case where we gather errors is with TestFailedWithDetails
which means those are unittest
failures and/or errors. We've been re-running the entire file in this scenario before. I think we should still as some of those are exceptions raised due to race conditions. We might think of tightening this up in a future update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic job! 🚀 In general LGTM!
I left some comments with some questions and suggestions.
Co-authored-by: Pablo Galindo Salgado <[email protected]>
Thanks @ambv for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9. |
Thanks @ambv for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10. |
Sorry, @ambv, I could not cleanly backport this to |
Sorry @ambv, I had trouble checking out the |
Thanks @ambv for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10. |
GH-27290 is a backport of this pull request to the 3.10 branch. |
…iling test methods (pythonGH-27287) * Move to a static argparse.Namespace subclass * Roughly annotate runtest.py * Refactor libregrtest to use lossless test result objects * Only re-run test methods that match names of previously failing test methods * Adopt tests to cover test method name matching Co-authored-by: Pablo Galindo Salgado <[email protected]> (cherry picked from commit f1afef5) Co-authored-by: Łukasz Langa <[email protected]>
…sly failing test methods (pythonGH-27287) * Move to a static argparse.Namespace subclass * Roughly annotate runtest.py * Refactor libregrtest to use lossless test result objects * Only re-run test methods that match names of previously failing test methods * Adopt tests to cover test method name matching Co-authored-by: Pablo Galindo Salgado <[email protected]>. (cherry picked from commit f1afef5) Co-authored-by: Łukasz Langa <[email protected]>
…iling test methods (GH-27287) (GH-27290) * Move to a static argparse.Namespace subclass * Roughly annotate runtest.py * Refactor libregrtest to use lossless test result objects * Only re-run test methods that match names of previously failing test methods * Adopt tests to cover test method name matching Co-authored-by: Pablo Galindo Salgado <[email protected]> (cherry picked from commit f1afef5) Co-authored-by: Łukasz Langa <[email protected]>
…sly failing test methods (GH-27287) (GH-27293) * Move to a static argparse.Namespace subclass * Roughly annotate runtest.py * Refactor libregrtest to use lossless test result objects * Only re-run test methods that match names of previously failing test methods * Adopt tests to cover test method name matching Co-authored-by: Pablo Galindo Salgado <[email protected]>. (cherry picked from commit f1afef5) Co-authored-by: Łukasz Langa <[email protected]>
Details on the issue. It's probably easiest to review this commit by commit as every commit is an atomic change.
I'd also like to backport to 3.10 and 3.9 to speed up open PRs on those branches as well.
https://bugs.python.org/issue44708