Skip to content

bpo-44708: Only re-run test methods that match names of previously failing test methods #27287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jul 22, 2021

Conversation

ambv
Copy link
Contributor

@ambv ambv commented Jul 22, 2021

Details on the issue. It's probably easiest to review this commit by commit as every commit is an atomic change.

I'd also like to backport to 3.10 and 3.9 to speed up open PRs on those branches as well.

https://bugs.python.org/issue44708

self.good.append(test_name)
elif ok in (FAILED, CHILD_ERROR):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The different order here is required due to how inheritance works. isinstance(Failed) would catch all the lower cases.


if rerun and ok not in {FAILED, CHILD_ERROR, INTERRUPTED}:
if rerun and not isinstance(result, (Failed, Interrupted)):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChildError is a Failed subclass so we don't have to enumerate it. Additionally, I feel like this list was missing other failures like uncaught exceptions, env changeds, and refleaks. Now it doesn't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For refleaks/envchanged I don't know if it makes sense to rerun because it those fail, then there is no way that the test suite is going to succeed after that, no? OTOH I don't see how it would hurt (just let's make sure that it works as we expect and it doesn't interfere with the refleak machinery).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is a little cryptic. The rerun boolean here means "we are already re-running" and this if statement figures out whether we should remove the current test that just re-ran from "bad" tests (which were populated on the first run). The bug that was there previously is that a refleak or env change in the second run would count as an "ok" outcome and the test would be removed from "bad", leading to a green CI badge that's wrong.

Comment on lines +328 to +329
self.ns.match_tests.extend(error_names)
self.ns.match_tests.extend(failure_names)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the money shot.

Comment on lines -21 to -29
PASSED = 1
FAILED = 0
ENV_CHANGED = -1
SKIPPED = -2
RESOURCE_DENIED = -3
INTERRUPTED = -4
CHILD_ERROR = -5 # error in a child process
TEST_DID_NOT_RUN = -6
TIMEOUT = -7
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing those with objects was necessary to store related context when returning, which allows matching test method names in main.py. This helps with consistency when running across processes as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent change, dealing with these have been quite painful before

ENV_CHANGED test failed because it changed the execution environment
FAILED test failed
PASSED test passed
EMPTY_TEST_SUITE test ran no subtests.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring was outdated anyway.

ns_dict, test_name = json.loads(worker_args)
ns = types.SimpleNamespace(**ns_dict)
ns = Namespace(**ns_dict)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom Namespace subclass pays off.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

if display_failure:
msg = f"{msg} -- {exc}"
print(msg, file=sys.stderr, flush=True)
return Failed(test_name, errors=exc.errors, failures=exc.failures)
Copy link
Contributor Author

@ambv ambv Jul 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how we store the failed and erroring method names to be matched later for re-running.

self,
name: str,
duration_sec: float = 0.0,
xml_data: list[str] | None = None,
Copy link
Member

@Fidget-Spinner Fidget-Spinner Jul 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by comment: T1 | T2 syntax is 3.10 only and won't be able to land in 3.9 unless you use from __future__ import annotations or stringify them manually. Ditto for other occurrences below.

I don't have enough experience in libregrtest to review the rest, but this is a change I look forward to :).

if display_failure:
msg = f"{msg} -- {exc}"
print(msg, file=sys.stderr, flush=True)
return Failed(test_name, errors=exc.errors, failures=exc.failures)
except support.TestFailed as exc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what scenarios are we are raising Failed but not TestFailedWithDetails now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few bare TestFailed in custom test_main functions in various test_*.py files. Example:

raise support.TestFailed("Can't read certificate file %r" % filename)

if errors or failures:
if self.ns.match_tests is None:
self.ns.match_tests = []
self.ns.match_tests.extend(error_names)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Do we want to rerun errors? In theory those are things like failures to set up the test and similar. Do you think that this is something we should run again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only case where we gather errors is with TestFailedWithDetails which means those are unittest failures and/or errors. We've been re-running the entire file in this scenario before. I think we should still as some of those are exceptions raised due to race conditions. We might think of tightening this up in a future update.

Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic job! 🚀 In general LGTM!

I left some comments with some questions and suggestions.

@ambv ambv merged commit f1afef5 into python:main Jul 22, 2021
@ambv ambv added needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes labels Jul 22, 2021
@miss-islington
Copy link
Contributor

Thanks @ambv for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @ambv for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

@miss-islington
Copy link
Contributor

Sorry, @ambv, I could not cleanly backport this to 3.9 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker f1afef5e0d93d66fbf3c9aaeab8b3b8da9617583 3.9

@miss-islington
Copy link
Contributor

Sorry @ambv, I had trouble checking out the 3.10 backport branch.
Please backport using cherry_picker on command line.
cherry_picker f1afef5e0d93d66fbf3c9aaeab8b3b8da9617583 3.10

@ambv ambv added needs backport to 3.10 only security fixes and removed needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes labels Jul 22, 2021
@miss-islington
Copy link
Contributor

Thanks @ambv for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-27290 is a backport of this pull request to the 3.10 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.10 only security fixes label Jul 22, 2021
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jul 22, 2021
…iling test methods (pythonGH-27287)

* Move to a static argparse.Namespace subclass
* Roughly annotate runtest.py
* Refactor libregrtest to use lossless test result objects
* Only re-run test methods that match names of previously failing test methods
* Adopt tests to cover test method name matching

Co-authored-by: Pablo Galindo Salgado <[email protected]>
(cherry picked from commit f1afef5)

Co-authored-by: Łukasz Langa <[email protected]>
ambv added a commit to ambv/cpython that referenced this pull request Jul 22, 2021
…sly failing test methods (pythonGH-27287)

* Move to a static argparse.Namespace subclass
* Roughly annotate runtest.py
* Refactor libregrtest to use lossless test result objects
* Only re-run test methods that match names of previously failing test methods
* Adopt tests to cover test method name matching

Co-authored-by: Pablo Galindo Salgado <[email protected]>.
(cherry picked from commit f1afef5)

Co-authored-by: Łukasz Langa <[email protected]>
ambv added a commit that referenced this pull request Jul 22, 2021
…iling test methods (GH-27287) (GH-27290)

* Move to a static argparse.Namespace subclass
* Roughly annotate runtest.py
* Refactor libregrtest to use lossless test result objects
* Only re-run test methods that match names of previously failing test methods
* Adopt tests to cover test method name matching

Co-authored-by: Pablo Galindo Salgado <[email protected]>
(cherry picked from commit f1afef5)

Co-authored-by: Łukasz Langa <[email protected]>
ambv added a commit that referenced this pull request Jul 22, 2021
…sly failing test methods (GH-27287) (GH-27293)

* Move to a static argparse.Namespace subclass
* Roughly annotate runtest.py
* Refactor libregrtest to use lossless test result objects
* Only re-run test methods that match names of previously failing test methods
* Adopt tests to cover test method name matching

Co-authored-by: Pablo Galindo Salgado <[email protected]>.
(cherry picked from commit f1afef5)

Co-authored-by: Łukasz Langa <[email protected]>
@ambv ambv deleted the regrtest-single-rerun branch May 3, 2022 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants