-
Notifications
You must be signed in to change notification settings - Fork 437
chore(ci_visibility): refactor test retry logic #13224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(ci_visibility): refactor test retry logic #13224
Conversation
|
if setup_report.outcome == outcomes.PASSED: | ||
call_call, call_report = _retry_run_when(item, "call", outcomes) | ||
if call_report.outcome == outcomes.FAILED: | ||
if setup_outcome == outcomes.PASSED: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Teardown does not happen if setup skipped | ||
if not setup_report.skipped: | ||
teardown_call, teardown_report = _retry_run_when(item, "teardown", outcomes) | ||
if not setup_outcome == outcomes.SKIPPED: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 252 ± 5 ms. The average import time from base is: 256 ± 7 ms. The import time difference between this PR and base is: -4.1 ± 0.3 ms. Import time breakdownThe following import paths have shrunk:
|
BenchmarksBenchmark execution time: 2025-04-23 18:12:33 Comparing candidate commit 931fb82 in PR branch Found 5 performance improvements and 2 performance regressions! Performance is the same for 483 metrics, 2 unstable metrics. scenario:ddtracerun-auto_profiling
scenario:iast_aspects-swapcase_aspect
scenario:otelspan-start-finish
scenario:otelspan-start-finish-telemetry
scenario:span-start-finish
scenario:span-start-finish-telemetry
scenario:span-start-finish-traceid128
|
log = get_logger(__name__) | ||
|
||
|
||
class RetryManager: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…3376) Currently we let pytest's builtin `pytest_runtest_protocol` hook to run the test, and we check whether to retry at the makereport stage. This has a number of consequences: - We don't have access to the setup, call, teardown reports all at once; we see one at a time during `makereport`, and we have to patch them and stash information around to keep state across each time we pass through the reports for a given test. - pytest logs each report as it is created, so we have to patch them at the right time so they get printed correctly (to e.g. change the outcome from `FAILED` to `ATR INITIAL ATTEMPT FAILED`; this affects not only printing, but also the session exit status). - In particular, for EFD when the initial attempt passes, we run too late and the `PASSED` status was already logged by the time we patch it, so it shows `PASSED` instead of `EFD INITIAL ATTEMPT PASSED`. Not only that, but we have to [handle this case specially](https://github.com/DataDog/dd-trace-py/blob/0efbb6c3cc6ab4c59a838771f2db92f1b729e87a/ddtrace/contrib/internal/pytest/_efd_utils.py#L237) when generating the terminal summary. This PR moves the retry logic from the `pytest_runtest_makereport` hook to the `pytest_runtest_protocol` hook. This means we _replace_ pytest's own `pytest_runtest_protocol` with our own. We invoke pytest's internal `runtestprotocol` function directly from our hook, so the behavior of our hook is similar to the pytest's own hook. The difference is that we call this function with `log=False`, so pytest doesn't log the setup, call, teardown reports as they are created. Instead, we collect all reports, patch them as needed, and _then_ print them out. This means we can write the logic having full knowledge of the final status of a test run, instead of patching things as we see them during setup, call and teardown. For retries, the responsibility for logging the statuses is moved to the retry handlers themselves, so they can delay printing to after the reports have been patched. In principle, we could even decide to _not_ print the retry results individually and only print the final status (which would make for a cleaner output), but this can come in a future version. For EFD, the special final outcomes (`dd_efd_final_passed`, etc.) are replaced with plain `passed`, `failed`, `skipped` states, which xdist can handle, and the final states are only used in `efd_get_teststatus` (called from the `pytest_report_teststatus` hook). Future work: - Attempt to Fix has to be modified in similar ways to EFD, but it also has to handle quarantine, so it's a bit more involved. - xdist still prints `EFD INITIAL ATTEMPT` for all atempts (not just the first one). - The whole retry logic outside of pytest should be refactored (see #13224), but this PR is a first step to make the rest possible. ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
…3448) This is essentially #13376, recreated now that the fix in the unittest suite (#13445) has been merged, and with the previous `pytest_runtest_protocol` logic wrapped in a try/except to avoid breaking the pipeline in case of an internal error in CI Visibility. ___ Currently we let pytest's builtin `pytest_runtest_protocol` hook to run the test, and we check whether to retry at the makereport stage. This has a number of consequences: - We don't have access to the setup, call, teardown reports all at once; we see one at a time during `makereport`, and we have to patch them and stash information around to keep state across each time we pass through the reports for a given test. - pytest logs each report as it is created, so we have to patch them at the right time so they get printed correctly (to e.g. change the outcome from `FAILED` to `ATR INITIAL ATTEMPT FAILED`; this affects not only printing, but also the session exit status). - In particular, for EFD when the initial attempt passes, we run too late and the `PASSED` status was already logged by the time we patch it, so it shows `PASSED` instead of `EFD INITIAL ATTEMPT PASSED`. Not only that, but we have to [handle this case specially](https://github.com/DataDog/dd-trace-py/blob/0efbb6c3cc6ab4c59a838771f2db92f1b729e87a/ddtrace/contrib/internal/pytest/_efd_utils.py#L237) when generating the terminal summary. This PR moves the retry logic from the `pytest_runtest_makereport` hook to the `pytest_runtest_protocol` hook. This means we _replace_ pytest's own `pytest_runtest_protocol` with our own. We invoke pytest's internal `runtestprotocol` function directly from our hook, so the behavior of our hook is similar to the pytest's own hook. The difference is that we call this function with `log=False`, so pytest doesn't log the setup, call, teardown reports as they are created. Instead, we collect all reports, patch them as needed, and _then_ print them out. This means we can write the logic having full knowledge of the final status of a test run, instead of patching things as we see them during setup, call and teardown. For retries, the responsibility for logging the statuses is moved to the retry handlers themselves, so they can delay printing to after the reports have been patched. In principle, we could even decide to _not_ print the retry results individually and only print the final status (which would make for a cleaner output), but this can come in a future version. For EFD, the special final outcomes (`dd_efd_final_passed`, etc.) are replaced with plain `passed`, `failed`, `skipped` states, which xdist can handle, and the final states are only used in `efd_get_teststatus` (called from the `pytest_report_teststatus` hook). Future work: - Attempt to Fix has to be modified in similar ways to EFD, but it also has to handle quarantine, so it's a bit more involved. - xdist still prints `EFD INITIAL ATTEMPT` for all atempts (not just the first one). - The whole retry logic outside of pytest should be refactored (see #13224), but this PR is a first step to make the rest possible. ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
This pull request has been automatically closed after a period of inactivity. |
Checklist
Reviewer Checklist