-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix git pull error in core contrib test #3357
base: main
Are you sure you want to change the base?
Fix git pull error in core contrib test #3357
Conversation
ff2f9e6
to
021c31b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea for this one would be just do a actions/checkout first. If checkout works, it means will be able to install things from that revision
We already do a checkout before hitting this error, the actions/checkout action has a retry built in which is why we do not see this error there. I couldn't see a flag for tox that would let us use the output of that checkout in runs-on: ubuntu-latest
steps:
- name: Checkout contrib repo @ SHA - ${% raw %}{{ env.CONTRIB_REPO_SHA }}{% endraw %}
uses: actions/checkout@v4
with:
repository: open-telemetry/opentelemetry-python-contrib
ref: ${% raw %}{{ env.CONTRIB_REPO_SHA }}{% endraw %}
- name: Set up Python 3.8
uses: actions/setup-python@v5
with:
python-version: "3.8"
architecture: "x64"
- name: Install tox
run: pip install tox-uv
- name: Run tests
run: tox -e {{ job_data.tox_env }} -- -ra The revision always exists (it's used for all of the other actions that run successfully and the previous checkout step) but we fail without retry when tox pulls it in |
I mean, checkout core repo at the core repo revision we want before the contrib one. We only have this problem for |
Understood - I'll update my PR so that it only adds that checkout for core_contrib tests. |
021c31b
to
677d607
Compare
9932189
to
5764d8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jomcgi I think this can works. But how we can make tox use the local repo to install packages instead of pointing to the remote repository? I'm wondering about configuring CORE_REPO to point like: git+file:///absolute/path/opentelemetry-python
29d8889
to
47a03b9
Compare
Core contrib is intermittently failing when pulling the core repo in tox. Add gh actions/checkout for the core repo before running tox to mitigate this.
47a03b9
to
01d7ae6
Compare
Good callout - I've updated tox.ini to allow passing the core rep as an environment variable. Logs show the expected
|
.github/workflows/generate_workflows_lib/src/generate_workflows_lib/core_contrib_test.yml.j2
Outdated
Show resolved
Hide resolved
9bac288
to
9cd105c
Compare
Declare env var for the workflow instead of each invidividual job to reduce unnecessary repetition.
9cd105c
to
3f08866
Compare
Passing test from the core repo with the latest commit: |
@@ -30,6 +31,13 @@ jobs: | |||
repository: open-telemetry/opentelemetry-python-contrib | |||
ref: ${% raw %}{{ env.CONTRIB_REPO_SHA }}{% endraw %} | |||
|
|||
- name: Checkout core repo @ SHA - ${% raw %}{{ env.CORE_REPO_SHA }}{% endraw %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are performing a checkout action for each job? Wouldn't this add quite a bit of overhead to the build run time?
According to @emdneto , the checkout action takes ~1s each. This will also grow linearly with the amount of instrumentations we have.
Seeing as we can't prove that this fixes the transient issue easily, and it's only solving a very small percentage of runs (which is to run the test again), I'm not sure if this change is worth the increased run time for ALL PR runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific error has happend in 7 out the last 50 runs which indicates it could account for a significant portion of that tests 34% failure rate:
The average runner time consumed by each execution of this test over the last 30 days was 85 minutes so the retries will be be consuming a significant amount of time as well.
Maybe I'm missing permissions but when I run into this error I can't retry failed steps I have to run everything again.
I can investigate this further - it should be possible to retrieve the data required to quantify the impact (retries) and the frequency of this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can test if this PR solves the issue by opening a PR in core pointing to this branch and I can trigger the CI a couple of times. from what I observed, every PR has at least ~1 error occurrence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jomcgi https://github.com/open-telemetry/opentelemetry-python/actions/runs/14369769807
I left the workflow pointing to main on purpose to see if it would fail. It is sill using the git+https://
, and there are no fails for ~5 runs -- Probably all we need is to do the checkout as was done before
Description
Core contrib is intermittently failing when pulling the core repo in tox.
Add gh actions/checkout for the core repo before running tox to mitigate this.
As the error is transient I'm not 100% sure that this resolves it but I have not encountered it with any of the GH Actions that have been triggered.
Fixes #3352
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
git+file://..
references instead of the remote repoDoes This PR Require a Core Repo Change?
Checklist: