-
Notifications
You must be signed in to change notification settings - Fork 231
test: reduce TAV test matrix for slowest jobs #3321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This tweaks the .tav.yml config for most TAV test matrix jobs that were observed to be taking >20 minutes in CI recently. One job was taking longer than 40 minutes and being aborted: test-tav (14, bluebird,got,mimic-response) - mimic-response: The agent instruments and will only ever instrument a single version: v1.0.0. Therefore there is no need for to "test all versions" as long as the regular non-TAV tests test that version. Lock package.json to that version. - bluebird, got, pg, mongodb-core, knex, tedious: Massively reduce the number of versions tested when the latest release for a given major version was more than 2-3 years ago. - fastify, aws-sdk: Update the tested ranges to test the first, last, and approximately 5 versions in between. (Added ./dev-utils/tav-versions-fastify.sh for this.) Also: - Fix a test in bluebird.test.js to be less flaky. In a local run it was observed to fail the 'new Promise -> timeout (timed out)' test case with: `Error: start + 49 should be <= 1683227232779 - was 1683227232782`.
@@ -0,0 +1,42 @@ | |||
#!/bin/sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have already a couple of files with the same logic. What do you think about a more generic script in a follow up issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine either way. I am kind of hoping to spend some time on the side improving tav
to support functionality like this natively. That's vapourware right now, however.
Timings from the TAV run after this merge (https://github.com/elastic/apm-agent-nodejs/actions/runs/4895356707):
So we had some success.
|
…re/support-specific-modules * 'main' of github.com:elastic/apm-agent-nodejs: (54 commits) chore: fix dev-utils/ci-tav-slow-jobs.sh (elastic#3319) test: reduce TAV test matrix for slowest jobs (elastic#3321) chore: sync package-lock so 'npm ci' can work (elastic#3318) docs: document `useElasticTraceparentHeader` config var (elastic#3316) chore, test: test driver improvements (elastic#3293) test: drop node 14 from RC tests now that it is EOL (elastic#3315) test: fix running fastify.test.js with node v8 (elastic#3317) feat: add @apollo/server@4 support (elastic#3203) chore: update nvm (elastic#3309) tests: stop testing 'express-graphql' instrumentation (elastic#3304) chore: fix bitrot.js dev util for recent changes (elastic#3308) test: restore testing of Azure Functions on node >=18.x (elastic#3307) fix: support Lambda instrumentation for `contextManager: 'patch'`; refactor Lambda tests (elastic#3305) test: fix fastify TAV test failures (elastic#3314) test: fix @aws-sdk/client-s3 TAV test failures (elastic#3312) feat: add instrumentation for aws-sdk S3 client (elastic#3287) feat(fastify): add captureBody support (elastic#2681) feat: mysql2@3 support (elastic#3301) chore(deps): bump @opentelemetry/exporter-prometheus from 0.37.0 to 0.38.0 in /test/opentelemetry-metrics/fixtures (elastic#3295) chore(deps-dev): bump fastify from 4.16.3 to 4.17.0 (elastic#3296) ...
This tweaks the .tav.yml config for most TAV test matrix jobs that were observed to be taking >20 minutes in CI recently. One job was taking longer than 40 minutes and being aborted: test-tav (14, bluebird,got,mimic-response) - mimic-response: The agent instruments and will only ever instrument a single version: v1.0.0. Therefore there is no need for to "test all versions" as long as the regular non-TAV tests test that version. Lock package.json to that version. - bluebird, got, pg, mongodb-core, knex, tedious: Massively reduce the number of versions tested when the latest release for a given major version was more than 2-3 years ago. - fastify, aws-sdk: Update the tested ranges to test the first, last, and approximately 5 versions in between. (Added ./dev-utils/tav-versions-fastify.sh for this.) Also: - Fix a test in bluebird.test.js to be less flaky. In a local run it was observed to fail the 'new Promise -> timeout (timed out)' test case with: `Error: start + 49 should be <= 1683227232779 - was 1683227232782`.
This tweaks the .tav.yml config for most TAV test matrix jobs that were observed to be taking >20 minutes in CI recently. One job was taking longer than 40 minutes and being aborted: test-tav (14, bluebird,got,mimic-response) - mimic-response: The agent instruments and will only ever instrument a single version: v1.0.0. Therefore there is no need for to "test all versions" as long as the regular non-TAV tests test that version. Lock package.json to that version. - bluebird, got, pg, mongodb-core, knex, tedious: Massively reduce the number of versions tested when the latest release for a given major version was more than 2-3 years ago. - fastify, aws-sdk: Update the tested ranges to test the first, last, and approximately 5 versions in between. (Added ./dev-utils/tav-versions-fastify.sh for this.) Also: - Fix a test in bluebird.test.js to be less flaky. In a local run it was observed to fail the 'new Promise -> timeout (timed out)' test case with: `Error: start + 49 should be <= 1683227232779 - was 1683227232782`.
This tweaks the .tav.yml config for most TAV test matrix jobs that were
observed to be taking >20 minutes in CI recently. One job was taking
longer than 40 minutes and being aborted:
test-tav (14, bluebird,got,mimic-response)
mimic-response: The agent instruments and will only ever instrument a
single version: v1.0.0. Therefore there is no need for to "test all
versions" as long as the regular non-TAV tests test that version. Lock
package.json to that version.
bluebird, got, pg, mongodb-core, knex, tedious: Massively reduce the
number of versions tested when the latest release for a given major
version was more than 2-3 years ago.
fastify, aws-sdk: Update the tested ranges to test the first, last,
and approximately 5 versions in between. (Added
./dev-utils/tav-versions-fastify.sh for this.)
Also:
observed to fail the 'new Promise -> timeout (timed out)' test case
with:
Error: start + 49 should be <= 1683227232779 - was 1683227232782
.Details
Here is a table of the slowest TAV jobs on a recent run in CI. These are all
those that took >20 minutes on that run. The latest column is a count of the
number of module versions that were tested for each job.
For example for the
test-tav (14, tedious)
job there were 143 versions oftedious that were installed and tested. That's a waste of resources and time.
The
-> 21
added for some of the rows shows the reduced number of versions afterthe changes in this PR. I only added that for some.
(Note to self: I calculated these number of module versions tests with a command like:
TAV=tedious node14 ~/tm/test-all-versions/index.js -n --verbose | \rg '^\{' | json -ga numVersions | paste -s -d+ - | bc
.)Detail on the bluebird.test.js change
In local testing I noticed this flaky test failure:
The bluebird.test.js change is to hopefully avoid this possible flaky failure in the future.