test: reduce TAV test matrix for slowest jobs #3321

trentm · 2023-05-04T23:47:31Z

This tweaks the .tav.yml config for most TAV test matrix jobs that were
observed to be taking >20 minutes in CI recently. One job was taking
longer than 40 minutes and being aborted:
test-tav (14, bluebird,got,mimic-response)

mimic-response: The agent instruments and will only ever instrument a
single version: v1.0.0. Therefore there is no need for to "test all
versions" as long as the regular non-TAV tests test that version. Lock
package.json to that version.
bluebird, got, pg, mongodb-core, knex, tedious: Massively reduce the
number of versions tested when the latest release for a given major
version was more than 2-3 years ago.
fastify, aws-sdk: Update the tested ranges to test the first, last,
and approximately 5 versions in between. (Added
./dev-utils/tav-versions-fastify.sh for this.)

Also:

Fix a test in bluebird.test.js to be less flaky. In a local run it was
observed to fail the 'new Promise -> timeout (timed out)' test case
with:
Error: start + 49 should be <= 1683227232779 - was 1683227232782.

Details

Here is a table of the slowest TAV jobs on a recent run in CI. These are all
those that took >20 minutes on that run. The latest column is a count of the
number of module versions that were tested for each job.

TIME        TEST JOB                                    NUM MODULE VERSIONS TESTED
----------- ------------------------------------------  --------------------------
1217 20m17s test-tav (10, pg)                            46 versions
1217 20m17s test-tav (14, next)                          29 versions *not changed*
1255 20m55s test-tav (18, bluebird,got,mimic-response)  223 versions
1268 21m08s test-tav (14, express)                       83 versions
1290 21m30s test-tav (8, pg)                             46 versions
1306 21m46s test-tav (10, tedious)                      116 versions
1321 22m01s test-tav (10, mongodb-core)                 102 versions
1339 22m19s test-tav (12, pg)                            46 versions
1352 22m32s test-tav (14, tedious)                      143 versions -> 21
1370 22m50s test-tav (8, knex)                           68 versions
1390 23m10s test-tav (12, mongodb-core)                 102 versions
1410 23m30s test-tav (14, knex)                         106 versions
1454 24m14s test-tav (8, mongodb-core)                  102 versions -> 8
1539 25m39s test-tav (14, aws-sdk)                       77 versions
1543 25m43s test-tav (12, knex)                         106 versions
1657 27m37s test-tav (8, aws-sdk)                        77 versions
1666 27m46s test-tav (10, aws-sdk)                       77 versions
1671 27m51s test-tav (10, knex)                         106 versions -> 6
1707 28m27s test-tav (8, bluebird,got,mimic-response)   223 versions
1820 30m20s test-tav (12, aws-sdk)                       77 versions
1900 31m40s test-tav (12, bluebird,got,mimic-response)  223 versions
2001 33m21s test-tav (14, fastify)                      125 versions -> 16
2042 34m02s test-tav (10, bluebird,got,mimic-response)  223 versions
2414 40m14s test-tav (14, bluebird,got,mimic-response)  223 versions -> 5

For example for the test-tav (14, tedious) job there were 143 versions of
tedious that were installed and tested. That's a waste of resources and time.
The -> 21 added for some of the rows shows the reduced number of versions after
the changes in this PR. I only added that for some.

(Note to self: I calculated these number of module versions tests with a command like:
TAV=tedious node14 ~/tm/test-all-versions/index.js -n --verbose | \rg '^\{' | json -ga numVersions | paste -s -d+ - | bc.)

Detail on the bluebird.test.js change

In local testing I noticed this flaky test failure:

node_tests_1  | # new Promise -> timeout (reject in time)
node_tests_1  | ok 531 should be strictly equal
node_tests_1  | ok 532 should be strictly equal
node_tests_1  | ok 533 should be strictly equal
node_tests_1  | ok 534 should be strictly equal
node_tests_1  | # new Promise -> timeout (timed out)
node_tests_1  | not ok 535 start + 49 should be <= 1683227232779 - was 1683227232782
node_tests_1  |   ---
node_tests_1  |     operator: ok
node_tests_1  |     expected: true
node_tests_1  |     actual:   false
node_tests_1  |     at: <anonymous> (/app/test/instrumentation/modules/bluebird/bluebird.test.js:1156:9)
node_tests_1  |     stack: |-
node_tests_1  |       Error: start + 49 should be <= 1683227232779 - was 1683227232782
node_tests_1  |           at Test.assert [as _assert] (/app/node_modules/tape/lib/test.js:312:48)
node_tests_1  |           at Test.bound [as _assert] (/app/node_modules/tape/lib/test.js:95:17)
node_tests_1  |           at Test.assert (/app/node_modules/tape/lib/test.js:431:7)
node_tests_1  |           at Test.bound [as ok] (/app/node_modules/tape/lib/test.js:95:17)
node_tests_1  |           at /app/test/instrumentation/modules/bluebird/bluebird.test.js:1156:9
node_tests_1  |           at /app/lib/instrumentation/run-context/AbstractRunContextManager.js:95:49
node_tests_1  |           at AsyncLocalStorage.run (async_hooks.js:303:14)
node_tests_1  |           at AsyncLocalStorageRunContextManager.with (/app/lib/instrumentation/run-context/AsyncLocalStorageRunContextManager.js:57:36)
node_tests_1  |           at wrapper (/app/lib/instrumentation/run-context/AbstractRunContextManager.js:95:23)
node_tests_1  |           at /app/lib/instrumentation/run-context/AbstractRunContextManager.js:95:49
node_tests_1  |           at AsyncLocalStorage.run (async_hooks.js:303:14)
node_tests_1  |           at AsyncLocalStorageRunContextManager.with (/app/lib/instrumentation/run-context/AsyncLocalStorageRunContextManager.js:57:36)
node_tests_1  |           at wrapper (/app/lib/instrumentation/run-context/AbstractRunContextManager.js:95:23)
node_tests_1  |           at tryCatch1 (/app/node_modules/bluebird/js/main/util.js:29:21)
node_tests_1  |           at Promise._settlePromiseFromHandler (/app/node_modules/bluebird/js/main/promise.js:588:13)
node_tests_1  |           at Promise._settlePromiseAt (/app/node_modules/bluebird/js/main/promise.js:752:18)
node_tests_1  |           at Promise._settlePromises (/app/node_modules/bluebird/js/main/promise.js:869:14)
node_tests_1  |           at Async._drainQueue (/app/node_modules/bluebird/js/main/async.js:78:16)
node_tests_1  |           at Async._drainQueues (/app/node_modules/bluebird/js/main/async.js:88:10)
node_tests_1  |           at Immediate.Async.drainQueues [as _onImmediate] (/app/node_modules/bluebird/js/main/async.js:13:14)
node_tests_1  |           at processImmediate (internal/timers.js:464:21)
node_tests_1  |           at process.callbackTrampoline (internal/async_hooks.js:130:17)
node_tests_1  |   ...
node_tests_1  | ok 536 should be truthy
node_tests_1  | ok 537 should be strictly equal
node_tests_1  | not ok 538 start + 49 should be <= 1683227232782 - was 1683227232783
node_tests_1  |   ---
node_tests_1  |     operator: ok
node_tests_1  |     expected: true
node_tests_1  |     actual:   false
node_tests_1  |     at: <anonymous> (/app/test/instrumentation/modules/bluebird/bluebird.test.js:1156:9)
node_tests_1  |     stack: |-
node_tests_1  |       Error: start + 49 should be <= 1683227232782 - was 1683227232783
node_tests_1  |           at Test.assert [as _assert] (/app/node_modules/tape/lib/test.js:312:48)
node_tests_1  |           at Test.bound [as _assert] (/app/node_modules/tape/lib/test.js:95:17)
node_tests_1  |           at Test.assert (/app/node_modules/tape/lib/test.js:431:7)
node_tests_1  |           at Test.bound [as ok] (/app/node_modules/tape/lib/test.js:95:17)
node_tests_1  |           at /app/test/instrumentation/modules/bluebird/bluebird.test.js:1156:9
node_tests_1  |           at /app/lib/instrumentation/run-context/AbstractRunContextManager.js:95:49
node_tests_1  |           at AsyncLocalStorage.run (async_hooks.js:303:14)
node_tests_1  |           at AsyncLocalStorageRunContextManager.with (/app/lib/instrumentation/run-context/AsyncLocalStorageRunContextManager.js:57:36)
node_tests_1  |           at wrapper (/app/lib/instrumentation/run-context/AbstractRunContextManager.js:95:23)
node_tests_1  |           at /app/lib/instrumentation/run-context/AbstractRunContextManager.js:95:49
node_tests_1  |           at AsyncLocalStorage.run (async_hooks.js:314:14)
node_tests_1  |           at AsyncLocalStorageRunContextManager.with (/app/lib/instrumentation/run-context/AsyncLocalStorageRunContextManager.js:57:36)
node_tests_1  |           at wrapper (/app/lib/instrumentation/run-context/AbstractRunContextManager.js:95:23)
node_tests_1  |           at tryCatch1 (/app/node_modules/bluebird/js/main/util.js:29:21)
node_tests_1  |           at Promise._settlePromiseFromHandler (/app/node_modules/bluebird/js/main/promise.js:588:13)
node_tests_1  |           at Promise._settlePromiseAt (/app/node_modules/bluebird/js/main/promise.js:752:18)
node_tests_1  |           at Promise._settlePromises (/app/node_modules/bluebird/js/main/promise.js:869:14)
node_tests_1  |           at Async._drainQueue (/app/node_modules/bluebird/js/main/async.js:78:16)
node_tests_1  |           at Async._drainQueues (/app/node_modules/bluebird/js/main/async.js:88:10)
node_tests_1  |           at Immediate.Async.drainQueues [as _onImmediate] (/app/node_modules/bluebird/js/main/async.js:13:14)
node_tests_1  |           at processImmediate (internal/timers.js:464:21)
node_tests_1  |           at process.callbackTrampoline (internal/async_hooks.js:130:17)
node_tests_1  |   ...
node_tests_1  | ok 539 should be truthy
node_tests_1  | ok 540 should be strictly equal
node_tests_1  | # new Promise -> reject -> tap -> catch
node_tests_1  | ok 541 should be strictly equal

The bluebird.test.js change is to hopefully avoid this possible flaky failure in the future.

This tweaks the .tav.yml config for most TAV test matrix jobs that were observed to be taking >20 minutes in CI recently. One job was taking longer than 40 minutes and being aborted: test-tav (14, bluebird,got,mimic-response) - mimic-response: The agent instruments and will only ever instrument a single version: v1.0.0. Therefore there is no need for to "test all versions" as long as the regular non-TAV tests test that version. Lock package.json to that version. - bluebird, got, pg, mongodb-core, knex, tedious: Massively reduce the number of versions tested when the latest release for a given major version was more than 2-3 years ago. - fastify, aws-sdk: Update the tested ranges to test the first, last, and approximately 5 versions in between. (Added ./dev-utils/tav-versions-fastify.sh for this.) Also: - Fix a test in bluebird.test.js to be less flaky. In a local run it was observed to fail the 'new Promise -> timeout (timed out)' test case with: `Error: start + 49 should be <= 1683227232779 - was 1683227232782`.

david-luna · 2023-05-05T13:50:15Z

dev-utils/tav-versions-fastify.sh

@@ -0,0 +1,42 @@
+#!/bin/sh


We have already a couple of files with the same logic. What do you think about a more generic script in a follow up issue?

I'm fine either way. I am kind of hoping to spend some time on the side improving tav to support functionality like this natively. That's vapourware right now, however.

trentm · 2023-05-05T18:49:46Z

Timings from the TAV run after this merge (https://github.com/elastic/apm-agent-nodejs/actions/runs/4895356707):

...
934 15m34s test-tav (12, graphql)
935 15m35s test-tav (14, express)
950 15m50s test-tav (14, handlebars,pug)
954 15m54s test-tav (12, ws)
979 16m19s test-tav (10, graphql)
988 16m28s test-tav (14, redis)
999 16m39s test-tav (8, ws)
1014 16m54s test-tav (8, graphql)
1032 17m12s test-tav (14, graphql)
1046 17m26s test-tav (16, next)
1142 19m02s test-tav (18, next)
1176 19m36s test-tav (14, next)
1267 21m07s test-tav (8, express)
2413 40m13s test-tav (16, knex)
2413 40m13s test-tav (20, knex)
2414 40m14s test-tav (10, knex)
2414 40m14s test-tav (12, knex)
2414 40m14s test-tav (14, knex)
2414 40m14s test-tav (18, knex)
2416 40m16s test-tav (8, knex)

So we had some success.

something going on with knex that is weird
express, next, etc. are likely next targets for improving

…re/support-specific-modules * 'main' of github.com:elastic/apm-agent-nodejs: (54 commits) chore: fix dev-utils/ci-tav-slow-jobs.sh (elastic#3319) test: reduce TAV test matrix for slowest jobs (elastic#3321) chore: sync package-lock so 'npm ci' can work (elastic#3318) docs: document `useElasticTraceparentHeader` config var (elastic#3316) chore, test: test driver improvements (elastic#3293) test: drop node 14 from RC tests now that it is EOL (elastic#3315) test: fix running fastify.test.js with node v8 (elastic#3317) feat: add @apollo/server@4 support (elastic#3203) chore: update nvm (elastic#3309) tests: stop testing 'express-graphql' instrumentation (elastic#3304) chore: fix bitrot.js dev util for recent changes (elastic#3308) test: restore testing of Azure Functions on node >=18.x (elastic#3307) fix: support Lambda instrumentation for `contextManager: 'patch'`; refactor Lambda tests (elastic#3305) test: fix fastify TAV test failures (elastic#3314) test: fix @aws-sdk/client-s3 TAV test failures (elastic#3312) feat: add instrumentation for aws-sdk S3 client (elastic#3287) feat(fastify): add captureBody support (elastic#2681) feat: mysql2@3 support (elastic#3301) chore(deps): bump @opentelemetry/exporter-prometheus from 0.37.0 to 0.38.0 in /test/opentelemetry-metrics/fixtures (elastic#3295) chore(deps-dev): bump fastify from 4.16.3 to 4.17.0 (elastic#3296) ...

This tweaks the .tav.yml config for most TAV test matrix jobs that were observed to be taking >20 minutes in CI recently. One job was taking longer than 40 minutes and being aborted: test-tav (14, bluebird,got,mimic-response) - mimic-response: The agent instruments and will only ever instrument a single version: v1.0.0. Therefore there is no need for to "test all versions" as long as the regular non-TAV tests test that version. Lock package.json to that version. - bluebird, got, pg, mongodb-core, knex, tedious: Massively reduce the number of versions tested when the latest release for a given major version was more than 2-3 years ago. - fastify, aws-sdk: Update the tested ranges to test the first, last, and approximately 5 versions in between. (Added ./dev-utils/tav-versions-fastify.sh for this.) Also: - Fix a test in bluebird.test.js to be less flaky. In a local run it was observed to fail the 'new Promise -> timeout (timed out)' test case with: `Error: start + 49 should be <= 1683227232779 - was 1683227232782`.

trentm requested a review from david-luna May 4, 2023 23:47

trentm self-assigned this May 4, 2023

github-actions bot added the agent-nodejs Make available for APM Agents project planning. label May 4, 2023

david-luna approved these changes May 5, 2023

View reviewed changes

trentm merged commit e36f003 into main May 5, 2023

trentm deleted the trentm/tav-slow-jobs-reduce-matrix branch May 5, 2023 16:08

trentm mentioned this pull request May 8, 2023

chore(deps-dev): bump mimic-response from 1.0.0 to 2.1.0 #3326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: reduce TAV test matrix for slowest jobs #3321

test: reduce TAV test matrix for slowest jobs #3321

trentm commented May 4, 2023 •

edited

Loading

david-luna May 5, 2023

trentm May 5, 2023

trentm commented May 5, 2023

test: reduce TAV test matrix for slowest jobs #3321

test: reduce TAV test matrix for slowest jobs #3321

Conversation

trentm commented May 4, 2023 • edited Loading

Details

Detail on the bluebird.test.js change

david-luna May 5, 2023

Choose a reason for hiding this comment

trentm May 5, 2023

Choose a reason for hiding this comment

trentm commented May 5, 2023

trentm commented May 4, 2023 •

edited

Loading