Skip to content

ci: tav tests failing for kafkajs v1.7.0 #2784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JamieDanielson opened this issue Apr 10, 2025 · 8 comments · Fixed by #2787 or #2789
Closed

ci: tav tests failing for kafkajs v1.7.0 #2784

JamieDanielson opened this issue Apr 10, 2025 · 8 comments · Fixed by #2787 or #2789
Assignees
Labels
bug Something isn't working pkg:instrumentation-kafkajs priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization

Comments

@JamieDanielson
Copy link
Member

JamieDanielson commented Apr 10, 2025

TAV tests are failing for kafka instrumentation on an older version of kafkajs.

In older tests, there were warnings about a change in KafkaJS v2.0.0 for default partitioners. To see this, search the raw log for required packages ["kafkajs@2..

{"level":"WARN","timestamp":"2025-03-19T09:42:57.788Z","logger":"kafkajs","message":"KafkaJS v2.0.0 switched default partitioner. To retain the same partitioning behavior as in previous versions, create the producer with the option "createPartitioner: Partitioners.LegacyPartitioner". See the migration guide at https://kafka.js.org/docs/migration-guide-v2.0.0#producer-new-default-partitioner for details. Silence this warning by setting the environment variable "KAFKAJS_NO_PARTITIONER_WARNING=1""}

In #2752 we updated the test to remove the warning, with createPartitioner: kafkajs.Partitioners.LegacyPartitioner. It seems that PR didn't run TAV tests so we didn't see the failing test.

Now we have a failing test with "[email protected]": TypeError: Cannot read properties of undefined (reading 'LegacyPartitioner')

@JamieDanielson JamieDanielson added the bug Something isn't working label Apr 10, 2025
@JamieDanielson
Copy link
Member Author

I guess the options are

  • Revert the test change. Revert the change in the test using LegacyPartitioner which means we continue to get the warning and punt the update to a later time.
  • Update test based on version. Only use the LegacyPartitioner change for the versions that support it (v1.8.0+).
  • Stop testing v1.7.0.

@JamieDanielson JamieDanielson added priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization pkg:instrumentation-kafkajs labels Apr 10, 2025
@JamieDanielson
Copy link
Member Author

JamieDanielson commented Apr 10, 2025

Interesting. I think we inadvertently skipped testing below 1.x with #2758 with the mode: latest-minors. So while it shows include: ">=0.3.0 <3" in the tav file, I think it's effectively include: ">=1 <3" because I don't see any tests for required packages ["kafkajs@0. anymore.

@trentm
Copy link
Contributor

trentm commented Apr 10, 2025

In older tests,

@JamieDanielson
That link doesn't work for me.

@trentm
Copy link
Contributor

trentm commented Apr 10, 2025

I think it's effectively include: ">=1 <3" because I don't see any tests for required packages ["kafkajs@0. anymore.

Possibly because TAV will abort on the first failure. So because it is failing on 1.7.0, it doesn't get down to 0.x versions.

@trentm
Copy link
Contributor

trentm commented Apr 10, 2025

So because it is failing on 1.7.0, it doesn't get down to 0.x versions.

Yah, you can see that by running tav with the --compat option:

[12:48:28 trentm@peach:~/tm/opentelemetry-js-contrib7/plugins/node/instrumentation-kafkajs]
% npm run test-all-versions -- --compat

> @opentelemetry/[email protected] test-all-versions
> tav --compat

Testing compatibility with kafkajs:
✔ 2.2.4
✔ 2.1.0
✔ 2.0.2
✔ 1.16.0
✔ 1.15.0
✔ 1.14.0
✔ 1.13.0
✔ 1.12.0
✔ 1.11.0
✔ 1.10.0
✔ 1.9.3
✔ 1.8.1
✖ 1.7.0
✖ 1.6.0
✖ 1.5.2
✖ 1.4.8
✖ 1.3.1
✖ 1.2.0
✖ 1.1.0
✖ 1.0.1
✖ 0.8.1
✖ 0.7.1
✖ 0.6.8
✖ 0.5.0
✖ 0.4.1
✖ 0.3.2

@trentm trentm self-assigned this Apr 10, 2025
@JamieDanielson
Copy link
Member Author

@JamieDanielson That link doesn't work for me.

Oops sorry about that! Basically I clicked on a workflow run in a previous PR, clicked Raw Logs, and searched through there.

@JamieDanielson
Copy link
Member Author

Yah, you can see that by running tav with the --compat option:

Ah good to know, thank you!

@trentm
Copy link
Contributor

trentm commented Apr 10, 2025

first issue: failing with [email protected] and lower

I am guessing that part of the #2752 change used this to attempt to not change the Partitioner that was being used for tests:

        producer = kafka.producer({
          createPartitioner: kafkajs.Partitioners.LegacyPartitioner,
        });

https://github.com/tulios/kafkajs/blob/master/docs/Producing.md#default-partitioners talks about LegacyPartitioner being the default until [email protected]. First I tried this:

        producer = kafka.producer(
          kafkajs.Partitioners?.LegacyPartitioner && {
            createPartitioner: kafkajs.Partitioners.LegacyPartitioner,
          }
        );

which works. But this also works:

        producer = kafka.producer();

I prefer the latter, because we are then testing with the kafka Producer defaults instead of tying to a legacy client thing for no particular reason.

Making this change gets more versions passing, but we start failing on v1.4.8:

Testing compatibility with kafkajs:
✔ 2.2.4
✔ 2.1.0
✔ 2.0.2
✔ 1.16.0
✔ 1.15.0
✔ 1.14.0
✔ 1.13.0
✔ 1.12.0
✔ 1.11.0
✔ 1.10.0
✔ 1.9.3
✔ 1.8.1
✔ 1.7.0
✔ 1.6.0
✔ 1.5.2
✖ 1.4.8
✖ 1.3.1
✖ 1.2.0
✖ 1.1.0
✖ 1.0.1
✖ 0.8.1
✖ 0.7.1
✖ 0.6.8
✖ 0.5.0
✖ 0.4.1
✖ 0.3.2

second issue: failing with [email protected] and lower

The test failure is this:

  1) instrumentation-kafkajs
       producer
         successful send
           "before each" hook for "simple send create span with right attributes, pass return value correctly and propagate context":
     KafkaJSNonRetriableError: Event name should be one of producer.events.CONNECT, producer.events.DISCONNECT
      at Object.on (node_modules/kafkajs/src/producer/index.js:136:13)
      at KafkaJsInstrumentation._setKafkaEventListeners (src/instrumentation.ts:15:4)
      at Client.consumer (src/instrumentation.ts:15:4)
      at Client.kafkajs.Kafka.producer (test/kafkajs.test.ts:151:44)
      at Client.consumer (src/instrumentation.ts:15:4)
      at initializeProducer (test/kafkajs.test.ts:229:26)
      at Context.<anonymous> (test/kafkajs.test.ts:235:9)
      at processImmediate (node:internal/timers:476:21)

#2752 added support for the METRIC_MESSAGING_CLIENT_OPERATION_DURATION metric by adding a handler for kafkajs's consumer.events.REQUEST instrumentation event. That instrumentation event was added in [email protected]:

commit b9581cb96f077240e848094d79778bb8a3ac2c2e
Date:   2018-12-07T17:06:38+01:00 (6 years ago)

    Accept network instrumentation events on the producer

diff --git a/src/producer/instrumentationEvents.js b/src/producer/instrumentationEvents.js
index c5eb56c4..95fb68a2 100644
--- a/src/producer/instrumentationEvents.js
+++ b/src/producer/instrumentationEvents.js
@@ -1,7 +1,26 @@
+const swapObject = require('../utils/swapObject')
+const networkEvents = require('../network/InstrumentationEvents')
 const InstrumentationEventType = require('../instrumentation/eventType')
 const producerType = InstrumentationEventType('producer')

-module.exports = {
+const events = {
   CONNECT: producerType('connect'),
   DISCONNECT: producerType('disconnect'),
+  REQUEST: producerType(networkEvents.NETWORK_REQUEST),
+  REQUEST_TIMEOUT: producerType(networkEvents.NETWORK_REQUEST_TIMEOUT),
+}

We can fix this by not supporting the metric in older kafkajs versions:

--- a/plugins/node/instrumentation-kafkajs/src/instrumentation.ts
+++ b/plugins/node/instrumentation-kafkajs/src/instrumentation.ts
@@ -251,10 +251,13 @@ export class KafkaJsInstrumentation extends InstrumentationBase<KafkaJsInstrumen
   private _setKafkaEventListeners(kafkaObj: KafkaEventEmitter) {
     if (kafkaObj[EVENT_LISTENERS_SET]) return;

-    kafkaObj.on(
-      kafkaObj.events.REQUEST,
-      this._recordClientDurationMetric.bind(this)
-    );
+    // The REQUEST Consumer event was added in [email protected].
+    if (kafkaObj.events?.REQUEST) {
+      kafkaObj.on(
+      kafkaObj.on(
+        kafkaObj.events.REQUEST,
+        this._recordClientDurationMetric.bind(this)
+      );
+    }

After this tests pass. (I guess there isn't a test for that particular metric being emitted.)

trentm added a commit to trentm/opentelemetry-js-contrib that referenced this issue Apr 10, 2025
trentm added a commit that referenced this issue Apr 11, 2025
…d earlier (#2787)

The tests broke on [email protected] and earlier.
The instrumentation crashed on [email protected] and earlier.

Refs: #2784 (comment)
Fixes: #2784
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg:instrumentation-kafkajs priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization
Projects
None yet
2 participants