-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Connection terminated unexpectedly only in Node.js v20 #3083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Running out of ideas here. It appears like a connection inside of a pool is getting terminated prematurely. The query happens immediately after the pool is created, so it is not a timeout question. Appears like so weird race condition. Odd that it is only happening in Node.js v20 |
Sorry that's happening! Do you have any self-contained code to reproduce? |
Been breaking my neck the whole day trying to isolate this. So far I've got nothing that I can share. I will close this until I can share something more concrete. |
Have you tried bisecting the node versions or pg versions to see when the break happens? Given the randomness of the error and how hard it is to reproduce, I suggest starting with v20.0.0 (vs. v18.LATEST) as a start to see if that's where the error is starting. Nothing comes to mind in this module or node proper that would cause a break, but major versions are when internals like compiler versions, libc requirements, V8 runtimes, and many other things get updated. It could be any one of them or a combination of them that's leading to some new race condition. |
alright...I'm standing by! |
What's strange is that it happens only on 1 query. const setupPool = await createPool(CAS_POSTGRES_DSN, {
captureStackTrace: true,
connectionTimeout: 5_000,
maximumPoolSize: 1,
});
const { TEST_RUN_ID } = globalThis;
if (!TEST_RUN_ID) {
throw new Error('TEST_RUN_ID is not set.');
}
const database = 'test_' + TEST_RUN_ID.replaceAll(/[^\da-z]/gu, '');
// https://github.com/brianc/node-postgres/issues/3083
await retry(
async () => {
await setupPool.query(sql.typeAlias('void')`
CREATE DATABASE ${sql.identifier([database])}
TEMPLATE contra_template
`);
},
{ delayMs: 100, maxRetries: 3 },
); None of the other queries are affected by this. But the one that's responsible for setting up the test database keeps erroring with "Connection terminated unexpectedly". I cannot replicate the same error even once on v18. Adding a retry here fixed our CI from flaking, but the issue definitely persists. |
So still no clue after debugging this for 24 hours. I've enabled logs of everything, but haven't caught anything definitive, esp. that would link to version update. We are also seeing the same issue in Node.js v21. Nothing if we downgrade to v18. One interesting data point that surfaced is ...
Burst of unexpected terminations happened at 2023-10-25T20:19:23 At 20:19:22 databased logged this:
I don’t have an educated guess here, but I am curious if it is somehow related to the FATAL error we are seeing because the database is being dropped while it being vacuumed. We don’t need autovacuum in these test database anyways, so I am removing it. Will try to validate this hypothesis. |
The node client crashing when the database fast shutdowns is a known thing: #1324 (comment) Without an error event handler, the client will crash the node process if the connection gets killed. That's only if the connection is checked out as the pool attaches error handlers, but they're only in place while the connection is in the pool. To handle all errant situations, you need to add error handlers as the connection crashed error comes as an event, not in response to a promise rejection or callback. Maybe this is purely a race condition in the tests with the newer node versions running faster (or slower?) and that's triggering that test case at the exact moment when the DB is being dropped. |
Thank you. This gives me something to work with. |
Looks like you tried five years ago 😄 #1700 Try killing the connection with |
oh lol. The That would have caused a different error to propagate.
I realized I can replicate this by force killing the process that's running Postgres.
But it looks like I am already attaching error handlers to the pool and to the client? |
Added a test for unexpected backend termination https://github.com/gajus/slonik/pull/522/files#diff-ba6517f3e26c5f80c89793d89578ca3350fe0ff30f2c4a4c4e8ff051d92d720bR1 |
I am not entirely sure what's the best way to handle this. Is it safe to retry a query in this scenario? I would think so. Need to do more research. |
Depends on the query. It may or may not have completed. |
I am having the exact same error here. Sometimes is when reaching out to a Redis instance and sometimes is when I have to reach out the DB itself (Mysql). Could not find a way to solve this without downgrade NodeJs version. It seems that is something with node indeed. |
@JafetGuerra what is node.js version you downgrade to? |
@timurturbil I am using NodeJs 18 in some projects and NodeJs 16 in only one. I believe that downgrading to 18 should work. |
In my case, this error occurred because the server from which I was calling the APIs was serving through nginx http/1.1 to my node@20 API call. Updating the the API server to nginx http/2.x resolved the issue. |
Has anyone encountered an issue migrating from Node.js v18 to v20?
I am trying isolate exactly what's breaking, but so far unsuccessful.
All tests are passing in Node.js v18, but as soon as we try to run them in v20, they start to fail with:
It happens to about ~1% of all tests (out of about a thousand).
It is not happening to the same tests.
Only happens when running a large number of parallel tests.
The text was updated successfully, but these errors were encountered: