-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redshift Long Running Query hang indefinitely: Query completes on RedShift #1863
Comments
I'm not sure, what changes in the library would have surfaced this issue. Could it be, that there were other changes (different OS) in your environment? This sounds to me, could be related to the fact, that To keep the connection alive, the TCP keepalive has to be set up with at most 200 seconds of idle timeout. You should be able to do this with the currently released version, by setting the The #1847 PR adds the |
This also seems to be happening to me. I am running a single query. It takes about 18-19 minutes. Pgadmin shows connection dropped and query result successfully committed yet promise for pool client query method never seems to be resolved. I tried
As mentioned by @OTooleMichael , it is working for small queries in my case. Other things I have noticed is that, (they might have nothing to do with issue)
|
I tested with both |
The above did help to resolve the issue on my local machine, (although no settings have changed locally save for the pg module) as per:
@boromisp I'm looking forward to the merge request. However will it cover throwing an error when a TCP connection is closed from the other side? That, I think is the main bug I faced here. The complete silence, made it hard to track down. A promise/callback remaining forever open doesn't make sense, and will do things like depleting the Pool, right? Also until merged will this be a work around? function initPool(userId){
let config = getAuth(userId);
Pools[userId] = new Pool(config);
Pools[userId].once('connect', function setKeepAlive(client) {
// overwrite the connection keepAlive to be short
client.connection.stream.setKeepAlive(true,200*1000);
})
return
} |
@OTooleMichael It seems OK to me. (Disclaimer: I'm not an expert; these are just my thoughts on the topic.) Related discussions: pgsql-odbc, pgsql-hackers/1, pgsql-hackers/2 As I understand it, the current version of PostgreSQL relies on the transport layer being well behaved. If the connection is closed gracefully by either party, both the client and the server will handle it as expected. Unfortunately, the server could simply disappear or become unresponsive. If setting more aggressive TCP keepalives solves the problem, then most likely a firewall somewhere along the way drops the connection. If you read through the linked discussions, you will find other edge cases, when only some of the components become unresponsive, while others are still working. After reading this blog post (and later skimming the linked discussions), I don't think there is a way to detect a "broken" connection reliably without application-level health checks. The (linux specific) Quick fixes, some of which you could implement, depending on your use-case: Simple queries could be run with an appropriate If you use a persistent connection to wait for notifications, then regularly running Where it gets more complicated, is the legitimately long-running queries. In those cases the connection will have no activity for extended periods, and / or it takes a long time to receive all the rows. If you expect to receive a lot of rows, you could listen on, and set up a timeout for the And finally, you could use the As an aside, whenever you abort query locally because the connection seems broken, it might be a good idea, to use And just a reminder: whenever we use timeouts to abort queries or close connections, there will be a race condition on whether or not the query succeeds. |
Just came here to say that this fixed the issue for me - thanks! I'm not using RedShift but an AWS-based managed TimescaleDB instance. I'm also running the application/queries on my machine inside a Docker container, i.e. from outside AWS, so I'm guessing that's what's doing it as per the docs in the link above. I had initially tried using |
First off love the module and use it often for large ETL processes for multiple clients, certainly the best Db connection module on npm (and I often have to connect to literally every type of Db in the same @project).
After upgrading from v6.4 to v7.9 long running queries began to hang indefinitely despite completing successfully on the Redshift (RS) instance. Cant find a similar issue.
Other queries can still be run against RS (and return results)
No errors thrown, no evidence of success or failure. Queries work fine if they are shorter.
I've tried listening to every error event I can find in the API and using pg.Client rather than pg.Pool but there is no difference.
I'm out of ideas, code below.
Help much appreciated
The text was updated successfully, but these errors were encountered: