connect failures cause queries to black-hole #718

davepacheco · 2015-01-27T00:10:08Z

This may be a dup of #632, but in my case, I'm not using a connection pool at all. I've found that if I issue a connect(), then queue up a few queries, and the connect() fails, I never get the callbacks for my queries. Similarly, if I issue a connect(), then queue up a few queries, and then the connect() succeeds, but I immediately call client.end(), then I get a callback for only one of the two queries. I understand why it's happening, but it doesn't seem like the desired behavior. I'd expect that when the connect() fails, any queries that have been queued would fail with an error indicating that the client is not connected.

Here's a pretty simple test program that shows all three cases:

/*
 * test-conn.js: connects to postgres and issues two trivial queries.  This is
 * used to test behavior when the postgres connection fails.  The user supplies
 * the conn string so that it's easy to test both the success and failure cases.
 */

var pg = require('pg');
var util = require('util');

var query = 'select NOW()';     /* sample trivial query */
var nqueries = 0;               /* number of outstanding queries */
var client;                     /* postgres client */
var timer;                      /* 5-second timer (see below) */

function main()
{
        if (process.argv.length < 3) {
                console.error('node test-conn.js PG_CONN_STR');
                console.error('example conn string: ' +
                    'postgres://username:password@localhost/database');
                process.exit(2);
        }

        /* Connect to postgres. */
        client = new pg.Client(process.argv[2]);
        client.connect(function (err) {
                if (err) {
                        console.error('pg client connect: %s', err.message);
                } else {
                        console.error('pg client connected');
                        if (process.argv[3] == '--close-on-connect') {
                                console.error('immediately closing pg client');
                                client.end();
                        }
                }
        });

        /*
         * Immediately issue two queries.  These will be queued on the client.
         * If the connection completes successfully, these queries will
         * generally complete successfully.  But if the connection fails, they
         * appear to "black-hole" (we never get the callbacks).
         */
        nqueries++;
        client.query(query, function (err, result) {
                console.error('pg.query[0] callback: ', err, result);
                queryDone();
        });

        nqueries++;
        client.query(query, function (err, result) {
                console.error('pg.query[1] callback: ', err, result);
                queryDone();
        });

        /*
         * If we just did that, then if the connection fails, the program prints
         * the error message and exits.  One might think that something
         * triggered an abnormal Node exit before the query callbacks get
         * called.  By setting a timer for 5 seconds, we can prove that Node is
         * still running for a while after the error, and we still never got the
         * query callbacks.  When the timer expires, the program will exit.
         */
        timer = setTimeout(function () {
                console.error('5 second timer expired!');
        }, 5000);
}

function queryDone()
{
        /*
         * When all of the queries complete, shut down the client and clear the
         * timer so that the program exits normally.
         */
        if (--nqueries === 0) {
                client.end();
                clearTimeout(timer);
                timer = null;
        }
}

main();

If I run it with a valid connection string, I get the expected results from the two queries:

$ node test-conn.js postgres://localhost:5432/zag 
pg client connected
pg.query[0] callback:  null { command: 'SELECT',
  rowCount: 1,
  oid: NaN,
  rows: [ { now: Mon Jan 26 2015 16:07:01 GMT-0800 (PST) } ],
  fields: 
   [ { name: 'now',
       tableID: 0,
       columnID: 0,
       dataTypeID: 1184,
       dataTypeSize: 8,
       dataTypeModifier: -1,
       format: 'text' } ],
  _parsers: [ [Function] ],
  RowCtor: [Function],
  rowAsArray: false,
  _getTypeParser: [Function] }
pg.query[1] callback:  null { command: 'SELECT',
  rowCount: 1,
  oid: NaN,
  rows: [ { now: Mon Jan 26 2015 16:07:02 GMT-0800 (PST) } ],
  fields: 
   [ { name: 'now',
       tableID: 0,
       columnID: 0,
       dataTypeID: 1184,
       dataTypeSize: 8,
       dataTypeModifier: -1,
       format: 'text' } ],
  _parsers: [ [Function] ],
  RowCtor: [Function],
  rowAsArray: false,
  _getTypeParser: [Function] }

If I run it with the wrong port number in the connection string, I get the connect() error message and no query callbacks:

$ node test-conn.js postgres://localhost:5433/zag 
pg client connect: connect ECONNREFUSED
5 second timer expired!

If I run it with the valid connection string again but give the --close-on-connect flag, the program calls client.end() immediately upon connect(), and I get exactly one callback:

$ node test-conn.js postgres://localhost:5432/zag --close-on-connect
pg client connected
immediately closing pg client
pg.query[0] callback:  [Error: Connection terminated] undefined
5 second timer expired!

The text was updated successfully, but these errors were encountered:

davepacheco · 2015-05-14T22:28:08Z

Is there any update on this issue?

brianc · 2016-06-21T16:28:35Z

Using the query queue isnt really recommended if you need consistent error handling. I recommend not issuing queries until after the connection is successful and then using async flow control to send queries only after others succede

davepacheco · 2016-06-22T17:26:04Z

I'm not looking for sophisticated error handling. I'm just looking to maintain the invariant that if I kick off an asynchronous operation with a callback, then the callback will eventually be invoked. If the queue doesn't provide that, shouldn't it be an assertion failure to issue more than one query at once, at least by default?

I totally understand if fixing this isn't a priority. But the behavior seems extremely non-idiomatic, and I'm not sure how someone could use the queueing interface in a correct program.

brianc · 2016-06-22T18:32:02Z

Yah I totally agree with you. It is non-idiomatic. This lib actually predates most node idioms interestingly, which is why there is no test framework other than executing files and calling assert on things. So the queue was designed before there were widely understood patterns of async handling in node. highly recommend not using the queue unless you're doing simple little shell scripts or something and just crash on error on purpose. Also, I updated the documentation to remove reference to it...'cause...yeah it's not a good design. 😬

As far as throwing if 1 query is issued before another one that's a pretty big backwards incompatible change with not much upside so I haven't done it. I feel your pain though, trust me, maintaining bad design decisions isn't always the most fun, and the occasional confusion they cause is painful to me.

All that being said I'm open to a PR that causes other queries to actually properly cause error callbacks to fire - it never hurts to improve what we have. 😄

dynajoe · 2017-03-09T23:13:42Z

This seems very similar to a situation I'm experiencing. How should one opt-out of the query queueing mechanism? I'm concerned that over time my processes will run out of memory with callbacks that have not been satisfied.

dynajoe · 2017-03-10T04:13:54Z

@brianc What do you think about pulsing the query queue regularly and if the underlying stream has ended throwing a connection ended on all the remaining queries to be processed? It's unfortunate to be able to enqueue a query when the underlying stream has been destroyed.

dynajoe · 2017-03-11T14:23:46Z

For those interested in a work around:

dynajoe/tinypg@03188a5

Just periodically check to see if the stream was destroyed. I may have been able to do this with events and an initial check upon query. However, this is a good start to preventing our applications from stalling.

brianc closed this as completed Jun 21, 2016

charmander added the bug label Mar 10, 2017

charmander mentioned this issue Sep 14, 2017

Fails to error on queries inside a transaction after a lost connection #1454

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

connect failures cause queries to black-hole #718

connect failures cause queries to black-hole #718

davepacheco commented Jan 27, 2015

davepacheco commented May 14, 2015

brianc commented Jun 21, 2016

davepacheco commented Jun 22, 2016

brianc commented Jun 22, 2016

dynajoe commented Mar 9, 2017

dynajoe commented Mar 10, 2017

dynajoe commented Mar 11, 2017

connect failures cause queries to black-hole #718

connect failures cause queries to black-hole #718

Comments

davepacheco commented Jan 27, 2015

davepacheco commented May 14, 2015

brianc commented Jun 21, 2016

davepacheco commented Jun 22, 2016

brianc commented Jun 22, 2016

dynajoe commented Mar 9, 2017

dynajoe commented Mar 10, 2017

dynajoe commented Mar 11, 2017