Skip to content

Connection timing out on aws lambda #1180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Andrew-Max opened this issue Dec 12, 2016 · 23 comments
Closed

Connection timing out on aws lambda #1180

Andrew-Max opened this issue Dec 12, 2016 · 23 comments

Comments

@Andrew-Max
Copy link

Andrew-Max commented Dec 12, 2016

I am trying to connect to a postgres db from an aws lambda function using the below code

var dbConfig = { host: {endpoint}, user: {dbuser}, database: {dbname}, password: {password}, port: {port} };

var pg = require('pg');
var client = new pg.Client(dbConfig);

exports.handler = function(event, context, callback) {
  client.connect(function (err) {
      client.end();
      context.succeed(context);
  });
};

If I run this via node locally it works fine and returns within a second or two. In lambda it times out when allotted up to 60 seconds

the lambda timeout error message is

{
  "errorMessage": "2016-12-12T17:18:18.617Z f8406e80-c08e-11e6-a0d3-8bc89c563312 Task timed out after 60.00 seconds"
}

The db I am trying to get to is on amazon rds and I am using the endpoint they provided me as the host on config.

Any thoughts on why this might be failing to connect?

EDIT: I upped the lambda execute time to 3 mins.

It finally failed after 128 seconds with:

{
  "errorMessage": "connect ETIMEDOUT 35.165.66.116:5432",
  "errorType": "Error",
  "stackTrace": [
    "Object.exports._errnoException (util.js:870:11)",
    "exports._exceptionWithHostPort (util.js:893:20)",
    "TCPConnectWrap.afterConnect [as oncomplete] (net.js:1062:14)"
  ]
}
@vitaly-t
Copy link
Contributor

You should research this against AWS Lambda, it is not relevant to this library.

Starting point: http://stackoverflow.com/questions/37232827/aws-lambda-function-stops-working-after-timed-out-error

@hassankhan
Copy link

hassankhan commented Dec 13, 2016

@vitaly-t Actually it is relevant to the library, and it's getting a bit annoying hearing that it isn't. All Node Postgres libraries assume that they will be running on long-lived processes, whereas on Lambda that is explicitly NOT the case.

@Andrew-Max You're having this problem because the connection to PGP doesn't get closed, and you will have to do it yourself:

var pgp = require('pgp')();

pgp.any(query)
  .then((response) => {
    pgp.end();
    return response;
  })
  .catch((error) => {
    pgp.end();
    throw error;
  });

Sorry, I don't mean to be rude, its just that your package is one of the few libraries of decent quality on NPM, and it was super frustrating because it worked in a dev environment but not on Lambda.

@charmander
Copy link
Collaborator

@hassankhan, what you’re describing sounds like a task timing out because an active pooled connection kept the Node process running, despite the task otherwise succeeding. However, as you can see from @Andrew-Max’s second error above with an increased Lambda timeout, the connection never succeeds at all. They’re also using pg.Client directly, without a pool, and closing the client properly.

Actually it is relevant to the library, and it's getting a bit annoying hearing that it isn't. All Node Postgres libraries assume that they will be running on long-lived processes, whereas on Lambda that is explicitly NOT the case.

This library doesn’t make assumptions about process lifetime. Anyway, considering the above, I think @vitaly-t is right; the issue doesn’t have anything to do with pg. The database server just isn’t reachable.

@sehrope
Copy link
Contributor

sehrope commented Dec 13, 2016

If it times out when running in a Lambda then it's most likely a networking issue. By default the security permission on AWS would not allow inbound connections to your database (assuming it's RDS) from the Lambda. Look into the security groups for your Lambda and database.

FYI, the timeout is happens waiting for the TCP socket to get established. AWS drops the network packets so from the perspective of the Lamda it just keeps trying to establish the socket without knowing that the error is permanent. If you lower the socket timeout you'll see the error sooner.

@hassankhan
Copy link

hassankhan commented Dec 14, 2016

@charmander Does the connection pooling not qualify as making assumptions about process lifetime? It keeps itself alive and thus the process continues to run unless explicitly closed.

You guys are right, in this case it's more likely a networking issue, but my point still remains: these libraries assume I want to pool the connection when I don't care about that at all.

@charmander
Copy link
Collaborator

@hassankhan

  • This library includes pg-pool for convenience, but it’s opt-in. The example seen above:

    var dbConfig = { host: {endpoint}, user: {dbuser}, database: {dbname}, password: {password}, port: {port} };
    
    var pg = require('pg');
    var client = new pg.Client(dbConfig);
    
    exports.handler = function(event, context, callback) {
      client.connect(function (err) {
          client.end();
          context.succeed(context);
      });
    };

    does not create a pool.

  • Pools can be useful even in short-lived processes.

  • In pretty much any situation, you should still release the pool; there’s no* way for the library to know when you’re done with it. (Just because it’s common for people to let SIGINT exit a web server process without cleaning up doesn’t mean it’s right.)

@hassankhan
Copy link

hassankhan commented Dec 14, 2016

Thanks @charmander, really appreciate the information. I was not aware that you should be manually releasing the pool.

Just to pick your brain, could you give me an example of using pools in short-lived processes?

@thewillhuang
Copy link

the api is here, https://github.com/brianc/node-pg-pool#plays-nice-with-asyncawait
if you pass in a true to release, it would also destroy the client so your lambda process won't keep it. Though if you are having heavy traffic, regardless if you are using pool or not, you might exhaust all your database memory due to the concurrent clients your server needs to handle.

client.release(true);

@lostcolony
Copy link

lostcolony commented Jan 23, 2017

I ran into something akin to this, and figured I'd comment after getting it worked out, rather than file a separate ticket.

I had to connect from a lambda running inside a VPC. If I ran the lambda on a subnet that could not hit the wide open internet, the client.connect call would timeout. If I ran the lambda on a subnet that had internet access, the connect call worked fine.

I don't know the reason, and am not digging into it as I don't have time (if any maintainers of the code want to weigh in I'd be interested though). But, basically, it looks like part of the client.connect call requires being able to talk to the outside internet. If that's a known requirement, it'd be nice to document.

@shellandbull
Copy link

shellandbull commented Mar 22, 2017

I'm running into the same issue. My PG database is accessble from the lambda as I can connect.

However it looks like node-pg keeps the process running even afer the connection has been terminated? please advice further. My code is:

  start() {
    return new Promise((resolve, reject) => {
      const client = new PG.Client(env.db.pg.url);
      client.connect((error) => {
        if (error) {
          reject(error);
        } else {
          this.client = client;
          resolve(this);
        }
      });
    });
  }
  stop() {
    return new Promise((resolve, reject) => {
      this.client.end((error) => {
        if (error) {
          reject(error);
        } else {
          resolve(null);
        }
      });
    });
  }

EDIT This was solved for me. My app was creating a connection somewhere else which wasn't being closed

@hassankhan
Copy link

hassankhan commented Mar 22, 2017

I used client.destroy() immediately after executing the query to completely destroy the pool, hope that helps.

@charmander
Copy link
Collaborator

However it looks like node-pg keeps the process running even afer the connection has been terminated?

Does that mean you’re able to run queries?

I used client.destroy() immediately after executing the query to completely destroy the pool, hope that helps.

If you do this, there’s no point in creating a pool to begin with. Use a single client.

@hassankhan
Copy link

hassankhan commented Mar 22, 2017

Just to clarify, I'm not using this library directly, I'm using knex with pg. Sorry for the noise.

@yoaquim
Copy link

yoaquim commented Jun 20, 2017

Just posting what worked for me:

As @lostcolony mentioned, some bit of code needs public access, and so I added an ALL egress rule on my lambda's security group.

Tried a lot of stuff, and this finally did the trick.

@jfaraklit
Copy link

It about your access to RDS from Lambda, you may want to check these things first based on your set up:

  • RDS is accessible publicly and your Lambda has internet access
  • If your RDS is in VPC then your Lambda needs to be to have access or in that VPC
  • Each time lambda containers are created, it is under a subnet, make sure your subnet under the VPC has access to the internet or other VPC resources.

@joaojeronimo
Copy link

So, I just ran into this and my understanding is that even though on your lambda you get to call the final callback and expect it to respond, some custom code on the lambda implementation is checking if there are any handlers (like these open database connections) in the event loop before assuming your function is "done".

With the context object passed into the handler function, setting context.callbackWaitsForEmptyEventLoop = false makes everything behave as expected for me. The pool is left on the top scope with open connections and the functions "finish".

@yeedle
Copy link

yeedle commented Feb 7, 2018

Ran into the same issue, and after a lot of digging figured out that I had a combination of two issues: 1, the event loop isn't empty unless you end the pool which kind of defeats the purpose of using a pool as @charmander said. The solution to this is to add context.callbackWaitsForEmptyEventLoop = false or to use a client and destroy it before the callback. The second issue is that the lambda needs access to the RDS's vpc. Weirdly enough, it can "connect" even without having access, but it gets blocked on queries. Thanks everyone in this thread for sharing their info.

@arichiardi
Copy link

Hello folks, I am running into this issue and I'd like to understand why context.callbackWaitsForEmptyEventLoop = false is solving. If it is true that a new Client does not create a pool, then once .end() is called things should be all good.

Also, I see no .destroy() other than this on the client.

Am I missing something?

@quangv
Copy link

quangv commented Jul 3, 2018

I am also running into this issue as well. I am not sure if I'm setting up my VPC correctly.

It seems strange to me that running the code locally works, but on Lambda does not. So I am not sure if it's a VPC issue because logically I would think it'll prevent a remote IP from accessing it vs lambda that's on the cloud next door...

I am curious if others are having this issue, or is this mostly solved. Can anyone provide a working example?

Thanks in advance.


Editted: Finally got it working. It was indeed a VPC issue.

// Generated by CoffeeScript 2.2.4
var Client;

({Client} = require('pg'));

exports.handler = async function(event, context) {
  var client;
  client = new Client();
  await client.connect();
  return 'connected';
};

Also callback style:

// Generated by CoffeeScript 2.2.4
var Client;

({Client} = require('pg'));

exports.handler = function(event, context, callback) {
  var client;
  //context.callbackWaitsForEmptyEventLoop = false
  client = new Client();
  return client.connect(function(err) {
    client.end();
    if (err) {
      return callback(err);
    } else {
      return callback(null, 'connected');
    }
  });
};

The way my VPC is setup (though there might be a better way):

  • Lambda & RDS same VPC, security group & subnets
  • Had to manually add an incoming-rule to the Security Group See screenshot (This was the last thing I had to do before it worked)

99

https://blog.shikisoft.com/running-aws-lambda-in-vpc-accessing-rds/ - more in depth tutorial how to set up VPC.

@arichiardi
Copy link

I can confirm the above VPC works for me as well, however I still see the lamdba stuck if I do not add callbackWaitsForEmptyEventLoop.

@shellandbull
Copy link

shellandbull commented Jul 4, 2018

@arichiardi I think thats because, depending on your app code, you might be in a place where the lambda execution finishes but the PG library hasn't ended the connection properly(which is emitted through an event), so the connection stays open leaving you in a weird state

@deftomat
Copy link

Unfortunately, it looks like this library schedule a timeout to keep the connection alive, so the event loop will never be empty and Lambda will never return because callbackWaitsForEmptyEventLoop is true.

Also, setting callbackWaitsForEmptyEventLoop to false is not a solution, because then you need to place await on every async operation as AWS could freeze Lambda in a middle of that operation. For example, when you want to put the data into database but the result is not necessary for you and you want to do something else in a meantime. Without setting it to true, you need to remember every promise and wait for its resolution manually 😞

@sehrope
Copy link
Contributor

sehrope commented Aug 14, 2019

Also, setting callbackWaitsForEmptyEventLoop to false is not a solution, because then you need to place await on every async operation as AWS could freeze Lambda in a middle of that operation. For example, when you want to put the data into database but the result is not necessary for you and you want to do something else in a meantime.

IIUC, AWS will not freeze the Lamda if you haven't returned yet. It only freezes when it thinks it's idle.

You can do multiple things concurrently but if the work for your Lamda involves doing some DB interaction you can't pretend you completed early if it's not done. There's no guarantee your Lamda would continue running.

Without setting it to true, you need to remember every promise and wait for its resolution manually

You have to do that anyway if you actually care about success or failure. That's not even specific to Lamdas. At the very least you should be handling and ignoring rejections as it could crash your process in the future:

$ node -e 'Promise.reject(new Error())'
(node:9978) UnhandledPromiseRejectionWarning: Error
    at [eval]:1:16
    at ContextifyScript.Script.runInThisContext (vm.js:50:33)
    at Object.runInThisContext (vm.js:139:38)
    at Object.<anonymous> ([eval]-wrapper:6:22)
    at Module._compile (module.js:653:30)
    at evalScript (bootstrap_node.js:479:27)
    at startup (bootstrap_node.js:180:9)
    at bootstrap_node.js:625:3
(node:9978) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:9978) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Best to embrace the async nature of Node.js, await any promises along the way, and have your final handler return a Promise that resolves or rejects with the result.

Only pg specific advice here is to not use a Pool as it'll get broken if the Lamda is frozen / thawed due to any outstanding TCP connections being dropped. Use a Client directly and do await client.end() before you indicate that your Lamda is complete. Or if you want something Pool-like, try something like #1938.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests