Skip to content

Cloud jobs and functions do not behave the same way for async operations #4995

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
danepowell opened this issue Aug 19, 2018 · 5 comments
Closed

Comments

@danepowell
Copy link

danepowell commented Aug 19, 2018

Issue Description

When I define Cloud jobs as async functions and make calls within those jobs using await, Node.js doesn't actually wait for those calls to resolve; the function terminates prematurely.

Note that Cloud functions work just fine--all of my await calls properly resolve before the function terminates.

Looking through FunctionsRouter.js, it seems like jobs and functions get called using very different control structures. The fact that the exact same code behaves so differently in jobs vs functions leads me to believe that there is some sort of bug in how Parse Server is dispatching cloud jobs.

I am running these calls on AWS Lambda, which is aggressive about killing functions, but theoretically this could be a problem anywhere I think.

I first ran into this on Parse Server 2.8.2, I was hoping that 3.0.0 would fix it with its better support for async Cloud operations, but the problem persists.

Steps to reproduce

Take a look at my code snippet here: https://stackoverflow.com/questions/51832572/trouble-with-async-and-parse

Duplicate a version of that runs as a Cloud function rather than Cloud job. Observe

Environment Setup

  • Server
    • parse-server version (Be specific! Don't say 'latest'.) : 3.0.0
    • AWS Lambda / Node.js 8.10
@flovilmart
Copy link
Contributor

Closing as working as expected.

@flovilmart
Copy link
Contributor

For the sake of completeness here's the anwser on the original question:

I've explained why it isn't a bug it's by design and on purpose. So you can run things longer that the httpRequest timeout.

If you have a look at the code here you'll see that the handleCloudJob() function goes to great length to isolate the job execution from the request life cycle. This is BY DESIGN and NOT A BUG.

Now, let's see how it works in lambda:

  1. Request comes in
  2. handleCloudJob() is called
  3. The job is set a running
  4. We call process.nextTick() to enqueue the job effective work
  5. we return the job status ID.
  6. the response is sent with the jobID
  7. the job work starts, enqueued by process.nextTick
    ... > time passes
  8. the job finishes
  9. the jobStatus gets updated

Now imagine for a minute we didn't do that:

at if instead of 6. Sending the response, we were 'awaiting' on the job, like you suggest

What would happen is:

  • The timeout for the socket would probably be reached (30s), as jobs are 'long'
  • the connection would be closed
  • impossible to get the job ID.

Now what happens in you lambda environment:

  • the enqueuing is done at step 4,
  • at step 6 the response is sent
  • the response is sent to the client
  • the lambda environment is destroyed.

Which is the expected behaviour.

Can it be better? Perhaps, but that requires a totally different execution environment on jobs. basically worker that await for messages that functions that executes for long time in the background (so not front-end lambdas)

In the lambdas FAQ's it's stated that the default timeout is 3s and can be extended to max 300s. So by default lambda isn't suited to run jobs.

While I understand your frustration and that it doens't make sense to 'you', it's all by design.

Even if we swapped the implementation on parse-server it wouldn't make it suited for lambdas to run jobs.

@danepowell
Copy link
Author

danepowell commented Aug 20, 2018

Okay thanks, I understand now why it's by design that Cloud jobs and functions behave differently. I guess I was just confused by all of the recent talk about making things work better with async, to me this implied that you should now be able to to use await in Cloud jobs. Maybe it's worth documenting that that's not the case.

In this case, I'll try changing this to a Cloud function and optimizing it to run more quickly in smaller batches (I've set my Lambda timeout much higher than the default 3 seconds anyway).

@flovilmart
Copy link
Contributor

to me this implied that you should now be able to to use await in Cloud jobs

Jobs by definition are not something you wait upon. They execute and update a status in the DB. This hasn't changed from the previous iteration. The only thing that changed is the way to notify completion / error of the jobs which is compliant 100% with the way async function work, no more success/error.

I'll try changing this to a Cloud function and optimizing it to run more quickly in smaller batches

You probably should look at an execution environment that safely let you run long lived executions, instead of trying to cram everything into a function.

@danepowell
Copy link
Author

Yeah, I totally appreciate that I'm running in a non-standard environment here (Lambda). It's not by choice unfortunately. I need to make this work on AWS, and Lambda seems like the least bad option (I have no desire to manage an entire EC2 stack just to run a Parse Server).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants