Skip to content

Lambda Node 22 coldstart latency regression #6914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks done
perpil opened this issue Mar 2, 2025 · 3 comments
Open
4 tasks done

Lambda Node 22 coldstart latency regression #6914

perpil opened this issue Mar 2, 2025 · 3 comments
Assignees
Labels
bug This issue is a bug. p2 This is a standard priority issue potential-regression Marking this issue as a potential regression to be checked by team member

Comments

@perpil
Copy link

perpil commented Mar 2, 2025

Checkboxes for prior research

Describe the bug

When I updated to Node 22 on Lambda using the AWS Javascript SDK V3 last year, it added at least 50 ms to my coldstart times. For awhile I assumed it was due to the edge caches not being warm, but recently the lambda perf benchmark was updated with Node 22 and showed no appreciable difference in coldstart times between Node 20 and Node 22. I investigated further and it appears it is only when using the AWS Javascript SDK V3. In Node 22, it loads the http request bits that add 50 ms to the coldstart even if they aren't used. Similar to #6144, adding some lazy loading logic on the http request and agent should keep the coldstart performance inline with Node 20.

Regression Issue

  • Select this option if this issue appears to be a regression.

SDK version number

@aws-sdk/client-sts 3.750

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

Node 22.11

Reproduction Steps

I've created a repro with instructions here.

Observed Behavior

Node 22 exhibits 50 additional ms of coldstart time over Node 20 in the Lambda environment.

Here is a snippet from the README in the repro:

  1. Runs 1 and 2 show the baseline without the AWS SDK.
  2. Runs 3 and 4 show the difference between Node 20 and Node 22 with the AWS SDK.
  3. Runs 5 and 6 show the difference between Node 20 and Node 22 with a patch that removes http from the AWS SDK.

These are samples from my runs, but the results were repeatable.

run node version @initDuration notes
1 20.18 142.79 No AWS V3 SDK
2 22.11 142.26 No AWS V3 SDK
3 20.18 204.23 With AWS V3 SDK
4 22.11 255.26 With AWS V3 SDK
5 20.18 200.42 With patched AWS V3 SDK
6 22.11 203.62 With patched AWS V3 SDK

Expected Behavior

Similar coldstart performance between Node 20 and Node 22.

Possible Solution

Lazy load the http request and agent bits.

Additional Information/Context

No response

@perpil perpil added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 2, 2025
@github-actions github-actions bot added the potential-regression Marking this issue as a potential regression to be checked by team member label Mar 4, 2025
@aBurmeseDev aBurmeseDev self-assigned this Mar 5, 2025
@aBurmeseDev aBurmeseDev added investigating Issue is being investigated and/or work is in progress to resolve the issue. and removed needs-triage This issue or PR still needs to be triaged. labels Mar 5, 2025
@aBurmeseDev
Copy link
Contributor

aBurmeseDev commented Mar 10, 2025

Hi @perpil - thanks for reporting and for your patience while I investigate. Based on my benchmarks, I was able to confirm the difference in coldstart times of ~15-20ms between Node 20 and 22. I went on to perform the benchmarks of http vs https between two Node versions and noticed similar performance. As you mentioned, we might consider to add lazy loading logic on the http request and agent to inline with Node 20. If you may find anything else that might be useful, please feel free to add. Thanks again!

Benchmarks

  • ap-south-1
$ AwsSdkLambdaJsBenchmark> yarn benchmark
╔════════════════════════════════════════════╤════════════════════╤════════╤════════╤════════╗
║                                            │ metric             │ p50    │ p90    │ stdDev ║
╟────────────────────────────────────────────┼────────────────────┼────────┼────────┼────────╢
║ [node 22.11.0, x86_64, 128 MB,             │ init_duration (ms) │ 375.13 │ 401.87 │ 12.11  ║
║ ap-south-1]: Code (esm) with sts v3.758.0  │                    │        │        │        ║
║ (2.27 MB)                                  │                    │        │        │        ║
╚════════════════════════════════════════════╧════════════════════╧════════╧════════╧════════╝

$ AwsSdkLambdaJsBenchmark> yarn benchmark --node-versions 20
╔════════════════════════════════════════════╤════════════════════╤════════╤════════╤════════╗
║                                            │ metric             │ p50    │ p90    │ stdDev ║
╟────────────────────────────────────────────┼────────────────────┼────────┼────────┼────────╢
║ [node 20.18.0, x86_64, 128 MB,             │ init_duration (ms) │ 366.34 │ 403.38 │ 16.2   ║
║ ap-south-1]: Code (esm) with sts v3.758.0  │                    │        │        │        ║
║ (2.27 MB)                                  │                    │        │        │        ║
╚════════════════════════════════════════════╧════════════════════╧════════╧════════╧════════╝
  • eu-west-1
$ AwsSdkLambdaJsBenchmark> yarn benchmark
╔════════════════════════════════════════════╤════════════════════╤════════╤════════╤════════╗
║                                            │ metric             │ p50    │ p90    │ stdDev ║
╟────────────────────────────────────────────┼────────────────────┼────────┼────────┼────────╢
║ [node 22.11.0, x86_64, 128 MB, eu-west-1]: │ init_duration (ms) │ 407.18 │ 497.17 │ 35.76  ║
║ Code (esm) with sts v3.758.0 (2.27 MB)     │                    │        │        │        ║
╚════════════════════════════════════════════╧════════════════════╧════════╧════════╧════════╝
$ AwsSdkLambdaJsBenchmark> yarn benchmark --node-versions 20
╔════════════════════════════════════════════╤════════════════════╤════════╤════════╤════════╗
║                                            │ metric             │ p50    │ p90    │ stdDev ║
╟────────────────────────────────────────────┼────────────────────┼────────┼────────┼────────╢
║ [node 20.18.0, x86_64, 128 MB, eu-west-1]: │ init_duration (ms) │ 393.56 │ 466.05 │ 28.13  ║
║ Code (esm) with sts v3.758.0 (2.27 MB)     │                    │        │        │        ║
╚════════════════════════════════════════════╧════════════════════╧════════╧════════╧════════╝
  • eu-west-2
$ AwsSdkLambdaJsBenchmark> yarn benchmark
╔════════════════════════════════════════════╤════════════════════╤════════╤════════╤════════╗
║                                            │ metric             │ p50    │ p90    │ stdDev ║
╟────────────────────────────────────────────┼────────────────────┼────────┼────────┼────────╢
║ [node 22.11.0, x86_64, 128 MB, eu-west-2]: │ init_duration (ms) │ 372.52 │ 384.89 │ 8.13   ║
║ Code (esm) with sts v3.758.0 (2.27 MB)     │                    │        │        │        ║
╚════════════════════════════════════════════╧════════════════════╧════════╧════════╧════════╝
$ AwsSdkLambdaJsBenchmark> yarn benchmark --node-versions 20
╔════════════════════════════════════════════╤════════════════════╤════════╤════════╤════════╗
║                                            │ metric             │ p50    │ p90    │ stdDev ║
╟────────────────────────────────────────────┼────────────────────┼────────┼────────┼────────╢
║ [node 20.18.0, x86_64, 128 MB, eu-west-2]: │ init_duration (ms) │ 356.89 │ 372.92 │ 10.57  ║
║ Code (esm) with sts v3.758.0 (2.27 MB)     │                    │        │        │        ║
╚════════════════════════════════════════════╧════════════════════╧════════╧════════╧════════╝
AwsSdkLambdaJsBenchmark nvm use 20
Now using node v20.18.3 (npm v10.8.2)
AwsSdkLambdaJsBenchmark node benchmark.mjs 
┌──────────────────────┬──────────┬──────────┬──────────┐
│ (index)              │ Avg (ms) │ Max (ms) │ Min (ms) │
├──────────────────────┼──────────┼──────────┼──────────┤
│ Full Stack Load Time │ '6.745'  │ '65.768' │ '0.061'  │
└──────────────────────┴──────────┴──────────┴──────────┘
Now using node v22.8.0 (npm v10.8.2)
AwsSdkLambdaJsBenchmark node benchmark.mjs 
┌──────────────────────┬──────────┬──────────┬──────────┐
│ (index)              │ Avg (ms) │ Max (ms) │ Min (ms) │
├──────────────────────┼──────────┼──────────┼──────────┤
│ Full Stack Load Time │ '7.899'  │ '77.318' │ '0.070'  │
└──────────────────────┴──────────┴──────────┴──────────┘

@aBurmeseDev aBurmeseDev added p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. and removed investigating Issue is being investigated and/or work is in progress to resolve the issue. labels Mar 10, 2025
@perpil
Copy link
Author

perpil commented Mar 10, 2025

Thanks for the followup @aBurmeseDev

I'm surprised there is such a large discrepancy between your benchmark and mine. I'm seeing very different numbers in overall initDuration (~200 ms vs ~350+ ms), code size (315K vs 2.27 MB) and delta between node 20 and 22 (50 ms vs 15-20ms).

You can see my configuration here. I bundle, use esm, use 1767MB ram, arm64, remove unnecessary credentials providers, and run in us-east-2.

Are you able to share more about your benchmark, because 50 ms on 200 is a lot more interesting than 20 ms on 350 ms and you may not be replicating what I'm reporting. If you haven't can you try running my repro? At least one other person saw the same 50 ms delta between node 20 and 22 when using the v3 javascript sdk.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Mar 11, 2025
@perpil
Copy link
Author

perpil commented Mar 25, 2025

As an experiment, I decided to see what would happen if I didn't use the AWS Javascript SDK and just instantiated a http and https agent.

import { Agent as hAgent, request as hRequest } from "http";
import { Agent as hsAgent, request as hsRequest } from "https";

const httpAgent = new hAgent({ keepAlive: true, maxSockets: 50 });
const httpsAgent = new hsAgent({ keepAlive: true, maxSockets: 50 });

export const handler = async (event, context) => {
  return {
    statusCode: 200,
    body: {
      requestId: context.awsRequestId,
    },
  };
};

On Node 20, the coldstart time is: 157 ms and on Node 22, the coldstart time is: 224 ms. This was done with arm64/128MB/us-east-2.

Although some of the performance hit goes away if you only instantiate the https agent, this repro seems to point the finger at node.js or the lambda node runtime. I've reached out to the Lambda team to see if they have any thoughts on this.

Update 3/26:

This is a minimal repro, it's actually importing request at all which causes it:

import { request } from "http";

export const handler = async (event, context) => {
  return {
    statusCode: 200,
    body: {
      requestId: context.awsRequestId,
    },
  };
};

Although there is potential to lazy load the request class in the SDK, it might be worth waiting to see if it is node.js or lambda related first. If it is an issue with either of those, no action may be necessary with the SDK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue potential-regression Marking this issue as a potential regression to be checked by team member
Projects
None yet
Development

No branches or pull requests

2 participants