Understanding and Optimizing AWS Lambda Cold Start

Understanding and Optimizing AWS Lambda Cold Starts: A Practical Guide

Lambda cold starts can significantly impact the performance of serverless applications. In this guide, we’ll explore practical strategies to minimize cold start latency, using real implementation examples and performance metrics.

Understanding Cold Starts

A cold start occurs when AWS Lambda needs to initialize a new execution environment for your function. This includes:

  • Downloading your function code
  • Starting a new container
  • Loading the runtime environment
  • Initializing your function code and dependencies

Real-World Performance Impact

Based on our testing in the ap-south-1 region, here are typical cold start metrics:

Without Optimization

  • Cold start average time: 500-1000ms
  • Subsequent invocations: 100-200ms

With Optimization (Reserved Concurrency/Scheduled Events)

  • First invocation: 100-250ms
  • Subsequent invocations: Similar to non-optimized subsequent invocations

Note: Results may vary based on region, function complexity, and other factors.

Implementation Example

Let’s look at a practical implementation using the Serverless Framework. This setup demonstrates two different approaches to handling cold starts:

  1. Function A: Uses provisioned concurrency
  2. Function B: Uses scheduled warming

Serverless Configuration

service: cold-start
frameworkVersion: "3"
useDotenv: true

provider:
  name: aws
  runtime: nodejs20.x
  stage: ${opt:stage, 'dev'}
  region: ap-south-1
  environment:
    REGION: ap-south-1
    STAGE: ${self:provider.stage}
  iam:
    role:
      statements:
        - Effect: "Allow"
          Action:
            - "logs:CreateLogGroup"
            - "logs:CreateLogStream"
            - "logs:PutLogEvents"
          Resource: "arn:aws:logs:*:*:*"
        - Effect: "Allow"
          Action:
            - "lambda:PublishVersion"
            - "lambda:PutFunctionConcurrency"
          Resource: "arn:aws:lambda:ap-south-1:*:function:cold-function-a"

functions:
  cold-function-a:
    handler: src/handlers/function_a.handler
    timeout: 20
    memorySize: 256
    provisionedConcurrency: 10
    events:
      - http:
          path: /func-a
          method: get
          cors: true

  cold-function-b:
    handler: src/handlers/function_b.handler
    timeout: 20
    memorySize: 256
    events:
      - http:
          path: /func-b
          method: get
          cors: true
      - schedule:
          rate: rate(1 minute)
          enabled: true

plugins:
  - serverless-esbuild
  - serverless-offline

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    exclude: ["aws-sdk"]
    target: "node20"
    platform: "node"
    concurrency: 10

Lambda Handler Implementation

Here’s an example of a Lambda handler that handles both API Gateway events and scheduled warming events:

import { APIGatewayProxyEvent, APIGatewayProxyResult } from 'aws-lambda';

const headers = {
    'Content-Type': 'application/json',
    'Access-Control-Allow-Origin': '*',
    'Access-Control-Allow-Credentials': 'true',
    'Access-Control-Allow-Headers': '*',
    'Access-Control-Allow-Methods': '*',
};

export const handler = async (
  event: APIGatewayProxyEvent
): Promise<APIGatewayProxyResult> => {
  // Ignore scheduled warm-up events
  if (event.resource === 'aws.events') {
    console.log('Warm-up event received. Ignoring.');
    return {
      statusCode: 200,
      headers,
      body: JSON.stringify({ message: 'Warm-up event received.' }),
    };
  }

  return {
    statusCode: 200,
    headers,
    body: JSON.stringify({
      message: 'Function B executed successfully.',
      data: headers,
    }),
  };
};

Optimization Strategies Explained

1. Provisioned Concurrency (Function A)

provisionedConcurrency: 10

This configuration ensures that 10 execution environments are always kept warm and ready to serve requests.

Important Consideration: Provisioned concurrency reserves a portion of your account’s total concurrency limit. This means other functions in the same AWS account and region will have fewer resources available for scaling.

2. Scheduled Warming (Function B)

events:
  - schedule:
      rate: rate(1 minute)
      enabled: true

This approach keeps the function warm by invoking it regularly. While it’s more cost-effective than provisioned concurrency, it has some trade-offs:

  • Slightly higher costs due to regular invocations
  • No guarantee of immediate availability
  • May still experience cold starts if traffic spikes exceed the warming frequency

3. Build Optimization

custom:
  esbuild:
    bundle: true
    minify: true
    sourcemap: true
    exclude: ["aws-sdk"]
    target: "node20"
    platform: "node"
    concurrency: 10

The configuration uses esbuild to:

  • Bundle dependencies for faster loading
  • Minify code to reduce package size
  • Exclude AWS SDK to reduce package size (it’s built into Lambda)

Best Practices

  1. Choose the Right Strategy

    • Use provisioned concurrency for consistent, high-traffic workloads
    • Use scheduled warming for cost-sensitive, moderate-traffic applications
    • Consider using both strategies for different functions based on their requirements
  2. Code Optimization

    • Keep initialization code outside the handler
    • Minimize dependencies
    • Use efficient runtime (Node.js 20.x in our example)
  3. Resource Configuration

    • Set appropriate memory allocation (256MB in our example)
    • Configure realistic timeouts (20 seconds in our example)
    • Use bundling and minification
  4. Monitoring

    • Log cold start events
    • Track initialization times
    • Monitor cost impact of warming strategies

Cost Considerations

  1. Provisioned Concurrency

    • Fixed cost based on the number of provisioned instances
    • Higher predictability but potentially higher cost
  2. Scheduled Warming

    • Cost based on invocation frequency
    • More economical but less predictable
    • Additional costs from CloudWatch Events

Remember to:

  • Monitor your application’s performance
  • Adjust warming strategies based on traffic patterns
  • Balance cost and performance requirements
  • Regularly review and optimize your implementation

The example implementation provided here serves as a starting point - you can adapt and modify it based on your specific needs and requirements.

Additional Optimization Strategies

Using Serverless Warmup Plugin

The serverless-plugin-warmup is a powerful tool that helps manage cold starts by automatically keeping your functions warm. Here’s how to implement it:

  1. First, install the plugin:
npm install --save-dev serverless-plugin-warmup
  1. Add it to your serverless.yml:
plugins:
  - serverless-plugin-warmup
  - serverless-esbuild
  - serverless-offline

custom:
  warmup:
    enabled: true # Enable warmup
    events:
      - schedule: rate(5 minutes)
    concurrency: 3 # Number of concurrent warm functions
    prewarm: true # Warmup immediately after deployment
    aware: true # Enables handler to know if it's a warmup event
    # Only warmup in production
    staging:
      enabled: false
    production:
      enabled: true
      payloadTemplate: '{ "source": "serverless-plugin-warmup" }'

functions:
  cold-function-a:
    handler: src/handlers/function_a.handler
    timeout: 20
    memorySize: 256
    warmup:
      enabled: true
  1. Update your Lambda handler to handle warmup events:
interface WarmupEvent {
  source: string;
}

export const handler = async (event: APIGatewayProxyEvent | WarmupEvent) => {
  // Check for warmup event
  if (event.source === 'serverless-plugin-warmup') {
    console.log('WarmUp - Lambda is warm!');
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Warm and ready!' })
    };
  }

  // Normal handler logic
  return {
    statusCode: 200,
    headers,
    body: JSON.stringify({
      message: 'Function executed successfully'
    })
  };
};

Code Initialization Best Practices

Here’s an optimized example that combines proper initialization with warmup handling:

// Global initialization outside handler
const AWS = require('aws-sdk');
const db = new AWS.DynamoDB.DocumentClient();

// Reusable connection/cache
let globalCache: any = null;
let dbConnection: any = null;

const initializeResources = async () => {
  if (!dbConnection) {
    dbConnection = await createConnection();
  }
  if (!globalCache) {
    globalCache = await initializeCache();
  }
};

export const handler = async (event: APIGatewayProxyEvent | WarmupEvent) => {
  // Handle warmup
  if (event.source === 'serverless-plugin-warmup') {
    await initializeResources();
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Warmed up and initialized' })
    };
  }

  // Resources already initialized, handle actual request
  try {
    const result = await db.get({/*...*/}).promise();
    return {
      statusCode: 200,
      body: JSON.stringify(result)
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ error: 'Internal server error' })
    };
  }
};

Warmup Strategy Comparison

Here’s a comparison of the different warming strategies discussed:

  1. Serverless Warmup Plugin

    • Pros:
      • Easy to implement and configure
      • Flexible scheduling options
      • Supports concurrent warming
      • Minimal code changes required
    • Cons:
      • Additional plugin dependency
      • Slightly increases deployment package size
  2. Provisioned Concurrency

    • Pros:
      • Guaranteed performance
      • No code changes required
      • Predictable pricing
    • Cons:
      • Higher cost
      • Less flexible scaling
  3. Custom Scheduled Events

    • Pros:
      • Complete control over implementation
      • No additional dependencies
    • Cons:
      • More complex to implement
      • Requires more maintenance

Conclusion

Cold starts are a manageable challenge in serverless architectures. By implementing appropriate warming strategies and following best practices, you can significantly reduce their impact on your application’s performance. The choice between provisioned concurrency and scheduled warming depends on your specific requirements for performance, cost, and reliability.

12 Likes