Managing OpenSearch Index with a Lambda-Backed CloudFormation Custom Resource

Introduction

As modern applications scale, the ability to efficiently manage and automate infrastructure becomes essential. OpenSearch, Amazon’s open-source fork of Elasticsearch, is a powerful tool for search and analytics. However, managing indexes. such as creating, deleting, or updating mappings, can quickly become a complex and error-prone task when done manually.

In this post, we’ll explore how to automate OpenSearch index management using AWS Lambda, the Serverless Framework, and a custom CloudFormation resource. We’ll use the post-deployment Serverless template as a foundation for deploying our automation seamlessly.


:building_construction: Architecture Overview

Here’s a high-level view of what we’re building:

  • A Lambda function: Executes custom logic to create or update OpenSearch indexes.

  • Custom CloudFormation resource: Triggers the Lambda post-deployment.

  • Serverless Framework: Manages infrastructure as code and deployment lifecycle.

  • IAM permissions: Securely grant the Lambda access to required AWS services.


:gear: The Serverless Template

The template provided includes the following critical parts:

1. Lambda Function: manage-opensearch-index

This function lives in handlers/manage_indexes.js and is triggered via a CloudFormation custom resource post-deployment.

serverless.yaml

functions:  
  manage-opensearch-index:  
    handler: handlers/manage_indexes.handler  
    layers:  
      - { Ref: LAMBDALAYERTENANTNODE }

:light_bulb: Place your OpenSearch logic here: create indices, define mappings, or seed initial data.


2. Custom Resource for Post-Deployment Execution

This CloudFormation snippet ensures that the Lambda function is invoked automatically after the stack is created or updated.

serverless.yaml

resources:  
  Resources:  
    CustomResourceLambdaInvokePermission:  
      Type: AWS::Lambda::Permission  
      Properties:  
        FunctionName:   
           Fn::GetAtt:  
            - ManageDashopensearchDashindexLambdaFunction  
            - Arn  
        Action: lambda:InvokeFunction  
        Principal: cloudformation.amazonaws.com
    TriggerManageDashOpensearchCustomResource:  
      Type: AWS::CloudFormation::CustomResource  
      Properties:  
        ServiceToken:   
          Fn::GetAtt:  
            - ManageDashopensearchDashindexLambdaFunction  
            - Arn  
        CustomProperty1: "value1"  
        CustomProperty2: "value2"  
      DependsOn:  
        - CustomResourceLambdaInvokePermission  
        - ManageDashopensearchDashindexLambdaFunction
  • CustomResourceLambdaInvokePermission: Grants CloudFormation permission to invoke the Lambda function by allowing the cloudformation.amazonaws.com principal to execute lambda:InvokeFunction on the specified function.
  • TriggerManageDashOpensearchCustomResource: Creates a custom CloudFormation resource that automatically triggers the Lambda function during stack deployment using the function’s ARN as the ServiceToken.
  • DependsOn Dependencies: Ensures proper resource creation order by making the custom resource wait for both the invoke permission and the Lambda function to be created first before execution.

3. IAM Permissions

The Lambda function has broad access to services like OpenSearch (execute-api:*), DynamoDB, SES, and more. For a production setup, consider scoping down to least privilege.

serverless.yaml

iam:  
  role:  
    statements:  
      - Effect: "Allow"  
        Action:  
          - execute-api:*  
          - lambda:*  
 	      - aoss:*  
          # Add opensearch-specific permissions here if needed  
        Resource: "*"

:clipboard: OpenSearch Index Schema Definition

Before diving into the Lambda implementation, let’s examine the comprehensive schema definition for our facility activity logs index. This schema captures detailed user interactions, system events, and metadata for comprehensive auditing and analytics.
entities/opensearch/facility_activity_log.js

/**  
 * @description  
 * OpenSearch index mapping and settings for facility activity logs.  
 * Captures user interactions, system events, and metadata for auditing and analytics.  
 * Includes configurations for indexing behavior, shard allocation, and refresh intervals.  
 */ 
module.exports.facilityActivityLog = {     
    name: 'facility-activity-logs',     
    properties: {         
        event_id: { type: 'keyword', index: true },         
        timestamp: { type: 'date', format: 'strict_date_optional_time||epoch_millis' },         
        event_type: { type: 'keyword', index: true },         
        risk_level: { type: 'keyword', index: true },         
        risk_score: { type: 'integer' },         
        user_id: {             
            type: 'text',             
            fields: {                 
                keyword: { type: 'keyword', ignore_above: 256 },             
            },         
        },         
        user_name: {             
            type: 'text',             
            fields: {                 
                keyword: { type: 'keyword', ignore_above: 256 },             
            },         
        },         
        user_group: {             
            type: 'text',             
            fields: {                 
                keyword: { type: 'keyword', ignore_above: 256 },             
            },         
        },         
        user_email_address: {             
            type: 'text',             
            fields: {                 
                keyword: { type: 'keyword', ignore_above: 256 },             
            },         
        },         
        patient_id: {             
            type: 'text',             
            fields: {                 
                keyword: { type: 'keyword', ignore_above: 256 },             
            },         
        },         
    },     
    settings: {         
        index: {             
            number_of_shards: 1,             
            number_of_replicas: 0,             
            refresh_interval: '30s',             
            max_result_window: 50000,         
        },     
    }, 
}

:brain: Lambda Logic

Here’s a basic example of what handlers/manage_indexes.js might look like:

const response = require('cfn-response')  
const { defaultProvider } = require('@aws-sdk/credential-provider-node')  
const { Client } = require('@opensearch-project/opensearch')  
const { AwsSigv4Signer } = require('@opensearch-project/opensearch/aws')

const { facilityActivityLog } = require('../entities/opensearch/facility_activity_log')

const client = new Client({  
    ...AwsSigv4Signer({  
        region: process.env.REGION, // AWS region  
        service: 'aoss',
        getCredentials: () => {  
            const credentialsProvider = defaultProvider()  
            return credentialsProvider()  
        },  
    }),  
    node: process.env.ACTIVITY_COLLECTION_ENDPOINT, 
})

/* OpenSearch index mapping and settings for facility activity logs.  
 * Captures user interactions, system events, and metadata for auditing and analytics.  
 * Includes configurations for indexing behavior, shard allocation, and refresh intervals.  
 */  
async function createFacilityActivityIndex() {  
    const exists = await client.indices.exists({ index: facilityActivityLog.name })  
    if (!exists.body) {  
        const create_index_response = await client.indices.create({  
            index: facilityActivityLog.name,  
            body: {  
                settings: facilityActivityLog.settings,  
                mappings: {  
                    properties: facilityActivityLog.properties,  
                },  
            },  
        })  
        return create_index_response  
    }  
    const put_mapping_response = await client.indices.putMapping({  
        index: facilityActivityLog.name,  
        body: {  
            mappings: {  
                properties: facilityActivityLog.properties,  
            },  
        },  
    })  
    return put_mapping_response  
}

/**  
 * Handler to manage OpenSearch indexes post-deployment.  
 * Creates or updates the index for facility activity logs.  
 * @param {Object} event - The event object containing deployment data.  
 * @returns {Object} - The response object with status and message.  
 */  
module.exports.handler = async (event, context) => {  
    const responseData = {  
        Message: 'Static response from Lambda-backed custom resource',  
        Timestamp: new Date().toISOString(),  
    }  
    try {  
        console.debug(JSON.stringify(event))  
        await createFacilityActivityIndex()  
        return response.send(event, context, response.SUCCESS, responseData, event.LogicalResourceId);  
    } catch (error) {  
        return response.send(event, context, response.FAIL, responseData, event.LogicalResourceId);  
    }  
}

Handling CloudFormation Signaling with cfn-response

When using a Lambda-backed custom resource in CloudFormation, CloudFormation waits for a signal from the Lambda function to know whether the operation succeeded or failed. This is where the cfn-response module (or its equivalent logic) comes into play.

Why cfn-response?

CloudFormation doesn’t inherently understand the result of a Lambda function execution. Instead, your Lambda must manually send a response back to the pre-signed S3 URL provided in the event.ResponseURL. This informs CloudFormation whether to continue or roll back the stack.

:white_check_mark: Benefits

  • Fully automated: No manual index setup needed.

  • Repeatable: Ensures consistency across environments.

  • Post-deployment safe: Only runs once stack is fully provisioned.

  • Modular: Easy to extend or adapt with custom mappings or seed data.


:rocket: Next Steps

  1. Add OpenSearch credentials to AWS Parameter Store or Secrets Manager.

  2. Scope down IAM permissions.

  3. Implement retry logic or CloudWatch alerts for production readiness.

  4. Version-control your index mappings for auditability.


:pushpin: Conclusion

Automating OpenSearch index management using the Serverless Framework and AWS Lambda eliminates manual configuration errors while ensuring consistent deployment of complex schemas across all environments.

4 Likes