Introduction
As modern applications scale, the ability to efficiently manage and automate infrastructure becomes essential. OpenSearch, Amazon’s open-source fork of Elasticsearch, is a powerful tool for search and analytics. However, managing indexes. such as creating, deleting, or updating mappings, can quickly become a complex and error-prone task when done manually.
In this post, we’ll explore how to automate OpenSearch index management using AWS Lambda, the Serverless Framework, and a custom CloudFormation resource. We’ll use the post-deployment Serverless template as a foundation for deploying our automation seamlessly.
Architecture Overview
Here’s a high-level view of what we’re building:
-
A Lambda function: Executes custom logic to create or update OpenSearch indexes.
-
Custom CloudFormation resource: Triggers the Lambda post-deployment.
-
Serverless Framework: Manages infrastructure as code and deployment lifecycle.
-
IAM permissions: Securely grant the Lambda access to required AWS services.
The Serverless Template
The template provided includes the following critical parts:
1. Lambda Function: manage-opensearch-index
This function lives in handlers/manage_indexes.js and is triggered via a CloudFormation custom resource post-deployment.
serverless.yaml
functions:
manage-opensearch-index:
handler: handlers/manage_indexes.handler
layers:
- { Ref: LAMBDALAYERTENANTNODE }
Place your OpenSearch logic here: create indices, define mappings, or seed initial data.
2. Custom Resource for Post-Deployment Execution
This CloudFormation snippet ensures that the Lambda function is invoked automatically after the stack is created or updated.
serverless.yaml
resources:
Resources:
CustomResourceLambdaInvokePermission:
Type: AWS::Lambda::Permission
Properties:
FunctionName:
Fn::GetAtt:
- ManageDashopensearchDashindexLambdaFunction
- Arn
Action: lambda:InvokeFunction
Principal: cloudformation.amazonaws.com
TriggerManageDashOpensearchCustomResource:
Type: AWS::CloudFormation::CustomResource
Properties:
ServiceToken:
Fn::GetAtt:
- ManageDashopensearchDashindexLambdaFunction
- Arn
CustomProperty1: "value1"
CustomProperty2: "value2"
DependsOn:
- CustomResourceLambdaInvokePermission
- ManageDashopensearchDashindexLambdaFunction
- CustomResourceLambdaInvokePermission: Grants CloudFormation permission to invoke the Lambda function by allowing the
cloudformation.amazonaws.comprincipal to executelambda:InvokeFunctionon the specified function. - TriggerManageDashOpensearchCustomResource: Creates a custom CloudFormation resource that automatically triggers the Lambda function during stack deployment using the function’s ARN as the ServiceToken.
- DependsOn Dependencies: Ensures proper resource creation order by making the custom resource wait for both the invoke permission and the Lambda function to be created first before execution.
3. IAM Permissions
The Lambda function has broad access to services like OpenSearch (execute-api:*), DynamoDB, SES, and more. For a production setup, consider scoping down to least privilege.
serverless.yaml
iam:
role:
statements:
- Effect: "Allow"
Action:
- execute-api:*
- lambda:*
- aoss:*
# Add opensearch-specific permissions here if needed
Resource: "*"
OpenSearch Index Schema Definition
Before diving into the Lambda implementation, let’s examine the comprehensive schema definition for our facility activity logs index. This schema captures detailed user interactions, system events, and metadata for comprehensive auditing and analytics.
entities/opensearch/facility_activity_log.js
/**
* @description
* OpenSearch index mapping and settings for facility activity logs.
* Captures user interactions, system events, and metadata for auditing and analytics.
* Includes configurations for indexing behavior, shard allocation, and refresh intervals.
*/
module.exports.facilityActivityLog = {
name: 'facility-activity-logs',
properties: {
event_id: { type: 'keyword', index: true },
timestamp: { type: 'date', format: 'strict_date_optional_time||epoch_millis' },
event_type: { type: 'keyword', index: true },
risk_level: { type: 'keyword', index: true },
risk_score: { type: 'integer' },
user_id: {
type: 'text',
fields: {
keyword: { type: 'keyword', ignore_above: 256 },
},
},
user_name: {
type: 'text',
fields: {
keyword: { type: 'keyword', ignore_above: 256 },
},
},
user_group: {
type: 'text',
fields: {
keyword: { type: 'keyword', ignore_above: 256 },
},
},
user_email_address: {
type: 'text',
fields: {
keyword: { type: 'keyword', ignore_above: 256 },
},
},
patient_id: {
type: 'text',
fields: {
keyword: { type: 'keyword', ignore_above: 256 },
},
},
},
settings: {
index: {
number_of_shards: 1,
number_of_replicas: 0,
refresh_interval: '30s',
max_result_window: 50000,
},
},
}
Lambda Logic
Here’s a basic example of what handlers/manage_indexes.js might look like:
const response = require('cfn-response')
const { defaultProvider } = require('@aws-sdk/credential-provider-node')
const { Client } = require('@opensearch-project/opensearch')
const { AwsSigv4Signer } = require('@opensearch-project/opensearch/aws')
const { facilityActivityLog } = require('../entities/opensearch/facility_activity_log')
const client = new Client({
...AwsSigv4Signer({
region: process.env.REGION, // AWS region
service: 'aoss',
getCredentials: () => {
const credentialsProvider = defaultProvider()
return credentialsProvider()
},
}),
node: process.env.ACTIVITY_COLLECTION_ENDPOINT,
})
/* OpenSearch index mapping and settings for facility activity logs.
* Captures user interactions, system events, and metadata for auditing and analytics.
* Includes configurations for indexing behavior, shard allocation, and refresh intervals.
*/
async function createFacilityActivityIndex() {
const exists = await client.indices.exists({ index: facilityActivityLog.name })
if (!exists.body) {
const create_index_response = await client.indices.create({
index: facilityActivityLog.name,
body: {
settings: facilityActivityLog.settings,
mappings: {
properties: facilityActivityLog.properties,
},
},
})
return create_index_response
}
const put_mapping_response = await client.indices.putMapping({
index: facilityActivityLog.name,
body: {
mappings: {
properties: facilityActivityLog.properties,
},
},
})
return put_mapping_response
}
/**
* Handler to manage OpenSearch indexes post-deployment.
* Creates or updates the index for facility activity logs.
* @param {Object} event - The event object containing deployment data.
* @returns {Object} - The response object with status and message.
*/
module.exports.handler = async (event, context) => {
const responseData = {
Message: 'Static response from Lambda-backed custom resource',
Timestamp: new Date().toISOString(),
}
try {
console.debug(JSON.stringify(event))
await createFacilityActivityIndex()
return response.send(event, context, response.SUCCESS, responseData, event.LogicalResourceId);
} catch (error) {
return response.send(event, context, response.FAIL, responseData, event.LogicalResourceId);
}
}
Handling CloudFormation Signaling with cfn-response
When using a Lambda-backed custom resource in CloudFormation, CloudFormation waits for a signal from the Lambda function to know whether the operation succeeded or failed. This is where the cfn-response module (or its equivalent logic) comes into play.
Why cfn-response?
CloudFormation doesn’t inherently understand the result of a Lambda function execution. Instead, your Lambda must manually send a response back to the pre-signed S3 URL provided in the event.ResponseURL. This informs CloudFormation whether to continue or roll back the stack.
Benefits
-
Fully automated: No manual index setup needed.
-
Repeatable: Ensures consistency across environments.
-
Post-deployment safe: Only runs once stack is fully provisioned.
-
Modular: Easy to extend or adapt with custom mappings or seed data.
Next Steps
-
Add OpenSearch credentials to AWS Parameter Store or Secrets Manager.
-
Scope down IAM permissions.
-
Implement retry logic or CloudWatch alerts for production readiness.
-
Version-control your index mappings for auditability.
Conclusion
Automating OpenSearch index management using the Serverless Framework and AWS Lambda eliminates manual configuration errors while ensuring consistent deployment of complex schemas across all environments.