Skip to main content
AWS Lambda OpenSearch Monitoring Observability

Lambda Fleet Monitoring with OpenSearch: Real-Time Insights at Scale

YN
Yaroslav Naumenko
|

Do you manage multiple AWS accounts with countless Lambda functions — and feel overwhelmed by the complexity of monitoring them all? Look no further. The Lambda Fleet Monitoring Solution is a fully automated cross-account approach that tracks real-time metrics (invocations, errors, duration, and even cold starts) and funnels them into an OpenSearch cluster for robust analysis and visualization.

In this article, we’ll walk through this solution’s architecture, features, and setup. To dive deeper into the code and additional details, check out the opensearch-monitoring GitHub repository.

Why This Matters

As serverless adoption grows, monitoring Lambda metrics becomes increasingly challenging, especially if you have multiple AWS accounts.

With the Lambda Fleet Monitoring Solution, you gain:

  • Visibility into every function’s performance and execution patterns.
  • Centralized dashboards for easier troubleshooting.
  • Scalability that covers as many AWS accounts as you need.

High-Level Architecture

Key Components

  1. Amazon EventBridge: Schedules the monitoring Lambda to run on a configurable interval.
  2. Monitoring Lambda: Assumes roles in other AWS accounts to gather CloudWatch metrics and push them to OpenSearch.
  3. OpenSearch: Serves as the data store for all metrics.
  4. OpenSearch Dashboards: Provides out-of-the-box (and customizable) visualization tools.

Core Features

  • Cross-Account Monitoring: Leverage IAM roles to gather data from multiple AWS accounts.
  • Real-Time Metrics: Track invocation rates, error counts, memory usage, duration statistics, cold starts, etc.
  • Custom Dashboards: Quickly visualize performance trends and identify anomalies.
  • Automated Setup: Minimal manual configuration required — Terraform automates resource creation.
  • Customizable Alerts: Integrate with AWS services or third-party tools for alerting on critical thresholds.
  • Memory & Timeout Insights: Optimize Lambda performance and costs based on usage patterns.

Metrics You’ll See

  1. Invocation Count
  2. Error Rates
  3. Duration Statistics
  4. Memory Utilization
  5. Cold Start Frequency
  6. Timeout Proximity
  7. Runtime Distribution
  8. Cost Metrics

Prerequisites

To get started, ensure you have:

  • AWS CLI configured with the right permissions.
  • Terraform v1.5.0+ installed.
  • Python 3.9+ installed.
  • Cross-account IAM roles set up in each AWS account you wish to monitor.
  • Permission to create:
    • Lambda functions
    • OpenSearch domains
    • IAM roles and policies
    • CloudWatch events
    • S3 buckets

QuickStart Installation

1. Clone the Repository

git clone https://github.com/cloudon-one/opensearch-monitoring
cd opensearch-monitoring/lambda/terraform

2. Configure Variables

In a terraform.tfvars file, define your settings:

aws_region                       = "us-west-1"
monitored_accounts               = ["123456789012", "098765432109"]
opensearch_master_user_password  = "your-secure-password"
opensearch_instance_type         = "t3.small.search"
opensearch_instance_count        = 1
opensearch_volume_size           = 10

3. Initialize Terraform

terraform init

4. Plan & Apply

terraform plan
terraform apply

This will provision the OpenSearch domain, monitoring Lambda, IAM roles, and other necessary resources.

Securing Your Setup

1. Regular Rotation

  • Rotate access keys and review roles periodically.

2. Access Logging

  • Enable CloudTrail logging for all AWS API activities.

3. Least Privilege

  • Minimize permissions where possible and remove unused policies.

4. Organization Controls

  • Use AWS Organizations Service Control Policies (SCPs) for additional governance.

Wrapping Up

The Lambda Fleet Monitoring Solution offers a robust, scalable way to track and analyze performance for all your AWS Lambda functions — regardless of how many accounts you manage. By combining real-time CloudWatch metrics with the visualization power of OpenSearch, this solution ensures you stay on top of function behaviour, performance trends, and potential cost optimizations.

For a deeper dive, including best practices, troubleshooting tips, and advanced configuration options, head to the opensearch-monitoring GitHub repository and explore the documentation. Feel free to fork, submit issues, or contribute enhancements!

Have thoughts or questions?

Comment below or open an issue on GitHub to share your ideas.

Happy monitoring!

YN

Yaroslav Naumenko

Cloud Infrastructure Architect specializing in PCI/HIPAA/FedRAMP compliant solutions at scale. Over a decade building on AWS & GCP.

Need Help With Your Cloud Infrastructure?

Book a free 15-minute call and let's discuss your needs.