Increase Control and Reduce Noise in Your AWS Logs Using Datadog Observability Pipelines | Datadog

Increase control and reduce noise in your AWS logs using Datadog Observability Pipelines

Author Ahmed Ahmed
Author Jesse Mack

Published: 2月 21, 2025

Today’s SRE and security operations center (SOC) teams often find themselves overwhelmed by the sheer volume and variety of logs generated by critical AWS services such as VPC Flow Logs, AWS WAF, and Amazon CloudFront. While these logs can be valuable for detecting and investigating security threats, as well as troubleshooting issues in your environment, managing them at scale can be challenging and costly. High egress costs, complicated ingestion workflows, and repetitive or irrelevant data can hinder organizations’ ability to extract meaningful insights from AWS logs.

Datadog Observability Pipelines enables teams to take control of their log volumes, processing, and routing with integrations that help teams build pipelines across a range of log sources and destinations—including several key AWS services. In this blog post, we’ll explore how you can use Observability Pipelines to:

Aggregate and process logs from Amazon S3, Amazon Data Firehose, and AWS Lambda

Amazon S3, Amazon Data Firehose, and AWS Lambda are common pathways for collecting logs from various AWS services, including CloudFront, AWS WAF, and Amazon VPC. While these logs provide valuable data, they can be noisy and difficult to process. Integrating them with Observability Pipelines simplifies collection, parsing, and routing, helping you quickly and easily increase the value you’re getting from your logs.

Collect logs from AWS lambda

AWS Lambda is a serverless compute service that triggers code in response to events and outputs logs for warnings, errors, or other critical information. By default, AWS Lambda functions write their logs to Amazon CloudWatch Logs.

To capture these logs with Observability Pipelines, you can set up the Datadog Forwarder to start subscribing to your CloudWatch Logs data. Then, in Observability Pipelines, you can select the HTTP/S Server pipeline source and provide the HTTP address for the Datadog Forwarder as the listener address (the network interface and port that Observability Pipelines will listen on for traffic). Observability Pipelines will then process and route your logs according to the specifications of your pipeline.

Send a variety of AWS log sources to Observability Pipelines by configuring the Datadog Forwarder to subscribe to Amazon CloudWatch

Stream your CloudFront logs using AWS Data Firehose

Amazon Data Firehose is a real-time streaming data and storage service where users can send logs from many AWS services, such as CloudWatch and CloudFront. Once these logs arrive in Amazon Firehose, you can forward them to Datadog Observability Pipelines, use the JSON parser and splitter to filter and extract individual logs, and then route them to their final destination.

Sourcing logs from Amazon S3

Amazon S3 is a low-cost, highly scalable object storage service frequently used for log archival or bulk storage. Storing raw logs in S3 allows you to retain them for long-term compliance in a cost-effective manner.

After you set up S3 as a data source in Observability Pipelines, you can simply choose Amazon S3 as the data source for a new pipeline—Observability Pipelines will start collecting these events and route and/or process them based on your selections. This enables you to send high-value logs to Datadog Log Management (or another destination) for analysis, while keeping low-priority logs stored in S3 buckets.

Extract actionable insights from AWS WAF, CloudFront, and VPC Flow Logs

AWS services like WAF, CloudFront, and VPC Flow Logs produce high volumes of data in various formats. Datadog Observability Pipelines helps you process these logs, enrich them with context, and generate high-value metrics from them, so you can use these logs to better understand your environment’s performance.

VPC Flow Logs provide a record of IP traffic to and from network interfaces in your virtual private cloud (VPC). While these logs can be critical for security and network insights, they also tend to be noisy—they are generated at high volume, and not all logs contain critical information. With Observability Pipelines you can use filters to make your log stream less noisy, enrich logs with GeoIP or hostname data, and generate custom metrics to help you spot suspicious connections.

CloudFront logs capture request paths, edge locations used, and metrics that provide insights into the performance of your content delivery network. By using Observability Pipelines to route and process your CloudFront logs, you can parse your log data for performance metrics, redact sensitive details, and archive or forward logs as needed, helping you derive more insight from your CloudFront logs and manage storage costs.

AWS WAF logs contain details on network traffic to your AWS resources, providing security teams with important insights that improve threat detection and response. You can use Observability Pipelines to automatically remove sensitive data from your logs, create custom metrics (like block rates) for security use cases, and forward only high-severity events to specialized security tools.

Let’s say you’re the CISO at a fintech company that hosts its core services on AWS. Because of the industry you’re in, security, compliance, and auditability are all critical. Your organization sends data from VPC Flow Logs, AWS WAF, and CloudFront Logs to Amazon Data Firehose for simple storage and basic auditing. You want to filter and enrich these logs and generate metrics, and you also want to send CloudFront logs to S3 buckets while routing WAF and VPC Flow Logs to Amazon Security Lake.

You can use Datadog Observability Pipelines to build a log pipeline that looks like this:

Pipeline that sends certain logs from Amazon Data Firehose to Amazon Security Lake and others to S3 Buckets

This pipeline enables you to gain the insights you need from your most important logs and manage your spend by archiving less critical data, while still keeping it available for potential audits.

Get more value from your AWS logs with Datadog Observability Pipelines

When dealing with AWS logs at scale, it’s vital to optimize both your process and budget. Datadog’s Observability Pipelines provides a centralized control plane to set up, manage, and optimize your log flows. In addition to ingesting data from Amazon S3, Data Firehose, and AWS Lambda, you can also apply transformations and send logs to your preferred destinations, whether that’s Datadog Log Management, a SIEM for security analytics, or a data lake for long-term storage. You can also use processors in Observability Pipelines to filter repetitive events and sample high-volume logs, helping you reduce costs and ensuring you retain only the most valuable data. With these capabilities, you can support enhanced analytics, improve your threat detections, and optimize your log management strategy without vendor lock-in.

To get started sending your logs from Amazon S3, Firehose, or AWS Lambda with Observability Pipelines, configure the Amazon S3, Amazon Data Firehose, and HTTP server data sources. For more information, visit our documentation. If you’re new to Datadog, you can sign up for a 14-day .