Generate Metrics From Your High-Volume Logs With Datadog Observability Pipelines | Datadog

Generate metrics from your high-volume logs with Datadog Observability Pipelines

Author Candace Shamieh
Author Pratik Parekh

Published: 10月 21, 2024

Logs are a rich source of information, providing you with the minute details you need to troubleshoot a specific issue or perform extensive historical analysis. But with billions of logs being generated from your infrastructure every day, it isn’t practical to sift through them all to derive actionable insights. Firewall, CDN, network activity, and load balancer logs are especially high volume, requiring storage solutions that can be expensive and difficult to scale. Though logs are one of the three pillars of observability, an overreliance on them isn’t conducive to a long-term, cost-effective observability strategy.

By extracting metrics from logs, you can keep the most important information readily available while minimizing the costs associated with log management. Generating metrics from logs enables you to reduce the volume of logs that you ingest, store, and route without compromising your ability to capture key insights or analyze historical trends.

Datadog already provides you with the ability to extract metrics from ingested logs by using Log Pipelines, enabling you to retain logs selectively, track SLOs, and more. Now, Observability Pipelines can extract metrics from your logs at the source. Whether your logs are ingested and stored in Datadog, Splunk, Elasticsearch, or any other third-party tool, Observability Pipelines can generate metrics from them in near real-time, giving you the option to ship, drop, or store logs however you see fit.

In this post, we’ll discuss how Observability Pipelines enables you to:

Retain fewer logs without sacrificing search and analytics capabilities

When logs are constantly generated from a myriad of applications and infrastructure resources, gaining a complete understanding of your system can be challenging. To address this and offer teams the flexibility to use best-of-breed tools that are the best fit for their use cases, many organizations send their logs to multiple destinations for analysis or troubleshooting. But sending high-volume logs—such as Akamai or Cloudflare logs—to multiple destinations requires a significant amount of network bandwidth, forces you to incur egress costs, and may result in logs being stored in suboptimal locations.

Using Observability Pipelines to generate metrics from your high-volume logs can reduce the number of logs you retain in cost-prohibitive, frequent-access storage solutions. Once you generate metrics, you can drop the logs or store them in cost-effective, long-term storage solutions for retrieval on demand.

Whereas log management storage periods can vary widely (depending on log type, use case, observability tool, length of contractual commitment, and other factors), the metrics that Observability Pipelines collects are stored in Datadog for a 15-month retention period. This longer retention period enables you to consistently leverage these log-based metrics for predictive forecasting and seasonality checks.

For example, let’s say you work in an organization that ingests billions of logs per day. Storing all of these logs in a frequent-access storage solution is cost prohibitive, but the logs contain information that is necessary for analysis, troubleshooting, and compliance. With Observability Pipelines, you can extract key metrics from these logs and collect them in Datadog, enabling you to track performance in real time, analyze trends, and correlate with other infrastructure and application metrics that you already monitor in Datadog. This allows you to drop logs or send them to an archive storage solution, resulting in cost savings.

View  of a pipeline that generates metrics from logs that are routing from Splunk to both Amazon S3 and Splunk

Once your log-derived metrics are in Datadog, you can create metric-based alerts and visualize them using custom widgets in dashboards.

Derive actionable insights from verbose logs without overburdening resources or reducing developer efficiency

Depending on your log management practices, verbose logs can prove useful for troubleshooting issues that occur in your production environment. But routing and storing verbose logs is likely to impact resource performance and developer efficiency. Observability Pipelines not only extracts metrics from your verbose logs but can also reduce log size before sending them to a destination. This enables you to decrease network and egress costs and prevents your resources from becoming overburdened.

Trying to quickly identify insights from verbose logs can be a complex process depending on their structure, length, and the granularity of the embedded information. Extracting key metrics from them can reduce mean time to resolution (MTTR) by separating the most immediately helpful information from complex log data.

View  of the log-based VPC and CDN metrics generated from the pipeline

For example, CDN, firewall, network activity, and load balancer logs frequently include extensive amounts of information, such as multiple timestamps, request URLs, source and destination IP addresses, caching information, geolocation of requests, connection attempts, and more. As a result, these logs require a significant amount of network bandwidth to route from source to destination. Using Observability Pipelines, the extracted metrics might simply include a timestamp, geolocation of requests, the client IP, and any error codes. If you experience an issue with your environment that warrants an investigation, these metrics can provide enough preliminary information to quickly pinpoint the root cause.

Get started extracting metrics from your logs with Observability Pipelines

Datadog Observability Pipelines can generate metrics from your logs before they leave your environment, supporting your long-term observability strategy without compromising compliance or analytics capabilities. Extracting metrics from logs enables you to avoid unnecessary egress and network costs, maintain optimal performance, and take advantage of cost-efficient, long-term storage solutions as your log volumes scale.

For more information, visit our documentation. If you’re new to Datadog, you can sign up for a 14-day .