Filter and Correlate Logs Dynamically Using Subqueries | Datadog

Filter and correlate logs dynamically using subqueries

Author Jordan Obey
Author Sid Dhingra
Author Usman Khan

Published: 4月 1, 2024

Logs provide valuable information that can help you troubleshoot performance issues, track usage patterns, and conduct security audits. To derive actionable insights from log sources and facilitate thorough investigations, Datadog Log Management provides an easy-to-use query editor that enables you to group logs into patterns with a single click or perform reference table lookups on-the-fly for in-depth analysis.

As your applications scale, you will inevitably face increasing volumes of logs distributed across several different services, environments, and regions. When data is fragmented this way, you must make log correlations between two or more log sources in order to form a coherent understanding of what’s happening across your distributed ecosystem. Such correlation is often a tedious process that typically requires running a log query, exporting the results, and then manually using those results in another query.

To make it easier to correlate logs from multiple sources, Datadog’s Log Explorer now offers subqueries. In this post, we’ll look at how filtering logs with subqueries can help organizations quickly investigate bugs for remediation, gauge the impact of a security breach, and identify the business impact of key users.

Investigate bugs for remediation

To start filtering logs with nested queries, simply click the “Add” button in the top right corner of the Log Explorer and select “Filter with Subquery.”

subquery_02.png

Let’s say you are on an engineering team of an online retailer and investigating an issue where customers are able to checkout items but their payments are failing to process. Unfortunately, your transaction service does not emit logs for payment failures, so the only way to identify these errors is by looking for logs from your checkout service that don’t have a corresponding payment success log.

subquery_01.png

Subqueries help overcome this problem by allowing you to run a filter on your checkout service to surface all logs containing transaction_ids that are not present in the logs of your payment service. Now you can view all the transactions where there was a problem with processing payments and focus on identifying steps toward remediating the issue instead of hunting for relevant data.

Gauge the impact of a security breach

If you are a security analyst running an investigation on a potential network threat, it’s important for you to know which devices and assets were accessed by malicious actors so you can understand the scope of the threat. You can run a subquery to identify malicious IPs based on a threat vector and then filter logs from assets within your network down to only those accessed by the malicious IPs. In the screenshot below, for example, we are filtering logs from a web-store service down to those that contain IP addresses outside of the network firewall’s allowed list.

subquery_03.png

This way, you can quickly gauge the breadth of an attack and then isolate and quarantine the impacted assets to prevent the attack from spreading further.

Identify key users

When running a web app such as an e-commerce site, identifying your top users is an important step in informing your target marketing efforts and optimizing end user experience. Let’s say that among the customers who regularly order items, you are only interested in those who have created an account. This information is spread across two sets of logs: your platform logs which contain information about user accounts and logins, and logs from your third-party checkout service.

With Datadog, you can quickly surface relevant logs by running a subquery that first retrieves the users who most frequently check out items and then lists which of those users have accounts and log in most frequently. In just a few steps, you’ve gained insights that can help guide customer retention strategies and ensure you continue to satisfy users.

top_list_subquery.png

Filter logs by subqueries today

By using the results of one log query to filter the results of another, you can quickly surface the critical data needed to investigate incidents, troubleshoot bugs, identify key customers, or understand the blast radius of an attack. To learn more about subqueries and Datadog Log Management, please see our documentation. And if you aren’t already using Datadog, sign up today for a 14-day .