As your organization’s log volumes increase and your architecture scales in size and complexity, your security teams may find it difficult to collect, monitor, and analyze logs for security risks. These teams need to find a path that allows them full visibility of their log data without significantly increasing costs. As they struggle to navigate the tradeoffs between costs and retention, security teams often buy multiple solutions or build custom tooling to identify threats and vulnerabilities. These solutions can be noisy, creating false positives that lead to alert fatigue or are narrowly scoped, leading to missed indicators of compromise (IoCs).
In addition, security operations center (SOC) teams often work with siloed data, since they commonly use preferred vendors for different security monitoring use cases—such as endpoint protection, firewalls, cloud security, antivirus and malware scanners, and more. This introduces cross-platform learning friction and makes it difficult for CISOs to avoid tool sprawl, stay within budget, and still support their teams’ threat detection needs.
Amazon Security Lake is a data lake purpose-built for security teams, where you can centralize security data from AWS environments as well as other sources, including SaaS providers, on-premise applications, and other cloud vendors. Datadog Observability Pipelines now integrates with Amazon Security Lake easily, giving you more control over how you aggregate, transform, and route data from your various logging sources into Security Lake, as well as perform data analytics on logging sources. This integration, now available in Preview, helps security teams manage and analyze their security logs in a centralized location and flexibly scale their log routing and storage volumes while avoiding tool sprawl.
In this post, we’ll show you how to use Amazon Security Lake and Datadog Observability Pipelines to:
- Collect, route, and store logs at scale
- Standardize log data for security analysis
- Enrich and secure your logs to speed up investigations
- Easily adopt or migrate to different SIEM vendors
Collect, route, and store data at scale
A typical large organization’s IT infrastructure is spread across on-prem environments, public clouds, and geographic regions. Your DevOps and security teams may install and maintain applications—such as firewalls, CDN networks, and network devices—that generate multiple terabytes of log data each day. To add to the complexity, data ownership is typically spread across applications and accounts, making it less discoverable and harder to instrument for the right level of context and supporting detail, such as environment metadata, authentication credentials, and user-related details.
With Observability Pipelines, you can either use preconfigured templates or set up custom pipelines to collect, aggregate, and transform your logs in a central location. You can design pipelines to send these logs to downstream services and applications, including Amazon Security Lake, which can cost-effectively store petabytes of logs for historical analysis.
You can easily get started by deploying Observability Pipelines Workers in your environments on-prem in just a few clicks, while still having the flexibility to manage and deploy them from a cloud-based control plane. This enables you to flexibly scale your log ingestion and routing based on need, as each Observability Pipelines Worker operates independently and can be fronted with a simple load balancer. This makes it easy to manage higher volumes of logs that you may need to route to Amazon Security Lake as your infrastructure grows.
Standardize log data for security analysis
Security teams often use different vendors to monitor specific types of log data—for example, they may use Palo Alto Networks Firewall for network security, SentinelOne for endpoint threat detection, AWS IAM Access Analyzer to manage access policies, and more. Each of these tools generates logs in its own proprietary format, making it essential for security teams to normalize their data for analysis. But standardizing data for downstream security and analytics is expensive, resource-intensive, and requires task-specific expertise. In addition, vendors often change their logging formats or introduce new metadata tags that require you to re-parse the logs to make sure your critical detection rules and dashboards can incorporate these updated logs.
To improve DevSecOps efficiency and ensure high-quality detections, Amazon Security Lake stores data in the industry-standard Open Cybersecurity Schema Format (OCSF). Observability Pipelines includes an OCSF processor that you can use to automatically transform logs into OCSF before routing them to Amazon Security Lake or other SIEM vendors, such as Splunk or Google Chronicle, so the data is readily available for building your detection rules.
With Observability Pipelines, Security Lake users no longer need to transfer or pre-process data in multiple locations for each vendor before it’s consumed for reporting, visualization, and analysis.
Enrich and secure your logs
Often, log data is unstructured and contains sensitive information such as API keys, personally identifiable information (PII), and payment card data. To stay compliant, security teams may need to redact this sensitive data out of their logs before routing them to third-party SaaS solutions, such as SIEMs or data lakes, for long-term retention.
Observability Pipelines operates in hybrid environments that include both cloud-based and on-prem resources, and through its integration with Datadog Sensitive Data Scanner, it can obfuscate data at the edge (i.e., in your on-prem infrastructure) before it’s sent to the cloud.
This enables your security teams to automatically parse their log data into key-value pairs and redact sensitive data on the stream consistently, independent of vendor technology limitations, before streaming it to Amazon Security Lake for further analysis. In addition, you can enrich the data with contextual information and GeoIP location. Enrichment tables enable you to maintain a list of suspicious or malicious IPs, low-reputation domains, and known bad actors, and you can tag log events originating from these IPs as indicators of attack (IoA).
Adopt or migrate to tools easily
Observability Pipelines integrates with leading logging, SIEM, and cloud vendors. This helps reduce vendor lock-in, enabling your teams to choose solutions best suited for their specific use cases while retaining complete visibility into their logs. Similarly, Amazon Security Lake allows downstream applications to easily subscribe to the data that is stored in it. By using Observability Pipelines, you can route your logs to a variety of downstream services, including Security Lake and other common log destinations.
These flexible routing options give DevSecOps teams the freedom to select the best tools for their tasks—for example, they might choose a specialized solution for network or endpoint security—while still having a single pane of glass for log analysis and monitoring.
Start using Datadog to send logs to Amazon Security Lake
To get started, you can set up the Amazon Security Lake integration, now available in Preview, in Observability Pipelines in three easy steps:
- Once you’ve set up your Security Lake instance in AWS, you can set Observability Pipelines as a custom source from the Security Lake UI or using the AWS APIs.
- Next, design your observability pipeline in Datadog by selecting your logs source, such as Syslog, an Amazon S3 bucket, AWS Data Firehose, or others.
- Select Amazon Security Lake as the destination for your logs.
When you select Amazon Security Lake as a destination, Observability Pipelines automatically adds the OCSF processor to your pipeline. Datadog will identify the log type and convert it into the Amazon Security Lake-compatible OSCF schema.
Within seconds, you will start seeing logs streaming to Amazon Security Lake in the desired OCSF format, along with Parquet conversion. Learn more about how you can use OCSF with Datadog in our dedicated blog post.
Datadog Observability Pipelines enables you to choose the logging platform and security solutions of your choice, including Amazon Security Lake, so you can support enhanced analytics, improve your threat detections, and avoid vendor lock-in. You can get started with Observability Pipelines by reading our documentation. If you’re not a customer, you can get started today with a 14-day free trial.