Customer-facing applications request and process many types of sensitive data, such as API keys, credit card numbers, and email addresses. As your application scales in size and complexity, it becomes harder to keep track of this sensitive data moving across more services, increasing the risk of data leaks. Without proactive measures, you may unintentionally expose this sensitive data, violating the privacy of your customers, your organizational policies, compliance requirements, or industry regulations.
Datadog’s Sensitive Data Scanner continuously scans data at the time of ingestion in order to detect and then scrub or hash sensitive information based on out-of-the-box or custom rules. Now, in addition to logs, Sensitive Data Scanner is available for APM, RUM, and Events, expanding the scope of data you can monitor for leaks.
In this post, we’ll show you how to:
- Configure Sensitive Data Scanner
- Remove sensitive data from RUM sessions
- Obfuscate sensitive data from distributed traces
Configure the Sensitive Data Scanner
When configuring Sensitive Data Scanner, you can create scanning groups that determine the scope of data to monitor. You can easily build scanning groups by specifying a query filter. For each scanning group you can select which Datadog products (e.g., APM, RUM, Logs, and Events) where you’d like to enable the scanner. That way, you can specifically create scanners for backend services that you know you collect traces from or applications that will generate RUM sessions. You can easily enable a single group to scan all your data, allowing you to centralize the governance across products without the need to re-deploy, re-build, or replicate policies.
Within a scanning group, you can set rules that define what constitutes sensitive data and thus what needs to be flagged and scrubbed. Datadog provides out-of-the-box rules for detecting common instances of sensitive data (e.g., Credit Cards, Emails, AWS Access Key ID Scanner or Google API Key Scanner) which you can easily apply to a scanning group.
Or, you can configure custom rules that fit the needs of your application. For example, if you work for a healthcare provider, you may want to create a custom rule that scans for patient IDs to ensure your customers’ privacy if they enter sensitive data into a form on your site.
If Datadog detects data that matches an out-of-the-box rule within a log event, RUM event, or trace span, it will automatically tag it with the name of the rule (e.g., sensitive_data:american_express_credit_card
) to make it easily searchable in Datadog. You can also add custom tags to rules to facilitate fast and easy searches. For example, if you add a high severity tag to a rule that looks for Social Security Numbers, you can easily use the query filter sensitive_data_severity:high
to surface all relevant sensitive data in your logs, RUM events, or APM spans. Tagging will also make it easier to monitor which services are leaking sensitive data, which is useful information if your team performs regular compliance audits. For example, to check if any of your frontend applications are leaking sensitive data, you could filter your RUM events to show only those with the sensitive_data:*
tag and then see which applications show up.
Remove sensitive data from RUM sessions
If you run an e-commerce site, Datadog RUM can provide invaluable insight into user experience so you can troubleshoot frontend issues. But it’s also possible for RUM sessions to capture sensitive data, such as console logs that contain user input. Now, you can leverage Sensitive Data Scanner to create scanning rules that will hash or scrub your data on ingestion. Then, as in the example below, sensitive data such as credit card numbers will be redacted from your console logs by the time they reach your RUM sessions.
Once you’ve addressed the immediate risk of sensitive data leaking to your RUM sessions, you can see what user action triggered this leak. With this knowledge in hand, you can investigate the part of your code responsible for this action and remedy the issue at the source.
Obfuscate sensitive data from APM
When a customer makes a purchase from your e-commerce site, the request propagates across multiple services in your backend. These requests may include data such as credit card numbers that your application passes along to a payment processing service. If you use Datadog APM to trace these customer requests from the frontend through to the backend, these traces may include that sensitive data. This means that customer data is being exposed to your SREs if they need to check those traces to troubleshoot an issue.
To protect your customers’ privacy, you can obfuscate this sensitive data by configuring a Sensitive Data Scanner rule to monitor distributed traces for credit card numbers and redact any identified values.
Extend your compliance strategy to cover APM and RUM
For every request to your application, there is the potential for logs, traces, and RUM sessions to include sensitive data. Now, Sensitive Data Scanner enables you to identify leaks across all your services in a single pane of glass. This greatly eases the task of upholding the privacy of your customers and adhering to compliance regulations, as building out your own scrubbing solution at the service level would be tedious and time consuming. Because new code can introduce additional leakage points, the process of rooting out leaks is an ongoing one. Sensitive Data Scanner makes it easy to automatically surface potential leaks before they affect your business.
The Sensitive Data Scanner for RUM and APM complements Datadog’s host of governance features such as Sensitive Data Scanner for logs, Audit Trail, and RBAC controls.
If you’re new to Datadog, sign up for a 14-day free trial, and leverage the Sensitive Data Scanner for APM and RUM to expedite your investigations.