How Grainger Optimized RUM Costs by Removing Unwanted Bot Traffic | Datadog

How Grainger optimized RUM costs by removing unwanted bot traffic

Author Anand Mattah Chenna Kesavalu
Performance Engineer, Grainger

Published: 10月 30, 2024

On average, nearly half of all traffic to e-commerce websites come from bots, which significantly impacts performance metrics and inflates monitoring costs. While some bots serve useful purposes, such as search engine indexing or price comparison, many others are detrimentally skewing analytics and inflating RUM session counts. These automated visitors can distort key performance indicators, making it challenging to get an accurate read on customer behavior. Furthermore, the excessive RUM sessions consumed by bots can lead to unnecessary expenses. Understanding and mitigating the influence of bots is critical for maintaining accurate insights and controlling costs in today’s digital landscape. And, of course, grainger.com is no exception to this.

In this post, we will cover the single-user actions to look out for when assessing bot traffic and go over the steps we took to remove bots from our RUM sessions.

Understanding bot vs. real user traffic patterns

Many single-user actions are bots that are programmed to scrape web pages for content or data. These automated scripts mimic real user behavior by visiting the site and loading pages, but they don’t engage in meaningful interactions. Instead, they quickly extract information before leaving, adding little to no value to user experience analysis. Despite their minimal impact on genuine user engagement, these bot activities can inflate traffic metrics and consume valuable RUM sessions, leading to skewed data and increased monitoring costs.

Single-page user traffic activity.
630,000 out of 1.1 million RUM sessions are only one-page hits, signaling heavy bot traffic.

Single-user actions that attract unwanted traffic include:

  • Social media clicks, such as users who click on a link from social media platforms like Facebook, Twitter, or LinkedIn and visit your site briefly.
  • Ad clicks, or users who arrive through paid advertisements, such as Google Ads or display ads, but leave after viewing just one page.
  • Email campaign clicks, such as recipients of marketing emails who click through to the site but don’t engage further.
  • Search Engine results, such as visitors who find your site through a search engine, click on the result, but quickly leave without exploring further.
  • Referral traffic, or users who come from external links on other websites but only view a single page.
  • Event promotion clicks, such as attendees of online events or webinars who click through a promotional link but do not continue exploring.
  • Coupon or offer clicks, such as users attracted by a specific coupon or offer who visit the site to claim it but leave immediately afterward.
  • Bot traffic: Although not true users, some bots may appear as one-time visitors, quickly leaving the site after a single page view.
  • News or article clicks, such as readers who visit your site to view a specific article or news piece but don’t engage further.

Implementing a targeted strategy to filter unwanted traffic from RUM sessions

There are a few strategies you can use to avoid session replays for these types of traffic. The first is to simply avoid injecting the RUM SDK for overt bot traffic. Next, you can choose to ignore all one-page sessions or take a sample of RUM sessions from single-page visits. By allowing only 20 percent of single-page sessions to be sent to Datadog, you can inject the RUM SDK at the start of the session with the configuration option trackingConsent: 'not-granted' to start. This way the SDK can be injected but not collect the session. Then when you confirm if it is not bots or multiple-page sessions, you can call the .setTrackingConsent('granted') method in the RUM SDK to begin recording the session.

window.DD_RUM.onReady(function() {
    window.DD_RUM.init({
        ...,
        trackingConsent: 'not-granted'
    });
});

// Example of starting the RUM SDK to capture the session acceptCookieBannerButton.addEventListener('click', () { 
window.DD_RUM.onReady(function() { 
    window.DD_RUM.setTrackingConsent('granted');
    });
});
Single-page user traffic over the last 30 days.
All of the roughly 700,000 one-page visits occurred over the last 30 days.

For our approach, we implemented a bot manager from our Content Delivery Network (CDN) provider that set the signal for Human or Bot. We also used browser local storage that tracks the page view counts for the session com.tagmanager.reactor.core.visitorTracking.pagesViewed > 1, along with a referrer domain from grainger.com that cut down on traffic pollution from single-page hits or other subdomains. For all of these methods, we used a tag manager that injected the Datadog RUM SDK based on the following checks: bot signal, referrer domain, and page view count.

Method for not injecting RUM SDK based on bot patterns.
conditions: [{
        modulePath: "somepath/datadog_sdk_inject.js",
        settings: {
            source: function() {
                if ("Y" != _satellite.getVar("CDN: Bot Signal") && _satellite.getVar("User Identification: Visitor Behavior: Session Page View Count") > 1 && document.referrer.indexOf("example.com") > -1) 6
                return !0
            }
        }
    }],
    actions: [{
            modulePath: "somepath/datadog_sdk_inject.js",
            settings: {
                global: !1,
                source: "// CDN async script 

                (function(h, o, u, n, d) {
                    h = h[d] = h[d] || {
                        q: [],
                        onReady: function(c) {
                            h.q.push(c)
                        }
                    }
                    d = o.createElement(u);
                    d.async = 1;
                    d.src = n
                    n = o.getElementsByTagName(u)[0];
                    n.parentNode.insertBefore(d, n)
                })(window, document, 'script', 'https://www.datadoghq-browser-agent.com/us1/v5/datadog-rum.js', 'DD_RUM') 21 window.DD_RUM.onReady(function() {
                        window.DD_RUM.init({
                                clientToken: 'sometoken',
                                applicationId: 'someid',
                                site: 'datadoghq.com',
                                service: 'example.com',
                                env: 'env',
                                version: 'some_version',
                                sessionSampleRate: 100,
                                sessionReplaySampleRate: 100,
                                trackUserInteractions: true,
                                trackResources: true,
                                trackLongTasks: true,
                                trackFrustrations: true,
                                defaultPrivacyLevel: 'mask',
                                compressIntakeRequests: true,
                                allowedTracingUrls: [\"https://example.com\" ], 
                                    enableExperimentalFeatures: ['clickmap', 'feature_flags'],
                                });
                        }) window.DD_RUM.onReady(function() {
                        // Function to retrieve cookies 
                        function retrieveCookies() {
                            var cookies = document.cookie;
                            // console.log(\"cookie:\" + cookies) 
                            // Split the cookies string into individual cookies 
                            var cookieArray = cookies.split('; ');
                            var cookieObject = {};
                            // Convert the array of cookie strings into an object 
                            cookieArray.forEach(function(cookie) {
                                var parts = cookie.split('=');
                                cookieObject[parts[0]] = parts[1];
                            });
                            // Set the cookie object as a context property 
                            window.DD_RUM.setGlobalContextProperty('cookies', cookieObject);
                        }
                        // Call the function to retrieve cookies asynchronously after a short delay to ensure that the document is fully loaded 
                        setTimeout(retrieveCookies, 1000); // Adjust the delay as needed 
                    });
                    ", 
                    language: "javascript"
                }
            },
Eliminating potential bot traffic from recorded sessions.

Overall, we found that RUM session counts were over 1.3 million per day and more than 50 percent of them were bots and single page visits. By removing single-page visitors and bots, the total RUM session counts improved to 500,000 per day. By identifying and excluding one-time visitors, bots, and other low-value traffic sources, we significantly reduced unnecessary data collection and costs. This approach allowed us to focus on meaningful user interactions, enhancing the accuracy of our insights while optimizing resource utilization.

Cut down on your RUM costs with Datadog

By targeting our efforts, we achieved a remarkable 50 percent cost savings on RUM expenses, optimizing our budget while also improving the accuracy of our user behavior analysis. Our strategic approach has been both financially and operationally beneficial, cutting down on distorted key performance indicators, unnecessary expenses, and skewed analytics. This allows us to focus our resources on what truly matters—enhancing the experiences of real users.

To learn more about Datadog RUM and Session Replay, visit the documentation. Or try it out with a .