The Need: Monitoring To Scale Effortlessly Alongside a Growing Enterprise
After successfully running the digital component of Obama for America’ s 2012 Presidential campaign, Blue State Digital (BSD) shifted their focus to further expanding their service offerings and the diversity of their client base. “We are the web, email, and fundraising platform that powers presidential campaigns, disaster response, and global brands,” said Joel Barciauskas, Director of Platform at BSD. “We provide input-forms, signup forms, and contribution forms on the front end, and the tools for building those forms on the back end.”
Over the course of several years leading up to the Obama campaign, BSD had built a complex stack with multiple tiers of web services, databases, and load balancers that relied on varied systems including Linux, PHP, MySQL, RabbitMQ, and more. Additionally, BSD was in the process of migrating sections of their infrastructure to Amazon Web Services (AWS) to further support rapid infrastructure growth.
According to Barciauskas, these rapid growth plans included rolling out a “next generation” action tracking service that required “higher volume [of requests] than what we traditionally tracked.” For example, BSD had begun gathering information on how users found and interacted with pages, resulting in significantly more data points per mail click. Barciauskas and his team were using the open source tools StatsD and Graphite to monitor BSD’s infrastructure. However, as BSD moved to a more dynamic cloud environment that included automated server provisioning—manually updating server counts, instrumentation and alerts were beginning to take up a lot of time and overhead.
Barciauskas realized that BSD needed to shift to a monitoring tool that would easily integrate with their existing technical setup and scale effortlessly alongside their infrastructure. This would allow BSD to move its attention away from building monitoring tools and back to creating and enhancing applications.
“ Datadog was attractive because all of the integration tools that allowed for ‘out of the box’ integration, even with the more esoteric metrics from HAProxy and ElasticSearch.”
Joel Barciauskas
Director of Platform, Blue State Digital
Automated Setup for Server Monitoring with Datadog
After evaluating other monitoring-as-a-service providers, Barciauskas realized that Datadog was the right fit for BSD. A major reason that Barciauskas decided to go with Datadog is that Datadog had a Chef cookbook and handler that allowed BSD to implement Datadog on their servers in an automated fashion, unlike other solutions that used Chef cookbooks by just instituting a curl command. While most monitoring systems met Barciauskas’s requirements for “push driven metric collection into an external service for which we didn’t have to manage any servers,” only Datadog required no custom work during setup.
Infrastructure Monitoring that Integrates Together Easily
Datadog provided Barciauskas with the push-driven metric collection tools that BSD needed in a solution that was integrated effortlessly into other systems. “Datadog was attractive because all of the integration tools that allowed for ‘out of the box’ integration, even with the more esoteric metrics from HAProxy and ElasticSearch,” stated Barciauskas. Additionally, Datadog’s integration with StatsD allowed Barciauskas to continue gathering metrics from a system that his team understood well. “Being able to use an interface that we were familiar with, instead of a custom API, made us realize that we weren’t going to be locked down into gathering metrics through Datadog”, said Barciauskas.
Monitoring that Scales alongside Rapidly Growing Infrastructure
BSD regularly had to increase the number of servers that they were managing in order to adjust to the larger volume of data that their clients were sending. Before Datadog, Barciauskas would have to manually write a custom check into Nagios, taking up valuable time. However, with Datadog, “the flexibility of Datadog’s alert building tool allows someone who is not familiar with Nagios to write more robust, manageable, understandable alerts very quickly,” states Barciauskas. This made Datadog’s ability to get additional servers online with little additional set-up valuable to BSD. Datadog saves time. An individual alert that would have taken BSD an hour of coding to complete can be done now in five minutes.
Making Infrastructure Data Easy to Query and Manipulate
Every week, BSD’s operations team conducts an on-call review where they go through all the events that occurred the previous week. “We use Datadog’s Timeline View because it is much easier to navigate. It also links to the underlying PagerDuty incident so that we can just go directly in there when we need more detail,” said Barciauskas. Tagging was another feature that Barciauskas and his team became fond of, since it allowed them to easily manipulate data by grouping and aggregating metrics in any dimension wanted. Additionally, tags were not tied to specific instances, allowing BSD’s alerts to come online as soon as a server was launched without any additional configuration. This saved Barciauskas’ team’s time, which was instead assigned to BSD’s other technical needs.
Moving Towards a Social “DevOps” Collaboration Mode
Barciauskas is continually experimenting with new ways to enhance his team’s productivity. A feature set that Barciauskas believes has the ability to transform his team’s working dynamics is the social messaging aspect of Datadog. The conversations in Datadog are linked to the events that occur, allowing for responses and past incidents to be easily recorded. According to Barciauskas, “the social potential is huge.” His team can easily track what happened during past incidences to solve problems quickly, allowing Barciauskas’s team to focus on BSD’s growth.
“ The flexibility of Datadog’s alert building tool allows someone who is not familiar with Nagios to write more robust, manageable, understandable alerts very quickly.”
Joel Barciauskas
Director of Platform, Blue State Digital