Seeking improved visibility and control across a complex infrastructure
AGNC specializes in investing in mortgage-backed securities and other mortgage-related investments. Though publicly traded with multiple business units, including a self-clearing broker-dealer, it employs only about 50 people. “We’re a very nimble team, but it’s a sophisticated organization with complex systems architecture,” says Christopher Erhorn, Senior Vice President and Chief Technology Officer.
AGNC runs an extensive internal codebase that executes hundreds of jobs daily to facilitate operations. Given the nature of its business, the company must meet strict compliance and security requirements. It cannot afford extended downtime, so minimizing mean time to detect (MTTD) and mean time to repair (MTTR) is crucial.
Initially, AGNC used multiple disparate tools for monitoring and security, hindering quick problem diagnosis. Erhorn and his team sought a more comprehensive solution that would provide better visibility and control across the entire IT infrastructure while reducing costs. They aimed to consolidate tools, improve remediation times, and scale with growth while maintaining a lean IT team.
Adopting a new comprehensive observability and security platform
Erhorn had heard about Datadog but became more familiar with the product after visiting Datadog’s booth at AWS re:Invent 2019. Impressed by the demo, he launched a free Datadog trial soon after. Within a few hours, Datadog was set up and in use. “Once we installed the agents into the test environment, the metrics were there, and I could start customizing immediately,” he says.
Following a successful trial, AGNC adopted Datadog as its new observability platform. “We started using it with part of the infrastructure, then expanded as we grew more comfortable,” adds Erhorn. “We grew organically with the system and turned off other tools as we progressed.”
Initially, AGNC was using a different product for log management, but it required significant specialization, and the data was siloed from the monitoring data in Datadog. “We felt like we were duplicating a lot of work,” says Erhorn. “We had to build many bespoke rules and maintain a complex in-house architecture requiring consultants.”
The company eventually replaced this tool with Datadog Log Management. “Datadog Log Management was easier to use and it reduced our infrastructure complexity since it was cloud-based,” adds Erhorn.
Around the same time, AGNC began migrating to AWS, and it used Datadog Cloud Security Posture Management (CSPM) to ensure the AWS environment was secure and built to the highest standards. “We adopted CSPM early and responded to the signals as we built,” says Erhorn. “By leveraging CSPM, we got it right on the first try.”
AGNC also implemented Datadog Cloud SIEM to replace its Security Operations Center (SOC). “Every security signal reviewed by our team is centrally captured and reviewed, ensuring compliance and minimizing the risk of missed issues,” says Erhorn. “Since we had already implemented the other Datadog features, it was just a matter of examining the pre-built rules in the SIEM, expanding on them, and customizing them.”
Datadog assists with reporting as well. Every quarter, Erhorn uses data from Datadog, such as Service Level Objectives (SLOs), to present cybersecurity and management metrics to the board and executive management team. This process typically takes just a few hours, thanks to the streamlined reporting capabilities within the platform.
Reducing troubleshooting time and costs
By consolidating monitoring into a single platform, Datadog provides higher data granularity, improved visibility, and shortened issue resolution times. “We spend very little time diagnosing problems today because we have the data we need,” says Erhorn. “When an issue arises, we can find the root cause and troubleshoot quickly.”
Datadog has enabled AGNC to reduce the number of tools it relies on for monitoring and security. “Instead of looking at many different systems each day, I look at a few custom dashboards that contain everything I need to review,” notes Erhorn.
The breadth and depth of Datadog integrations mean AGNC engineers don’t have to manually build or maintain their own. A great example is their use of Datadog’s Netskope integration. Rather than logging into Netskope daily, Erhorn set up the integration through Datadog’s Marketplace. Within hours, he could construct custom visualizations of Netskope events.
Erhorn also appreciated the support from Crest Data, an Advanced tier Datadog Technology Partner who built the Netskope integration and offered assistance after installation. “I always check the Integrations and Marketplace pages to ensure we’re not wasting time building our own integrations,” he says. “We’d rather use or buy maintained Datadog integrations to save time and money. It’s a big value-add.”
Maintaining a lean IT team
Datadog has enabled AGNC to maintain a lean IT team despite its growth. “If we were still operating under the old model, we would have needed more people. Datadog has allowed us to stay the same size because we don’t have to spend a lot of time supporting our existing infrastructure,” says Erhorn. “We always want to focus our IT resources on differentiators for our business. Leveraging Datadog to manage our infrastructure means we can focus these resources on business needs, not routine infrastructure management.”
Ultimately, AGNC’s partnership with Datadog has helped Erhorn and his team stay on top of their IT environment. “Datadog is what we use to observe our entire IT environment. It is the system,” he adds. “To me, Datadog is essential.”