Leveraging the power of an integrated solution
While searching for a new observability solution, Tilkin conducted a limited trial of Datadog Log Management and serverless infrastructure monitoring. He also tested other open source observability platforms, but they lacked the integrations he needed. Once he saw how Datadog Application Performance Monitoring (APM) could help, Tilkin made a 180-degree shift to focus on Datadog. “When I added APM and saw how easily I could correlate between APM and traces and logs, I realized this was a great tool,” Tilkin says. “This is not like anything else on the market.”
Complyt’s R&D team then created Datadog dashboards and encouraged company members to monitor them daily. “The product’s health should be important to everybody, not only the R&D team,” says Tilkin. “We are here to serve our customers. We utilize Datadog for this, and it’s very helpful.”
As he dove deeper into Datadog’s capabilities and saw how its products work together seamlessly, Tilkin added Application Security Management (ASM) to identify services exposed to application attacks. “That was an eye-opener for me,” he says. “Once I enabled ASM, I started to receive alerts about hundreds of attack attempts on our microservices. I realized people could access parts of our sales tax microservices that were not supposed to be accessible.”
That realization helped Tilkin rethink his system architecture in AWS to isolate subnets to prevent unauthorized access to microservices. “Datadog helped me understand what was exposed to the public which shouldn’t be, and helped us reconfigure our network topology,” Tilkin adds. “After the reconfiguration, the attacks stopped.”
Finally, Tilkin added Datadog Cloud Cost Management (CCM) when he saw cloud costs grow and didn’t understand why. “With Datadog, we were able to integrate cost data from AWS and build a visualization that helped give us a clear idea of where we were spending and where there was no usage,” he says.
“If you want to make your product better, more secure, and faster and spend less money, Datadog is a great tool.”
That allowed Tilkin’s team to rightsize clusters they were barely using but were still paying for. “In an hour, we cut our total AWS costs by 40 percent,” he says. “When you have a tool that’s very fast, integrates with your cloud provider, and lets you understand where you spend your money, it’s very easy to dig deep into utilization of your compute resources.”
Going forward, Tilkin and his team will use Datadog to continue to break down organizational security and cost management silos and create a culture of security and cost accountability. They’ll also rely on APM to improve the performance of their flagship application and service level objectives (SLOs). For example, the company previously used an external service provider to calculate customer sales tax rates. When they processed a transaction that used the external service, it took roughly 700-800 ms. But when the volume of transactions increased, response time spiked to 2 to 3 seconds. The APM flamegraph allowed them to visualize the time spent on various resources. As a result, they found that as the traffic load increased, all the resources scaled well except for the external service. In response, they wrote their own service, which brought their average processing time down to 150 ms. It also helped them reach their SLO of 50 requests, each processed in 350 ms on average.