Toyota Deploys at Scale Faster and More Securely by Monitoring AWS With Datadog | Datadog

TOYOTA

Case study

Toyota deploys at scale faster and more securely by monitoring AWS with Datadog

Transportation

48,000+

IN NORTH AMERICA

Plano, TX

About Toyota

Toyota (NYSE:TM) has been a part of the cultural fabric in the US for more than 65 years, and is committed to advancing sustainable, next-generation mobility through its Toyota and Lexus brands, plus nearly 1,500 dealerships.

“It's not just about what Datadog can do; it’s also about how Datadog is helping us integrate and monitor multiple services. That was a big factor in choosing Datadog.”

case-studies/kishore_jonnalagedda
Kishore Jonnalagedda
Director of Engineering
Toyota Motor North America (TMNA)
case-studies/kishore_jonnalagedda

“It's not just about what Datadog can do; it’s also about how Datadog is helping us integrate and monitor multiple services. That was a big factor in choosing Datadog.”

Kishore Jonnalagedda
Director of Engineering
Toyota Motor North America (TMNA)
Why Datadog?
  • Provides full visibility into the health and performance of each layer of TMNA’s environment
  • Easily integrates with and monitors AWS-hosted apps and other key technologies
  • Brings data—regardless of their source—into a single unified platform to help teams quickly gain context and troubleshoot faster
  • Offers intuitive dashboards to visualize site reliability engineering practices, service level objectives (SLOs), AWS over-capacity, and more
Challenge

TMNA lacked a consistent monitoring tool, which created inefficiencies and reliability concerns.

Key Results
↓ 96% MTTD

From about 6 hours to 15 minutes on average

20X Faster

New developers and contractors onboard in 3-4 days instead of 8-12 weeks

Quicker delivery

Teams now ship projects in weeks instead of quarterly

Inconsistent monitoring creates inefficiencies

Toyota Motor North America (TMNA) is the operating subsidiary of the Toyota Motor Corporation in the United States, Canada, Mexico, and Puerto Rico. TMNA works to create high-quality vehicles and find innovative ways to advance society with cutting-edge automotive technology.

TMNA began using Amazon Web Services (AWS) in 2015. As it did so, it also wanted to simplify and standardize application development in the cloud and improve time to market. In response, Kishore Jonnalagedda, director of engineering, led the TMNA cloud platform team in building an internal, self-service development platform called Chofer using Backstage running on AWS.

Today, Chofer provides TMNA developers the tools they need to deploy modern applications that use AWS services across the organization and it facilitates faster, more secure application deployments at scale. However, the team lacked a consistent monitoring tool, which created reliability concerns. Some developers used open source tools, others used log management tools, and some didn't use anything. As a result, team members often spent multiple hours trying to get to the bottom of an outage because they didn’t know what to look for or where.

“Some applications support critical aspects of our business; if they go down, we can lose revenue in the order of millions,” says Jonnalagedda. “When we say our mission-critical applications are highly available, we need a mechanism to support that statement.”

With 1,600 total applications (300 in the cloud) and more than 100 teams, that was a challenging task. On top of gaining unified visibility, the cloud platform team also sought to improve mean time to detection (MTTD) and ensure they could meet SLAs for 99.9 percent uptime while simultaneously reducing costs and helping engineers become more efficient.

toyota.png

Gaining centralized visibility into its AWS environment

Jonnalagedda and his team began looking for an observability solution that could provide full visibility into the health and performance of each layer of TMNA’s environment at a glance, in a single pane of glass.

Ultimately, TMNA achieved that by ingesting data from its AWS services into Datadog. It can now maintain visibility into its cloud-hosted apps running on Amazon EC2, Amazon RDS, Amazon EKS, and others, all in one place. “It’s monitoring, logging, and traceability—the complete observability stack,” says Jonnalagedda.

Toyota also needed to monitor different parts of its tech stack. Datadog gave Toyota the visibility it needed with its 800+ integrations with key technologies, including support and out-of-the-box dashboards for over 100 AWS services.

“I love that Datadog recognizes non-traditional databases,” adds Jonnalagedda. “Some of our big data teams use databases that aren’t RDS. Datadog instantly recognizes them and starts monitoring them. It's not just about what Datadog can do; it’s also about how Datadog helps us integrate and monitor multiple services. That was a big factor in choosing Datadog.”

Datadog’s dashboards helped TMNA develop applications with more transparency, bringing metrics and logs into one place—regardless of their source—and helping the team quickly gain context and troubleshoot problems faster. These visualizations were also an easy way for the organization to look at site reliability engineering practices, visualize service level objectives (SLOs), and manage AWS over-capacity.

Improve speed and reliability

TMNA has saved $10 million over two years using Chofer. Part of that savings can be attributed to using Datadog to monitor its underlying infrastructure, supporting services, applications, and security data in a single observability platform. Datadog helps TMNA teams free up time they’d typically spend managing infrastructure or observability so they can spend more time on feature delivery.

With these time savings, teams now ship projects in weeks instead of quarterly. In addition, since new hires can easily make sense of TMNA’s distributed architecture with Datadog’s centralized platform, onboarding developers and contractors now takes as little as three to four days instead of the eight to twelve weeks previously required.

“Datadog provides us the visibility to find weak links so we can educate teams and fix them.”
Kishore JonnalageddaDirector of Engineering, Toyota Motor North America

Finally, Datadog helps Jonnalagedda’s team reduce MTTD. “MTTD is reduced from about six hours to 15 minutes in a large-scale system,” says Jonnalagedda.

In another example, TMNA also used Datadog’s services to help reduce the mean time to resolution (MTTR) from seven days to two hours in one of its manufacturing plants, avoiding hundreds of thousands of dollars of cost from downtime. With Datadog, TMNA was able to achieve a standard process for metrics and measurements, along with a reduction in cross-team dependencies for issue resolution.

For Jonnalagedda, it all goes back to having the confidence to stand behind TMNA’s 99.9 percent uptime promise, which necessitates cross-team collaboration, alignment, and transparency across the entire organization. “Datadog provides us the visibility to find weak links so we can educate teams and fix them,” he says. “If something doesn't work in the cloud, we can ask the team for their Datadog dashboard, look in the logs to see which integration is broken and fix it. It’s a much faster turnaround time,” adds Jonnalagedda.

Since introducing Datadog into TMNA, Jonnalagedda’s team has helped build a robust observability culture that empowers application owners to take control of their monitoring. Additionally, TMNA can break down silos between different groups by offering a single source of truth for their data. “Bringing Datadog to the team is helping us better support our applications,” said Jonnalagedda. “Adoption is much cleaner. Datadog is a no-brainer.”

Resources

case-studies/resources_berkeley-lab_casestudy

case study

Materials Project of Berkeley Lab Uses Datadog Cloud Monitoring to Simplify Observability on AWS
case-studies/resources_hashicorp_casestudy@2x

case study

Improving Application Performance and DevOps Collaboration with a Unified Monitoring Platform
case-studies/resources_mercadolibre_casestudy@2x

case study

Learn how Datadog bridged the gap between application and infrastructure teams to enable cross-team collaboration.
case-studies/resources_zendesk_casestudy@2x

case study

How Zendesk enables greater developer productivity with AWS and Datadog