Last month, members of the Datadog community convened in Seattle for our customer summit. There, they discussed new developments in monitoring dynamic infrastructure and applications, learned about the latest updates to the Datadog platform, and shared tips, tools, and techniques from their own experiences.
This year, the Datadog Summit included a series of technical talks, hands-on workshops, and small-group Q&A sessions. We discussed new capabilities and language support for distributed tracing and APM; introduced automated browser tests; and debuted advanced logging features, like nested processing pipelines and automated correlation between logs and traces. In addition, Datadog users and partners shared their own diverse experiences, such as why application-level metrics remain important even as infrastructure technologies become increasingly advanced, and how to take the anxiety out of deploying new features.
For more, you can watch some of the product announcements and talks from the customer summit below, or see the full playlist here.
APM and distributed tracing
In this talk, Datadog product manager Priyanshi Gupta simulates a typical on-call workflow, pivoting between features like the Service Map, App Analytics, and flame graphs to detect and debug errors. In addition, Priyanshi announces some exciting additions to Datadog’s APM and distributed tracing offerings, from support for new languages (PHP and .NET) to our brand-new runtime metrics feature, which surfaces detailed data from your application’s runtime environment.
Advances in log management
Datadog product manager Stephen Lechner unveils a number of new log management features, including a way to unify logs and traces more tightly than ever. Datadog’s tracing libraries can now inject trace IDs into the logs generated while processing a request, automatically deep-linking log events and traces to provide instant, request-level context. In addition, Stephen also introduces the nested pipelines feature, which allows a team to independently organize and manage a number of parallel logs pipelines without having to worry about affecting their colleagues’ pipelines in the process.
How we made deploys less scary (Carta)
For developers and SREs, the feature release process is one of the most stressful parts of the job because of the potential for cascade effects. In this video, Adam Savitzky explains how Carta uses strategies and tools like feature flags, dark launches, and user bucketing to deliver new features to 700,000+ shareholders (and over 10,000 companies) while reducing the risk of introducing slowdowns or outages.
Application first, not infrastructure first (AWS)
Despite all the attention paid to changes in infrastructure, such as going from VMs to containers or instances to functions, many of the performance-monitoring fundamentals remain the same. In this talk, Abby Fuller of AWS shares some helpful tips and tools for monitoring important application-level metrics even as your infrastructure evolves. Towards that end, she also explains the critical role of a service mesh in facilitating application-level communications and networking, independent of whatever infrastructure platform you use.
Achieving huge performance wins with Datadog (Rover)
As a popular matchmaking service that connects pet care providers with owners, Rover relies on a small SRE team to support a complex application and a rapid release cycle. In this talk, Rover’s Alex Landau explains how the SRE team customized their Datadog dashboards, focusing on a selection of granular metrics in order to guide their performance monitoring and troubleshooting efforts. In addition, Alex and his team built an observability toolkit to empower developers to find and fix issues in pre-production, preventing smaller problems from escalating.
Tracking SLIs and SLOs
In order to improve visibility into the status of your SLOs and SLIs, Datadog recently released (in public beta) a new monitor uptime and SLO widget. As Datadog product manager Meghan Jordan explains, this new feature tracks your SLO performance over time, visualizes your remaining error budget, and helps you better understand where your performance stands in relation to your SLOs.
We’re grateful to everyone who shared their wisdom at Summit, and we hope that you find the talks engaging and informative. If you’d like to see more, please take a look at our full playlist of videos here.