Datadog’s more than 800 integrations collect monitoring data from across your entire stack, giving you full visibility into the health and performance of your applications and infrastructure. Alerts are a crucial part of any monitoring workflow, as they draw your attention to problems in your system before they affect your users. But whether you’re migrating to a new environment or integrating a new technology into your stack, it’s not always clear what data you should alert on. For example, which metrics matter most in your Kubernetes cluster? How do you define your alerting thresholds? How should you tag your monitors?
We’re pleased to announce Recommended Monitors—a suite of curated, customizable alert queries and thresholds for key infrastructure technologies such as Consul, Kubernetes, Kafka, and more. Recommended Monitors are preconfigured based on the expertise of our many technology partners, as well as our own experience and the experience of thousands of our customers. With Recommended Monitors, you’ll be able to start alerting on key monitoring data from your environment within minutes, so you can focus your attention on growing your business—without worrying about undetected problems in your ecosystem.
Enable Recommended Monitors with just a few clicks
Recommended Monitors are available out of the box, so you can get started immediately after you’ve finished installing the Datadog Agent and adding your integrations. Simply navigate to the “New Monitor” tab from the “Monitors” dropdown in the sidebar and then select “Recommended Monitors.” From there, you’ll be able to browse all available Recommended Monitors for your installed integrations—and filter them by integration name.
Each Recommended Monitor comes equipped with a default query, predetermined alerting and warning thresholds, and relevant tags—all of which adhere to the best alerting practices for the integration. You can also easily customize Recommended Monitors to suit your particular needs.
For example, the screenshot above shows a Recommended Monitor for Kubernetes that tracks how much disk space has been used per node over the last 10 minutes. By default, you will be warned if disk space usage exceeds 85%, and you will receive an alert if it exceeds 88%. But if your application relies on a rapid, high-volume data flow, you may want to adjust these thresholds in order to ensure that your team has time to address the underlying issue before it becomes critical. You can continue to fine-tune the thresholds even after you’ve added the monitor to your workflow.
As a final step, you can add notification channels to any monitor with the help of our built-in integrations with communication tools such as Slack and PagerDuty to ensure your team is able to respond quickly if and when a problem arises.
Enact alerting best practices for the technologies you rely on
Recommended Monitors are the product of the hard-won lessons we’ve learned over the years and the expertise of our trusted technology partners. We maintain two public Github repositories for our integrations, where our partners and community members can submit monitors they’ve created. These submissions are subject to a thorough review process by our integrations team to ensure that every Recommended Monitor is preventative, contextual, and actionable. This means you can trust that the monitors we recommend have been both created and vetted by the people who know these technologies best.
Each monitor has warning thresholds that are generous enough to give the responding engineer time to resolve the problem but conservative enough to prevent alert fatigue. They also include a descriptive title and notification message, as well as appropriate tags, so that whoever is responding to the alert has the necessary context to conduct an investigation. Finally, we recommend that alerts triggered by Recommended Monitors provide links to resources such as runbooks and service management consoles to ensure that the next steps are clear.
We’ve already added Recommended Monitors that were submitted to us by partner companies such as:
- Apache
- Confluent Platform
- CoreDNS
- Elasticsearch
- HAProxy
- HashiCorp
- IIS
- Istio
- Jenkins
- Kafka
- MongoDB
- MySQL
- Nginx
- PostgreSQL
- RabbitMQ
- Red Hat Ceph Storage
- Redis Labs
- SQL Server
- Scylla
- Signal Sciences
- Tomcat
- Vault
We’re working with other partners to add even more Recommended Monitors soon.
Get started with Recommended Monitors
With Recommended Monitors, you’ll have out-of-the-box access to the best monitoring practices for the services in your system, so you can incorporate actionable alerts into your monitoring workflow within minutes of installing your integrations. And because every alert triggered by a Recommended Monitor comes with a wealth of contextual data and detailed next steps, you’ll have all the information you need to conduct a comprehensive investigation.
Recommended Monitors are now generally available. Check out our documentation for more information about getting started. If you’re a technology partner and you’d like to learn more about submitting a Recommended Monitor, check out the learning materials available within the Partner Portal.
New to Datadog? Get started with a 14-day free trial.