Generative AI Monitoring
Monitor your Foundation Model usage, API performance, and error rate with runtime metrics and logs.
The Essential Monitoring and Security Platform for the Cloud Age
Datadog brings together end-to-end traces, metrics, and logs to make your applications, infrastructure, and third-party services entirely observable.
Next-generation ML Monitoring
Monitor and your entire machine learning stack with Datadog.
AWS Trainium & Inferentia
Monitor and optimize deep learning workloads running on AWS AI chips
OpenAI
Monitor token consumption, API performance, and more.
NVIDIA DCGM Exporter
Gather metrics from NVIDIA’s discrete GPUs, essential to parallel computing.
Loved & Trusted by Thousands
ML Monitoring Resources
Learn about how Datadog can help you monitor your entire AI stack.