Fairwinds Insights is Kubernetes governance and security software that enables DevOps teams to monitor and prevent configuration problems in their infrastructure and applications. Not only does Fairwinds simplify Kubernetes complexity, but it also reduces risk by surfacing security and reliability issues in your Kubernetes clusters.
The Fairwinds Insights integration is now available in the Datadog Marketplace. By unifying Fairwinds Insights recommendations with Kubernetes metrics, logs, APM, and RUM data in Datadog, you can get end-to-end visibility into your clusters and the applications they’re running. With this integration, you get a host of essential insights from Fairwinds—including action items related to new deployments, the estimated cost of your workloads, remediation guidance, links to reference resources, and more—all within a customizable, out-of-the-box dashboard. Any changes made in Datadog (such as resolving or assigning an action item) will automatically be reflected in the Fairwinds Insights platform.
In this post, we’ll walk through how you can use the Fairwinds Insights integration to continuously monitor the security of your Kubernetes clusters and optimize costs without sacrificing reliability.
Continuous Kubernetes security monitoring
Rooting out Kubernetes misconfigurations is critical for protecting the security of your applications, but it can take time and resources away from developing new features. Fairwinds Insights helps you reduce MTTR by scanning your Kubernetes clusters, manifests, and Helm charts for container vulnerabilities and configuration errors, which Fairwinds lists as action items.
Now, these action items will also appear in your Fairwinds Insights dashboard. For example, you could see an access control vulnerability action item warning you that a container needs to be configured with a read-only root filesystem. In the description of the issue, Fairwinds recommends making this change in order to prevent anyone from modifying any files in the container, as there is a risk of a user adding or deleting code to create an exploit. Fairwinds also provides remediation guidance for adhering to the best practices when securing the vulnerability.
From the out-of-the-box dashboard, you can quickly triage the most urgent problems, mark any Fairwinds action item as resolved, or assign an issue to another member of the team. Fairwinds Insights integrates with Jira and GitHub so you can ensure that feedback regarding Kubernetes cluster misconfigurations and vulnerabilities is sent to the teams who are responsible for that infrastructure.
Additionally, if your team is using infrastructure-as-code software (e.g., Terraform), Fairwinds Insights can help ensure that your resources are provisioned in a way that enforces the policies you’ve set up. These kinds of guardrails help ensure that you can ship apps faster but also more safely.
Kubernetes cost optimization
Requests and limits specify the respective minimum and maximum amount of resources (e.g., CPU and memory) a Kubernetes pod can access. Teams sometimes overprovision resources in order to ensure their application performs well, but this can get expensive. A recent Datadog report found that 49 percent of Kubernetes workloads use less than 30 percent of their requested CPU. Fairwinds analyzes the resource usage of your workloads and makes recommendations for your requests and limits in order to help you save money without sacrificing reliability.
Fairwinds also provides actionable remediation guidance for cost optimization with reference documentation and example snippets of proper configuration that you can copy/paste into your manifests. As you implement Fairwinds’ cost-optimization suggestions, you can check APM and RUM data to ensure that your changes don’t inadvertently increase latency in your application or degrade your user experience.
Or, you can pivot the other way–from an APM alert to Fairwinds Insights. For example, if you get notified about a mysterious spike in latency on one of your services, Fairwinds Insights can help you investigate the issue. In the screenshot below, the Fairwinds Insights action items widget in a Datadog dashboard informs you that a CPU request has not been set for this service. Configuring requests is considered a best practice because it informs Kubernetes how much compute to allocate, which can help ensure that your applications have enough resources to run.
Drilling down deeper reveals that because the CPU request was not set, the node was unable to allocate enough resources for the pods to handle a spike in traffic. This misconfiguration eventually caused the node—and all the pods running on it—to crash. This meant there were fewer pods available to handle requests, resulting in a spike in application latency. Setting a CPU request would have prevented the pod from being scheduled on a node with too many pods scheduled on it. Ensuring that pods are scheduled on nodes that can accommodate their resource requirements is a best practice.
Start monitoring Kubernetes with Fairwinds Insights and Datadog
The Fairwinds Insights integration is available for purchase in the Datadog Marketplace, giving you the visibility and control you need to run more efficient, reliable, and secure applications on Kubernetes. To learn more, see our documentation. If you’re new to Datadog, sign up for a 14-day free trial.
The ability to promote branded monitoring tools in the Datadog Marketplace is one of the benefits of membership in the Datadog Partner Network. If you’re interested in developing an integration or application for the Datadog Marketplace, contact us at marketplace@datadog.com.