For over a decade, Amazon RDS (Relational Database Service) has been a popular managed database service companies have used to set up, operate, and scale databases for web and mobile applications. However, modern, high-scale applications can have upward of tens of thousands of clients. This means that maintaining direct connections between the database and application can quickly begin to consume more resources than the query executions themselves. This can slow down an application’s ability to scale and result in maintenance issues and higher costs, especially when paying per database instance.
To solve this problem, AWS launched RDS Proxy to act as an intermediary between the application and database. By connecting an application to RDS Proxy instead of directly to the database, you can pool and share connections from application clients rather than create new ones for every query, improving database efficiency and application scalability. RDS Proxy also strengthens security by providing access through Identity and Access Management (IAM) instead of encoding credentials within the application.
Datadog’s Amazon RDS Proxy integration and new out-of-the-box dashboard give you visibility into client connections and query throughput so you can quickly troubleshoot connection errors, identify when to scale connections to meet thresholds, and ensure optimal proxy performance and connection reusability.
In this post, we’ll walk through how Datadog can help you monitor RDS Proxy by:
- Identifying and troubleshooting connection errors
- Highlighting when to scale the number of connections
- Ensuring connection performance and reusability
Identifying and troubleshooting connection errors
When using RDS Proxy, there are two types of connections: client-to-proxy and proxy-to-database. Identifying issues with each type of connection can help you determine where problems are occurring and troubleshoot the cause. Monitoring and alerting on aws.rds.proxy.client_connections_setup_failed_auth
will notify you of the number of connection failures between the client and the proxy caused by a misconfigured authentication, TLS issue, or other error. Increases in aws.rds.proxy.database_connections_setup_failed
, on the other hand, indicates the proxy is failing to connect to the database. An increase in this metric can indicate issues with the IAM policy, security group issues, or other authentication difficulties.
Target groups are the RDS DB instances or Aurora DB clusters that your proxy can connect to. Another way to identify a proxy’s connection issues is by monitoring aws.rds.proxy.availability_percentage
, which provides the percentage of time for which the target groups associated with the proxy are available to route connections to applications. This is useful to identify any downtime in the database instances and clusters connecting to the proxy.
Highlighting when to scale the number of connections
When monitoring database connections, you will want to make sure that the existing number of connections (aws.rds.proxy.database_connections
) does not cross the maximum number of allowed simultaneous client connections (aws.rds.proxy.max_database_connections_allowed
). The maximum number of concurrent database connections allowed varies by: the database engine used, the memory allocation for the database instance type, and your specific RDS Proxy configuration. If the number of database connections frequently reaches the threshold limit, you may need to scale the database instance or increase the max_connections
parameter in RDS Proxy.
Ensuring connection performance and reusability
A proxy’s performance is, in part, determined by how efficiently it is able to reuse connections after each transaction in a session; this transaction level reuse is called multiplexing. However, RDS Proxy can sometimes deem a connection unfit for reuse and put it in a pinned state when a session state change occurs that isn’t applicable to other sessions.
Datadog can help you identify if there is an increased number of pinned connections by setting alerts on aws.rds.proxy.database_connections_currently_session_pinned
and notifying you when it crosses a threshold or exceeds a ratio relative to the number of borrowed connections (aws.rds.proxy.database_connections_currently_borrowed
). You can also ensure multiplexing is working as expected in a proxy by monitoring the ratio between connection requests (aws.rds.proxy.database_connection_requests
) to query requests ( aws.rds.proxy.query_requests
), which should ideally stay low. A consistently high ratio could be an indication of excessive pinning (see AWS docs on how to avoid pinning). By ensuring that pinning is kept to a minimum, you can help reduce wastage of critical resources from creating new connections.
Beyond monitoring connection reuse, it’s important to track query throughput and latency to ensure that the proxy and database are responding to application requests effectively. Metrics like aws.rds.proxy.query_database_response_latency
and aws.rds.proxy.query_response_latency
relative to the overall number of queries (aws.rds.proxy.query_requests
) can further help identify increased latency issues within the application and database.
Get started
With Datadog’s Amazon RDS Proxy integration and new out-of-the-box dashboard, you can monitor key metrics from your proxies alongside the rest of your AWS infrastructure and more than 500 other services and technologies. Troubleshoot database connection errors to prevent failed queries, understand when to scale connections up or down to meet thresholds, and ensure proxy performance and connection reusability—all from a single pane of glass. See our documentation for RDS Proxy to get started. If you’re not already a Datadog customer, sign up for a 14-day free trial.