Configuring connection validation queries for AWS RDS Proxy

This guide is part of Connection Acquisition Timeout Strategies. AWS RDS Proxy relies on connection health checks and lightweight validation to detect and discard stale database connections before routing client traffic. Misconfigured validation parameters cause Connection is not valid errors, increased latency, and unexpected pool exhaustion. This guide provides exact remediation steps, CLI configurations, and validation commands to stabilize your connection lifecycle.

Key Objectives:

Identify stale connection symptoms via RDS Proxy CloudWatch metrics
Configure pool settings and init_query for lightweight validation
Align validation intervals with RDS Proxy idle timeout thresholds
Verify configuration using AWS CLI and live connection tests

Understanding RDS Proxy Validation Mechanics

RDS Proxy intercepts client requests and validates backend database connections before routing traffic. Default behavior relies on TCP keepalives at the transport layer. TCP keepalives only verify network reachability, not logical database state.

Explicit SQL validation becomes important when handling logical state changes, read replica promotions, or cluster failovers. RDS Proxy does not expose a dedicated --validation-query CLI flag. Instead, validation behavior is controlled through connection pool configuration parameters (MaxConnectionsPercent, MaxIdleConnectionsPercent, ConnectionBorrowTimeout) and optionally through InitQuery, which executes a SQL statement when each new backend connection is established.

When designing pool behavior, understanding the underlying Pool Architecture & Algorithm Fundamentals helps align validation frequency with connection acquisition patterns. Proper alignment prevents thread starvation during high-concurrency spikes.

Diagnosing Stale Connection Errors

Stale connections manifest as abrupt query failures and connection pool exhaustion. Monitor CloudWatch metrics to isolate validation bottlenecks. Track DatabaseConnectionsCurrentlyBorrowed and ClientConnectionsCurrentlyBorrowed for divergence.

Application logs will surface explicit errors. Look for ERROR: connection is not valid or FATAL: terminating connection due to administrator command. Correlate these spikes with RDS failover events or transient network partitions.

Metric / Log Pattern	Threshold / Indicator	Action Required
`DatabaseConnectionsCurrentlyBorrowed`	Sustained > 85% of `MaxConnectionsPercent`	Increase pool size or reduce validation frequency
`ClientConnectionsCurrentlyBorrowed`	Rapid drop to 0 after failover	Verify application pool `pool_recycle` is set below proxy idle timeout
App Log: `connection is not valid`	> 5 errors/minute	Check proxy target group health; verify `pool_pre_ping` or equivalent

High acquisition latency during validation cycles often requires tuning Connection Acquisition Timeout Strategies to prevent client-side timeouts.

Configuring RDS Proxy Connection Pool Parameters via AWS CLI

Apply connection pool configuration directly through the AWS CLI using modify-db-proxy-target-group. This controls how RDS Proxy manages backend connections.

aws rds modify-db-proxy-target-group \
  --db-proxy-name my-proxy \
  --target-group-name default \
  --connection-pool-config '{
    "MaxConnectionsPercent": 80,
    "MaxIdleConnectionsPercent": 50,
    "ConnectionBorrowTimeout": 120,
    "SessionPinningFilters": ["EXCLUDE_VARIABLE_SETS"],
    "InitQuery": "SET TIME ZONE UTC"
  }'

InitQuery runs once per new backend connection; it can initialize session variables but is not a per-checkout validation query. For per-checkout health checks, configure pool_pre_ping=True (SQLAlchemy) or equivalent in your application driver. Always pair proxy configuration with IdleClientTimeout to recycle unused connections proactively.

Validating Configuration & Running Live Tests

Verify applied settings using describe-db-proxy-target-groups. Confirm pool configuration fields match your intended values.

aws rds describe-db-proxy-target-groups \
  --db-proxy-name my-proxy \
  --query "TargetGroups[].ConnectionPoolConfig"

Execute live routing tests to confirm connection establishment. Use IAM auth tokens to simulate production traffic patterns.

PGPASSWORD=$TOKEN psql \
  -h my-proxy.proxy-abc123.us-east-1.rds.amazonaws.com \
  -U admin \
  -d appdb \
  -c "SELECT 1 AS validation_check;"

Monitor query execution latency. If validation overhead exceeds 50ms, investigate backend resource contention or network path degradation.

Tuning Validation Intervals & Timeout Alignment

Ensure IdleClientTimeout strictly exceeds your application pool’s idle_timeout / idleTimeoutMillis to prevent premature recycling on the proxy side.

Parameter	Safe Range	Production Recommendation
`IdleClientTimeout`	300s – 1800s	900s (15m) for standard API services
`ConnectionBorrowTimeout`	30s – 300s	120s to absorb transient validation spikes
`MaxIdleConnectionsPercent`	20% – 70%	50% to balance memory footprint and reuse
App `pool_recycle`	75% of `IdleClientTimeout`	Recycle before proxy drops idle connections

Disable aggressive per-checkout validation during high-throughput batch jobs. Excessive checks during bulk inserts or data migrations consume unnecessary backend IOPS. Adjust pool parameters dynamically via CLI or infrastructure templates during maintenance windows.

Configuration Reference

Infrastructure-as-code ensures consistent validation and pool sizing across environments. The following Terraform block enforces MySQL-compatible proxy configuration with strict idle recycling.

resource "aws_db_proxy" "main" {
  name                   = "app-proxy"
  engine_family          = "MYSQL"
  idle_client_timeout    = 1800
  require_tls            = true
  role_arn               = aws_iam_role.rds_proxy.arn
  vpc_subnet_ids         = aws_subnet.private[*].id

  auth {
    auth_scheme = "SECRETS"
    secret_arn  = aws_secretsmanager_secret.db_creds.arn
    iam_auth    = "DISABLED"
  }
}

resource "aws_db_proxy_default_target_group" "main" {
  db_proxy_name = aws_db_proxy.main.name

  connection_pool_config {
    connection_borrow_timeout    = 120
    max_connections_percent      = 90
    max_idle_connections_percent = 50
    session_pinning_filters      = ["EXCLUDE_VARIABLE_SETS"]
  }
}

Deploy this configuration alongside automated drift detection. Validate pool metrics post-deployment to confirm parameter inheritance.

Common Mistakes

Using heavy queries as InitQuery: Resource-intensive SQL in InitQuery increases new-connection latency. Keep it to lightweight session initialization such as SET search_path or SET TIME ZONE.
Setting proxy IdleClientTimeout shorter than app pool idleTimeoutMillis: The proxy drops connections that the application pool still considers live, causing ECONNRESET errors.
Ignoring IAM auth token rotation: RDS Proxy authentication can fail if the IAM token expires mid-cycle. Use short-lived tokens and rotate them before expiry.

FAQ

Does RDS Proxy support per-checkout SQL validation queries?

Not natively. RDS Proxy performs TCP-level health checks and uses InitQuery for session initialization on new backend connections. Per-checkout SQL validation (equivalent to pool_pre_ping) must be implemented in the application driver.

How do I verify RDS Proxy pool configuration is applied correctly?

Run aws rds describe-db-proxy-target-groups --db-proxy-name <name> --query "TargetGroups[].ConnectionPoolConfig" and confirm MaxConnectionsPercent, MaxIdleConnectionsPercent, and ConnectionBorrowTimeout match your intended values.

What happens if the proxy cannot borrow a backend connection within ConnectionBorrowTimeout?

The proxy returns a connection error to the client. The client pool will typically surface this as a connection timeout or acquisition failure. Increase ConnectionBorrowTimeout or reduce MaxConnectionsPercent to free up headroom.

Connection Acquisition Timeout Strategies — the parent topic on aligning client wait thresholds with pool and proxy behavior.
Tuning Connection Acquisition Timeout Under Burst Load — sizing ConnectionBorrowTimeout and client timeouts for traffic surges.
Pool Architecture & Algorithm Fundamentals — allocator design and lifecycle context behind validation frequency.
Detecting Connection Pool Saturation — spotting borrowed-connection divergence before it becomes exhaustion.