AWS Aurora Connection Scaling

This guide is part of Cloud Database Connection Management, and focuses on the connection-ceiling mechanics that make Aurora behave differently from a self-managed PostgreSQL or MySQL instance. Aurora derives max_connections from the instance class memory, splits traffic across writer and reader endpoints, and — on Serverless v2 — moves that ceiling continuously as Aurora Capacity Units (ACU) scale. Pool sizing that ignores any one of these three facts will either starve under load or overrun the engine and trigger FATAL: remaining connection slots are reserved rejections. This document covers the formulas, the endpoint topology, failover draining behavior, and the read-pool separation pattern that keeps the writer healthy.

The operational hazard is that connection limits on Aurora are not a static number you set once. They float with the instance class on provisioned clusters and with the current ACU value on Serverless v2. Every client pool — HikariCP, node-postgres, SQLAlchemy, database/sql — multiplies its maximumPoolSize by the number of application instances, and the sum of those pools competes for one shared, memory-derived ceiling. Getting this right means computing the ceiling from the instance class, reserving headroom for failover and superuser sessions, and routing read traffic away from the writer.

Aurora endpoint topology with pool ceilings Application read and write pools connect through the cluster writer endpoint and reader endpoint to a writer instance and two reader replicas, each with its own memory-derived max_connections ceiling. Write Pool maximumPoolSize = 15 x N app instances Read Pool maximumPoolSize = 25 x N app instances Cluster Writer Endpoint (DNS) Reader Endpoint round-robins replicas Writer Instance db.r6g.large ceiling ~1660 Reader 1 db.r6g.large ceiling ~1660 Reader 2 db.r6g.large ceiling ~1660
Read and write pools target separate endpoints; each instance enforces its own memory-derived ceiling.

Key operational takeaways:

  • Aurora PostgreSQL derives max_connections from LEAST({DBInstanceClassMemory/9531392}, 5000) — it is a function of instance class memory, not a value you set freely.
  • The reader endpoint round-robins new connections across replicas; long-lived pooled connections pin to whichever replica answered first, so balance is per-connection, not per-query.
  • Aurora Serverless v2 recomputes the connection ceiling from current ACU memory; the floor you must size against is min ACU, not max.
  • Failover promotes a reader and flushes the writer endpoint’s DNS; every connection in the writer pool is severed and must be re-established, so size for the reconnection storm.
  • Route read traffic to a separate pool on the reader endpoint to keep writer connection slots available for transactions and failover headroom.

Foundational mechanics

Aurora does not let you set max_connections to an arbitrary value. The default parameter is an expression evaluated against the instance’s memory. For Aurora PostgreSQL the formula is:

LEAST({DBInstanceClassMemory/9531392}, 5000)

DBInstanceClassMemory is the instance’s memory in bytes minus the memory Aurora reserves for the engine. Dividing by 9531392 (roughly 9.1 MiB) yields the connection budget, capped at 5000. A db.r6g.large (16 GiB) lands near 1660 connections; a db.r6g.xlarge (32 GiB) roughly doubles that. Aurora MySQL uses a similar memory-derived default with a different divisor. The practical consequence: when you scale the instance class up or down, the connection ceiling moves with it, and any client pool sized against the old ceiling is now either wasteful or dangerous.

You can override the default by setting a literal max_connections in a DB cluster parameter group, but that is a footgun on Aurora. Setting it higher than the memory formula allows lets clients open connections the engine cannot back with working memory, and each PostgreSQL backend process consumes work_mem-scaled RAM. The safer pattern is to leave the formula in place and constrain demand at the pool layer instead.

Connections are not free on the engine side. Each PostgreSQL backend is a separate OS process with its own memory footprint, scaling with work_mem and the size of cached plans and catalog data. A cluster sitting at 90% of its connection ceiling is not merely “busy” — it is one traffic spike or one failover away from rejecting the connections that real transactions need. This is exactly why a front-end proxy that multiplexes many client connections onto few backend connections — covered in AWS RDS Proxy Connection Pooling — is often the right answer for high-fan-out or serverless application tiers that would otherwise blow past the ceiling.

The other reason the formula matters operationally: it interacts with how application frameworks decide pool size. A naive default such as “one connection per request thread” assumes the database can absorb whatever the application tier throws at it. On Aurora that assumption is bounded by a number you can compute in advance. If you run a thread-per-request server with 200 worker threads behind it and 10 instances, the application is implicitly asking for up to 2000 connections — which fits a db.r6g.large but overruns a db.t4g.medium and overruns Serverless v2 at its floor. Computing the ceiling first, then sizing pools to fit it, inverts the usual order and is the only safe approach on a memory-derived limit.

Writer and reader endpoints. An Aurora cluster exposes two DNS names. The cluster (writer) endpoint always resolves to the current primary instance. The reader endpoint load-balances new connections across the available read replicas using DNS round-robin with a short TTL. Critically, the balancing happens at connection time, not query time: once a pooled connection is established to a replica, it stays on that replica for its entire lifetime. A connection pool that opens 25 long-lived connections through the reader endpoint will scatter them across replicas at startup and then hold them. If one replica is added later, existing pools do not rebalance onto it until connections recycle.

This per-connection pinning has a subtle consequence for steady-state balance. Because each application instance establishes its read pool independently, and DNS round-robin is observed per resolution rather than globally coordinated, two replicas can end up with markedly different connection counts even though the reader endpoint is “balancing.” A short DNS TTL and a moderate maxLifetime are the levers that keep distribution roughly even over time: as connections age out and reconnect, fresh DNS resolutions redistribute them. If you set maxLifetime very high to minimize reconnect overhead, you also freeze whatever skew existed at startup. The same recycle behavior is what lets a newly promoted or newly added replica receive traffic without an application restart. Treat maxLifetime on the reader pool as a balance-and-topology knob, not just a connection-hygiene one.

Custom endpoints are a third option worth knowing. You can define a named endpoint that targets a specific subset of instances — for example, routing analytical read traffic to two larger replicas while transactional reads use a separate custom endpoint pointed at smaller ones. Each custom endpoint is still subject to the per-instance ceiling of whichever instances it contains, so the sizing inequality applies per endpoint, per instance.

Precision sizing & timeout orchestration

The governing constraint is a single inequality. The sum of every pool’s maximumPoolSize across every application instance, plus reserved superuser and replication slots, must stay below the per-instance ceiling with failover headroom to spare:

(write_pool_size + read_pool_size) x app_instances + reserved < ceiling x safety_factor

Use a safety_factor of about 0.7 so a failover-driven reconnection storm and Aurora’s own internal connections do not push you over. The table below shows representative ceilings and a conservative pool budget. Pair this with connectionTimeout tuning from Connection Acquisition Timeout Strategies so that when demand exceeds supply, pools fail fast instead of queueing threads indefinitely.

Instance class Memory Approx max_connections Suggested total pool budget (0.7x) Notes
db.t4g.medium 4 GiB ~410 ~285 Burstable; avoid for steady high-connection load.
db.r6g.large 16 GiB ~1660 ~1160 Common baseline writer.
db.r6g.xlarge 32 GiB ~3325 ~2325 Headroom for many app instances.
db.r6g.2xlarge 64 GiB ~5000 (capped) ~3500 Formula exceeds cap; ceiling is 5000.
Serverless v2 @ 2 ACU min ~4 GiB ~410 (at floor) ~285 Ceiling rises with ACU; size against the floor.

Timeout orchestration matters more on Aurora than on a static instance because the supply side moves. Set connectionTimeout (HikariCP) or the equivalent acquire timeout to a few seconds so a brief ceiling overrun surfaces as a fast application error rather than a silent thread pile-up. Set maxLifetime below 30 minutes so connections recycle and pick up newly added replicas and post-failover topology. Never set maxLifetime near or above Aurora’s idle connection behavior expectations — recycle proactively.

There is a deliberate asymmetry between the writer and reader pools that the sizing table only hints at. The writer pool should be the tighter of the two for three reasons: writes are usually a minority of total queries, writer slots are the ones that matter during a failover reconnection storm, and the writer is the single point whose exhaustion blocks all transactions. The reader pool can be larger and more elastic because a saturated reader degrades read latency rather than halting writes, and because that load is spread across multiple replicas. As a rule of thumb, start the writer pool at roughly one-third the combined budget and let the reader pool take the remaining two-thirds, then adjust from observed DatabaseConnections per endpoint. The minimumIdle setting deserves equal attention: a high idle floor multiplied across many application instances can hold a large fraction of the ceiling open even when traffic is near zero, which is wasteful on provisioned Aurora and actively harmful on Serverless v2 where it prevents scale-down.

Production configuration examples

A read/write split on Aurora is two pools pointed at two endpoints. The write pool targets the cluster endpoint; the read pool targets the reader endpoint and is sized larger because read traffic usually dominates.

# Spring Boot: two HikariCP pools, writer + reader
spring:
  datasource:
    write:
      jdbc-url: jdbc:postgresql://mycluster.cluster-abc123.us-east-1.rds.amazonaws.com:5432/app
      hikari:
        maximum-pool-size: 15      # writer slots are precious; keep tight
        connection-timeout: 4000   # fail fast on ceiling overrun
        max-lifetime: 1500000      # 25 min: recycle below failover horizon
        pool-name: aurora-writer
    read:
      jdbc-url: jdbc:postgresql://mycluster.cluster-ro-abc123.us-east-1.rds.amazonaws.com:5432/app
      hikari:
        maximum-pool-size: 25      # reads dominate; spread across replicas
        connection-timeout: 4000
        max-lifetime: 1500000
        read-only: true
        pool-name: aurora-reader

The cluster-ro- prefix in the second URL is the reader endpoint. Marking the reader pool read-only lets the driver send SET SESSION CHARACTERISTICS AS TRANSACTION READ ONLY, which guards against accidental writes routed to a replica (Aurora rejects writes on replicas regardless, but failing in the driver is cleaner).

For Serverless v2, the same split applies, but the writer pool must be sized against the min ACU memory, not the max. A pool that is safe at 16 ACU will reject connections at 2 ACU. The dedicated guide Tuning Aurora Serverless v2 Connection Limits works through the ACU-to-ceiling arithmetic with a numeric example.

; node-postgres reader pool against the Aurora reader endpoint
[pool]
host = mycluster.cluster-ro-abc123.us-east-1.rds.amazonaws.com
max = 20            ; per process; multiply by container replica count
idleTimeoutMillis = 30000
connectionTimeoutMillis = 4000

Diagnostics & telemetry

The authoritative count lives on the engine. pg_stat_activity shows every backend, its state, and which application opened it:

SELECT usename, application_name, state, count(*)
FROM pg_stat_activity
GROUP BY usename, application_name, state
ORDER BY count DESC;

Compare the live count against the ceiling:

SELECT current_setting('max_connections')::int AS ceiling,
       count(*) AS in_use,
       round(100.0 * count(*) / current_setting('max_connections')::int, 1) AS pct
FROM pg_stat_activity;

On the AWS side, the CloudWatch DatabaseConnections metric is per-instance — watch it on both the writer and each reader. A writer climbing toward its ceiling while readers sit idle is the classic signal that read traffic is not being split off. The general signals for when a pool is about to wall are covered in Detecting Connection Pool Saturation; on Aurora the leading indicator is DatabaseConnections divided by the memory-derived ceiling crossing roughly 80%, combined with rising connectionTimeout errors in the application.

ServerlessDatabaseCapacity (current ACU) and ACUUtilization are the companion metrics on Serverless v2. Plot DatabaseConnections against ServerlessDatabaseCapacity: a connection count that flatlines while ACU is still scaling up means the pool ceiling, not the engine, is the bottleneck.

Two more queries earn their place in a runbook. To find which replica a connection landed on — useful when read-pool balance looks skewed — group by the server address each backend is bound to:

SELECT inet_server_addr() AS replica, count(*)
FROM pg_stat_activity
GROUP BY 1;

To catch connections that are open but doing nothing (the silent consumers of ceiling slots), filter for long-idle backends:

SELECT pid, application_name, state,
       now() - state_change AS idle_for
FROM pg_stat_activity
WHERE state = 'idle'
  AND now() - state_change > interval '5 minutes'
ORDER BY idle_for DESC;

A large population of long-idle connections usually points to a minimumIdle set too high or an application holding connections across request boundaries. Either way it is consuming ceiling headroom that failover and superuser sessions need.

Integration & proxy compatibility

When the application tier is large or bursty — many Lambda functions, many ECS tasks, or a serverless web tier — the cleanest way to honor the Aurora ceiling is to put a multiplexing proxy in front of it. RDS Proxy maintains a warm pool of backend connections and lends them to clients per transaction, so 2000 client connections can ride on a few hundred backend connections. See AWS RDS Proxy Connection Pooling for setup and the session-pinning pitfalls that erode multiplexing efficiency.

RDS Proxy understands Aurora endpoints natively and follows failover, re-pointing borrowed connections at the new writer without the client re-resolving DNS. If you run your own PgBouncer instead, route it at the cluster endpoint for the transaction pool and a second instance at the reader endpoint for reads; the transaction-vs-statement-mode trade-offs in PgBouncer Transaction vs Statement Pooling apply unchanged on Aurora. Note that proxies do not raise the ceiling — they let you serve more clients under the same ceiling by sharing backend connections.

A proxy’s own connection limit must itself respect the Aurora ceiling. RDS Proxy exposes MaxConnectionsPercent and MaxIdleConnectionsPercent settings that bound how much of the target’s max_connections it will consume. On a fixed-size provisioned instance you can set these against a known ceiling. On Serverless v2 the ceiling moves, so the percentage should be conservative enough to stay safe at the floor — a proxy that grabs 90% of the ceiling at 16 ACU will overrun catastrophically when the cluster scales back to its minimum. Treat the proxy as one more consumer in the sizing inequality, not as an exemption from it.

Session pinning is the failure mode that quietly erodes a proxy’s value on Aurora. When a client uses a session-level feature the proxy cannot safely share — certain SET statements, advisory locks, temporary tables, or some prepared-statement patterns — the proxy pins that backend connection to the client for the session’s duration, collapsing multiplexing back toward one-to-one. The remediation is to avoid session-scoped state where possible and to keep transactions short. Where pinning is unavoidable, account for it by sizing the backend pool closer to the client connection count. The framework-level lifecycle hooks that release connections promptly, discussed in Connection Acquisition Timeout Strategies, are what keep multiplexing efficient end to end.

Common failure patterns & remediation

Symptom Root cause Exact fix Validation command
FATAL: remaining connection slots are reserved for non-replication superuser connections Total pool sizes exceeded the memory-derived ceiling Reduce maximumPoolSize per instance or add RDS Proxy; recompute against the 0.7 safety factor SELECT count(*), current_setting('max_connections') FROM pg_stat_activity;
Writer near ceiling, readers idle No read/write split; all traffic on cluster endpoint Add a second pool on the cluster-ro- reader endpoint, marked read-only Compare CloudWatch DatabaseConnections on writer vs readers
Connection storm and timeouts right after a failover Writer pool re-establishes every connection simultaneously when DNS flips Lower writer maximumPoolSize, stagger reconnects, front with RDS Proxy Correlate connectionTimeout errors with the RDS failover event timestamp
Serverless v2 rejects connections at low traffic Pool sized against max ACU; ceiling collapsed at min ACU Size writer pool against min-ACU memory or raise the min ACU floor SELECT count(*) FROM pg_stat_activity; while ServerlessDatabaseCapacity is at minimum
New replica receives no traffic Long-lived pooled connections pinned to old replicas Lower maxLifetime so connections recycle and rebalance via reader-endpoint DNS SELECT inet_server_addr(), count(*) FROM pg_stat_activity GROUP BY 1; per replica

Failover and connection draining. When Aurora fails over, it promotes one of the read replicas to writer, then updates the cluster endpoint DNS to point at the new primary. Existing connections in the writer pool are to the old primary — they are severed, not migrated. Every client must detect the broken connection, evict it, and reconnect through the now-updated endpoint. A pool with a large maximumPoolSize and aggressive minimumIdle will try to refill all those connections at once, hammering the freshly promoted writer just as it is also absorbing the read replica that was promoted out of the read pool. Mitigate with a smaller writer pool, jittered reconnection, and RDS Proxy, which holds backend connections through the failover and shields clients from the DNS flip.