GCP Cloud SQL Connection Pooling
This guide is part of Cloud Database Connection Management, and it focuses on the connection topology that trips up most teams the first time they run Postgres or MySQL on Google Cloud: there is a hard max_connections ceiling baked into the machine tier, the official connectors secure the link but do not pool it, and the only thing standing between your workload and FATAL: remaining connection slots are reserved is the client-side pool you size yourself. Cloud SQL is a managed instance, not an elastic connection fabric — every application replica still opens real backend processes against a finite instance, and getting the math right between your pool count and the tier ceiling is the whole game.
The confusion comes from the word “proxy.” The Cloud SQL Auth Proxy and the language connectors look like RDS Proxy or PgBouncer, but they are transport security layers, not multiplexers. They establish an mTLS tunnel and resolve IAM identity; they pass every client connection straight through to the instance one-for-one. If you treat the connector as a pooler, you will exhaust the tier and never understand why. This guide separates the two roles cleanly.
Key operational takeaways:
- The Cloud SQL Auth Proxy and language connectors handle mTLS and IAM but pass connections 1:1 — they are not poolers, so you always need a client pool.
max_connectionsis governed by the machine tier (memory) and is the true ceiling; sizeΣ(pool_size × replicas)to stay under it with headroom.- Enable
cloudsql.iam_authenticationto authenticate with IAM principals instead of static passwords; it changes auth, not pooling. - Private IP removes public-internet exposure and egress, but the connection-count math is identical to public IP.
- To collapse many app connections onto few backends, run PgBouncer (transaction mode) as a sidecar behind the connector — the connector does not do this for you.
Foundational mechanics
A Cloud SQL instance is a managed Postgres or MySQL server running on a fixed machine type. Every connection a client opens becomes a real backend process (a postgres backend or MySQL thread) holding memory. The server enforces a max_connections limit, and the instance reserves a handful of slots for superusers and internal maintenance (superuser_reserved_connections on Postgres). When the in-use count plus reserved slots reaches the limit, the next client receives FATAL: remaining connection slots are reserved for non-replication superuser connections or too many connections. This is identical to self-managed Postgres — there is no magic elasticity from the managed wrapper.
Connecting to that instance securely is where Google’s tooling comes in, and it offers three distinct paths.
The Cloud SQL Auth Proxy is a standalone binary or sidecar. It opens a local Unix socket or TCP listener, and when your application connects to that local endpoint, the proxy establishes an mTLS tunnel to the instance using short-lived, automatically rotated certificates. It can also resolve IAM identity for cloudsql.iam_authentication. Crucially, it forwards every inbound local connection as a separate outbound connection to the instance. Ten clients on the local socket means ten backend processes on the instance.
The language connectors (the Java cloud-sql-jdbc-socket-factory, the Python cloud-sql-python-connector, and the Go cloud-sql-go-connector) do the same TLS/IAM work but in-process, as a socket factory or dialer plugged into your normal driver. There is no separate process. The Java connector hooks into HikariCP’s dataSourceClassName/socketFactory; the Python connector returns a DB-API connection you hand to SQLAlchemy’s creator; the Go connector returns a net.Conn you register as a pgx/mysql dialer. In all three cases the connector replaces the transport, and your existing pool (HikariCP, SQLAlchemy QueuePool, database/sql) sits on top unchanged. The connector still opens one instance connection per pool connection.
Direct IP skips Google’s tooling entirely. You connect to the instance’s public or private IP with standard TLS and password (or IAM) auth, managing certificates yourself or using sslmode=require. This is the only path with no connector overhead, but you lose automatic certificate rotation and the convenience of IAM-based connection authorization. Direct public IP also requires authorized-networks allowlisting on the instance, which is operationally brittle in autoscaling environments where egress IPs change; most teams use direct connection only over private IP within a VPC.
The choice between these paths is mostly an operational one, summarized below. None of them changes the pooling story — they differ only in transport, certificate management, and deployment shape.
| Path | TLS / cert rotation | IAM auth support | Deployment shape | When to choose |
|---|---|---|---|---|
| Language connector | automatic, in-process | yes | library in your app | greenfield Java/Python/Go services |
| Auth Proxy sidecar | automatic, separate process | yes | sidecar / local listener | polyglot, legacy drivers, no connector available |
| Direct private IP | manual / sslmode |
yes (token) | no extra component | minimal moving parts inside a VPC |
The single most important sentence in this guide: the connector does not pool. It secures and routes a connection; it never multiplexes one backend process across multiple client sessions the way AWS RDS Proxy Connection Pooling or PgBouncer does. Therefore the entire burden of keeping the backend count sane falls on your client-side pool configuration.
There is a second mechanism that interacts with all three paths: IAM database authentication. Setting the cloudsql.iam_authentication instance flag and enableIamAuth/enable_iam_auth on the connector tells Cloud SQL to authenticate the connecting principal as an IAM user or service account, exchanging a short-lived OAuth2 token instead of a static password. The connector fetches and refreshes that token automatically. This is purely an authentication change — it does not alter pooling, connection counts, or the tier ceiling. It does mean you must grant the principal the roles/cloudsql.client role and create a corresponding IAM database user, and it means tokens expire, so connections must be recycled before token expiry (another reason to keep maxLifetime/pool_recycle below an hour). Teams frequently enable IAM auth and then mistake token-expiry disconnects for pool exhaustion; the two are unrelated, and the diagnostics section below separates them.
Precision sizing & timeout orchestration
The instance ceiling is set by max_connections, which Cloud SQL derives from the machine tier’s memory at creation. You can raise it with the max_connections database flag, but raising it beyond what the tier’s RAM supports causes per-connection memory pressure and OOM risk, because each Postgres backend reserves work_mem-scaled memory. The defaults scale roughly with memory as follows (Postgres; verify your instance with SHOW max_connections):
| Machine tier (vCPU / memory) | Approx. default max_connections |
Practical app-pool budget (after reserves + headroom) |
|---|---|---|
| Shared-core (1 vCPU / 0.6–1.7 GB) | 25 – 50 | 15 – 35 |
| 1 vCPU / 3.75 GB | ~100 | ~80 |
| 2 vCPU / 8 GB | ~200 | ~170 |
| 4 vCPU / 16 GB | ~400 | ~350 |
| 8 vCPU / 32 GB | ~800 | ~700 |
| 16 vCPU / 64 GB+ | ~1000+ (flag-capped) | ~900 |
Treat these as starting points; the authoritative value is whatever SHOW max_connections reports on your instance. The sizing rule is a fan-in budget:
Σ (per_replica_pool_size × replica_count)
+ superuser_reserved_connections
+ admin/migration/monitoring sessions
≤ instance max_connections × 0.85 (keep ~15% headroom)
Worked example: a 4 vCPU / 16 GB instance with max_connections = 400, running 12 application replicas. Reserve 3 superuser slots, budget 15 connections for migrations and a monitoring agent, and keep 15% headroom (60 slots). That leaves 400 − 3 − 15 − 60 = 322 for the app, so 322 / 12 ≈ 26 connections per replica. Set each replica’s pool to 20 (round down for safety, leaving slack for autoscaling spikes). This is the same discipline used when Tuning Aurora Serverless v2 Connection Limits — the ceiling differs, the arithmetic does not.
Two ceilings deserve emphasis when sizing. First, the shared-core tiers are dangerously easy to exhaust: a single replica with a default framework pool of 10–20 connections can saturate a shared-core instance by itself, and these tiers are common in staging environments where the symptom then surfaces under load tests rather than in production. Second, read replicas have their own independent max_connections ceiling — routing read traffic to a replica does not borrow from the primary’s budget, but it does mean you must run the fan-in arithmetic separately for the replica fleet. Cloud SQL high-availability failover swaps the standby in under the same instance name, so the ceiling is unchanged across a failover, but the brief reconnect storm during failover is exactly why the 15% headroom matters: every replica’s pool re-establishes its connections at once.
Timeouts must be orchestrated around the connector hop. Because the connector adds a TLS handshake on each new backend connection, set connect timeouts generously (5–10 s) but keep acquisition timeouts tight so a saturated pool fails fast rather than queueing threads. Set maxLifetime/pool_recycle below any idle disconnect to force rotation onto fresh certificates, and — when IAM auth is enabled — below the token lifetime so connections rotate before their auth token expires.
| Parameter (cross-driver) | Cloud SQL guidance | Reason |
|---|---|---|
pool size (maximumPoolSize, pool_size, SetMaxOpenConns) |
derive from fan-in budget above | one backend per pooled conn |
| connect/socket timeout | 5,000 – 10,000 ms | covers mTLS handshake via connector |
acquisition timeout (connectionTimeout, pool_timeout) |
2,000 – 5,000 ms | fail fast on exhaustion, surface backpressure |
max lifetime (maxLifetime, pool_recycle) |
1,200,000 – 1,800,000 ms | rotate before idle disconnects, refresh certs |
| idle timeout | 300,000 – 600,000 ms | release excess backends during quiet periods |
Production configuration examples
Python connector + SQLAlchemy (used heavily by FastAPI services). The connector provides the creator; the SQLAlchemy pool does the actual pooling. See FastAPI SQLAlchemy Pool Configuration for request-scoping the resulting sessions.
from google.cloud.sql.connector import Connector, IPTypes
import sqlalchemy
connector = Connector(ip_type=IPTypes.PRIVATE) # private IP, no public egress
def getconn():
# The connector handles mTLS + IAM. It does NOT pool — SQLAlchemy below does.
return connector.connect(
"project:region:instance",
"pg8000",
user="svc@project.iam", # IAM principal (cloudsql.iam_authentication)
db="appdb",
enable_iam_auth=True,
)
engine = sqlalchemy.create_engine(
"postgresql+pg8000://",
creator=getconn,
pool_size=20, # per-replica budget from the fan-in formula
max_overflow=0, # hard cap: never exceed pool_size against the tier ceiling
pool_timeout=5, # fail fast when the 20 are in use
pool_recycle=1500, # rotate connections (and certs) before idle cutoff
pool_pre_ping=True, # validate on borrow; connector reconnects transparently
)
max_overflow=0 is deliberate: overflow connections are exactly how a replica fleet silently blows past max_connections during a traffic spike. A hard cap keeps the per-replica contribution to the fan-in budget predictable.
Java connector + HikariCP. The connector is wired in as a socket factory; HikariCP remains the pool. The sizing principles mirror the HikariCP Configuration Deep Dive.
spring.datasource.url=jdbc:postgresql:///appdb
spring.datasource.username=svc@project.iam
spring.datasource.hikari.data-source-properties.socketFactory=com.google.cloud.sql.postgres.SocketFactory
spring.datasource.hikari.data-source-properties.cloudSqlInstance=project:region:instance
spring.datasource.hikari.data-source-properties.enableIamAuth=true
spring.datasource.hikari.data-source-properties.ipTypes=PRIVATE
spring.datasource.hikari.maximum-pool-size=20
spring.datasource.hikari.connection-timeout=5000
spring.datasource.hikari.max-lifetime=1500000
Auth Proxy sidecar. When you cannot embed a connector (legacy driver, or a polyglot deployment), run the proxy as a sidecar and point the app at 127.0.0.1. The proxy’s own --max-connections flag is a safety valve — see Configuring Cloud SQL Auth Proxy Connection Limits for sizing it correctly.
- name: cloud-sql-proxy
image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:latest
args:
- "--private-ip"
- "--auto-iam-authn"
- "--max-connections=25" # ceiling on inbound conns this proxy forwards
- "project:region:instance"
Diagnostics & telemetry
Watch the instance-side connection count first; it is the only number that matches the ceiling. On Postgres:
SELECT count(*) AS in_use, current_setting('max_connections')::int AS ceiling
FROM pg_stat_activity;
SELECT usename, application_name, count(*)
FROM pg_stat_activity
GROUP BY usename, application_name
ORDER BY 3 DESC;
In Cloud Monitoring, the metric cloudsql.googleapis.com/database/postgresql/num_backends (or database/network/connections) plotted against max_connections is your saturation signal. Alert when in-use exceeds 80% of the ceiling. On the client side, expose pool metrics: HikariCP’s ActiveConnections/PendingThreads via JMX, SQLAlchemy’s engine.pool.status(), or db.Stats() in Go (InUse, WaitCount, WaitDuration). A rising WaitCount/PendingThreads with the instance count pinned at the ceiling is unambiguous tier exhaustion — raise the tier or add a real pooler, not the per-replica pool size.
The Auth Proxy emits its own structured logs; a spike in refused inbound connections there indicates you hit the proxy’s --max-connections valve before the instance ceiling, which is by design.
Integration & proxy compatibility
Because the connector does not multiplex, the way to serve more application connections than the tier allows is to put a real pooler between the connector and the instance, or in front of the connector. The common topology is PgBouncer in transaction mode as a sidecar, with PgBouncer connecting outward through the Auth Proxy:
app pool (small) → PgBouncer (transaction mode) → Auth Proxy (mTLS/IAM) → Cloud SQL
Here PgBouncer collapses hundreds of short-lived app connections onto a small set of backend processes, and the connector just secures PgBouncer’s outbound link. This is the Cloud SQL equivalent of the trade-offs covered in PgBouncer Transaction vs Statement Pooling — transaction mode breaks session-level features (advisory locks, SET, plain server-side prepared statements) but maximizes backend reuse. If your workload is session-heavy, you are stuck with the 1:1 model and must scale the tier.
Cloud SQL has no managed equivalent to RDS Proxy. If you want a fully managed multiplexing layer on GCP, the closest options are self-hosted PgBouncer or Cloud SQL’s built-in connection pooling preview where available; otherwise the connector-plus-client-pool pattern is the supported baseline. The practical consequence: on AWS you can often offload pool sizing to the proxy, but on Cloud SQL the discipline of computing the fan-in budget and capping per-replica pools is non-negotiable, because nothing else does it for you.
The other compatibility note concerns prepared statements and transaction-mode pooling. If you do introduce PgBouncer in transaction mode behind the connector, the Java and Python drivers must be configured to avoid server-side named prepared statements (or use protocol-level handling that PgBouncer supports), exactly as in any PgBouncer transaction deployment. The connector is transparent to this — it neither helps nor hinders prepared statement behavior, since it operates purely at the transport layer. Plan the PgBouncer interaction the same way you would for self-hosted Postgres.
Common failure patterns & remediation
| Symptom | Root cause | Exact fix | Validation command |
|---|---|---|---|
FATAL: remaining connection slots are reserved |
Σ(pool×replicas) exceeds max_connections |
Reduce per-replica pool or raise tier; cap max_overflow=0 |
SELECT count(*) FROM pg_stat_activity; |
| Connections climb after each deploy | overlapping replicas during rolling update double the pool count | Budget for maxSurge; lower pool size to absorb the surge window |
Watch num_backends during rollout |
| Intermittent connection drops at idle | maxLifetime/pool_recycle above the idle/cert cutoff |
Set recycle to 1,200–1,800 s; enable pool_pre_ping |
SELECT state, count(*) FROM pg_stat_activity GROUP BY 1; |
| Proxy refuses new connections, instance not full | proxy --max-connections reached first |
Raise the proxy flag or add proxy replicas | proxy logs: “max connections reached” |
enableIamAuth fails with permission error |
principal lacks cloudsql.instances.connect / Cloud SQL IAM role |
Grant roles/cloudsql.client + IAM DB user |
gcloud sql users list |
| High latency on first query per connection | mTLS handshake counted in acquisition timeout | Raise connect timeout, keep acquisition timeout tight, pre-warm minimum-idle |
client pool creation-time metric |
Related
- Cloud Database Connection Management — the parent overview of managed-database connection limits across providers.
- Configuring Cloud SQL Auth Proxy Connection Limits — sizing
--max-connectionsand the tier ceiling for the Auth Proxy. - AWS RDS Proxy Connection Pooling — the managed multiplexing layer Cloud SQL lacks, for contrast.
- FastAPI SQLAlchemy Pool Configuration — sizing the client pool that sits on top of the connector.