GCP Cloud SQL Connection Pooling

This guide is part of Cloud Database Connection Management, and it focuses on the connection topology that trips up most teams the first time they run Postgres or MySQL on Google Cloud: there is a hard max_connections ceiling baked into the machine tier, the official connectors secure the link but do not pool it, and the only thing standing between your workload and FATAL: remaining connection slots are reserved is the client-side pool you size yourself. Cloud SQL is a managed instance, not an elastic connection fabric — every application replica still opens real backend processes against a finite instance, and getting the math right between your pool count and the tier ceiling is the whole game.

The confusion comes from the word “proxy.” The Cloud SQL Auth Proxy and the language connectors look like RDS Proxy or PgBouncer, but they are transport security layers, not multiplexers. They establish an mTLS tunnel and resolve IAM identity; they pass every client connection straight through to the instance one-for-one. If you treat the connector as a pooler, you will exhaust the tier and never understand why. This guide separates the two roles cleanly.

The connector secures and routes; the instance tier caps the total. Your client pool is the only knob that controls fan-in.

Key operational takeaways:

The Cloud SQL Auth Proxy and language connectors handle mTLS and IAM but pass connections 1:1 — they are not poolers, so you always need a client pool.
max_connections is governed by the machine tier (memory) and is the true ceiling; size Σ(pool_size × replicas) to stay under it with headroom.
Enable cloudsql.iam_authentication to authenticate with IAM principals instead of static passwords; it changes auth, not pooling.
Private IP removes public-internet exposure and egress, but the connection-count math is identical to public IP.
To collapse many app connections onto few backends, run PgBouncer (transaction mode) as a sidecar behind the connector — the connector does not do this for you.

Foundational mechanics

A Cloud SQL instance is a managed Postgres or MySQL server running on a fixed machine type. Every connection a client opens becomes a real backend process (a postgres backend or MySQL thread) holding memory. The server enforces a max_connections limit, and the instance reserves a handful of slots for superusers and internal maintenance (superuser_reserved_connections on Postgres). When the in-use count plus reserved slots reaches the limit, the next client receives FATAL: remaining connection slots are reserved for non-replication superuser connections or too many connections. This is identical to self-managed Postgres — there is no magic elasticity from the managed wrapper.

Connecting to that instance securely is where Google’s tooling comes in, and it offers three distinct paths.

The Cloud SQL Auth Proxy is a standalone binary or sidecar. It opens a local Unix socket or TCP listener, and when your application connects to that local endpoint, the proxy establishes an mTLS tunnel to the instance using short-lived, automatically rotated certificates. It can also resolve IAM identity for cloudsql.iam_authentication. Crucially, it forwards every inbound local connection as a separate outbound connection to the instance. Ten clients on the local socket means ten backend processes on the instance.

The language connectors (the Java cloud-sql-jdbc-socket-factory, the Python cloud-sql-python-connector, and the Go cloud-sql-go-connector) do the same TLS/IAM work but in-process, as a socket factory or dialer plugged into your normal driver. There is no separate process. The Java connector hooks into HikariCP’s dataSourceClassName/socketFactory; the Python connector returns a DB-API connection you hand to SQLAlchemy’s creator; the Go connector returns a net.Conn you register as a pgx/mysql dialer. In all three cases the connector replaces the transport, and your existing pool (HikariCP, SQLAlchemy QueuePool, database/sql) sits on top unchanged. The connector still opens one instance connection per pool connection.

Direct IP skips Google’s tooling entirely. You connect to the instance’s public or private IP with standard TLS and password (or IAM) auth, managing certificates yourself or using sslmode=require. This is the only path with no connector overhead, but you lose automatic certificate rotation and the convenience of IAM-based connection authorization. Direct public IP also requires authorized-networks allowlisting on the instance, which is operationally brittle in autoscaling environments where egress IPs change; most teams use direct connection only over private IP within a VPC.

The choice between these paths is mostly an operational one, summarized below. None of them changes the pooling story — they differ only in transport, certificate management, and deployment shape.

Path	TLS / cert rotation	IAM auth support	Deployment shape	When to choose
Language connector	automatic, in-process	yes	library in your app	greenfield Java/Python/Go services
Auth Proxy sidecar	automatic, separate process	yes	sidecar / local listener	polyglot, legacy drivers, no connector available
Direct private IP	manual / `sslmode`	yes (token)	no extra component	minimal moving parts inside a VPC

The single most important sentence in this guide: the connector does not pool. It secures and routes a connection; it never multiplexes one backend process across multiple client sessions the way AWS RDS Proxy Connection Pooling or PgBouncer does. Therefore the entire burden of keeping the backend count sane falls on your client-side pool configuration.

There is a second mechanism that interacts with all three paths: IAM database authentication. Setting the cloudsql.iam_authentication instance flag and enableIamAuth/enable_iam_auth on the connector tells Cloud SQL to authenticate the connecting principal as an IAM user or service account, exchanging a short-lived OAuth2 token instead of a static password. The connector fetches and refreshes that token automatically. This is purely an authentication change — it does not alter pooling, connection counts, or the tier ceiling. It does mean you must grant the principal the roles/cloudsql.client role and create a corresponding IAM database user, and it means tokens expire, so connections must be recycled before token expiry (another reason to keep maxLifetime/pool_recycle below an hour). Teams frequently enable IAM auth and then mistake token-expiry disconnects for pool exhaustion; the two are unrelated, and the diagnostics section below separates them.

Precision sizing & timeout orchestration

The instance ceiling is set by max_connections, which Cloud SQL derives from the machine tier’s memory at creation. You can raise it with the max_connections database flag, but raising it beyond what the tier’s RAM supports causes per-connection memory pressure and OOM risk, because each Postgres backend reserves work_mem-scaled memory. The defaults scale roughly with memory as follows (Postgres; verify your instance with SHOW max_connections):

Machine tier (vCPU / memory)	Approx. default `max_connections`	Practical app-pool budget (after reserves + headroom)
Shared-core (1 vCPU / 0.6–1.7 GB)	25 – 50	15 – 35
1 vCPU / 3.75 GB	~100	~80
2 vCPU / 8 GB	~200	~170
4 vCPU / 16 GB	~400	~350
8 vCPU / 32 GB	~800	~700
16 vCPU / 64 GB+	~1000+ (flag-capped)	~900

Treat these as starting points; the authoritative value is whatever SHOW max_connections reports on your instance. The sizing rule is a fan-in budget:

Σ (per_replica_pool_size × replica_count)
  + superuser_reserved_connections
  + admin/migration/monitoring sessions
  ≤ instance max_connections × 0.85   (keep ~15% headroom)

Worked example: a 4 vCPU / 16 GB instance with max_connections = 400, running 12 application replicas. Reserve 3 superuser slots, budget 15 connections for migrations and a monitoring agent, and keep 15% headroom (60 slots). That leaves 400 − 3 − 15 − 60 = 322 for the app, so 322 / 12 ≈ 26 connections per replica. Set each replica’s pool to 20 (round down for safety, leaving slack for autoscaling spikes). This is the same discipline used when Tuning Aurora Serverless v2 Connection Limits — the ceiling differs, the arithmetic does not.

Two ceilings deserve emphasis when sizing. First, the shared-core tiers are dangerously easy to exhaust: a single replica with a default framework pool of 10–20 connections can saturate a shared-core instance by itself, and these tiers are common in staging environments where the symptom then surfaces under load tests rather than in production. Second, read replicas have their own independent max_connections ceiling — routing read traffic to a replica does not borrow from the primary’s budget, but it does mean you must run the fan-in arithmetic separately for the replica fleet. Cloud SQL high-availability failover swaps the standby in under the same instance name, so the ceiling is unchanged across a failover, but the brief reconnect storm during failover is exactly why the 15% headroom matters: every replica’s pool re-establishes its connections at once.

Timeouts must be orchestrated around the connector hop. Because the connector adds a TLS handshake on each new backend connection, set connect timeouts generously (5–10 s) but keep acquisition timeouts tight so a saturated pool fails fast rather than queueing threads. Set maxLifetime/pool_recycle below any idle disconnect to force rotation onto fresh certificates, and — when IAM auth is enabled — below the token lifetime so connections rotate before their auth token expires.

Parameter (cross-driver)	Cloud SQL guidance	Reason
pool size (`maximumPoolSize`, `pool_size`, `SetMaxOpenConns`)	derive from fan-in budget above	one backend per pooled conn
connect/socket timeout	5,000 – 10,000 ms	covers mTLS handshake via connector
acquisition timeout (`connectionTimeout`, `pool_timeout`)	2,000 – 5,000 ms	fail fast on exhaustion, surface backpressure
max lifetime (`maxLifetime`, `pool_recycle`)	1,200,000 – 1,800,000 ms	rotate before idle disconnects, refresh certs
idle timeout	300,000 – 600,000 ms	release excess backends during quiet periods

Production configuration examples

Python connector + SQLAlchemy (used heavily by FastAPI services). The connector provides the creator; the SQLAlchemy pool does the actual pooling. See FastAPI SQLAlchemy Pool Configuration for request-scoping the resulting sessions.

from google.cloud.sql.connector import Connector, IPTypes
import sqlalchemy

connector = Connector(ip_type=IPTypes.PRIVATE)  # private IP, no public egress

def getconn():
    # The connector handles mTLS + IAM. It does NOT pool — SQLAlchemy below does.
    return connector.connect(
        "project:region:instance",
        "pg8000",
        user="svc@project.iam",          # IAM principal (cloudsql.iam_authentication)
        db="appdb",
        enable_iam_auth=True,
    )

engine = sqlalchemy.create_engine(
    "postgresql+pg8000://",
    creator=getconn,
    pool_size=20,            # per-replica budget from the fan-in formula
    max_overflow=0,          # hard cap: never exceed pool_size against the tier ceiling
    pool_timeout=5,          # fail fast when the 20 are in use
    pool_recycle=1500,       # rotate connections (and certs) before idle cutoff
    pool_pre_ping=True,      # validate on borrow; connector reconnects transparently
)

max_overflow=0 is deliberate: overflow connections are exactly how a replica fleet silently blows past max_connections during a traffic spike. A hard cap keeps the per-replica contribution to the fan-in budget predictable.

Java connector + HikariCP. The connector is wired in as a socket factory; HikariCP remains the pool. The sizing principles mirror the HikariCP Configuration Deep Dive.

spring.datasource.url=jdbc:postgresql:///appdb
spring.datasource.username=svc@project.iam
spring.datasource.hikari.data-source-properties.socketFactory=com.google.cloud.sql.postgres.SocketFactory
spring.datasource.hikari.data-source-properties.cloudSqlInstance=project:region:instance
spring.datasource.hikari.data-source-properties.enableIamAuth=true
spring.datasource.hikari.data-source-properties.ipTypes=PRIVATE
spring.datasource.hikari.maximum-pool-size=20
spring.datasource.hikari.connection-timeout=5000
spring.datasource.hikari.max-lifetime=1500000

Auth Proxy sidecar. When you cannot embed a connector (legacy driver, or a polyglot deployment), run the proxy as a sidecar and point the app at 127.0.0.1. The proxy’s own --max-connections flag is a safety valve — see Configuring Cloud SQL Auth Proxy Connection Limits for sizing it correctly.

- name: cloud-sql-proxy
  image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:latest
  args:
    - "--private-ip"
    - "--auto-iam-authn"
    - "--max-connections=25"          # ceiling on inbound conns this proxy forwards
    - "project:region:instance"

Diagnostics & telemetry

Watch the instance-side connection count first; it is the only number that matches the ceiling. On Postgres:

SELECT count(*) AS in_use, current_setting('max_connections')::int AS ceiling
FROM pg_stat_activity;

SELECT usename, application_name, count(*)
FROM pg_stat_activity
GROUP BY usename, application_name
ORDER BY 3 DESC;

In Cloud Monitoring, the metric cloudsql.googleapis.com/database/postgresql/num_backends (or database/network/connections) plotted against max_connections is your saturation signal. Alert when in-use exceeds 80% of the ceiling. On the client side, expose pool metrics: HikariCP’s ActiveConnections/PendingThreads via JMX, SQLAlchemy’s engine.pool.status(), or db.Stats() in Go (InUse, WaitCount, WaitDuration). A rising WaitCount/PendingThreads with the instance count pinned at the ceiling is unambiguous tier exhaustion — raise the tier or add a real pooler, not the per-replica pool size.

The Auth Proxy emits its own structured logs; a spike in refused inbound connections there indicates you hit the proxy’s --max-connections valve before the instance ceiling, which is by design.

Integration & proxy compatibility

Because the connector does not multiplex, the way to serve more application connections than the tier allows is to put a real pooler between the connector and the instance, or in front of the connector. The common topology is PgBouncer in transaction mode as a sidecar, with PgBouncer connecting outward through the Auth Proxy:

app pool (small) → PgBouncer (transaction mode) → Auth Proxy (mTLS/IAM) → Cloud SQL

Here PgBouncer collapses hundreds of short-lived app connections onto a small set of backend processes, and the connector just secures PgBouncer’s outbound link. This is the Cloud SQL equivalent of the trade-offs covered in PgBouncer Transaction vs Statement Pooling — transaction mode breaks session-level features (advisory locks, SET, plain server-side prepared statements) but maximizes backend reuse. If your workload is session-heavy, you are stuck with the 1:1 model and must scale the tier.

Cloud SQL has no managed equivalent to RDS Proxy. If you want a fully managed multiplexing layer on GCP, the closest options are self-hosted PgBouncer or Cloud SQL’s built-in connection pooling preview where available; otherwise the connector-plus-client-pool pattern is the supported baseline. The practical consequence: on AWS you can often offload pool sizing to the proxy, but on Cloud SQL the discipline of computing the fan-in budget and capping per-replica pools is non-negotiable, because nothing else does it for you.

The other compatibility note concerns prepared statements and transaction-mode pooling. If you do introduce PgBouncer in transaction mode behind the connector, the Java and Python drivers must be configured to avoid server-side named prepared statements (or use protocol-level handling that PgBouncer supports), exactly as in any PgBouncer transaction deployment. The connector is transparent to this — it neither helps nor hinders prepared statement behavior, since it operates purely at the transport layer. Plan the PgBouncer interaction the same way you would for self-hosted Postgres.

Common failure patterns & remediation

Symptom	Root cause	Exact fix	Validation command
`FATAL: remaining connection slots are reserved`	`Σ(pool×replicas)` exceeds `max_connections`	Reduce per-replica pool or raise tier; cap `max_overflow=0`	`SELECT count(*) FROM pg_stat_activity;`
Connections climb after each deploy	overlapping replicas during rolling update double the pool count	Budget for `maxSurge`; lower pool size to absorb the surge window	Watch `num_backends` during rollout
Intermittent connection drops at idle	`maxLifetime`/`pool_recycle` above the idle/cert cutoff	Set recycle to 1,200–1,800 s; enable `pool_pre_ping`	`SELECT state, count(*) FROM pg_stat_activity GROUP BY 1;`
Proxy refuses new connections, instance not full	proxy `--max-connections` reached first	Raise the proxy flag or add proxy replicas	proxy logs: “max connections reached”
`enableIamAuth` fails with permission error	principal lacks `cloudsql.instances.connect` / Cloud SQL IAM role	Grant `roles/cloudsql.client` + IAM DB user	`gcloud sql users list`
High latency on first query per connection	mTLS handshake counted in acquisition timeout	Raise connect timeout, keep acquisition timeout tight, pre-warm `minimum-idle`	client pool creation-time metric

Cloud Database Connection Management — the parent overview of managed-database connection limits across providers.
Configuring Cloud SQL Auth Proxy Connection Limits — sizing --max-connections and the tier ceiling for the Auth Proxy.
AWS RDS Proxy Connection Pooling — the managed multiplexing layer Cloud SQL lacks, for contrast.
FastAPI SQLAlchemy Pool Configuration — sizing the client pool that sits on top of the connector.