Why Patroni is the Right Choice for Cloud-Native PostgreSQL HA
Patroni uses a distributed configuration store (etcd, Consul, or ZooKeeper) for leader election and exposes a REST API for cluster management — making it the gold standard for automated PostgreSQL failover in Kubernetes and cloud environments.
What is Patroni?
Patroni is an open-source PostgreSQL high-availability template written in Python and maintained by Zalando. It wraps a PostgreSQL instance with a daemon that talks to a Distributed Configuration Store (etcd, Consul, ZooKeeper, or Kubernetes API) to elect a single primary, automatically promote a replica on primary failure, and prevent split-brain through DCS-backed locking.
First released in 2015, Patroni has become the de-facto standard for PostgreSQL HA in cloud and Kubernetes environments — it underpins the Zalando Postgres Operator, Crunchy Data PGO, and CloudNativePG. Compared to repmgr (its main alternative), Patroni offers fully automatic failover; compared to Stolon, it works on bare-metal and VMs in addition to Kubernetes. Typical failover time is 30–45 seconds with default settings.
How Patroni Works
Patroni wraps PostgreSQL with a Python daemon that continuously communicates with a Distributed Configuration Store (DCS). The DCS holds the cluster state — which node is primary, the current timeline, and configuration. This eliminates split-brain by making leader election a distributed consensus problem.
DCS-backed Leader Election
Uses etcd, Consul, or ZooKeeper for distributed consensus. No more split-brain scenarios — only the node that holds the DCS lock is primary.
Automatic Failover in Seconds
When the primary fails, Patroni promotes the best replica within configurable TTL (default 30s). No manual intervention needed.
REST API & patronictl CLI
Manage switchover, promote, reinitialize, and pause cluster operations via REST API or the patronictl CLI — integrates with Kubernetes operators.
Fencing & Watchdog
Hardware or software watchdog support (e.g. /dev/watchdog) prevents the old primary from accepting writes after demotion — critical for data safety.
Configuration Management
Patroni stores PostgreSQL configuration in DCS so all cluster members use consistent settings. No config drift between nodes.
Kubernetes Operator Ready
Native integration with Zalando PostgreSQL Operator, Crunchy Data PGO, and CloudNativePG — the standard HA backend for K8s-managed PostgreSQL.
Patroni vs repmgr vs Stolon
All three tools solve PostgreSQL HA but with very different approaches. Patroni wins for cloud/Kubernetes environments. repmgr wins for simplicity. Stolon requires Kubernetes and has a smaller community.
| Feature | Patroni ✦ | repmgr | Stolon |
|---|---|---|---|
| Automatic failover | |||
| Kubernetes / cloud-native | |||
| REST API for cluster management | |||
| DCS support (etcd / Consul / ZooKeeper) | |||
| Leader election via DCS | |||
| Minimal operational overhead | |||
| Barman / pgBackRest integration | |||
| Active community & Zalando backing | |||
| Switchover / promote via CLI | |||
| Custom callbacks on state change | |||
| Works without Kubernetes |
When to Choose Patroni (vs Alternatives)
Choose Patroni when…
- • You run PostgreSQL on Kubernetes or any container platform
- • You need fully automatic failover with zero manual steps
- • You want a REST API to integrate cluster management into CI/CD
- • Your team already uses etcd or Consul for service discovery
- • You need multi-datacenter HA with leader election across DCs
- • You use Zalando Operator, Crunchy PGO, or CloudNativePG
Choose repmgr when…
- • You want minimal dependencies — no etcd/Consul required
- • You prefer manual switchover with a simple CLI
- • Your PostgreSQL runs on bare-metal or simple VMs
- • You need tight Barman backup integration out of the box
- • Team is not familiar with distributed systems or DCS tooling
- • Low-traffic PostgreSQL where manual intervention is acceptable
Patroni on Kubernetes
On Kubernetes, Patroni uses the Kubernetes API itself as the DCS — no separate etcd cluster is required. Cluster state lives in a ConfigMap or Endpoints object, and leader election uses the Kubernetes Lease primitive. Three established operators bundle Patroni as their HA backend.
Zalando Postgres Operator
The original Patroni-based operator. Manages thousands of PostgreSQL clusters in production at Zalando. Best for teams that want minimal opinionation and direct YAML control of Patroni configuration.
Crunchy Data PGO
Patroni + pgBackRest + PgBouncer in one operator. Adds enterprise features: cross-cluster replication, multi-tenant deployments, and built-in monitoring.
CloudNativePG (EDB)
Newer operator that re-implements many Patroni-like behaviors natively. Best for teams already on EDB Postgres or who want CNCF-aligned tooling — Patroni-style HA without the Patroni daemon itself.
Production K8s pattern: 3-replica StatefulSet with PodDisruptionBudget = 1, pod anti-affinity on hostname, and HAProxy or PgBouncer in front of the Patroni REST API health endpoint (/master returns 200 only on the current primary).
Patroni REST API: Key Endpoints
Patroni exposes an HTTP REST API on port 8008 by default. These endpoints are the standard way to drive cluster operations from CI/CD, load balancers, and observability stacks.
| Endpoint | Purpose | Returns |
|---|---|---|
| GET /cluster | Full cluster topology + state | JSON of all members, lag, roles |
| GET /master | Health check for HAProxy / load balancers | 200 if primary, 503 otherwise |
| GET /replica | Health check for read-only routing | 200 if healthy streaming replica |
| GET /metrics | Prometheus-format metrics | timeline, lag, postmaster state |
| POST /switchover | Manual planned failover | Promotes specified replica |
| POST /restart | Rolling restart (zero-downtime) | Restarts node respecting role |
| POST /reinitialize | Reclone a divergent replica | Triggers pg_basebackup |
| PATCH /config | Update PostgreSQL/Patroni config in DCS | Propagates to all members |
All POST/PATCH endpoints require auth via restapi.authentication credentials defined in patroni.yml. For production, also enable HTTPS via restapi.certfile and restapi.keyfile.
Patroni Metrics & Observability
Patroni emits Prometheus-format metrics on /metrics. These six metrics catch the vast majority of PostgreSQL HA incidents before they cause an outage. Wire them into Grafana with alerts tuned to your TTL / loop_wait values.
patroni_master1 if the node is currently primary, 0 otherwise. Alert on unexpected leader changes.
patroni_replica1 if the node is a healthy streaming replica. Alert if a replica drops to 0 outside maintenance.
patroni_postgres_runningIs the PostgreSQL postmaster process up. Page on any 0 — Patroni itself running with PG down is the worst state.
patroni_xlog_replayed_locationWAL replay position. Compute lag against primary's xlog_location for replication-lag alerts.
patroni_pending_restart1 if a config change is awaiting restart. Alert if it sits >1 hour — usually a missed maintenance window.
patroni_dcs_last_seenUnix timestamp of last successful DCS interaction. Alert if it exceeds 2× ttl — DCS connectivity issue.
JusDB Patroni Implementation Service
We design and deploy production-grade Patroni clusters tuned to your infrastructure — bare-metal, VMs, EKS, GKE, or on-prem Kubernetes.
Architecture Design
3-node primary + replica topology, DCS sizing (etcd cluster vs single node), VIP or HAProxy/ProxySQL frontend layer design.
DCS Setup & Hardening
Deploy and secure etcd or Consul cluster, configure TTL and heartbeat intervals, test quorum loss scenarios.
Failover Testing & Runbooks
Simulate primary crash, network partition, and DCS unavailability. Document exact recovery steps and expected timelines.
Kubernetes Operator Integration
Configure Zalando PostgreSQL Operator or Crunchy PGO with Patroni as the HA backend, including PodDisruptionBudgets and pod anti-affinity rules.
Monitoring Integration
Patroni metrics exposed via /metrics endpoint, integrated into Prometheus + Grafana with alerting on leader changes, lag spikes, and DCS connectivity.
Ongoing Managed Operations
24/7 monitoring of cluster state, proactive replica lag alerts, coordinated version upgrades with zero downtime using Patroni's rolling restart.
What are the most common database consulting questions?
Ready to implement Patroni?
Our PostgreSQL HA specialists will design, deploy, and test your Patroni cluster — with runbooks, monitoring, and 24/7 support.