High Availability

Patroni with Consul DCS: PostgreSQL HA Without etcd

Configure Patroni to use Consul instead of etcd as its DCS for PostgreSQL HA. Covers Consul setup, patroni.yml, service discovery, HAProxy routing, and failover testing.

JusDB Team
March 5, 2026
10 min read
234 views
Most organizations running Patroni default to etcd as their Distributed Configuration Store, yet many of those same organizations already operate Consul clusters for service mesh and microservice discovery. Running a separate etcd cluster solely for Patroni duplicates operational overhead when Consul is fully capable of serving as the DCS backend. Consul's native multi-datacenter federation, built-in health checks, and service registration make it a compelling alternative that consolidates your distributed systems tooling. This guide walks through a complete Patroni-on-Consul setup: from installing a three-node Consul cluster to configuring patroni.yml, validating failover, and routing client traffic through HAProxy.
TL;DR
  • Patroni supports Consul as a DCS backend via the consul: section in patroni.yml — no etcd required.
  • Set register_service: true and Patroni automatically registers postgres (primary) and postgres-replica services in Consul for DNS/API-based discovery.
  • A three-node Consul server cluster handles both DCS duties and service registration; existing Consul clusters can serve double duty with ~10 writes/second per Patroni node.
  • HAProxy checks Patroni's REST API (port 8008) — GET / returns 200 only on the primary, GET /replica only on replicas — enabling zero-application-change client routing.
  • Always enable Consul ACLs in production and scope Patroni's token to /service/postgres-ha/* key operations and service registration only.

Why Consul as DCS Instead of etcd?

Patroni abstracts its DCS layer cleanly — it supports etcd, Consul, ZooKeeper, Kubernetes, and Raft as backends. etcd is the default and works well, but it carries its own operational footprint: cluster bootstrapping, certificate rotation, compaction jobs, and snapshot management. If your infrastructure team already runs Consul for service mesh, service discovery, or configuration management, you likely already have a production-grade three-node Consul cluster with monitoring, runbooks, and on-call coverage baked in.

There are concrete reasons to choose Consul over etcd for Patroni:

  • Consolidated tooling. One distributed system instead of two. Your Consul cluster handles Patroni's leader election keys and your microservice registry simultaneously.
  • Native multi-datacenter support. Consul's WAN federation and datacenter-aware queries work out of the box. Stretching etcd across datacenters requires federation configurations that are non-trivial to operate safely.
  • Integrated service registration. When register_service: true is set, Patroni registers PostgreSQL primary and replica endpoints directly into Consul. Applications and load balancers can resolve the current primary via Consul DNS (master.postgres.service.consul) without a separate HAProxy tier if desired.
  • Health check integration. Consul's health check system complements Patroni's own REST API checks. The Consul UI gives operators instant visibility into which PostgreSQL node is primary, without running separate monitoring tools.
  • Familiar operational model. Teams already experienced with Consul's ACL system, gossip encryption, and TLS configuration will find the security hardening work familiar rather than a new discipline to learn.
Tip: Reuse your existing Consul cluster

If your organization already runs Consul for microservice discovery, reusing it for Patroni DCS avoids operating a second distributed system. Your existing Consul cluster (3+ servers) serves both purposes — just ensure its resource limits account for the additional Patroni key writes (~10 writes/second per node).

Setting Up Consul for Patroni

This section covers deploying a fresh three-node Consul cluster on Ubuntu. If you already have a running Consul cluster, skip to the Patroni configuration section — the only prerequisite is that your Consul cluster is healthy and your Patroni nodes can reach the Consul agent on port 8500.

Step 1: Install Consul on All Three Nodes

Use node IPs 192.168.33.11, 192.168.33.12, and 192.168.33.13. Run the following on each node:

bash
# Node IPs: 192.168.33.11, 192.168.33.12, 192.168.33.13

# Download Consul (Ubuntu)
wget https://releases.hashicorp.com/consul/1.18.0/consul_1.18.0_linux_amd64.zip
unzip consul_1.18.0_linux_amd64.zip
sudo mv consul /usr/local/bin/
sudo mkdir -p /etc/consul.d /var/lib/consul

Step 2: Configure Each Consul Server Node

Create the Consul configuration file on each node, adjusting bind_addr to each node's IP. The configuration below is for node 1 (192.168.33.11):

bash
# Consul server config for node 1
cat > /etc/consul.d/consul.hcl << 'EOF'
datacenter = "dc1"
data_dir = "/var/lib/consul"
server = true
bootstrap_expect = 3
bind_addr = "192.168.33.11"
client_addr = "0.0.0.0"
retry_join = ["192.168.33.11", "192.168.33.12", "192.168.33.13"]
ui_config { enabled = true }
EOF

Repeat on nodes 2 and 3, changing bind_addr to 192.168.33.12 and 192.168.33.13 respectively. The retry_join list is identical on all nodes.

Step 3: Enable and Start Consul

bash
# systemd service
sudo systemctl enable consul && sudo systemctl start consul

# Verify cluster
consul members

A healthy consul members output shows all three nodes with status alive and role server. One node will hold the leader role. Wait for the cluster to elect a leader before proceeding — this typically takes 5–10 seconds after all three nodes start.

Warning: Enable Consul ACLs in Production

Consul's default ACL policy is allow (no auth). In production, enable Consul ACLs and create a specific token for Patroni with minimal permissions: key read/write on /service/postgres-ha/* and service registration for postgres. Never run Consul in production without ACLs. An open Consul cluster exposes leader election keys, service catalog, and KV store to any network-reachable client.

Configuring Patroni with Consul Backend

With the Consul cluster running, configure Patroni to use it as the DCS. The critical change from a standard etcd-backed config is replacing the etcd: section with a consul: section. Everything else — bootstrap, pg_hba, postgresql block — remains structurally identical.

patroni.yml — Full Configuration for Node 1

yaml
# /etc/patroni/patroni.yml (using Consul instead of etcd)
scope: postgres-ha
namespace: /service/
name: pg-node-1

restapi:
  listen: 192.168.33.11:8008
  connect_address: 192.168.33.11:8008

consul:
  host: 127.0.0.1:8500
  # For Consul ACL token (recommended in production):
  # token: your-consul-acl-token
  register_service: true      # Register PG as Consul service
  service_check_interval: 5s  # Health check interval

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      parameters:
        wal_level: replica
        hot_standby: "on"
        max_wal_senders: 5
        max_replication_slots: 5
        wal_log_hints: "on"

  initdb:
    - encoding: UTF8
    - data-checksums

  pg_hba:
    - host replication replicator 192.168.33.0/24 scram-sha-256
    - host all all 0.0.0.0/0 scram-sha-256

postgresql:
  listen: 192.168.33.11:5432
  connect_address: 192.168.33.11:5432
  data_dir: /var/lib/postgresql/17/main
  bin_dir: /usr/lib/postgresql/17/bin
  authentication:
    replication:
      username: replicator
      password: replicator_password
    superuser:
      username: postgres
      password: postgres_password

On nodes 2 and 3, update name (pg-node-2, pg-node-3), restapi.listen, restapi.connect_address, postgresql.listen, and postgresql.connect_address to each node's IP. The consul.host points to 127.0.0.1:8500 because a Consul agent runs locally on every PostgreSQL node — Patroni communicates with the local agent, and the agent handles cluster communication.

Consul Service Registration — What register_service Does

When register_service: true is configured, Patroni automatically registers two service entries into Consul's service catalog as the cluster state changes:

  • postgres — registered only on the current primary node. Consul health checks validate this via Patroni's REST API.
  • postgres-replica — registered on replica nodes. Used for read-scaling traffic.

These Consul services can be queried via DNS or the HTTP API without maintaining a separate HAProxy configuration:

bash
# Consul DNS: resolve primary
dig @127.0.0.1 -p 8600 master.postgres.service.consul SRV

# Consul API: get current primary
curl http://localhost:8500/v1/health/service/postgres?passing

The DNS approach is particularly powerful for applications that support SRV records — the application resolves the current primary address dynamically without any intermediary load balancer.

Testing Failover and Switchover

Before relying on automated failover in production, validate that it works correctly in a test environment. Patroni provides patronictl for all cluster management operations.

bash
# Check cluster status
patronictl -c /etc/patroni/patroni.yml list

# Manual switchover (graceful — zero data loss)
patronictl -c /etc/patroni/patroni.yml switchover postgres-ha

# Force failover (for testing failover behavior)
patronictl -c /etc/patroni/patroni.yml failover postgres-ha --primary pg-node-1

# Restart a node
patronictl -c /etc/patroni/patroni.yml restart postgres-ha pg-node-1

# Reinitialize a lagging replica
patronictl -c /etc/patroni/patroni.yml reinit postgres-ha pg-node-2

The distinction between switchover and failover matters: switchover is a planned, graceful promotion that waits for all replicas to catch up before promoting — zero data loss. Failover is a forced promotion used when the primary is unreachable; data loss is bounded by maximum_lag_on_failover (set to 1 MB in the config above). When testing, always validate that Consul's service catalog updates correctly after each operation — the postgres service should shift to the new primary within one service_check_interval (5 seconds).

After a failover, the old primary comes back as a replica. Use reinit only if pg_rewind fails to reattach it automatically. With use_pg_rewind: true in the bootstrap DCS config, Patroni will attempt pg_rewind first, which avoids a full base backup rewind in most cases.

Consul vs etcd vs ZooKeeper for Patroni DCS

Choose your DCS based on what your team already operates and what your availability requirements demand across datacenters.

Feature Consul etcd ZooKeeper
Protocol HTTP/DNS gRPC Zab
Multi-datacenter Native Federation (complex) No
Service discovery Yes No No
Patroni support Yes Yes (default) Yes
Operational complexity Medium Low High
Already deployed in orgs Common (service mesh) Kubernetes clusters Legacy Java shops
TLS setup Easy Moderate Complex

etcd remains the simplest choice if you are running Kubernetes — etcd is already present. ZooKeeper is only worth considering when PostgreSQL is embedded in an existing platform (Kafka, HBase) that already operates ZooKeeper. Consul shines when the infrastructure team already runs it for service mesh, making it the lowest-friction choice for non-Kubernetes deployments.

HAProxy Integration for Client Routing

Consul's service registration works well for service-mesh-aware applications, but many applications connect via a standard PostgreSQL connection string. HAProxy bridges this gap: it uses Patroni's REST API health checks to route write traffic exclusively to the primary and distribute read traffic across replicas — with no application code changes required.

ini
# /etc/haproxy/haproxy.cfg
global
  log /dev/log local0
  maxconn 100

defaults
  log global
  mode tcp
  retries 2
  timeout client 30m
  timeout connect 4s
  timeout server 30m
  timeout check 5s

# Primary (write) connection
listen postgres_primary
  bind *:5432
  option httpchk
  http-check expect status 200
  default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
  server pg-node-1 192.168.33.11:5432 maxconn 100 check port 8008
  server pg-node-2 192.168.33.12:5432 maxconn 100 check port 8008
  server pg-node-3 192.168.33.13:5432 maxconn 100 check port 8008

# Replica (read) connections
listen postgres_replicas
  bind *:5433
  balance roundrobin
  option httpchk GET /replica
  http-check expect status 200
  default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
  server pg-node-1 192.168.33.11:5432 maxconn 100 check port 8008
  server pg-node-2 192.168.33.12:5432 maxconn 100 check port 8008
  server pg-node-3 192.168.33.13:5432 maxconn 100 check port 8008

The HAProxy health check mechanism works as follows: Patroni's REST API at port 8008 returns HTTP 200 on GET / only when the node is the primary leader, and HTTP 200 on GET /replica only when the node is a healthy replica. HAProxy polls these endpoints every 3 seconds (inter 3s) and only forwards client connections to nodes that pass. After a failover, HAProxy detects the new primary within one check interval — typically under 10 seconds end-to-end including Patroni's promotion time.

The on-marked-down shutdown-sessions directive closes existing connections to a node the moment HAProxy marks it down, preventing applications from hanging on a dead primary. Applications should implement connection retry logic (most PostgreSQL drivers support this natively) to reconnect cleanly after a failover event.

Tip: Dual-layer routing

You can use both HAProxy and Consul DNS simultaneously. HAProxy handles application connections from legacy systems using static connection strings, while service-mesh-aware microservices resolve master.postgres.service.consul directly. Both methods reflect the same Patroni cluster state.

Key Takeaways

  • Consul replaces etcd entirely as Patroni's DCS — replace the etcd: block with a consul: block in patroni.yml. No other structural changes to the Patroni configuration are required.
  • register_service: true automatically maintains postgres and postgres-replica Consul service entries, enabling DNS-based primary discovery without HAProxy for compatible applications.
  • ACLs are non-negotiable in production. Consul's default allow-all policy must be replaced with ACL tokens scoped to the minimum required permissions for Patroni's key namespace and service registration.
  • HAProxy's Patroni REST API checks (port 8008) provide reliable primary/replica routing without application changes — a critical feature for migrating existing applications to a Patroni HA cluster.
  • switchover vs failover: use patronictl switchover for planned maintenance (zero data loss), and understand that automated failover may have bounded data loss determined by maximum_lag_on_failover.
  • Existing Consul clusters can absorb Patroni's DCS load (~10 writes/second per node) without dedicated infrastructure, making this the operationally cheapest path for teams already running Consul.

Working with JusDB on PostgreSQL High Availability

Setting up Patroni with Consul DCS involves a number of moving parts: Consul cluster health, Patroni configuration correctness, HAProxy check tuning, ACL policy scoping, and failover validation. Getting each layer right before it matters in production is the difference between a five-minute failover and a multi-hour incident.

JusDB specializes in PostgreSQL high availability architecture and operations. Whether you are migrating an existing single-node PostgreSQL deployment to a Patroni cluster, evaluating DCS backends, or hardening an existing cluster for production readiness, the JusDB team brings direct experience with Patroni, Consul, etcd, and HAProxy configurations across multiple production environments.

Reach out to discuss your PostgreSQL HA requirements or explore the services available for Patroni cluster design, deployment, and ongoing management.

Related reading:

Share this article

JusDB Team

Official JusDB content team

Deeper Reading

Curated companion guides for readers who want to go deeper on this topic.