Database Performance

PgPool-II for PostgreSQL: Installation, Configuration, and Connection Pooling

PgPool-II provides connection pooling, read/write splitting, and automatic failover for PostgreSQL — but its extensive configuration can be overwhelming. Here's a practical guide to getting PgPool-II running in production.

JusDB Team
January 20, 2022
9 min read
143 views

When PostgreSQL starts returning FATAL: sorry, too many clients already, the first instinct is to raise max_connections — but that trades one problem for another, consuming more memory per process and forcing a server restart. Connection pooling solves the root cause, and while PgBouncer is the right answer for high-concurrency OLTP, teams running read replicas, needing automatic failover, or requiring load balancing across multiple PostgreSQL nodes need something more capable. PgPool-II is a middleware layer that provides connection pooling, read/write splitting, streaming replication load balancing, and watchdog-based HA — all from a single daemon sitting in front of your PostgreSQL cluster. Its configuration surface is substantial, and every production deployment has the same war story: pgpool.conf edited by guesswork, health checks misconfigured, load balancing miscounted. This guide covers what actually matters.

TL;DR
  • PgPool-II pools connections and routes reads to replicas — PgBouncer only pools connections. Choose PgPool-II when you need both in one component.
  • num_init_children sets the maximum concurrent client connections; max_pool sets cached server connections per child. Their product must stay within PostgreSQL's max_connections budget.
  • Health checks (health_check_period, health_check_max_retries) must be tuned or PgPool-II will aggressively take backends offline on momentary network hiccups.
  • Watchdog provides PgPool-II HA with a virtual IP — without it, PgPool-II itself is a single point of failure in front of your HA PostgreSQL cluster.
  • Query load balancing requires load_balance_mode = on and read queries must be explicitly routable (no open transactions, no temporary tables, SELECT only).

What PgPool-II Does That PgBouncer Doesn't

PgBouncer's scope is narrow and deliberately so: it multiplexes client connections onto a smaller pool of real PostgreSQL server connections. It does nothing else. PgPool-II's scope is fundamentally broader, and that scope is both its strength and the source of its configuration complexity.

Connection Pooling

Like PgBouncer, PgPool-II caches server connections and reuses them across client sessions. PgPool-II supports three pooling modes: connection pooling (connections cached and reused between transactions), replication mode (PgPool-II sends write queries to all backends simultaneously for synchronous replication), and raw mode (no pooling, used with external replication). For modern setups using PostgreSQL streaming replication, pool_mode = streaming_replication is the correct choice.

Read/Write Splitting and Load Balancing

PgPool-II inspects query text at the protocol level and routes SELECT statements to standby replicas while sending writes to the primary. You do not modify application code — PgPool-II parses every statement and makes the routing decision. With multiple standbys configured, it distributes read load across them based on configurable weights.

Automatic Failover and Watchdog

When PgPool-II detects that a backend PostgreSQL node has failed (via health checks), it removes it from the pool. You configure a failover_command script that PgPool-II executes to promote a standby — typically a script that calls Patroni's REST API or pg_ctl promote. The watchdog subsystem extends this to PgPool-II itself: multiple PgPool-II nodes elect a leader that holds a virtual IP, so the failure of one PgPool-II instance does not interrupt client connections.

Warning

PgPool-II is significantly more complex to operate than PgBouncer. If your only requirement is connection pooling for a single PostgreSQL instance or a Patroni cluster fronted by HAProxy, PgBouncer does that with a fraction of the configuration surface and zero risk of incorrect query routing. Reach for PgPool-II when you specifically need its load balancing or multi-backend orchestration capabilities.

Tip: When PgPool-II Is the Right Choice
  • You have one or more streaming standbys and want automatic read query distribution without application changes.
  • You need a single middleware layer that handles both connection pooling and failover orchestration.
  • Your team is already running PgPool-II and needs to tune an existing deployment rather than migrate to a simpler stack.
  • You require online query load balancing weighted by replica capacity (e.g., a replica with more RAM gets more read traffic).

Installation

APT (Debian / Ubuntu)

The PostgreSQL Global Development Group (PGDG) APT repository includes PgPool-II packages for all major PgPool-II versions. Install from PGDG rather than the OS distribution repository to get current releases.

bash
# Add PGDG APT repository if not already present
sudo apt install -y curl ca-certificates gnupg
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/pgdg.gpg
echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" \
  | sudo tee /etc/apt/sources.list.d/pgdg.list

sudo apt update

# Install PgPool-II 4.5 with PostgreSQL 16 extensions
sudo apt install -y pgpool2 pgpool2-dev postgresql-16-pgpool2

# Verify installation
pgpool --version
# pgpool-II version 4.5.x (akamagari)

RPM (RHEL / Rocky Linux / Amazon Linux 2023)

bash
# RHEL 9 / Rocky Linux 9
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm
sudo dnf -qy module disable postgresql

sudo dnf install -y pgpool-II-pg16 pgpool-II-pg16-extensions

# Start and enable
sudo systemctl enable pgpool
sudo systemctl start pgpool

Compiling from Source

Source compilation is required when you need to link against a non-standard PostgreSQL installation or need a version not yet in the package repositories. Download the tarball from pgpool.net and compile against your PostgreSQL development headers.

bash
# Install build dependencies (Debian/Ubuntu)
sudo apt install -y build-essential libpq-dev libssl-dev libpam-dev

# Download and extract
wget https://www.pgpool.net/download.php?f=pgpool-II-4.5.3.tar.gz -O pgpool-II-4.5.3.tar.gz
tar xzf pgpool-II-4.5.3.tar.gz
cd pgpool-II-4.5.3

# Configure with PostgreSQL prefix and SSL support
./configure \
  --prefix=/usr/local/pgpool \
  --with-pgsql=/usr/pgsql-16 \
  --with-openssl

make -j$(nproc)
sudo make install

# Install pgpool_recovery extension into PostgreSQL
cd src/sql/pgpool-recovery
make
sudo make install

After source installation, create the configuration directory and copy sample files: cp /usr/local/pgpool/etc/pgpool.conf.sample /usr/local/pgpool/etc/pgpool.conf.


Core pgpool.conf Configuration

PgPool-II's primary configuration file is /etc/pgpool2/pgpool.conf (package installs) or /usr/local/pgpool/etc/pgpool.conf (source). It contains over 200 parameters; this section covers the ones that materially affect behavior and are most commonly misconfigured.

Listener and Process Settings

ini
# ─── Network ────────────────────────────────────────────────────────────────
listen_addresses = '*'          # Or a specific IP: '10.0.1.5'
port = 5432                     # PgPool-II listener port (match what apps expect)
socket_dir = '/var/run/pgpool'  # Unix socket directory

# ─── Process Model ──────────────────────────────────────────────────────────
# num_init_children: number of pre-forked child processes = max concurrent clients
# Each child handles exactly ONE client connection at a time.
# This is your hard ceiling on client connections.
num_init_children = 200

# max_pool: cached server connections per child process, per backend
# A child with max_pool=4 can reuse up to 4 server connections (one per user/db combo)
# Total server connections ≤ num_init_children × max_pool
# That total MUST be less than PostgreSQL max_connections on each backend
max_pool = 4

# child_life_time: recycle child processes after this many idle seconds (0 = never)
child_life_time = 300

# child_max_connections: recycle child after serving this many connections (0 = never)
# Prevents memory growth from long-lived children
child_max_connections = 1000

# connection_life_time: close cached server connections idle longer than this (seconds)
connection_life_time = 600

# client_idle_limit: disconnect clients idle longer than this (seconds, 0 = disabled)
client_idle_limit = 0

Authentication

ini
# Authentication method for client connections to PgPool-II
# Options: trust, reject, md5, scram-sha-256, cert, pam
enable_pool_hba = on
pool_hba_filename = '/etc/pgpool2/pool_hba.conf'

# PgPool-II's internal password file for backend authentication
# Format mirrors PostgreSQL's pg_hba.conf entries
pool_passwd = '/etc/pgpool2/pool_passwd'

# Allow PgPool-II to authenticate to backends without storing passwords locally.
# PgPool-II forwards client credentials to the backend (requires backend pg_hba.conf
# to trust the PgPool-II host, or use SCRAM passthrough).
allow_clear_text_frontend_auth = off  # Keep off unless required for legacy clients

Pool Mode

ini
# backend_clustering_mode controls the overall clustering behavior:
#   streaming_replication — for PostgreSQL streaming replication clusters (most common)
#   logical_replication   — for logical replication setups
#   slony                 — for Slony-I replication
#   raw                   — pass-through with no pooling logic
#   native_replication    — PgPool-II manages replication itself (legacy, avoid)
backend_clustering_mode = 'streaming_replication'

# connection_cache: enable server connection caching (pooling)
# Set to off only for debugging — always on in production
connection_cache = on
Tip

For a cluster with 3 PostgreSQL nodes (1 primary + 2 standbys) and num_init_children = 150, max_pool = 4: your maximum server connections per backend is 150 × 4 = 600. Ensure PostgreSQL's max_connections is set to at least 650 (600 + 50 for superuser/admin connections) on every node, including standbys that may be promoted to primary.


Backend PostgreSQL Server Configuration

PgPool-II's [pgpool] section defines backend servers using numbered parameter groups. Each backend is identified by an integer suffix (0, 1, 2, ...). At minimum, you configure the host, port, weight, data directory, and application name for each backend.

Backend Definitions

ini
# ─── Backend 0: Primary ──────────────────────────────────────────────────────
backend_hostname0 = '10.0.1.10'
backend_port0 = 5432
backend_weight0 = 1             # Weight for load balancing (writes always go to primary)
backend_data_directory0 = '/var/lib/postgresql/16/main'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_application_name0 = 'pgpool_primary'

# ─── Backend 1: Standby Replica 1 ────────────────────────────────────────────
backend_hostname1 = '10.0.1.11'
backend_port1 = 5432
backend_weight1 = 1             # Equal weight with backend 2 for read distribution
backend_data_directory1 = '/var/lib/postgresql/16/main'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_application_name1 = 'pgpool_standby1'

# ─── Backend 2: Standby Replica 2 (higher-spec server, gets more read traffic) ──
backend_hostname2 = '10.0.1.12'
backend_port2 = 5432
backend_weight2 = 2             # This replica gets 2x more read traffic than backend 1
backend_data_directory2 = '/var/lib/postgresql/16/main'
backend_flag2 = 'ALLOW_TO_FAILOVER'
backend_application_name2 = 'pgpool_standby2'

Backend Flags

The backend_flag parameter controls PgPool-II's behavior toward each backend. ALLOW_TO_FAILOVER permits PgPool-II to remove the backend from the pool on failure and trigger the failover command. DISALLOW_TO_FAILOVER keeps the backend in the configuration even if it fails — useful for backends that are temporarily offline for maintenance. ALWAYS_PRIMARY marks a backend as the primary in streaming replication mode, bypassing primary detection logic (use with caution).

Failover Command

ini
# Script executed when a backend node fails. PgPool-II passes substitution variables:
#   %d = failed node ID
#   %h = failed node hostname
#   %p = failed node port
#   %D = failed node data directory
#   %m = new primary node ID (after promotion)
#   %H = new primary hostname
#   %M = old primary node ID

failover_command = '/etc/pgpool2/scripts/failover.sh %d %h %p %D %m %H %M'
follow_primary_command = '/etc/pgpool2/scripts/follow_primary.sh %d %h %p %D %m %H %P %r %R'
bash
#!/bin/bash
# /etc/pgpool2/scripts/failover.sh
# Minimal failover script — promote a Patroni-managed standby

FAILED_NODE_ID=$1
FAILED_HOST=$2
NEW_PRIMARY_ID=$5
NEW_PRIMARY_HOST=$6

logger -t pgpool_failover "Node ${FAILED_NODE_ID} (${FAILED_HOST}) failed. Promoting node ${NEW_PRIMARY_ID} (${NEW_PRIMARY_HOST})"

# Trigger Patroni failover via REST API
curl -s -XPATCH http://${NEW_PRIMARY_HOST}:8008/switchover \
  -H "Content-Type: application/json" \
  -d '{"leader": "'"${FAILED_HOST}"'"}'

exit 0

Health Check and Watchdog Configuration

Backend Health Checks

PgPool-II periodically connects to each backend and issues a simple query to verify availability. Misconfigured health check timing is the most common cause of spurious backend detachment — a network hiccup causes a false failure, PgPool-II fires the failover command, and a healthy standby gets promoted unnecessarily.

ini
# ─── Health Check ────────────────────────────────────────────────────────────
health_check_period = 10          # Seconds between health checks per backend
health_check_timeout = 20         # Seconds to wait for health check response
health_check_user = 'pgpool_hc'   # Dedicated low-privilege user for health checks
health_check_password = ''        # Leave empty to use pool_passwd file
health_check_database = 'postgres'

# Retry logic: how many consecutive failures before declaring backend down
health_check_max_retries = 3
health_check_retry_delay = 1      # Seconds between retries

# Per-backend health check overrides (useful for cross-AZ latency)
health_check_period0 = 10
health_check_period1 = 10
health_check_period2 = 10
sql
-- Create the health check user on each PostgreSQL backend
-- Minimal permissions: connect only
CREATE ROLE pgpool_hc WITH LOGIN PASSWORD 'hc_secure_password' CONNECTION LIMIT 5;
GRANT CONNECT ON DATABASE postgres TO pgpool_hc;
Warning

Setting health_check_max_retries = 0 means a single failed health check immediately detaches the backend and triggers the failover command. In environments with any network jitter, this causes false failovers. Always set health_check_max_retries to at least 3, and tune health_check_retry_delay based on your acceptable failover detection window.

Watchdog for PgPool-II High Availability

The watchdog subsystem allows multiple PgPool-II nodes to form a cluster. One node holds the virtual IP (VIP) and handles all client traffic. If it fails, a standby PgPool-II node detects the failure, takes the VIP, and resumes service. Watchdog communication uses heartbeat packets and an arbitration mechanism to prevent split-brain.

ini
# ─── Watchdog ────────────────────────────────────────────────────────────────
use_watchdog = on

# Virtual IP address that clients connect to
delegate_ip = '10.0.1.100'

# Network interface on which to assign the VIP
if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev eth0 label eth0:0'
if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev eth0'
arping_cmd = '/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I eth0'

# This node's watchdog listening address (the IP of this PgPool-II host)
wd_hostname = '10.0.1.20'    # Change per node
wd_port = 9000
wd_priority = 1              # Higher value = preferred as leader (set differently per node)

# ─── Peer watchdog nodes ─────────────────────────────────────────────────────
other_pgpool_hostname0 = '10.0.1.21'
other_pgpool_port0 = 5432
other_wd_port0 = 9000

# Heartbeat settings
wd_heartbeat_keepalive = 2    # Seconds between heartbeat packets
wd_heartbeat_deadtime = 30    # Seconds of missed heartbeats before peer declared dead
Tip

The watchdog VIP requires ip addr add / arping to run as root. Configure /etc/sudoers to allow the postgres user (or whichever user runs PgPool-II) to execute only those specific commands without a password. Never run PgPool-II as root to work around this requirement.


Load Balancing and Read/Write Split Setup

Enabling Load Balancing

ini
# ─── Load Balancing ───────────────────────────────────────────────────────────
load_balance_mode = on

# Distribute read queries across all backends by weight.
# Writes always go to the primary (backend with smallest ID in streaming_replication mode).

# Queries sent to primary even in load balance mode:
# - Any DML (INSERT/UPDATE/DELETE/MERGE)
# - DDL (CREATE/ALTER/DROP)
# - Queries inside explicit transactions
# - Temporary table queries
# - Queries using functions in black_function_list

# Allow PgPool-II to send SELECT queries to standbys even when
# there are active write connections (safe for replication lag-tolerant reads)
ignore_leading_white_space = on

# Statement-level load balancing: each SELECT is independently load-balanced
statement_level_load_balance = on

Controlling Which Queries Are Load-Balanced

ini
# white_function_list: functions that return stable/immutable results —
# calls to these are safe to route to standbys
white_function_list = 'currval,lastval,nextval,setval'

# black_function_list: functions that modify state or read session variables —
# calls to these always go to primary
black_function_list = 'nextval,setval,get_current_timestamp'

# black_query_pattern_list: regex patterns — matching queries always go to primary
# Useful for application code that uses custom session variables before reads
black_query_pattern_list = 'SET LOCAL.*;SELECT pg_advisory_lock.*'

# database_redirect_preference_list: per-database routing overrides
# Route all analytics DB reads to backend 2 regardless of load balancing
database_redirect_preference_list = 'analytics:standby2'

Verifying Read Routing

sql
-- Connect through PgPool-II and check which backend your query runs on
SHOW pool_nodes;

-- Output:
-- node_id | hostname   | port | status | lb_weight | role    | select_cnt
-- --------+------------+------+--------+-----------+---------+-----------
-- 0       | 10.0.1.10  | 5432 | up     | 0.333     | primary | 412
-- 1       | 10.0.1.11  | 5432 | up     | 0.333     | standby | 3891
-- 2       | 10.0.1.12  | 5432 | up     | 0.333     | standby | 3944

-- Confirm current backend for this session
SELECT inet_server_addr(), pg_is_in_recovery();

In the SHOW pool_nodes output, watch select_cnt across standbys. Roughly equal counts confirm balanced distribution. A heavily imbalanced distribution usually means queries are landing on the primary because they are inside transactions or match entries in black_query_pattern_list.


PgPool-II vs PgBouncer

Feature PgPool-II 4.5 PgBouncer 1.22
Connection pooling Yes (per-child cached connections) Yes (session / transaction / statement)
Transaction pooling mode Limited (connection_cache) Full, high-density
Read/write splitting Yes (query-level routing) No
Load balancing across replicas Yes (weighted, configurable) No
Automatic backend failover Yes (failover_command + watchdog) No
HA for the proxy itself Yes (watchdog VIP) Manual (HAProxy or keepalived)
Prepared statements (extended query) Supported with protocol-level handling Requires disable in driver (transaction mode)
SCRAM-SHA-256 auth passthrough Yes (4.3+) Yes (1.21+)
Operational complexity High (pgpool.conf, scripts, watchdog) Low (single .ini file)
Connection density (clients:servers) Moderate (~10:1 typical) High (100:1+ in transaction mode)
Best for HA clusters needing load balancing High-concurrency OLTP, simple pooling
Latency overhead Higher (query parsing for routing) Minimal (~0.1ms passthrough)
Warning

PgPool-II's connection density is lower than PgBouncer's because its pooling model ties a child process to a client connection rather than multiplexing many clients through one server connection in transaction mode. For workloads with tens of thousands of concurrent application connections, PgBouncer in transaction pooling mode will outperform PgPool-II significantly. Consider a layered architecture: PgBouncer for connection multiplexing, HAProxy for replica routing, and Patroni for failover — rather than asking PgPool-II to do everything.

Key Takeaways
  • PgPool-II combines connection pooling, read/write splitting, and automatic failover in one daemon — reach for it when you need all three, not just pooling.
  • Size carefully: num_init_children × max_pool must not exceed the PostgreSQL max_connections limit on any backend node, including standbys that may be promoted.
  • Set health_check_max_retries to at least 3 and validate health check timing against your network latency before going to production — single-check-failure detachment causes unnecessary failovers.
  • Enable watchdog with a virtual IP whenever running PgPool-II in production; a single PgPool-II process in front of a HA PostgreSQL cluster is itself a single point of failure.
  • Monitor SHOW pool_nodes select_cnt distribution and status for each backend — an unexpected down status or unbalanced reads indicates a misconfiguration or replication lag condition.
  • For maximum connection density on OLTP workloads, PgBouncer in transaction pooling mode outperforms PgPool-II — consider each tool for its strengths rather than forcing PgPool-II to cover all scenarios.

Working with JusDB on PostgreSQL Connection Management

JusDB deploys and configures PostgreSQL connection pooling for engineering teams — PgBouncer for high-concurrency OLTP, PgPool-II for HA with load balancing, and HAProxy for multi-standby routing. Our DBAs match the right proxy to your workload.

Explore JusDB PostgreSQL Management →  |  Talk to a DBA

Related reading:

Share this article