At 02:17 on a Tuesday morning, a payments platform lost their primary PostgreSQL 15 cluster to a catastrophic storage failure on a three-node bare-metal setup in their Frankfurt data center. The on-call DBA had a base backup from 36 hours ago, no WAL archive, no streaming replica, and an RTO target nobody had ever formally written down. By the time they finished a full pg_dump restore from that stale backup, they had lost 36 hours of transaction data, issued a regulatory disclosure, and triggered a customer churn event that cost them far more than the hardware. The infrastructure team had assumed their backup strategy was solid. They had never tested a restore. This guide is the playbook they needed before that night — not after.
- Define RTO and RPO in writing before choosing any backup strategy; these two numbers determine every other technical decision in your DR plan.
- Use
pg_basebackupor pgBackRest for physical backups — they are orders of magnitude faster to restore thanpg_dumpfor large databases. - Enable WAL archiving (
archive_mode = on,archive_command) immediately; without it, Point-in-Time Recovery (PITR) is impossible. - Streaming replication with
synchronous_standby_namesachieves zero RPO at the cost of write latency; async replication gives lower latency with a small RPO window. - A backup that has never been tested is not a backup — automate restore verification weekly, not quarterly.
- Alert on backup freshness: if the last successful backup is older than 24 hours, page someone before the next failure, not after.
RTO vs RPO: The Two Numbers That Drive Every Decision
Every other technical decision in your PostgreSQL DR strategy is downstream of two metrics: Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Confusing them — or leaving them undefined — is the most expensive mistake a DBA can make.
RTO is the maximum acceptable time from failure detection to the database being fully operational. RTO is a business question: how long can the application be unavailable before the cost becomes intolerable? A payments processor may have an RTO of 5 minutes. An internal analytics database may tolerate 4 hours.
RPO is the maximum acceptable data loss, measured in time. RPO answers: if disaster strikes right now, how much committed data are you allowed to lose? An RPO of zero means no committed transaction can be lost — ever. An RPO of 1 hour means losing up to 60 minutes of writes is acceptable.
Set both numbers in writing, with stakeholder sign-off, before designing your backup architecture. The table below maps common RTO/RPO targets to the minimum required PostgreSQL capabilities:
| RTO Target | RPO Target | Minimum Required Strategy |
|---|---|---|
| < 5 minutes | Zero | Synchronous streaming replication + automatic failover (Patroni, repmgr) |
| 5 – 30 minutes | < 5 minutes | Async streaming replication + WAL archiving + pre-promoted standby |
| 30 minutes – 4 hours | < 1 hour | Physical base backup + WAL archiving for PITR |
| > 4 hours | > 1 hour | Daily pg_dump + WAL archiving (minimum viable; not recommended for production) |
Never design a backup strategy for a database before you have written down your RTO and RPO targets and had them approved by your engineering leadership and, where relevant, your compliance team. Skipping this step means you will optimize for technical elegance rather than business requirements, and you will discover the mismatch at the worst possible moment.
Backup Types: Physical vs Logical
pg_dump and pg_dumpall: Logical Backups
pg_dump produces a logical export of one database. It is SQL-statement-level: it reads every row from every table and generates INSERT statements, COPY blocks, or a custom binary format. It is schema-aware, cross-version compatible, and can restore individual tables or schemas. It is also the wrong choice as your primary backup mechanism for any database over ~50 GB, because restore time scales linearly with data volume and generates enormous WAL on the target during restore.
# Custom format backup — best for single-database logical backups
pg_dump \
--host=localhost \
--port=5432 \
--username=postgres \
--format=custom \
--compress=9 \
--jobs=4 \
--file=/backups/mydb_$(date +%Y%m%d_%H%M%S).dump \
mydb
# Restore a custom-format dump
pg_restore \
--host=localhost \
--port=5432 \
--username=postgres \
--dbname=mydb \
--jobs=4 \
--verbose \
/backups/mydb_20250115_020000.dump
# Cluster-wide backup including roles and tablespaces
pg_dumpall \
--host=localhost \
--username=postgres \
--file=/backups/cluster_globals_$(date +%Y%m%d).sql \
--globals-onlyUse pg_dump for: schema migrations, cross-version upgrades, selective table exports, development database seeding, and as a supplemental backup for critical individual tables. Do not rely on it as your sole DR mechanism for databases above 10 GB.
pg_basebackup: Physical Backups
pg_basebackup creates a binary copy of the entire PostgreSQL data directory using the streaming replication protocol. Restore is a filesystem copy — no SQL parsing, no row-by-row insertion. A 500 GB database that takes 6 hours to restore from pg_dump can be restored from a physical backup in under 30 minutes on fast storage.
# Physical backup with WAL segments included (self-contained backup)
pg_basebackup \
--host=localhost \
--port=5432 \
--username=replication_user \
--pgdata=/backups/base/$(date +%Y%m%d) \
--format=tar \
--gzip \
--compress=9 \
--wal-method=stream \
--checkpoint=fast \
--progress \
--verbose
# Required: replication_user must have REPLICATION privilege
# CREATE ROLE replication_user REPLICATION LOGIN PASSWORD 'strongpassword';Use --wal-method=stream rather than --wal-method=fetch on pg_basebackup. The stream method opens a second connection to stream WAL concurrently with the base backup, ensuring the resulting backup is immediately usable for PITR without gaps — even if you have no WAL archive configured yet. The fetch method retrieves WAL at the end of the backup and can miss WAL segments if they are recycled before the backup completes.
pgBackRest: Enterprise-Grade Backup Management
pgBackRest is the production-grade backup solution used by the majority of serious PostgreSQL deployments today. It adds delta backup, parallel compression, remote backup over SSH or S3, retention policy management, backup verification, and PITR orchestration on top of the raw physical backup mechanism.
# /etc/pgbackrest/pgbackrest.conf
[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=4
repo1-retention-diff=14
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=your-strong-passphrase
# S3 remote repository (alternative to local)
# repo1-type=s3
# repo1-s3-bucket=your-backup-bucket
# repo1-s3-region=us-east-1
# repo1-s3-endpoint=s3.amazonaws.com
# repo1-path=/pgbackrest
compress-type=lz4
compress-level=6
process-max=4
log-level-console=info
log-level-file=detail
[mydb]
pg1-path=/var/lib/postgresql/15/main
pg1-port=5432
pg1-socket-path=/var/run/postgresql# Initial stanza creation (run once)
pgbackrest --stanza=mydb stanza-create
# Full backup
pgbackrest --stanza=mydb --type=full backup
# Differential backup (changes since last full)
pgbackrest --stanza=mydb --type=diff backup
# Incremental backup (changes since last backup of any type)
pgbackrest --stanza=mydb --type=incr backup
# List all backups
pgbackrest --stanza=mydb infoBarman: Backup and Recovery Manager
Barman (Backup and Recovery Manager) from EnterpriseDB is the other widely deployed enterprise backup tool. It specializes in streaming WAL via pg_receivewal for near-zero RPO, centralized backup server management for multiple PostgreSQL instances, and tightly integrated monitoring. Barman is the right choice when you need a dedicated backup server managing 10+ PostgreSQL clusters with centralized retention and alerting.
WAL Archiving for Point-in-Time Recovery
Write-Ahead Log (WAL) archiving is the foundation of PITR. Without WAL archiving, you can only restore to the exact moment your last base backup completed — not to any arbitrary point in time. Every committed transaction generates WAL, and archiving those WAL segments lets you replay the exact sequence of changes up to any second within your retention window.
# postgresql.conf — minimum WAL archiving configuration
wal_level = replica # Must be replica or logical for archiving
archive_mode = on # Enable archiving (requires restart)
archive_command = 'test ! -f /wal_archive/%f && cp %p /wal_archive/%f'
archive_timeout = 60 # Force WAL segment switch every 60 seconds
# Limits maximum RPO on idle databases
# For S3 archiving (using pgBackRest or WAL-G)
# archive_command = 'pgbackrest --stanza=mydb archive-push %p'
# archive_command = 'wal-g wal-push %p'The archive_command is a shell command that PostgreSQL calls for each completed WAL segment. The placeholders %p (full path to the WAL file) and %f (WAL filename only) are substituted automatically. The command must exit with status 0 only on success — if it exits non-zero, PostgreSQL retries indefinitely and the WAL segment is not removed from pg_wal.
# postgresql.conf — restore_command for PITR on the recovery target
# (This goes on the restored instance, not the primary)
restore_command = 'cp /wal_archive/%f %p'
# For pgBackRest:
# restore_command = 'pgbackrest --stanza=mydb archive-get %f %p'
# For WAL-G on S3:
# restore_command = 'wal-g wal-fetch %f %p'Test your archive_command immediately after enabling it. A misconfigured archive command that silently fails will cause WAL segments to accumulate in pg_wal/ until the filesystem fills up and PostgreSQL panics. Monitor the pg_stat_archiver view: SELECT * FROM pg_stat_archiver; and alert if failed_count is increasing or last_archived_time is stale.
PITR Walkthrough: Restoring to a Point in Time
This is the full PITR procedure for PostgreSQL 14+, from a physical base backup and WAL archive, targeting a specific timestamp.
# Step 1: Stop the failed/target PostgreSQL instance
systemctl stop postgresql@15-main
# Step 2: Clear the existing data directory (or use a new server)
rm -rf /var/lib/postgresql/15/main/*
# Step 3: Restore the base backup
# Option A: From pg_basebackup tar archive
tar -xzf /backups/base/20250222/base.tar.gz \
-C /var/lib/postgresql/15/main/
# Option B: Using pgBackRest
pgbackrest --stanza=mydb \
--type=time \
"--target=2025-02-22 03:45:00" \
--target-action=promote \
restore
# Step 4 (if restoring manually from pg_basebackup):
# Create recovery signal file — tells PostgreSQL to enter recovery mode
touch /var/lib/postgresql/15/main/recovery.signal
# Step 5: Configure recovery target in postgresql.auto.conf
# (PostgreSQL 12+: recovery.conf is gone; use postgresql.auto.conf or ALTER SYSTEM)
cat >> /var/lib/postgresql/15/main/postgresql.auto.conf << 'EOF'
restore_command = 'cp /wal_archive/%f %p'
recovery_target_time = '2025-02-22 03:45:00 UTC'
recovery_target_action = 'promote'
recovery_target_inclusive = true
EOF
# Step 6: Fix ownership
chown -R postgres:postgres /var/lib/postgresql/15/main/
# Step 7: Start PostgreSQL — it will enter recovery mode, replay WAL,
# stop at the target time, and promote to primary
systemctl start postgresql@15-main
# Step 8: Tail the logs and watch recovery progress
tail -f /var/log/postgresql/postgresql-15-main.logDuring recovery, PostgreSQL logs each WAL segment as it is applied. When it reaches the recovery_target_time, it logs recovery stopping before commit of transaction ... at <timestamp> and then, with recovery_target_action = 'promote', promotes the instance to a writable primary. Verify the recovery endpoint with SELECT pg_is_in_recovery(); — it should return false after promotion.
Tablespace and Multi-Volume Restore Considerations
If your PostgreSQL cluster uses non-default tablespaces (common for separating hot indexes onto NVMe from cold tables on HDD), the physical backup contains symlinks in pg_tblspc/ pointing to the original tablespace directories. On restore to a different server, those paths must exist and be writable before PostgreSQL starts.
# List tablespace symlinks in the backup
ls -la /var/lib/postgresql/15/main/pg_tblspc/
# Create the target directories on the new server
mkdir -p /mnt/nvme/pg_indexes_tblspc
mkdir -p /mnt/hdd/pg_archive_tblspc
chown postgres:postgres /mnt/nvme/pg_indexes_tblspc /mnt/hdd/pg_archive_tblspc
# If paths differ from the original, update symlinks before starting
# pgBackRest handles this with tablespace-map in the restore command:
pgbackrest --stanza=mydb \
--type=time \
"--target=2025-02-22 03:45:00" \
"--tablespace-map=16388=/mnt/nvme/pg_indexes_tblspc" \
"--tablespace-map=16389=/mnt/hdd/pg_archive_tblspc" \
restoreStreaming Replication for High Availability and Zero RPO
A warm standby via streaming replication is the most operationally efficient way to achieve low RTO for most organizations. The standby continuously replays WAL from the primary. In asynchronous mode it provides near-zero RPO (typically 0–500ms of lag under normal load). In synchronous mode it provides exactly zero RPO, because the primary does not acknowledge a commit until the standby has confirmed WAL receipt.
# On the PRIMARY: postgresql.conf
wal_level = replica
max_wal_senders = 5
wal_keep_size = 2048 # Keep 2 GB of WAL for lagging standbys
hot_standby = on
# Zero RPO: synchronous replication to at least one standby
# The primary will not acknowledge commits until 'standby1' confirms WAL receipt
synchronous_standby_names = 'FIRST 1 (standby1, standby2)'
# pg_hba.conf — allow replication connections
# host replication replication_user 192.168.1.0/24 scram-sha-256# Bootstrap a streaming standby from the primary
pg_basebackup \
--host=primary.db.internal \
--port=5432 \
--username=replication_user \
--pgdata=/var/lib/postgresql/15/main \
--wal-method=stream \
--checkpoint=fast \
--write-recovery-conf \ # Writes standby.signal + primary_conninfo automatically
--progressUse synchronous_commit = remote_apply instead of the default remote_write for the strongest durability guarantee. With remote_write, the standby has received the WAL but may not have applied it to shared buffers yet. With remote_apply, the commit is not confirmed until the standby has applied the WAL and the changes are visible on the standby — true zero data loss with zero read-your-writes gap during failover.
Testing DR: Automated Restore Verification
The only meaningful test of a backup strategy is a successful restore. A backup that has never been restored is a hypothesis, not a guarantee. Automate restore testing on a weekly schedule on a separate validation host. Treat a failed restore test as a Severity 1 incident.
#!/bin/bash
# dr-test.sh — weekly automated restore verification script
set -euo pipefail
STANZA="mydb"
RESTORE_HOST="dr-test-host.internal"
RESTORE_PORT=5433
RESTORE_DATA="/var/lib/postgresql/15/dr-test"
LOG="/var/log/dr-test/$(date +%Y%m%d).log"
TARGET_TIME="$(date -u -d '2 hours ago' '+%Y-%m-%d %H:%M:%S') UTC"
echo "[$(date -u)] Starting DR restore test" | tee -a "$LOG"
# Stop and clean restore target
ssh postgres@"$RESTORE_HOST" "pg_ctlcluster 15 dr-test stop || true"
ssh postgres@"$RESTORE_HOST" "rm -rf ${RESTORE_DATA}/*"
# Restore with pgBackRest to 2 hours ago
pgbackrest \
--stanza="$STANZA" \
--pg1-host="$RESTORE_HOST" \
--pg1-path="$RESTORE_DATA" \
--pg1-port="$RESTORE_PORT" \
--type=time \
"--target=${TARGET_TIME}" \
--target-action=promote \
restore 2>&1 | tee -a "$LOG"
# Start the restored instance
ssh postgres@"$RESTORE_HOST" "pg_ctlcluster 15 dr-test start"
# Wait for recovery to complete
sleep 30
# Validation queries
RESULT=$(psql \
--host="$RESTORE_HOST" \
--port="$RESTORE_PORT" \
--username=postgres \
--dbname=mydb \
--tuples-only \
--command="SELECT COUNT(*) FROM orders WHERE created_at < NOW() - interval '2 hours';")
echo "[$(date -u)] Validation row count: ${RESULT}" | tee -a "$LOG"
if [[ -z "$RESULT" || "$RESULT" -eq 0 ]]; then
echo "[$(date -u)] RESTORE TEST FAILED — row count is zero or query failed" | tee -a "$LOG"
# Page on-call via PagerDuty / OpsGenie
curl -s -X POST https://events.pagerduty.com/v2/enqueue \
-H "Content-Type: application/json" \
-d "{\"routing_key\":\"$PD_ROUTING_KEY\",\"event_action\":\"trigger\",\"payload\":{\"summary\":\"PostgreSQL DR restore test FAILED\",\"severity\":\"critical\"}}"
exit 1
fi
echo "[$(date -u)] RESTORE TEST PASSED" | tee -a "$LOG"Cloud DR: RDS, Aurora, and S3-Based WAL Archiving
For teams running PostgreSQL on AWS, native cloud DR capabilities significantly reduce operational burden — but they require deliberate configuration to meet custom RTO/RPO targets.
RDS Automated Backups take a daily storage-level snapshot and continuously stream transaction logs to S3. The retention window is configurable from 1 to 35 days. PITR to any second within the retention window is available from the console or AWS CLI:
# Restore RDS PostgreSQL instance to a specific point in time
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier mydb-prod \
--target-db-instance-identifier mydb-pitr-restore \
--restore-time 2025-02-22T03:45:00Z \
--db-instance-class db.r6g.2xlarge \
--multi-az \
--region us-east-1Aurora Global Database replicates data across up to five AWS regions with a storage-level replication lag of typically under 1 second. For global RPO requirements, it is the most operationally simple option. Regional failover (promoting a secondary region to primary) completes in under 1 minute with Aurora Global Database — a level of RTO that previously required complex multi-region Patroni clusters.
S3-Based WAL Archiving with WAL-G is the standard pattern for self-managed PostgreSQL on EC2 or Kubernetes when you need cost-effective off-site WAL archiving without a dedicated backup server:
# Environment variables for WAL-G with S3
export WALG_S3_PREFIX=s3://your-backup-bucket/postgres/mydb
export AWS_REGION=us-east-1
export WALG_COMPRESSION_METHOD=lz4
export WALG_DELTA_MAX_STEPS=6
# postgresql.conf
# archive_command = 'wal-g wal-push %p'
# restore_command = 'wal-g wal-fetch %f %p'
# Take a full backup
wal-g backup-push /var/lib/postgresql/15/main
# List available backups
wal-g backup-list
# Restore to PITR
WALG_DOWNLOAD_CONCURRENCY=8 \
wal-g backup-fetch /var/lib/postgresql/15/main LATESTMonitoring Backup Freshness
A backup strategy without freshness monitoring is incomplete. You must alert proactively when backups are stale — before a failure exposes the gap.
-- Query pg_stat_archiver for WAL archive health
SELECT
archived_count,
last_archived_wal,
last_archived_time,
failed_count,
last_failed_wal,
last_failed_time,
EXTRACT(EPOCH FROM (now() - last_archived_time)) / 3600 AS hours_since_last_archive
FROM pg_stat_archiver;
-- Alert condition: archive lag > 1 hour
SELECT CASE
WHEN EXTRACT(EPOCH FROM (now() - last_archived_time)) > 3600
THEN 'ALERT: WAL archive is stale'
ELSE 'OK'
END AS archive_status
FROM pg_stat_archiver;# Prometheus alerting rule (prometheus/alerts/postgresql.yml)
# Alert if last pgBackRest backup is older than 25 hours
- alert: PostgreSQLBackupStale
expr: |
(time() - pgbackrest_backup_timestamp_stop_seconds{type="full"}) > 90000
for: 5m
labels:
severity: critical
annotations:
summary: "PostgreSQL backup is stale ({{ $labels.stanza }})"
description: "Last successful full backup completed {{ $value | humanizeDuration }} ago. Maximum threshold is 25 hours."
- alert: PostgreSQLWALArchiveFailing
expr: pg_stat_archiver_failed_count > 0
for: 10m
labels:
severity: warning
annotations:
summary: "PostgreSQL WAL archiving is failing"
description: "{{ $value }} WAL segments have failed to archive on {{ $labels.instance }}."DR Runbook Template
Every PostgreSQL deployment needs a written DR runbook — a step-by-step procedure that a sleep-deprived on-call engineer can execute under pressure. Below is the canonical template. Populate it for your environment before you need it.
═══════════════════════════════════════════════════════
POSTGRESQL DISASTER RECOVERY RUNBOOK
Last tested: [DATE] | Tested by: [NAME]
RTO target: [X minutes] | RPO target: [Y minutes]
═══════════════════════════════════════════════════════
PHASE 1: DETECTION & ASSESSMENT (target: < 5 min)
──────────────────────────────────────────────────
[ ] Confirm outage: psql -h PRIMARY_HOST -c "SELECT 1;"
[ ] Check PostgreSQL logs: journalctl -u postgresql@15-main -n 200
[ ] Check system health: df -h, free -m, dmesg | tail -50
[ ] Determine failure type:
- Software crash → attempt restart first
- Corruption/hardware → proceed to restore
[ ] Notify: Slack #incidents, page DBA lead
[ ] Open incident ticket and begin timeline log
PHASE 2: FAILOVER DECISION (target: < 10 min)
──────────────────────────────────────────────
[ ] Is a streaming standby available?
YES → promote standby (Phase 2A)
NO → restore from backup (Phase 2B)
Phase 2A — Standby Promotion:
[ ] Confirm standby is healthy:
psql -h STANDBY_HOST -c "SELECT pg_is_in_recovery();"
[ ] Check replication lag before promotion:
psql -h STANDBY_HOST -c "SELECT now() - pg_last_xact_replay_timestamp();"
[ ] Promote the standby:
pg_ctl promote -D /var/lib/postgresql/15/main
# OR via Patroni: patronictl failover --master PRIMARY --candidate STANDBY
[ ] Update DNS / load balancer to point to new primary
[ ] Verify application connectivity
Phase 2B — Restore from Backup:
[ ] Identify latest usable backup:
pgbackrest --stanza=mydb info
[ ] Determine recovery target time (use incident start - 5 min buffer)
[ ] Proceed to Phase 3
PHASE 3: RESTORE STEPS (time varies by database size)
──────────────────────────────────────────────────────
[ ] Provision restore target host if original is unrecoverable
[ ] Stop PostgreSQL on restore target (if running)
[ ] Execute pgBackRest restore:
pgbackrest --stanza=mydb \
--type=time \
"--target=RECOVERY_TARGET_TIMESTAMP" \
--target-action=promote \
restore
[ ] Start PostgreSQL and monitor recovery progress:
tail -f /var/log/postgresql/postgresql-15-main.log
[ ] Confirm promotion: SELECT pg_is_in_recovery(); → false
PHASE 4: VALIDATION (target: < 15 min)
────────────────────────────────────────
[ ] Row count spot-checks on critical tables
[ ] Verify latest expected transaction exists
[ ] Run application smoke tests
[ ] Check for sequence gaps or constraint violations
[ ] Confirm application is writing new data successfully
PHASE 5: POST-INCIDENT (within 24 hours)
─────────────────────────────────────────
[ ] Root cause analysis documented
[ ] Recovery timeline documented (detection → restore complete)
[ ] Gap against RTO/RPO targets identified
[ ] Backup strategy updated if deficiencies found
[ ] Runbook updated with lessons learned
[ ] Stakeholder summary sent
═══════════════════════════════════════════════════════Keep your DR runbook in a location that is accessible when your primary systems are down — a printed copy in your office, a document in a separate cloud account, or a page in a wiki that does not depend on the same infrastructure as your database. A runbook stored only in Confluence on the same AWS account that just lost its database is not useful during a DR event.
- Define RTO and RPO in writing before touching any backup tooling — these numbers determine every architectural decision that follows.
- Physical backups (
pg_basebackup, pgBackRest, Barman) are dramatically faster to restore than logical backups (pg_dump) for any database over 10 GB. Use logical backups as a supplement, not a primary DR mechanism. - WAL archiving is the prerequisite for PITR. Enable
archive_mode = onand validate yourarchive_commandis actually working by checkingpg_stat_archiverdaily. - Synchronous replication (
synchronous_standby_names) achieves zero RPO at the cost of write latency. For most workloads, async replication with a WAL archive provides the right balance of performance and data safety. - Test your restore. Weekly. Automate it. A failed restore test is a Severity 1 incident, because it means your DR plan is broken before a disaster proved it.
- Alert when your last successful backup is older than 24 hours. Alert when WAL archiving is failing. Monitor
pg_stat_archiverand your backup tool's own metrics. Backup freshness must be an active monitoring concern, not a periodic audit. - Maintain a written, tested DR runbook that lives outside your primary infrastructure and can be executed by an on-call engineer who did not design the backup strategy.
Building and validating a PostgreSQL disaster recovery strategy is one of the highest-leverage investments a DBA team can make — until you need it, it is invisible; when you need it, nothing else matters. If your current backup architecture has gaps, untested assumptions, or undefined RTO/RPO targets, JusDB's PostgreSQL consulting team can audit your existing setup, design a backup and replication strategy matched to your business requirements, automate restore testing, and build the runbook your team will rely on when the lights go out. Reach out at jusdb.com before the next incident, not during it.