Database SRE

PostgreSQL Point-in-Time Recovery: WAL Archiving, pg_basebackup, and pgBackRest

PostgreSQL PITR lets you recover your database to any point in time — down to the second — by replaying WAL segments on top of a base backup. Here's how to set it up correctly.

JusDB Team
January 9, 2023
10 min read
216 views

At 2:47 AM on a Tuesday in January 2024, a senior engineer at a fintech startup ran what he thought was a targeted cleanup script against the staging database. The connection string in his terminal was wrong — it pointed to production. DROP TABLE transactions; executed in under a second, taking 2.3 million transaction records with it. Cascading foreign key deletes wiped three dependent tables. The on-call DBA had a pgBackRest-managed WAL archive and a base backup from 11 PM the night before. Recovery to the point 90 seconds before the drop completed in 40 minutes. Without PITR, the alternative was restoring from the previous night's snapshot and losing 3 hours and 47 minutes of live transaction data — an unacceptable outcome for a payment processor.

PostgreSQL's Point-in-Time Recovery (PITR) is the mechanism that made that 40-minute recovery possible. It works by replaying Write-Ahead Log (WAL) segments on top of a base backup, stopping at any moment in time you specify. When configured correctly, PITR gives you the ability to recover your database to within seconds of any event — an accidental DROP, a botched migration, silent data corruption, or a ransomware attack that encrypted your primary storage.

This guide covers everything you need to implement production-grade PostgreSQL PITR: WAL archiving configuration, taking base backups, executing a recovery, choosing the right recovery target, using pgBackRest for production workloads, and — critically — testing that your recovery actually works before you need it.

TL;DR
  • PITR works by replaying WAL segments on top of a base backup. You need both: a base backup plus a continuous WAL archive.
  • Set wal_level = replica and archive_mode = on in postgresql.conf before anything else. These require a restart.
  • Use pg_basebackup for simple base backups; use pgBackRest for production — it adds compression, parallelism, incremental backups, and retention management.
  • PostgreSQL 12+ merged recovery.conf into postgresql.conf. Create a recovery.signal file in the data directory to trigger recovery mode.
  • Always specify recovery_target_action = promote — without it, the server pauses at the recovery target indefinitely.
  • Test your recovery monthly in an isolated environment. A backup you have never restored from is not a backup.
  • With pgBackRest parallel restore, RTO for a 500 GB database is typically under 30 minutes on modern storage.

How PostgreSQL PITR Works

PostgreSQL writes every change to the Write-Ahead Log before applying it to data files. This is not optional — it is fundamental to crash recovery, replication, and PITR. Every INSERT, UPDATE, DELETE, CREATE TABLE, and DROP TABLE generates WAL records. WAL records are organized into 16 MB segment files, named with a 24-character hexadecimal sequence like 000000010000000000000001.

PITR works in three stages:

  1. Base backup: A consistent snapshot of the entire data directory, taken while the server is running. PostgreSQL uses a checkpoint mechanism to ensure the backup is internally consistent.
  2. WAL archive: Every completed WAL segment is copied to a separate location (local disk, S3, GCS, etc.) as it is finished. This archive forms a continuous log of all changes since the base backup was taken.
  3. Recovery: To restore, you extract the base backup to a new data directory, configure a restore_command that tells PostgreSQL how to fetch WAL segments from the archive, set a recovery target (point in time, transaction ID, or named restore point), and start the server. PostgreSQL replays WAL segments sequentially until it reaches the target, then promotes to a writable primary.

The key insight is that the WAL contains the complete history of every change. If you archive all WAL segments between two base backups without gaps, you can recover to any microsecond within that window. A single missing WAL segment breaks the chain and limits recovery to the last segment before the gap.

Critical

WAL archiving and PITR protect against logical errors (accidental deletes, bad migrations) and storage failures. They do not replace high availability. A streaming replica provides automatic failover for infrastructure failures; PITR provides time-travel recovery for data errors. You need both in production.

Prerequisites: Configuring WAL Archiving

Before you can perform PITR, PostgreSQL must be configured to archive WAL segments. This requires changes to postgresql.conf and a restart. Plan this into a maintenance window for existing production servers.

postgresql.conf settings

Three parameters control WAL archiving. Open postgresql.conf (location varies; find it with SHOW config_file; in psql) and set:

bash
# Minimum WAL level required for archiving and streaming replication
# Options: minimal, replica, logical
# 'replica' is sufficient for PITR; 'logical' adds overhead you don't need
wal_level = replica

# Enable archiving — PostgreSQL will call archive_command for each completed segment
archive_mode = on

# The shell command PostgreSQL runs to archive each WAL segment
# %p = full path to the WAL file to archive
# %f = WAL filename only
# The command must return exit code 0 only on success
archive_command = 'cp %p /mnt/wal_archive/%f'

# How long to wait before archiving a partially-filled WAL segment
# Lower values reduce RPO; higher values reduce archive I/O
archive_timeout = 60   # seconds — archive at most every 60s even if WAL isn't full
Gotcha

The archive_command must be idempotent: if the same segment is archived twice (which PostgreSQL will retry on failure), the second call must succeed. The cp example above is idempotent because it overwrites the destination. A command that fails if the file already exists will cause archiving to stall.

After editing, apply the changes:

bash
# Test configuration syntax without restarting
pg_ctlcluster 16 main configtest

# Reload works for archive_command changes, but NOT for wal_level or archive_mode
# Those require a full restart
sudo systemctl restart postgresql@16-main

# Verify archiving is active
psql -c "SHOW archive_mode;"
psql -c "SHOW wal_level;"

# Force a WAL switch and check archive_status
psql -c "SELECT pg_switch_wal();"
ls /mnt/wal_archive/

Choosing a WAL archive destination

The archive destination is the most important infrastructure decision for PITR. The archive must survive whatever destroys your primary — if your primary and archive are on the same storage volume, you lose both in a disk failure.

Local directory (simplest, least durable):

bash
archive_command = 'cp %p /mnt/wal_archive/%f'
# /mnt/wal_archive should be a separate physical volume or NFS mount

Amazon S3 (recommended for AWS-hosted PostgreSQL):

bash
# Using AWS CLI — ensure the postgres OS user has IAM permissions to s3:PutObject
archive_command = 'aws s3 cp %p s3://my-pg-wal-archive/wal/%f'

# With pgBackRest (preferred — handles retries, compression, verification)
archive_command = 'pgbackrest --stanza=mydb archive-push %p'

Google Cloud Storage:

bash
archive_command = 'gsutil cp %p gs://my-pg-wal-archive/wal/%f'
Tip

Use pgBackRest's archive-push as your archive_command even if you are archiving to a local directory. pgBackRest compresses WAL segments, verifies their integrity with checksums, handles retries transparently, and produces structured logs that make it easy to detect archiving gaps. The operational overhead of setting up pgBackRest pays for itself the first time you need to debug a stalled archive.

Taking a Base Backup with pg_basebackup

A base backup is the starting point for any recovery. PostgreSQL ships pg_basebackup as the standard tool for creating base backups while the server is running and accepting connections.

Basic pg_basebackup command

Create a dedicated replication user for backups — do not use a superuser for this in production:

sql
-- Run as superuser in psql
CREATE ROLE replicator WITH LOGIN REPLICATION PASSWORD 'strong_password_here';

-- Allow connection from backup host (or localhost)
-- Edit pg_hba.conf to add:
-- host  replication  replicator  127.0.0.1/32  scram-sha-256

Then take the base backup:

bash
# -h : host
# -U : replication user
# -D : destination directory for the backup
# -Ft: tar format (one tar file per tablespace; default.tar is the main cluster)
# -z : gzip compress the tar files
# -Xs: stream WAL during the backup (ensures backup is self-consistent without gaps)
# -P : show progress during backup

pg_basebackup \
  -h localhost \
  -U replicator \
  -D /backup/base \
  -Ft \
  -z \
  -Xs \
  -P

# Output:
# 23428/23428 kB (100%), 1/1 tablespace
# ls /backup/base/
# base.tar.gz  pg_wal.tar.gz
Gotcha

The -Xs flag streams WAL during the backup. Without it, you must ensure that all WAL generated during the backup period is available in your archive before you can use this backup for PITR. With -Xs, the backup is self-contained for the duration of the backup itself, but you still need your ongoing WAL archive for point-in-time recovery after the backup completes.

Verifying backup integrity

bash
# Check the backup label to confirm start/stop WAL positions
tar -xzf /backup/base/base.tar.gz -O backup_label
# START WAL LOCATION: 0/3000028 (file 000000010000000000000003)
# CHECKPOINT LOCATION: 0/3000060
# BACKUP METHOD: streamed
# BACKUP FROM: primary
# START TIME: 2024-01-15 03:00:01 UTC
# LABEL: pg_basebackup base backup

# Verify the archive has segments from the backup start LSN forward
ls /mnt/wal_archive/ | head -5

# PostgreSQL 15+ supports backup manifest verification
pg_verifybackup /backup/base/
# backup successfully verified

Performing Point-in-Time Recovery

This procedure recovers a PostgreSQL cluster to a specific point in time. Execute all steps on the recovery target server — a separate host from the production primary if possible.

Step 1: Stop PostgreSQL

bash
# Stop the existing PostgreSQL instance (if any)
sudo systemctl stop postgresql@16-main

# Verify it is stopped
pg_lsclusters

Step 2: Restore the base backup

bash
# Move or rename the existing data directory
sudo mv /var/lib/postgresql/16/main /var/lib/postgresql/16/main.broken

# Create a fresh data directory
sudo mkdir -p /var/lib/postgresql/16/main
sudo chown postgres:postgres /var/lib/postgresql/16/main
sudo chmod 700 /var/lib/postgresql/16/main

# Extract the base backup as the postgres user
sudo -u postgres bash -c "
  cd /var/lib/postgresql/16/main
  tar -xzf /backup/base/base.tar.gz
  tar -xzf /backup/base/pg_wal.tar.gz -C pg_wal/
"

# Verify the data directory structure
ls -la /var/lib/postgresql/16/main/
# PG_VERSION  base  global  pg_hba.conf  pg_wal  postgresql.conf  ...

Step 3: Configure recovery target

PostgreSQL 12 and later merged recovery.conf into postgresql.conf. You no longer create a separate recovery.conf file. Instead, add recovery parameters directly to postgresql.conf and create a signal file to activate recovery mode.

bash
# Add recovery configuration to postgresql.conf
sudo -u postgres tee -a /var/lib/postgresql/16/main/postgresql.conf <<'EOF'

# ── PITR Recovery Configuration ─────────────────────────────────────────────
# Command PostgreSQL uses to fetch WAL segments from the archive
# %f = WAL filename, %p = destination path
restore_command = 'cp /mnt/wal_archive/%f %p'

# Recover to this specific point in time (UTC)
# PostgreSQL will replay WAL up to but not past this timestamp
recovery_target_time = '2024-01-15 14:30:00'

# What to do when the recovery target is reached:
# 'promote'  = make the server writable (most common for disaster recovery)
# 'pause'    = pause at target for inspection (useful for forensics)
# 'shutdown' = shut down cleanly at target
recovery_target_action = promote

# Optional: include/exclude transactions that committed exactly AT the target time
# 'true' (default) = include transactions that committed at the target time
recovery_target_inclusive = true
EOF

# Create recovery.signal — this tells PostgreSQL to enter recovery mode
# Without this file, PostgreSQL ignores the recovery parameters above
sudo -u postgres touch /var/lib/postgresql/16/main/recovery.signal
Critical

If you are using S3 or GCS as your WAL archive, your restore_command must be able to run as the postgres OS user. Test the command manually before starting recovery: sudo -u postgres cp /mnt/wal_archive/000000010000000000000001 /tmp/test_wal. A broken restore_command causes recovery to stall with cryptic log messages.

Step 4: Start and verify

bash
# Start PostgreSQL — it will enter recovery mode
sudo systemctl start postgresql@16-main

# Tail the logs to watch WAL replay progress
sudo tail -f /var/log/postgresql/postgresql-16-main.log

# Expected log output:
# LOG:  starting point-in-time recovery to 2024-01-15 14:30:00+00
# LOG:  restored log file "000000010000000000000003" from archive
# LOG:  redo starts at 0/3000028
# LOG:  consistent recovery state reached at 0/3000100
# LOG:  recovery stopping before commit of transaction 12345678, time 2024-01-15 14:30:01+00
# LOG:  pausing at the end of recovery
# LOG:  database system is ready to accept read-only connections
# LOG:  selected new timeline ID: 2
# LOG:  archive recovery complete

# After promotion, verify the data
psql -c "SELECT now();"
psql -c "SELECT count(*) FROM transactions;"

# Confirm recovery.signal was removed (PostgreSQL removes it on successful promotion)
ls /var/lib/postgresql/16/main/recovery.signal
# ls: cannot access '.../recovery.signal': No such file or directory

Recovery Target Options

PostgreSQL supports four recovery target types. Use the one that best matches the evidence you have about when the data error occurred.

Time-based recovery — the most common. Use when you know approximately when the bad event happened:

bash
# Recover to a specific timestamp (UTC recommended to avoid timezone ambiguity)
recovery_target_time = '2024-01-15 14:30:00'

# With timezone
recovery_target_time = '2024-01-15 14:30:00+00'

Transaction ID-based recovery — use when you know the exact transaction that caused the damage. Requires querying pg_waldump to identify the XID:

bash
# Find the transaction ID that dropped the table
pg_waldump -p /mnt/wal_archive -n 100 000000010000000000000010 | grep "DROP TABLE"
# rmgr: Heap        len (rec/tot):     59/    59, tx:     12345701, ...

# Recover to just before that transaction
recovery_target_xid = '12345700'    # one transaction before the bad one
recovery_target_inclusive = false   # do not apply the target XID itself

LSN-based recovery — most precise. Use when you have the exact WAL Log Sequence Number from monitoring or logs:

bash
# Recover to a specific LSN (Log Sequence Number)
recovery_target_lsn = '0/14ECFA8'

# Find LSNs with pg_waldump
pg_waldump -p /mnt/wal_archive -s 0/14000000 -e 0/15000000 000000010000000000000014

Named restore point — use when your application creates named checkpoints before risky operations:

sql
-- Application code creates a restore point before a migration
SELECT pg_create_restore_point('before_v2_3_migration');
-- Returns: 0/1234ABCD
bash
# Recover to the named restore point
recovery_target_name = 'before_v2_3_migration'
Tip

Integrate pg_create_restore_point() into your deployment pipeline. Before every schema migration, create a named restore point. This turns "recover to before the migration" from a forensic investigation into a single configuration line.

Using pgBackRest for Production PITR

pg_basebackup is a good starting point, but it has significant limitations for production use: it does not support incremental backups, has no built-in retention management, does not parallelize restore operations, and provides no WAL archiving management. For databases over 100 GB, or for any team with an RTO measured in minutes rather than hours, pgBackRest is the right tool.

pgBackRest config

Install pgBackRest and create the configuration file:

bash
# Install on Debian/Ubuntu
sudo apt install pgbackrest

# Create the configuration file
sudo tee /etc/pgbackrest/pgbackrest.conf <<'EOF'
[global]
# Repository location — use S3 for production
repo1-type=s3
repo1-s3-bucket=my-pg-backups
repo1-s3-endpoint=s3.amazonaws.com
repo1-s3-region=us-east-1
repo1-s3-key=AKIAIOSFODNN7EXAMPLE
repo1-s3-key-secret=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

# Enable compression
repo1-retention-full=2           # keep 2 full backups
repo1-retention-diff=7           # keep 7 differential backups
compress-type=lz4                # fast compression, good ratio
compress-level=1

# Process level logging
log-level-console=info
log-level-file=detail
log-path=/var/log/pgbackrest

[mydb]
pg1-path=/var/lib/postgresql/16/main
pg1-port=5432
pg1-user=postgres
EOF

# Initialize the stanza (repository metadata for this PostgreSQL cluster)
sudo -u postgres pgbackrest --stanza=mydb stanza-create

# Update postgresql.conf to use pgBackRest for archiving
# archive_command = 'pgbackrest --stanza=mydb archive-push %p'
# archive_cleanup_command = 'pgbackrest --stanza=mydb archive-cleanup %p'

Retention policies

bash
# Take a full backup
pgbackrest --stanza=mydb backup --type=full

# Take a differential backup (changes since last full)
pgbackrest --stanza=mydb backup --type=diff

# Take an incremental backup (changes since last backup of any type)
pgbackrest --stanza=mydb backup --type=incr

# List all backups with sizes, timestamps, and WAL ranges
pgbackrest --stanza=mydb info

# Output example:
# stanza: mydb
#     status: ok
#     full backup: 20240115-030001F
#         timestamp start/stop: 2024-01-15 03:00:01 / 2024-01-15 03:12:44
#         wal start/stop: 000000010000000000000003 / 000000010000000000000005
#         database size: 48.3GB, database backup size: 48.3GB
#     diff backup: 20240115-030001F_20240115-120001D
#         timestamp start/stop: 2024-01-15 12:00:01 / 2024-01-15 12:02:11
#         database backup size: 2.1GB

# Expire old backups per retention policy
pgbackrest --stanza=mydb expire

Parallel backup/restore

bash
# Restore to a specific point in time using pgBackRest
# --target-action=promote promotes the instance after reaching the target
# --process-max=4 uses 4 parallel processes for restore (matches your CPU count)
pgbackrest --stanza=mydb restore \
  --target="2024-01-15 14:30:00" \
  --target-action=promote \
  --target-exclusive=false \
  --process-max=4 \
  --delta   # only restore changed files (useful for large databases)

# Verify the restore
pgbackrest --stanza=mydb restore --dry-run \
  --target="2024-01-15 14:30:00"
Tip

Set --process-max to the number of disks in your storage array, not your CPU count. pgBackRest parallelism is I/O-bound. On a 4-disk RAID configuration, --process-max=4 will saturate the array and deliver maximum throughput. On a single NVMe SSD, --process-max=2 is typically sufficient.

Testing Your Recovery

The most common reason PITR fails in a real disaster is that it has never been tested. Archive gaps, broken restore_command scripts, expired IAM credentials, missing WAL segments — none of these problems are visible until you attempt a restore. By then, you are under pressure.

Schedule a monthly recovery test. The test should be fully automated and run in an isolated environment that mirrors production as closely as possible.

bash
#!/bin/bash
# recovery-test.sh — run monthly in CI or as a cron job
set -euo pipefail

STANZA="mydb"
TEST_DIR="/tmp/pg_recovery_test"
TEST_PORT=5435
LOG_FILE="/var/log/pgbackrest/recovery-test-$(date +%Y%m%d).log"
TARGET_TIME=$(date -u -d '1 hour ago' '+%Y-%m-%d %H:%M:%S')

echo "Starting recovery test at $(date -u)" | tee "$LOG_FILE"
echo "Recovery target: $TARGET_TIME" | tee -a "$LOG_FILE"

# Clean up previous test
rm -rf "$TEST_DIR"
mkdir -p "$TEST_DIR"

# Restore to test directory
pgbackrest --stanza="$STANZA" \
  --pg1-path="$TEST_DIR" \
  --pg1-port="$TEST_PORT" \
  --target="$TARGET_TIME" \
  --target-action=promote \
  --process-max=4 \
  restore 2>&1 | tee -a "$LOG_FILE"

# Start the recovered instance on a non-production port
pg_ctl -D "$TEST_DIR" -o "-p $TEST_PORT" -l "$LOG_FILE" start
sleep 5

# Validate: check row counts against known baselines
TRANSACTION_COUNT=$(psql -p "$TEST_PORT" -t -c "SELECT count(*) FROM transactions;" | tr -d ' ')
echo "Transaction count after recovery: $TRANSACTION_COUNT" | tee -a "$LOG_FILE"

if [ "$TRANSACTION_COUNT" -lt "1000000" ]; then
  echo "FAIL: Transaction count below expected minimum" | tee -a "$LOG_FILE"
  pg_ctl -D "$TEST_DIR" stop
  exit 1
fi

echo "PASS: Recovery test completed successfully" | tee -a "$LOG_FILE"
pg_ctl -D "$TEST_DIR" stop
rm -rf "$TEST_DIR"
Gotcha

Do not test recovery on your production host using the production data directory. A misconfigured recovery test can overwrite your live data directory. Always restore to an isolated host or a separate directory with a non-production port number.

RTO and RPO: Setting Realistic Targets

Recovery Time Objective (RTO) is how long recovery takes. Recovery Point Objective (RPO) is how much data you can afford to lose. Both are contractual commitments, not aspirational targets — size your infrastructure to meet them.

Database Size Backup Tool Storage Realistic RTO RPO with archive_timeout=60s
Under 50 GB pg_basebackup Local SSD 10–20 minutes ~60 seconds
50 GB – 500 GB pgBackRest parallel S3 / NVMe 20–45 minutes ~60 seconds
500 GB – 2 TB pgBackRest parallel S3 + local staging 45–90 minutes ~60 seconds
Over 2 TB pgBackRest + delta restore S3 + high-throughput NVMe 2–6 hours ~60 seconds

Several factors drive RTO variation more than database size alone:

  • Network bandwidth between archive and recovery host: A 500 GB restore over a 1 Gbps link takes a minimum of 67 minutes just for data transfer. A 10 Gbps link cuts that to under 7 minutes before WAL replay.
  • WAL volume to replay: If the recovery target is 8 hours after the base backup, PostgreSQL must replay 8 hours of WAL. High-write databases generate substantial WAL — this can add tens of minutes to recovery time.
  • pgBackRest delta restore: If the recovery target server already has most of the data (e.g., a warm standby), --delta mode only transfers changed blocks. This can reduce restore time by 80–90% for recent failures.
  • Parallelism: pgBackRest with --process-max=8 on a well-provisioned server can achieve throughputs of 2–4 GB/s from S3, making a 1 TB restore feasible in under 10 minutes of data transfer.

For RPO, the critical parameter is archive_timeout. At the default (disabled or very long), a low-traffic database might not fill a WAL segment for hours, meaning you could lose hours of data even with archiving enabled. Set archive_timeout = 60 to ensure WAL is archived at least every 60 seconds regardless of traffic. For stricter RPO requirements (under 5 seconds), configure streaming replication to a standby — WAL streaming provides near-zero RPO for infrastructure failures.

Critical

Measure your actual restore throughput during monthly recovery tests and compare it against your RTO SLA. If restoring 300 GB takes 55 minutes and your RTO is 1 hour, you have almost no margin for unexpected WAL replay time. Either improve throughput (faster storage, more parallelism) or widen your RTO commitment to an honest number.

Key Takeaways
  • PITR requires two components: a base backup and a continuous WAL archive. A gap in either breaks recovery at that point.
  • Set wal_level = replica and archive_mode = on first — these require a PostgreSQL restart and must be in place before you need recovery.
  • PostgreSQL 12+ no longer uses recovery.conf. Use recovery.signal plus parameters in postgresql.conf to trigger recovery mode.
  • Always set recovery_target_action = promote unless you specifically need to pause the server for forensic inspection at the recovery target.
  • pgBackRest is the production standard over bare pg_basebackup: it adds compression, parallel restore, incremental backups, S3/GCS integration, and WAL archiving management in a single tool.
  • Use named restore points (pg_create_restore_point()) before risky deployments to make recovery target selection trivial.
  • Set archive_timeout = 60 on low-traffic databases or your effective RPO could be measured in hours rather than seconds.
  • Test your recovery every month on an isolated host. The test should verify data integrity, not just that the restore command exited 0.
  • pgBackRest parallel restore achieves RTOs under 30 minutes for databases under 500 GB on modern NVMe-backed storage with adequate network bandwidth.

Working with JusDB on PostgreSQL Backup

JusDB implements and manages production-grade PostgreSQL backup strategies — including WAL archiving, pgBackRest configuration, automated recovery testing, and RTO/RPO validation. When data loss happens, your recovery window is measured in minutes, not hours.

Explore JusDB PostgreSQL Services →  |  Talk to a DBA

Related reading:

Share this article