At 2:47 AM on a Tuesday, a DBA at a mid-sized fintech company got paged: their primary PostgreSQL 15 instance had gone read-only. The root cause was not a query gone wrong or a rogue migration — it was a 500 GB pg_wal directory that had silently consumed every available gigabyte on the data volume. An inactive logical replication slot, left behind after a failed migration test six weeks earlier, had been holding every WAL segment generated since then. The instance had been slowly accumulating WAL files at a rate of about 80 GB per week, and nobody had a dashboard watching it. The on-call engineer spent three hours recovering because they were not sure which slots were safe to drop, what archiving had already completed, and whether dropping the slot could corrupt the replica. This guide is the runbook they wish they had.
- WAL files are PostgreSQL's write-ahead log — they underpin crash recovery, streaming replication, and point-in-time recovery (PITR). PostgreSQL never deletes them until it is certain they are no longer needed.
- The primary levers for WAL retention are
wal_keep_size(PostgreSQL 13+),wal_keep_segments(PostgreSQL 12 and older),min_wal_size, andmax_wal_size. - Inactive replication slots are the most common cause of uncontrolled WAL accumulation — they override every other retention limit and will fill your disk.
- Monitor
pg_replication_slots,pg_ls_waldir(), andpg_stat_replicationcontinuously; alert before you hit 70–75% of the WAL volume. - Safe cleanup paths:
pg_drop_replication_slot()for orphaned slots,pg_archivecleanupfor archived WAL segments, and tuningwal_keep_sizeto a realistic standby lag window.
What WAL Files Are and Why PostgreSQL Keeps Them
The Write-Ahead Log is the foundation of PostgreSQL's durability guarantee. Before any data page is modified on disk, PostgreSQL writes a record of the change to WAL. On crash, PostgreSQL replays WAL from the last checkpoint forward, bringing the data files back to a consistent state. This is the first reason WAL files must be retained: crash recovery.
The second reason is streaming replication. Standby servers replicate by consuming the primary's WAL stream. If a standby falls behind — because of network latency, a maintenance window, or a long-running query on the standby — the primary must retain WAL segments that the standby has not yet consumed. If those segments are recycled before the standby catches up, the standby is forcibly disconnected and must be rebuilt from a base backup.
The third reason is point-in-time recovery (PITR). When WAL archiving is enabled, every WAL segment is copied to an archive location (S3, GCS, NFS) before it is recycled on the primary. A restore operation replays a base backup followed by archived WAL segments up to any target timestamp. The archive must be complete and contiguous — a single missing segment makes recovery impossible past that gap.
WAL files live in $PGDATA/pg_wal. Each segment is exactly 16 MB by default (configurable at initdb time via --wal-segsize). File names are 24-character hexadecimal strings encoding the timeline ID and log sequence number, for example 000000010000001A000000B3. PostgreSQL recycles (renames) segments rather than deleting and recreating them where possible, but the directory size grows when demand outpaces recycling.
WAL Retention Parameters
wal_keep_size (PostgreSQL 13 and Later)
Introduced in PostgreSQL 13 as a replacement for wal_keep_segments, wal_keep_size specifies the minimum amount of past WAL that the primary retains for standbys that are not using replication slots. The value is in megabytes by default but accepts GB suffixes.
-- Check current setting
SHOW wal_keep_size;
-- Set in postgresql.conf (no restart required, reload is sufficient)
-- wal_keep_size = 2GB
-- Apply without restart
SELECT pg_reload_conf();A reasonable starting value for most production setups is 1–2 GB. Setting it too high wastes disk space; setting it too low risks disconnecting a standby that is slightly behind during a burst write period. If you use replication slots (covered below), wal_keep_size is largely redundant — slots retain WAL independently — but it remains a useful backstop for physical standbys configured without slots.
wal_keep_segments (PostgreSQL 12 and Older)
The predecessor to wal_keep_size. It specifies the number of 16 MB WAL segments to retain. On PostgreSQL 12 and below:
-- Each segment is 16 MB; 128 segments = 2 GB
-- In postgresql.conf:
-- wal_keep_segments = 128
SHOW wal_keep_segments;If you are still running PostgreSQL 12 or earlier, plan your upgrade path. PostgreSQL 12 reached end-of-life in November 2024.
min_wal_size and max_wal_size
These parameters control how PostgreSQL manages the WAL directory size in relation to checkpoints — they do not directly control how long WAL is retained for replication or recovery, but they set the floor and ceiling for WAL recycling behavior.
-- Defaults
SHOW min_wal_size; -- 80MB
SHOW max_wal_size; -- 1GB
-- View current checkpoint statistics
SELECT checkpoints_timed, checkpoints_req, buffers_checkpoint,
round(checkpoint_write_time / 1000.0, 2) AS write_secs,
round(checkpoint_sync_time / 1000.0, 2) AS sync_secs
FROM pg_stat_bgwriter;min_wal_size is the amount of WAL file space that PostgreSQL will keep recycled rather than deleting after a checkpoint. max_wal_size is the soft cap on total WAL size before a checkpoint is triggered regardless of checkpoint_completion_target and checkpoint_timeout. On write-heavy systems, raise max_wal_size to reduce checkpoint frequency. A common production value is 4–8 GB.
If checkpoints_req in pg_stat_bgwriter is consistently higher than checkpoints_timed, max_wal_size is too low for your write rate. Increasing it reduces I/O pressure and improves throughput at the cost of slightly longer crash recovery time.
Replication Slots and WAL Accumulation
Replication slots are the most operationally dangerous WAL retention mechanism because they have no upper bound. A slot retains every WAL segment generated since the slot's restart_lsn — regardless of wal_keep_size, max_wal_size, or anything else. If the consumer of a slot stops reading, the WAL directory grows without limit until the disk fills.
-- List all replication slots and their WAL retention footprint
SELECT
slot_name,
slot_type,
active,
active_pid,
restart_lsn,
confirmed_flush_lsn,
pg_size_pretty(
pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
) AS retained_wal_size
FROM pg_replication_slots
ORDER BY retained_wal_size DESC;The active column tells you whether a consumer process is currently attached. An inactive slot (active = false) that has a large retained_wal_size is a disk bomb waiting to go off. The confirmed_flush_lsn column is specific to logical replication slots — it marks the LSN up to which the consumer has acknowledged receipt. Physical slots use restart_lsn only.
Never leave an inactive replication slot on a production primary without a monitoring alert. A slot created for a logical replication subscriber that was decommissioned, a pglogical or Debezium connector that lost its connection, or a pg_upgrade slot that was never cleaned up will accumulate WAL indefinitely. Set max_slot_wal_keep_size (PostgreSQL 13+) as a safety valve:
-- In postgresql.conf: limit WAL retained per slot to 10 GB
-- max_slot_wal_keep_size = 10GB
-- Check current setting
SHOW max_slot_wal_keep_size;When a slot exceeds this limit, PostgreSQL invalidates the slot automatically rather than letting the disk fill. The subscriber must be rebuilt from a fresh base backup, but the primary remains healthy.
WAL Archiving: archive_mode, archive_command, and archive_cleanup_command
WAL archiving runs in parallel with replication and is the foundation of PITR. When archive_mode = on, PostgreSQL calls archive_command after each WAL segment is closed. The segment is retained in pg_wal until the archive command exits with status 0, confirming the copy succeeded.
# In postgresql.conf
archive_mode = on
archive_command = 'aws s3 cp %p s3://my-wal-archive/%f'
# %p = full path to the WAL segment on the primary
# %f = filename only (used as the S3 object key)The archive command must be idempotent — PostgreSQL may call it multiple times for the same segment after a crash. A common pattern is to check for existence before copying, or simply allow the copy to overwrite. On S3, the standard aws s3 cp is safe to call repeatedly.
For continuous archiving with WAL-G or pgBackRest (the two most widely used tools in production):
# WAL-G example
archive_command = 'wal-g wal-push %p'
restore_command = 'wal-g wal-fetch %f %p'
# pgBackRest example
archive_command = 'pgbackrest --stanza=main archive-push %p'
restore_command = 'pgbackrest --stanza=main archive-get %f %p'After a base backup is completed and old PITR windows have expired, archived WAL segments older than the oldest needed recovery point can be removed. pg_archivecleanup is the built-in tool for this:
# Remove archived WAL segments older than a specific segment
# Typically called from archive_cleanup_command in recovery configuration
pg_archivecleanup /mnt/wal-archive 000000010000001A000000C0
# In postgresql.conf (primary or standby):
archive_cleanup_command = 'pg_archivecleanup /mnt/wal-archive %r'
# %r = the oldest WAL segment still needed for recoveryOn standbys running in hot standby mode, archive_cleanup_command runs after each restartpoint, removing segments older than what the standby still needs for crash recovery.
Inspecting the pg_wal Directory
Direct filesystem inspection of $PGDATA/pg_wal requires OS-level access. For in-database monitoring, use pg_ls_waldir(), available from PostgreSQL 10 onward (requires superuser or pg_monitor role):
-- Total WAL size and segment count
SELECT
count(*) AS segment_count,
pg_size_pretty(sum(size)) AS total_wal_size,
min(modification) AS oldest_segment_modified,
max(modification) AS newest_segment_modified
FROM pg_ls_waldir();
-- List the 10 oldest segments (useful during disk exhaustion triage)
SELECT name, size, modification
FROM pg_ls_waldir()
ORDER BY modification ASC
LIMIT 10;
-- WAL segments that are not recycled temp files (filter out .history files)
SELECT count(*), pg_size_pretty(sum(size))
FROM pg_ls_waldir()
WHERE name ~ '^[0-9A-F]{24}$';From the shell, if you have filesystem access:
# Count and size
ls -l $PGDATA/pg_wal | grep -c '^-'
du -sh $PGDATA/pg_wal
# Watch WAL growth in real time (useful during incident triage)
watch -n 5 'du -sh $PGDATA/pg_wal && ls $PGDATA/pg_wal | wc -l'Monitoring WAL: System Views and Position Tracking
A complete WAL monitoring setup covers three layers: current WAL position, replication lag, and slot health.
Current WAL Position
-- Current write-ahead log position on the primary
SELECT pg_current_wal_lsn();
-- Convert an LSN to a WAL filename and byte offset within that file
SELECT * FROM pg_walfile_name_offset(pg_current_wal_lsn());
-- Returns: file_name (24-char hex), file_offset (bytes into the segment)
-- Compute the distance between two LSN values in bytes
SELECT pg_wal_lsn_diff('0/3A000000', '0/38000000');
-- Returns: 33554432 (32 MB = 2 WAL segments)Replication Lag
-- Full replication status including write, flush, and apply lag
SELECT
application_name,
state,
sent_lsn,
write_lsn,
flush_lsn,
replay_lsn,
pg_size_pretty(pg_wal_lsn_diff(sent_lsn, replay_lsn)) AS total_lag_bytes,
write_lag,
flush_lag,
replay_lag,
sync_state
FROM pg_stat_replication
ORDER BY total_lag_bytes DESC;The write_lag, flush_lag, and replay_lag interval columns (available from PostgreSQL 10) break down exactly where the standby is losing ground. A large replay_lag with small write_lag indicates a query on the standby is blocking apply — likely a long-running read blocking vacuum or a conflicting lock.
Slot Health
-- Full slot health check — everything you need for an alert query
SELECT
slot_name,
slot_type,
database,
active,
active_pid,
xmin,
catalog_xmin,
restart_lsn,
confirmed_flush_lsn,
wal_status, -- PostgreSQL 13+: normal, extended, unreserved, lost
safe_wal_size, -- PostgreSQL 13+: bytes of WAL remaining before slot is at risk
two_phase
FROM pg_replication_slots
ORDER BY (safe_wal_size IS NULL) DESC, safe_wal_size ASC;The wal_status column is critical for alerting. unreserved means the slot is within max_slot_wal_keep_size of being invalidated. lost means the slot has already been invalidated and the subscriber must be rebuilt. Alert on unreserved so you can intervene before data loss.
Add this query to your monitoring system as a scheduled check. Fire a critical alert when any slot has active = false and retained_wal_size exceeds 5 GB, or when wal_status is unreserved or lost. This catches orphaned slots before they cause an outage.
Safe WAL Cleanup: Dropping Slots and Archiving Hygiene
Dropping an Orphaned Replication Slot
Before dropping a slot, confirm the subscriber is genuinely decommissioned. On a logical replication slot, check whether any subscription on the subscriber side still references it:
-- On the subscriber: list active subscriptions
SELECT subname, subconninfo, subenabled, subslotname
FROM pg_subscription;
-- On the primary: drop a slot that has no active consumer
-- This immediately releases all WAL the slot was retaining
SELECT pg_drop_replication_slot('slot_name_here');
-- Verify it is gone
SELECT slot_name FROM pg_replication_slots WHERE slot_name = 'slot_name_here';The WAL retained by the slot is released immediately after the drop. PostgreSQL will begin recycling those segments at the next checkpoint. You will not see an instant reduction in directory size — the segments are renamed and reused rather than deleted — but new WAL growth will consume the recycled segments first, and total directory size will stop growing.
Archiving Cleanup with pg_archivecleanup
# Dry-run to see which segments would be removed
pg_archivecleanup -n /mnt/wal-archive 000000010000001A000000C0
# Execute the cleanup
pg_archivecleanup /mnt/wal-archive 000000010000001A000000C0
# With WAL-G, enforce retention by age (e.g., keep last 14 days)
wal-g delete retain FIND_FULL 14 --confirm
# With pgBackRest
pgbackrest --stanza=main expireDisk Alert Thresholds and WAL Exhaustion Runbook
WAL disk exhaustion is one of the fastest-moving incidents in PostgreSQL operations. Once the volume fills, the primary goes read-only within seconds. The following thresholds and runbook steps should be part of every production DBA's playbook.
Alert Thresholds
- Warning at 60% of WAL volume: Investigate which slots or archiving backlog are driving growth. No immediate action required but schedule remediation.
- Critical at 75% of WAL volume: Take immediate action. Drop orphaned slots, flush archiving backlog, or expand the volume.
- Emergency at 90%: The instance may go read-only within minutes. Execute the runbook immediately.
Do not share the WAL volume (pg_wal) with the main data directory (base/) on production systems. WAL exhaustion on a shared volume takes the entire instance read-only, not just WAL writes. Isolate pg_wal on its own volume or use a symbolic link to a separate mount point created at initdb time with the --waldir flag.
WAL Disk Exhaustion Runbook
- Identify the cause immediately. Run the slot health query from the monitoring section. Check
pg_ls_waldir()for total size and oldest segment age. If archiving is enabled, check whetherarchive_commandis failing (pg_stat_archiver).
-- Step 1: Quick triage
SELECT last_failed_wal, last_fail_time, last_archived_wal, last_archive_time
FROM pg_stat_archiver;
SELECT slot_name, active,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained
FROM pg_replication_slots
ORDER BY retained DESC;- Drop confirmed orphaned slots. If a slot is inactive and the subscriber does not exist, drop it with
pg_drop_replication_slot(). If unsure, contact the owning team before dropping. - Fix the archive backlog. If
pg_stat_archivershows archive failures, diagnose and fix thearchive_command(permissions, connectivity, quota). PostgreSQL will automatically retry failed segments once the command works again. - Temporarily reduce
wal_keep_size. If the disk is near-full and archiving is healthy, temporarily lowerwal_keep_sizeto 0 (requires reload), which allows PostgreSQL to recycle WAL more aggressively past the current standby positions. - Expand the volume if necessary. On cloud instances, a live volume resize (AWS EBS, GCP persistent disk) typically does not require an instance restart and can buy immediate breathing room while root cause is addressed.
- Post-incident: add the monitoring queries above to your observability stack. WAL exhaustion incidents are almost always preventable with 24-hour advance notice from a proper alert.
- WAL files serve three distinct purposes — crash recovery, streaming replication, and PITR archiving — and PostgreSQL will not recycle a segment until all three consumers have confirmed they no longer need it.
wal_keep_size(PostgreSQL 13+) is the primary parameter for controlling how much past WAL the primary retains for standbys that are not using slots. A value of 1–2 GB covers most standby lag windows without excessive disk use.- Replication slots override all other retention limits. An inactive slot will accumulate WAL indefinitely. Set
max_slot_wal_keep_sizeas a safety valve and monitorpg_replication_slots.wal_statusforunreservedorloststates. pg_ls_waldir(),pg_stat_replication, andpg_replication_slotsgive you complete visibility into WAL accumulation without needing OS filesystem access.pg_walfile_name_offset()andpg_current_wal_lsn()let you track the exact WAL position and convert LSNs to segment filenames for archiving triage.- Isolate the WAL volume from the data volume in production. Alert at 60% capacity, act at 75%, and have a written runbook ready for 90%.
- After an orphaned slot is dropped or an archiving backlog is cleared, WAL recycling resumes at the next checkpoint — segments are reused rather than deleted, so directory size shrinks gradually rather than immediately.
Managing WAL retention correctly is one of the highest-leverage operational tasks for a PostgreSQL DBA — get it wrong and you are recovering from a 3 AM outage; get it right and your standbys stay in sync, your PITR window is always intact, and your disk graphs are flat. If you want expert help designing WAL retention policies, setting up monitoring, or migrating to a managed PostgreSQL environment where WAL management is handled for you, JusDB works with engineering teams on exactly these problems — from slot hygiene audits to full WAL archiving pipeline buildouts.