PostgreSQL VACUUM Tuning: A Comprehensive Guide
PostgreSQL VACUUM Tuning: A Comprehensive Guide
By JusDB — Database SRE & Consulting
Understanding VACUUM
VACUUM is PostgreSQL’s housekeeping mechanism that reclaims space from dead tuples and keeps visibility information up to date. In PostgreSQL’s MVCC design, updates/deletes don’t immediately remove old row versions; they’re left behind until VACUUM cleans them up. Without regular vacuuming you’ll see table and index bloat, slower queries, and—most critically—risk transaction ID wraparound.
Why VACUUM is Critical
- Storage Management: Frees space occupied by dead tuples and mitigates table/index bloat.
- Performance: Smaller relations = fewer I/O operations and better cache behavior.
- Transaction ID Safety: Regular freezing prevents dangerous wraparound events.
- Planner Accuracy: Paired with ANALYZE, it keeps statistics fresh for good plans.
Types of VACUUM Operations
Manual VACUUM
-- Basic VACUUM on a table
VACUUM table_name;
-- VACUUM with ANALYZE to refresh planner stats
VACUUM ANALYZE table_name;
-- VACUUM all tables in the current DB
VACUUM;
VACUUM FULL
-- Reclaims maximum space by rewriting the table; requires exclusive lock
VACUUM FULL table_name;
VACUUM FULL
rewrites the table, requires ACCESS EXCLUSIVE lock, and uses extra disk space during the rewrite. Reserve for maintenance windows or use pg_repack
to minimize blocking.Autovacuum (Recommended)
The autovacuum daemon automatically decides when to VACUUM and ANALYZE based on table activity. For most production systems, tuning autovacuum is the safest, lowest-friction strategy.
Key Configuration Parameters
Global Autovacuum Settings
-- Enable/disable autovacuum launcher (default: on)
autovacuum = on
-- Max concurrent autovacuum workers (default: 3)
autovacuum_max_workers = 3
-- Sleep time between autovacuum cycles (default: 60s)
autovacuum_naptime = '60s'
autovacuum_work_mem
.Thresholds & Scale Factors
-- Trigger when (threshold + scale_factor * reltuples) dead tuples appear
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.2
-- ANALYZE triggers
autovacuum_analyze_threshold = 50
autovacuum_analyze_scale_factor = 0.1
-- Since PG13: insert-triggered autovacuum thresholds (for insert-heavy tables)
autovacuum_vacuum_insert_threshold = 1000
autovacuum_vacuum_insert_scale_factor = 0.2
Performance & Cost-Based Throttling
-- Maintenance memory (default ~64MB); safe to raise for faster vacuum
maintenance_work_mem = '256MB'
-- Per autovacuum worker memory (default -1 = use maintenance_work_mem)
autovacuum_work_mem = '-1'
-- Cost-based vacuum (manual vacuum): default delay 0ms, limit 200
vacuum_cost_delay = '0ms'
vacuum_cost_limit = 200
-- Cost-based vacuum (autovacuum): default delay 2ms (PG12+), limit -1 (=vacuum_cost_limit)
autovacuum_vacuum_cost_delay = '2ms'
autovacuum_vacuum_cost_limit = -1
autovacuum_vacuum_cost_delay
default is 2ms (it used to be 20ms in older versions). Lower delay = faster autovacuum but more I/O pressure. Tune with care.Monitoring VACUUM Performance
Per-Table Activity
SELECT
schemaname, tablename,
n_tup_ins, n_tup_upd, n_tup_del,
n_dead_tup,
last_vacuum, last_autovacuum,
vacuum_count, autovacuum_count
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;
Live Progress of Running VACUUMs
SELECT
pid,
datname,
relid::regclass AS table_name,
phase,
heap_blks_total,
heap_blks_scanned,
heap_blks_vacuumed,
index_vacuum_count,
max_dead_tuples,
num_dead_tuples
FROM pg_stat_progress_vacuum;
Quick-and-Dirty Bloat Signal
SELECT
schemaname,
tablename,
n_live_tup,
n_dead_tup,
ROUND((n_dead_tup::float / NULLIF(n_live_tup + n_dead_tup, 0)) * 100, 2) AS dead_tuple_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY dead_tuple_pct DESC;
Tuning Strategies
High-Write Workloads
autovacuum_naptime = '30s'
autovacuum_vacuum_scale_factor = 0.10
autovacuum_analyze_scale_factor = 0.05
autovacuum_max_workers = 6
maintenance_work_mem = '1GB' -- be mindful of total memory with many workers
Large Tables (>10GB)
ALTER TABLE large_table SET (
autovacuum_vacuum_scale_factor = 0.05,
autovacuum_analyze_scale_factor = 0.02,
autovacuum_vacuum_cost_delay = '10ms'
);
Insert-Heavy (Append-Only) Tables
ALTER TABLE events SET (
autovacuum_vacuum_insert_threshold = 1000,
autovacuum_vacuum_insert_scale_factor = 0.02, -- smaller to keep visibility map fresh
autovacuum_analyze_scale_factor = 0.05
);
The goal is to keep pages “all-visible” to enable index-only scans while controlling overhead.
Read-Heavy Systems
autovacuum_naptime = '300s'
autovacuum_vacuum_scale_factor = 0.40
autovacuum_analyze_scale_factor = 0.20
Memory-Constrained Hosts
maintenance_work_mem = '64MB'
autovacuum_work_mem = '128MB'
autovacuum_max_workers = 2
Common Issues & Solutions
1) Autovacuum Takes Too Long / Can’t Keep Up
-- Give VACUUM more memory (cluster-wide)
ALTER SYSTEM SET maintenance_work_mem = '1GB';
SELECT pg_reload_conf();
-- Make autovacuum pause less per cycle (cluster-wide)
ALTER SYSTEM SET autovacuum_vacuum_cost_delay = '2ms'; -- or 0–2ms if you can handle I/O
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 2000; -- higher budget
-- Per-table overrides for hotspots
ALTER TABLE problematic_table SET (autovacuum_vacuum_cost_delay = '0ms');
VACUUM
is running—useful during long maintenance jobs.2) Small Tables Vacuumed Too Often
ALTER TABLE small_table SET (
autovacuum_vacuum_threshold = 1000,
autovacuum_analyze_threshold = 500
);
3) Transaction ID Wraparound Warnings
autovacuum_freeze_max_age
excessively. Keep defaults unless you fully understand the trade-offs. Instead:-- Check DB age
SELECT datname, age(datfrozenxid) AS xid_age
FROM pg_database
ORDER BY xid_age DESC;
-- Emergency action (on affected tables)
VACUUM (FREEZE, VERBOSE) table_name; -- or VACUUM FREEZE; cautiously
-- Cluster-wide maintenance window
VACUUM (FREEZE);
Autovacuum will still run wraparound protection vacuums even if autovacuum is otherwise disabled—don’t rely on that as a strategy; tune and monitor proactively.
4) Index Bloat Causing Slow Plans
-- Find low-usage or suspicious indexes
SELECT schemaname, relname AS tablename, indexrelname AS indexname,
idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
-- Rebuild
REINDEX INDEX index_name; -- or: REINDEX TABLE table_name;
If you need to reclaim table space with minimal blocking, prefer pg_repack over VACUUM FULL
.
Configuration Management
-- Persist configuration safely
ALTER SYSTEM SET autovacuum_naptime = '30s';
SELECT pg_reload_conf();
ALTER SYSTEM writes to postgresql.auto.conf
; many parameters apply on reload, some need restart. Avoid manual edits to postgresql.auto.conf
.
Per-Table Customization
-- High-churn OLTP table
ALTER TABLE transactions SET (
autovacuum_vacuum_scale_factor = 0.05,
autovacuum_analyze_scale_factor = 0.02,
autovacuum_naptime = '15s'
);
-- Append-only log table (only if you schedule manual VACUUM/ANALYZE to prevent wraparound)
ALTER TABLE logs SET (autovacuum_enabled = off);
Maintenance Windows
# Example nightly job
vacuumdb -d mydb --analyze-in-stages --all --jobs=4
Monitoring & Alerting
CREATE OR REPLACE VIEW jusdb_vacuum_monitor AS
SELECT
schemaname,
tablename,
n_live_tup,
n_dead_tup,
CASE WHEN n_live_tup + n_dead_tup > 0
THEN ROUND(100.0 * n_dead_tup / (n_live_tup + n_dead_tup), 2)
ELSE 0 END AS dead_tuple_pct,
last_autovacuum,
autovacuum_count
FROM pg_stat_user_tables
WHERE n_dead_tup > 100
ORDER BY dead_tuple_pct DESC;
Performance Testing
- Test parameter changes in staging with production-like write patterns.
- Measure impact on both vacuum duration and end-user query latencies.
Emergency Playbook (Wraparound)
- Identify oldest databases/tables by
age(datfrozenxid)
. - Run
VACUUM (FREEZE, VERBOSE)
on the hottest offenders. - Reduce application write pressure temporarily.
- Post-mortem: tune thresholds/scale factors, especially on big/update-heavy tables.