Your PostgreSQL database has been running smoothly for months, but disk usage keeps climbing and queries are getting slower — even though you're running VACUUM religiously. The culprit is almost certainly table bloat: dead tuples and fragmented pages that VACUUM marks as reusable but never actually returns to the OS. Left unchecked, bloat inflates table size by 2–5x, degrades index efficiency, and eventually forces an emergency VACUUM FULL that locks your tables for hours. This post walks through exactly how to measure bloat, why standard VACUUM falls short, and how to use pg_repack to reclaim disk space with zero downtime.
- Table bloat is dead tuple space that VACUUM marks free but doesn't return to the OS.
- Use
pgstattupleto measure real bloat percentages per table and index. VACUUM FULLreclaims space but holds anACCESS EXCLUSIVElock — no reads or writes during the operation.pg_repackrebuilds tables and indexes online with only brief lock windows at the very start and end.- Know the limitations:
pg_repackdoes not support tables with exclusion constraints.
What Is Table Bloat?
PostgreSQL uses MVCC (Multi-Version Concurrency Control) to handle concurrent reads and writes without locking. Every UPDATE writes a new row version and marks the old one dead; every DELETE marks the existing row dead without immediately removing it. This is intentional — long-running transactions may still need to see those old versions.
VACUUM's job is to mark dead tuples as reusable so future INSERTs can occupy that space. The critical word is reusable: the space stays allocated to the table's data file on disk. Only VACUUM FULL physically rewrites the table file and returns unused pages to the operating system. When a table sees frequent updates and deletes, dead space accumulates faster than new inserts can reuse it — this is bloat.
Index bloat follows the same pattern. Every dead heap tuple still has a corresponding index entry until VACUUM cleans it, and index pages themselves fragment over time as entries are inserted and deleted across the B-tree structure. A bloated index is slower to scan and consumes more RAM in shared_buffers.
Autovacuum alone is not a cure for heavily written tables. If your write throughput outpaces autovacuum workers, bloat will grow continuously. Tune autovacuum_vacuum_cost_delay and autovacuum_vacuum_scale_factor aggressively for high-churn tables before relying on any reclaim strategy.
Measuring Bloat with pgstattuple
Before you reclaim anything, measure. PostgreSQL ships with the pgstattuple extension, which inspects page-level tuple visibility to give you accurate bloat numbers — far more reliable than the commonly cited approximation queries built on pg_class statistics.
Enable the extension once per database:
CREATE EXTENSION IF NOT EXISTS pgstattuple;Then query a specific table:
SELECT
table_len,
tuple_count,
tuple_len,
dead_tuple_count,
dead_tuple_len,
round((dead_tuple_len::numeric / NULLIF(table_len, 0)) * 100, 2) AS dead_pct,
free_space,
round((free_space::numeric / NULLIF(table_len, 0)) * 100, 2) AS free_pct
FROM pgstattuple('public.orders');A table where dead_pct + free_pct exceeds 20–30% is a strong candidate for reclamation. To scan all user tables in one shot (be aware this performs a full sequential scan of every table and will be slow on large databases):
SELECT
schemaname,
tablename,
(pgstattuple(schemaname || '.' || tablename)).dead_tuple_len AS dead_bytes,
(pgstattuple(schemaname || '.' || tablename)).table_len AS total_bytes,
round(
(pgstattuple(schemaname || '.' || tablename)).dead_tuple_len::numeric
/ NULLIF((pgstattuple(schemaname || '.' || tablename)).table_len, 0) * 100,
2
) AS dead_pct
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY dead_bytes DESC
LIMIT 20;For index bloat, use pgstatindex:
SELECT
index_size,
leaf_fragmentation,
avg_leaf_density
FROM pgstatindex('orders_customer_id_idx');A leaf_fragmentation above 30% or an avg_leaf_density below 60% indicates an index worth rebuilding.
Save bloat measurements to a table before and after any reclaim operation. A simple INSERT INTO bloat_log SELECT now(), ... FROM pgstattuple(...) gives you a before/after comparison you can share with your team and reference during the next review cycle.
Why VACUUM FULL Isn't Enough
VACUUM FULL does exactly what you want from a space-reclamation standpoint: it rewrites the entire table into a new heap file, drops the old one, and rebuilds all indexes from scratch. Free space goes back to the OS. The problem is the lock it requires.
VACUUM FULL acquires an ACCESS EXCLUSIVE lock on the table for its entire duration. Every read, write, and even SELECT queues behind it. On a 50 GB table this can mean two to four hours of total unavailability — unacceptable for any production workload with SLA requirements.
CLUSTER has the same lock behavior with the added benefit of physically reordering rows by an index, which can dramatically improve sequential scan performance. But the lock story is identical: ACCESS EXCLUSIVE for the full duration.
Comparing the three approaches:
| Method | Lock Held | Returns Space to OS | Rebuilds Indexes | Downtime Required |
|---|---|---|---|---|
| VACUUM | ShareUpdateExclusive | No | No | None |
| VACUUM FULL | AccessExclusive (full run) | Yes | Yes | Full duration |
| CLUSTER | AccessExclusive (full run) | Yes | Yes | Full duration |
| pg_repack | AccessExclusive (seconds) | Yes | Yes | None (brief lock at end) |
pg_repack: Online Space Reclamation
How pg_repack Works
pg_repack uses a combination of triggers, a shadow table, and logical replication to rebuild your table while it remains fully online. The rough sequence is:
- Acquires a brief
ACCESS SHARElock to install a trigger that logs changes to a queue table. - Copies the live table into a new shadow table in a dedicated schema.
- Builds all indexes on the shadow table concurrently.
- Applies queued changes (inserts, updates, deletes that happened during the copy) to the shadow table.
- Acquires an
ACCESS EXCLUSIVElock — typically for only a few seconds — swaps the shadow table into place, and drops the original.
The result is a compacted, defragmented table and indexes, with the lock window measured in seconds rather than hours.
Installation
On Debian/Ubuntu:
sudo apt-get install postgresql-17-repackOn RHEL/Rocky Linux via PGDG:
sudo dnf install pg_repack_17Then install the extension in each target database:
CREATE EXTENSION pg_repack;Verify the version:
pg_repack --versionThe pg_repack client binary version must match the installed extension version exactly. A mismatch produces a cryptic error at runtime. Always install both from the same package source.
Repacking a Single Table
The most common operation: reclaim bloat from one heavily-written table.
pg_repack \
--host=localhost \
--port=5432 \
--username=postgres \
--dbname=myapp \
--table=public.orders \
--echoThe --echo flag prints each SQL statement as it executes, useful for confirming progress in a terminal session.
Protecting Long-Running Transactions with --no-kill-backend
By default, pg_repack will terminate backends that hold conflicting locks when it needs the final ACCESS EXCLUSIVE lock for the swap. In most production environments you want to avoid killing user sessions. Use --no-kill-backend combined with --wait-timeout instead:
pg_repack \
--dbname=myapp \
--table=public.orders \
--no-kill-backend \
--wait-timeout=60 \
--echoWith this configuration, pg_repack will wait up to 60 seconds for conflicting locks to clear. If they don't, it aborts without touching the table — safe to retry later rather than disrupting active sessions.
Index-Only Repack
If heap bloat is acceptable but your indexes are severely fragmented, you can rebuild indexes alone. This is faster and carries a smaller lock window than a full table repack:
pg_repack \
--dbname=myapp \
--table=public.orders \
--only-indexes \
--echoTo target a single specific index:
pg_repack \
--dbname=myapp \
--index=public.orders_customer_id_idx \
--echoRepacking an Entire Database
To process all eligible tables in a database at once, omit the --table flag. Add --jobs to parallelize across multiple tables simultaneously (use with care on busy systems):
pg_repack \
--dbname=myapp \
--jobs=2 \
--no-kill-backend \
--wait-timeout=30 \
--echoMonitoring Progress
During a long repack, query the repack schema tables in a separate session to track progress:
-- See active repack operations and their shadow tables
SELECT schemaname, tablename, n_live_tup
FROM pg_stat_user_tables
WHERE schemaname = 'repack';
-- Watch the queue drain (rows = pending changes to apply)
SELECT count(*) FROM repack.log_; Alternatively, watch pg_stat_activity for the repack backend:
SELECT pid, state, wait_event_type, wait_event, query_start, query
FROM pg_stat_activity
WHERE application_name ILIKE '%repack%';Comparing Bloat Before and After
Capture a snapshot before starting:
CREATE TABLE IF NOT EXISTS bloat_snapshots (
captured_at timestamptz DEFAULT now(),
table_name text,
total_bytes bigint,
dead_bytes bigint,
dead_pct numeric
);
INSERT INTO bloat_snapshots (table_name, total_bytes, dead_bytes, dead_pct)
SELECT
'public.orders',
table_len,
dead_tuple_len,
round(dead_tuple_len::numeric / NULLIF(table_len, 0) * 100, 2)
FROM pgstattuple('public.orders');Run the repack, then insert another row with the updated stats. Compare:
SELECT
captured_at,
table_name,
pg_size_pretty(total_bytes) AS total_size,
pg_size_pretty(dead_bytes) AS dead_size,
dead_pct
FROM bloat_snapshots
ORDER BY captured_at;A successful repack on a heavily bloated table typically reduces total_bytes by 30–60% and drops dead_pct to near zero.
Best Practices
- Schedule during low-traffic periods. pg_repack is online, but the copy phase does generate significant I/O. The final lock swap happens faster when fewer transactions are open.
- Ensure enough disk space. pg_repack creates a full copy of the table while the original still exists. You need at least 1.5x the current table size free during the operation.
- Test on a replica first. Run
pgstattupleon a read replica to estimate bloat without impacting the primary. Validate your repack commands against a staging database before running on production. - Tune autovacuum after reclaiming. Once bloat is cleared, adjust
autovacuum_vacuum_scale_factordownward for high-churn tables so bloat doesn't return at the same pace. - Monitor replication lag. The shadow table copy generates WAL. On streaming replication setups, watch replica lag during a large repack and pause if it grows beyond your tolerance.
pg_repack does not support tables with exclusion constraints. If your table uses EXCLUDE USING, pg_repack will refuse to process it and exit with an error. Your only alternatives are VACUUM FULL (with downtime) or partitioning the table and rebuilding partitions individually. Check for exclusion constraints with: SELECT conname FROM pg_constraint WHERE contype = 'x' AND conrelid = 'public.orders'::regclass;
Key Takeaways
- Table and index bloat silently degrade query performance and inflate storage costs — measure with
pgstattuplebefore acting. - VACUUM marks space as reusable but never returns it to the OS; only a full rewrite does that.
VACUUM FULLandCLUSTERboth require hours-longACCESS EXCLUSIVElocks, making them impractical for production tables under live load.pg_repackrebuilds tables and indexes online with only a few seconds of locking at the final swap — no planned downtime required.- Use
--no-kill-backendand--wait-timeoutin production to avoid disrupting active sessions. - For index-only bloat,
--only-indexesis faster and less I/O-intensive than a full table repack. - Tables with exclusion constraints cannot be repacked; plan alternative strategies for those early.
- Always confirm free disk space of at least 1.5x the table size before starting a repack.
Let JusDB Handle This For You
Identifying bloat, scheduling safe maintenance windows, tuning autovacuum, and validating repack runs on production systems takes hours of careful DBA work. At JusDB, our managed PostgreSQL service includes continuous bloat monitoring, automated pg_repack scheduling, and expert review of your autovacuum configuration — so your tables stay lean without anyone losing sleep over lock windows.
Talk to a JusDB engineer about a free database health audit and find out exactly where your bloat is hiding.