PostgreSQL Table Bloat and pg_repack

Your PostgreSQL database has been running smoothly for months, but disk usage keeps climbing and queries are getting slower — even though you're running VACUUM religiously. The culprit is almost certainly table bloat: dead tuples and fragmented pages that VACUUM marks as reusable but never actually returns to the OS. Left unchecked, bloat inflates table size by 2–5x, degrades index efficiency, and eventually forces an emergency VACUUM FULL that locks your tables for hours. This post walks through exactly how to measure bloat, why standard VACUUM falls short, and how to use pg_repack to reclaim disk space with zero downtime.

TL;DR

Table bloat is dead tuple space that VACUUM marks free but doesn't return to the OS.
Use pgstattuple to measure real bloat percentages per table and index.
VACUUM FULL reclaims space but holds an ACCESS EXCLUSIVE lock — no reads or writes during the operation.
pg_repack rebuilds tables and indexes online with only brief lock windows at the very start and end.
Know the limitations: pg_repack does not support tables with exclusion constraints.

What Is Table Bloat?

PostgreSQL uses MVCC (Multi-Version Concurrency Control) to handle concurrent reads and writes without locking. Every UPDATE writes a new row version and marks the old one dead; every DELETE marks the existing row dead without immediately removing it. This is intentional — long-running transactions may still need to see those old versions.

VACUUM's job is to mark dead tuples as reusable so future INSERTs can occupy that space. The critical word is reusable: the space stays allocated to the table's data file on disk. Only VACUUM FULL physically rewrites the table file and returns unused pages to the operating system. When a table sees frequent updates and deletes, dead space accumulates faster than new inserts can reuse it — this is bloat.

Index bloat follows the same pattern. Every dead heap tuple still has a corresponding index entry until VACUUM cleans it, and index pages themselves fragment over time as entries are inserted and deleted across the B-tree structure. A bloated index is slower to scan and consumes more RAM in shared_buffers.

Warning

Autovacuum alone is not a cure for heavily written tables. If your write throughput outpaces autovacuum workers, bloat will grow continuously. Tune autovacuum_vacuum_cost_delay and autovacuum_vacuum_scale_factor aggressively for high-churn tables before relying on any reclaim strategy.

Measuring Bloat with pgstattuple

Before you reclaim anything, measure. PostgreSQL ships with the pgstattuple extension, which inspects page-level tuple visibility to give you accurate bloat numbers — far more reliable than the commonly cited approximation queries built on pg_class statistics.

Enable the extension once per database:

sql

CREATE EXTENSION IF NOT EXISTS pgstattuple;

Then query a specific table:

sql

SELECT
    table_len,
    tuple_count,
    tuple_len,
    dead_tuple_count,
    dead_tuple_len,
    round((dead_tuple_len::numeric / NULLIF(table_len, 0)) * 100, 2) AS dead_pct,
    free_space,
    round((free_space::numeric / NULLIF(table_len, 0)) * 100, 2) AS free_pct
FROM pgstattuple('public.orders');

A table where dead_pct + free_pct exceeds 20–30% is a strong candidate for reclamation. To scan all user tables in one shot (be aware this performs a full sequential scan of every table and will be slow on large databases):

sql

SELECT
    schemaname,
    tablename,
    (pgstattuple(schemaname || '.' || tablename)).dead_tuple_len AS dead_bytes,
    (pgstattuple(schemaname || '.' || tablename)).table_len AS total_bytes,
    round(
        (pgstattuple(schemaname || '.' || tablename)).dead_tuple_len::numeric
        / NULLIF((pgstattuple(schemaname || '.' || tablename)).table_len, 0) * 100,
        2
    ) AS dead_pct
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY dead_bytes DESC
LIMIT 20;

For index bloat, use pgstatindex:

sql

SELECT
    index_size,
    leaf_fragmentation,
    avg_leaf_density
FROM pgstatindex('orders_customer_id_idx');

A leaf_fragmentation above 30% or an avg_leaf_density below 60% indicates an index worth rebuilding.

Tip

Save bloat measurements to a table before and after any reclaim operation. A simple INSERT INTO bloat_log SELECT now(), ... FROM pgstattuple(...) gives you a before/after comparison you can share with your team and reference during the next review cycle.

Why VACUUM FULL Isn't Enough

VACUUM FULL does exactly what you want from a space-reclamation standpoint: it rewrites the entire table into a new heap file, drops the old one, and rebuilds all indexes from scratch. Free space goes back to the OS. The problem is the lock it requires.

VACUUM FULL acquires an ACCESS EXCLUSIVE lock on the table for its entire duration. Every read, write, and even SELECT queues behind it. On a 50 GB table this can mean two to four hours of total unavailability — unacceptable for any production workload with SLA requirements.

CLUSTER has the same lock behavior with the added benefit of physically reordering rows by an index, which can dramatically improve sequential scan performance. But the lock story is identical: ACCESS EXCLUSIVE for the full duration.

Comparing the three approaches:

Method	Lock Held	Returns Space to OS	Rebuilds Indexes	Downtime Required
VACUUM	ShareUpdateExclusive	No	No	None
VACUUM FULL	AccessExclusive (full run)	Yes	Yes	Full duration
CLUSTER	AccessExclusive (full run)	Yes	Yes	Full duration
pg_repack	AccessExclusive (seconds)	Yes	Yes	None (brief lock at end)

pg_repack: Online Space Reclamation

How pg_repack Works

pg_repack uses a combination of triggers, a shadow table, and logical replication to rebuild your table while it remains fully online. The rough sequence is:

Acquires a brief ACCESS SHARE lock to install a trigger that logs changes to a queue table.
Copies the live table into a new shadow table in a dedicated schema.
Builds all indexes on the shadow table concurrently.
Applies queued changes (inserts, updates, deletes that happened during the copy) to the shadow table.
Acquires an ACCESS EXCLUSIVE lock — typically for only a few seconds — swaps the shadow table into place, and drops the original.

The result is a compacted, defragmented table and indexes, with the lock window measured in seconds rather than hours.

Installation

On Debian/Ubuntu:

bash

sudo apt-get install postgresql-17-repack

On RHEL/Rocky Linux via PGDG:

bash

sudo dnf install pg_repack_17

Then install the extension in each target database:

sql

CREATE EXTENSION pg_repack;

Verify the version:

bash

pg_repack --version

Warning

The pg_repack client binary version must match the installed extension version exactly. A mismatch produces a cryptic error at runtime. Always install both from the same package source.

Repacking a Single Table

The most common operation: reclaim bloat from one heavily-written table.

bash

pg_repack \
  --host=localhost \
  --port=5432 \
  --username=postgres \
  --dbname=myapp \
  --table=public.orders \
  --echo

The --echo flag prints each SQL statement as it executes, useful for confirming progress in a terminal session.

Protecting Long-Running Transactions with --no-kill-backend

By default, pg_repack will terminate backends that hold conflicting locks when it needs the final ACCESS EXCLUSIVE lock for the swap. In most production environments you want to avoid killing user sessions. Use --no-kill-backend combined with --wait-timeout instead:

bash

pg_repack \
  --dbname=myapp \
  --table=public.orders \
  --no-kill-backend \
  --wait-timeout=60 \
  --echo

With this configuration, pg_repack will wait up to 60 seconds for conflicting locks to clear. If they don't, it aborts without touching the table — safe to retry later rather than disrupting active sessions.

Index-Only Repack

If heap bloat is acceptable but your indexes are severely fragmented, you can rebuild indexes alone. This is faster and carries a smaller lock window than a full table repack:

bash

pg_repack \
  --dbname=myapp \
  --table=public.orders \
  --only-indexes \
  --echo

To target a single specific index:

bash

pg_repack \
  --dbname=myapp \
  --index=public.orders_customer_id_idx \
  --echo

Repacking an Entire Database

To process all eligible tables in a database at once, omit the --table flag. Add --jobs to parallelize across multiple tables simultaneously (use with care on busy systems):

bash

pg_repack \
  --dbname=myapp \
  --jobs=2 \
  --no-kill-backend \
  --wait-timeout=30 \
  --echo

Monitoring Progress

During a long repack, query the repack schema tables in a separate session to track progress:

sql

-- See active repack operations and their shadow tables
SELECT schemaname, tablename, n_live_tup
FROM pg_stat_user_tables
WHERE schemaname = 'repack';

-- Watch the queue drain (rows = pending changes to apply)
SELECT count(*) FROM repack.log_<oid>;

Alternatively, watch pg_stat_activity for the repack backend:

sql

SELECT pid, state, wait_event_type, wait_event, query_start, query
FROM pg_stat_activity
WHERE application_name ILIKE '%repack%';

Comparing Bloat Before and After

Capture a snapshot before starting:

sql

CREATE TABLE IF NOT EXISTS bloat_snapshots (
    captured_at   timestamptz DEFAULT now(),
    table_name    text,
    total_bytes   bigint,
    dead_bytes    bigint,
    dead_pct      numeric
);

INSERT INTO bloat_snapshots (table_name, total_bytes, dead_bytes, dead_pct)
SELECT
    'public.orders',
    table_len,
    dead_tuple_len,
    round(dead_tuple_len::numeric / NULLIF(table_len, 0) * 100, 2)
FROM pgstattuple('public.orders');

Run the repack, then insert another row with the updated stats. Compare:

sql

SELECT
    captured_at,
    table_name,
    pg_size_pretty(total_bytes) AS total_size,
    pg_size_pretty(dead_bytes) AS dead_size,
    dead_pct
FROM bloat_snapshots
ORDER BY captured_at;

A successful repack on a heavily bloated table typically reduces total_bytes by 30–60% and drops dead_pct to near zero.

Best Practices

Schedule during low-traffic periods. pg_repack is online, but the copy phase does generate significant I/O. The final lock swap happens faster when fewer transactions are open.
Ensure enough disk space. pg_repack creates a full copy of the table while the original still exists. You need at least 1.5x the current table size free during the operation.
Test on a replica first. Run pgstattuple on a read replica to estimate bloat without impacting the primary. Validate your repack commands against a staging database before running on production.
Tune autovacuum after reclaiming. Once bloat is cleared, adjust autovacuum_vacuum_scale_factor downward for high-churn tables so bloat doesn't return at the same pace.
Monitor replication lag. The shadow table copy generates WAL. On streaming replication setups, watch replica lag during a large repack and pause if it grows beyond your tolerance.

Warning

pg_repack does not support tables with exclusion constraints. If your table uses EXCLUDE USING, pg_repack will refuse to process it and exit with an error. Your only alternatives are VACUUM FULL (with downtime) or partitioning the table and rebuilding partitions individually. Check for exclusion constraints with: SELECT conname FROM pg_constraint WHERE contype = 'x' AND conrelid = 'public.orders'::regclass;

Key Takeaways

Key Takeaways

Table and index bloat silently degrade query performance and inflate storage costs — measure with pgstattuple before acting.
VACUUM marks space as reusable but never returns it to the OS; only a full rewrite does that.
VACUUM FULL and CLUSTER both require hours-long ACCESS EXCLUSIVE locks, making them impractical for production tables under live load.
pg_repack rebuilds tables and indexes online with only a few seconds of locking at the final swap — no planned downtime required.
Use --no-kill-backend and --wait-timeout in production to avoid disrupting active sessions.
For index-only bloat, --only-indexes is faster and less I/O-intensive than a full table repack.
Tables with exclusion constraints cannot be repacked; plan alternative strategies for those early.
Always confirm free disk space of at least 1.5x the table size before starting a repack.

Let JusDB Handle This For You

Identifying bloat, scheduling safe maintenance windows, tuning autovacuum, and validating repack runs on production systems takes hours of careful DBA work. At JusDB, our managed PostgreSQL service includes continuous bloat monitoring, automated pg_repack scheduling, and expert review of your autovacuum configuration — so your tables stay lean without anyone losing sleep over lock windows.

Talk to a JusDB engineer about a free database health audit and find out exactly where your bloat is hiding.

PostgreSQL Table Bloat and pg_repack: Reclaim Disk Without Downtime

What Is Table Bloat?

Measuring Bloat with pgstattuple

Why VACUUM FULL Isn't Enough

pg_repack: Online Space Reclamation

How pg_repack Works

Installation

Repacking a Single Table

Protecting Long-Running Transactions with --no-kill-backend

Index-Only Repack

Repacking an Entire Database

Monitoring Progress

Comparing Bloat Before and After

Best Practices

Key Takeaways

Let JusDB Handle This For You

Share this article

Keep reading

SQL Server Wait Stats: A Diagnostic Playbook for Slow Queries

InnoDB Architecture Explained (2026): Buffer Pool, Redo Log & Production Tuning

MySQL 8.4 Parallel DDL: innodb_parallel_read_threads & innodb_ddl_threads Tuning

Need Expert Help?

Database Performance Optimization

PostgreSQL Consulting

PostgreSQL Migration

PostgreSQL Support

PostgreSQL High Availability

PostgreSQL Cloud Migration