Every PostgreSQL DBA has encountered it without fully understanding it: a table whose total size dwarfs its visible main-table size, queries that slow down when filtering on a jsonb column, or an autovacuum job that runs on a table you have never heard of. The culprit in all three cases is usually TOAST — The Oversized-Attribute Storage Technique — PostgreSQL's built-in mechanism for handling values that are too large to fit on a standard 8 KB data page. TOAST is invisible by design, but when it misbehaves it can cause significant write amplification, bloat, and query latency. Understanding exactly how it works, how to diagnose problems it introduces, and how to tune it is an essential skill for anyone running PostgreSQL in production.
- PostgreSQL's default page size is 8 KB; when a row's stored size exceeds roughly 2 KB, large variable-length columns are automatically moved to a separate TOAST table (
pg_toast.pg_toast_<table_oid>). - Four storage strategies control how each column is handled: PLAIN, EXTENDED (default for text/jsonb), EXTERNAL, and MAIN — choosing the right one can meaningfully reduce CPU and I/O overhead.
- PostgreSQL 14+ supports LZ4 compression for TOAST, which is dramatically faster than the legacy pglz compressor and cuts write CPU significantly for large text and JSONB columns.
- Updating any column in a row that has TOASTed values rewrites all TOAST entries for that row — even unchanged ones — causing write amplification that is easy to miss in profiling.
- TOAST tables accumulate dead tuples independently and must be monitored and vacuumed just like regular tables.
What is TOAST in PostgreSQL?
PostgreSQL organizes data into fixed-size pages, each exactly 8 KB by default. A single row — its header, all fixed-width columns, and all variable-length column data — must fit within that page for the row to be stored inline. Variable-length types such as text, bytea, jsonb, xml, arrays, and hstore can be arbitrarily large, far exceeding that 8 KB limit.
To handle this, PostgreSQL uses TOAST: The Oversized-Attribute Storage Technique. The threshold is not the full 8 KB page size — because a page contains metadata and can hold multiple rows, PostgreSQL begins considering TOAST when a single row's data exceeds roughly 2 KB (2,000 bytes). When that threshold is crossed, PostgreSQL applies a two-step process: first it attempts to compress the oversized column value; if it is still too large after compression, it moves the value entirely out of the main table and into a dedicated TOAST table.
Every table that contains at least one TOAST-eligible column automatically gets a companion TOAST table. The naming convention is deterministic:
-- Find the TOAST table for any given user table
SELECT 'pg_toast.' || t.relname AS toast_table,
pg_size_pretty(pg_relation_size('pg_toast.' || t.relname)) AS toast_size,
pg_size_pretty(pg_relation_size(m.oid)) AS main_size
FROM pg_class m
JOIN pg_class t ON t.oid = m.reltoastrelid
WHERE m.relname = 'your_table';The main table row stores a TOAST pointer — a small fixed-size reference — in place of the actual value. When PostgreSQL needs the value, it follows that pointer to the TOAST table, reassembles any chunked data, and decompresses it if necessary. From the application's perspective this is entirely transparent: queries read and write data as if nothing special has happened.
TOAST applies to any column whose data type is variable-length and whose storage strategy is not PLAIN. The eligible types include text, varchar, bytea, jsonb, json, xml, hstore, and array types. Fixed-width types like int, bigint, boolean, and timestamp are never TOASTed.
How TOAST Works
TOAST Storage Strategies
PostgreSQL exposes four storage strategies, settable per column. Choosing the right one for your access pattern is the first lever for TOAST optimization.
| Strategy | Code | Compression | Out-of-line storage | Best for |
|---|---|---|---|---|
| PLAIN | p | No | No | Small fixed-width types; cannot TOAST at all |
| EXTENDED | x | Yes (first) | Yes (if still large) | Default for text, jsonb; best general-purpose choice |
| EXTERNAL | e | No | Yes | Binary data (bytea) where compression is wasteful or pre-compressed content |
| MAIN | m | Yes (first) | Yes (last resort only) | Columns where keeping data inline matters for performance; avoids TOAST heap fetch |
-- Check current TOAST strategy per column
SELECT attname, attstorage,
CASE attstorage
WHEN 'p' THEN 'PLAIN'
WHEN 'e' THEN 'EXTENDED'
WHEN 'x' THEN 'EXTERNAL'
WHEN 'm' THEN 'MAIN'
END AS strategy
FROM pg_attribute
WHERE attrelid = 'your_table'::regclass
AND attnum > 0
AND NOT attisdropped;
-- Change TOAST strategy for specific columns
ALTER TABLE documents ALTER COLUMN content SET STORAGE EXTERNAL;
ALTER TABLE events ALTER COLUMN payload SET STORAGE EXTENDED;TOAST Tables
When a value is moved out-of-line, PostgreSQL chunks it into segments of up to 2 KB each and stores each chunk as a separate row in the TOAST table. The TOAST table has three columns: chunk_id (an OID identifying which value the chunk belongs to), chunk_seq (ordering within the value), and chunk_data (the actual bytes). Reassembly on read happens automatically in the executor.
TOAST tables live in the pg_toast schema and are inaccessible to normal SQL without explicitly setting search_path. They have their own indexes (on chunk_id, chunk_seq) and their own visibility map, free space map, and autovacuum tracking entries.
Compression
For columns using EXTENDED or MAIN strategies, PostgreSQL compresses the value before deciding whether to move it out-of-line. Historically, the only available compressor was pglz (a variant of LZ77). PostgreSQL 14 introduced support for LZ4, which compresses and decompresses significantly faster than pglz at a modest reduction in compression ratio — typically the right trade-off for database workloads where CPU is more constrained than storage.
-- Set LZ4 compression on a column (PostgreSQL 14+)
ALTER TABLE documents ALTER COLUMN content SET COMPRESSION lz4;
-- Verify compression method per column
SELECT attname, attcompression
FROM pg_attribute
WHERE attrelid = 'documents'::regclass
AND attnum > 0
AND NOT attisdropped;EXTENDED to EXTERNAL storage — this avoids double-compression overhead and can reduce write CPU by up to 40%. Use LZ4 on human-readable text and structured JSONB that PostgreSQL compresses well.Diagnosing TOAST Bloat and Performance Issues
Identifying Tables with High TOAST Overhead
The first diagnostic step is finding which tables have significant TOAST overhead relative to their main-table size. A large gap between pg_relation_size (main table only) and pg_total_relation_size (main + TOAST + indexes) indicates heavy TOAST usage.
-- Tables with significant TOAST overhead
SELECT
schemaname,
tablename,
pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS main_size,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)
- pg_relation_size(schemaname||'.'||tablename)) AS toast_and_index_size,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 20;Inspecting the TOAST Table Directly
Once you have identified a candidate table, inspect its TOAST table for bloat indicators. Dead tuples in a TOAST table mean that old versions of large column values are accumulating and consuming space without being reclaimed.
-- Check TOAST table dead tuple count and size
SELECT
relname AS toast_table,
n_dead_tup,
n_live_tup,
pg_size_pretty(pg_relation_size(oid)) AS size
FROM pg_stat_user_tables
WHERE relname LIKE 'pg_toast_%'
ORDER BY n_dead_tup DESC;Measuring Column-Level TOAST Pressure
To understand how many rows in a specific table are actually being TOASTed and what the average value size is, query the column directly:
-- Average and maximum TOAST column size, count of rows exceeding threshold
SELECT
avg(length(content::text)) AS avg_content_bytes,
max(length(content::text)) AS max_content_bytes,
count(*) FILTER (WHERE length(content::text) > 2048) AS toasted_rows
FROM documents;SELECT list but is referenced in a WHERE clause or ORDER BY. Avoid using large TOASTed columns as filter predicates. Always add a dedicated, indexed column for any searchable attribute — never filter directly on a large jsonb or text column without a supporting index that avoids the TOAST fetch.Optimizing TOAST Storage Strategy
Step 1: Audit Current Strategies
Before making any changes, audit the current storage strategy for every column in your largest tables using the query from the strategies section above. Identify columns that are:
- Using
EXTENDEDbut storing pre-compressed or binary data (considerEXTERNAL) - Using
EXTENDEDbut containing text that PostgreSQL compresses poorly (considerEXTERNALor LZ4) - Using
EXTERNALbut storing highly compressible structured text (considerEXTENDEDwith LZ4) - Candidates for the write-amplification optimization described in Step 3
Step 2: Apply LZ4 Compression on PostgreSQL 14+
For any column using EXTENDED or MAIN strategy that holds human-readable text or structured JSONB, switch to LZ4. The change takes effect for newly written values; existing TOASTed values retain their original compression method until they are rewritten by an UPDATE.
-- Apply LZ4 to multiple columns
ALTER TABLE documents ALTER COLUMN content SET COMPRESSION lz4;
ALTER TABLE events ALTER COLUMN payload SET COMPRESSION lz4;
-- Force existing rows to be rewritten with the new compression
-- (only if you need the change to apply immediately; this is a table rewrite)
VACUUM FULL documents;
-- Or use pg_repack for a non-blocking rewrite:
-- pg_repack -t documentsStep 3: Eliminate Write Amplification with Table Splitting
This is the most impactful optimization and the one most commonly overlooked. When PostgreSQL performs an UPDATE on any column in a row that has TOASTed values, it rewrites all TOAST chunks for that row — even columns that did not change. A table that mixes frequently-updated status columns with a large infrequently-updated JSONB payload is a write-amplification trap.
-- Anti-pattern: large JSONB alongside frequently updated columns
CREATE TABLE events (
id BIGINT PRIMARY KEY,
user_id INT, -- updated frequently (e.g., reassignment)
status TEXT, -- updated frequently (e.g., state machine transitions)
raw_payload JSONB -- large, written once, never updated
);
-- Better: isolate the large, stable column into a 1:1 child table
CREATE TABLE events (
id BIGINT PRIMARY KEY,
user_id INT,
status TEXT
);
CREATE TABLE event_payloads (
event_id BIGINT PRIMARY KEY REFERENCES events(id),
raw_payload JSONB
);With the split design, updates to user_id or status do not touch event_payloads at all. The TOAST table for event_payloads accumulates dead tuples only when the payload itself is rewritten — which in this case is never. Write amplification is eliminated.
Step 4: Set EXTERNAL for Pre-compressed Binary Columns
If a bytea column stores data that is already compressed at the application layer (gzip, zstd, etc.), the default EXTENDED strategy will attempt to compress it again, burning CPU for negligible or negative gain. Set it to EXTERNAL:
# Verify the column stores incompressible data before switching
# Then alter the storage strategy
psql -d mydb -c "ALTER TABLE uploads ALTER COLUMN file_data SET STORAGE EXTERNAL;"TOAST and Autovacuum
TOAST tables are not exempt from bloat — they accumulate dead tuples every time a row with TOASTed values is updated or deleted. Each UPDATE that touches a TOASTed row creates new TOAST chunks and marks the old ones dead. Each DELETE marks all TOAST chunks for that row dead. Until autovacuum reclaims those dead tuples, they consume space and degrade sequential scan performance on the TOAST table.
A critical and frequently misunderstood detail: autovacuum tracks and vacuums TOAST tables independently from their parent tables. If you disable autovacuum on a parent table (via autovacuum_enabled = false in its storage parameters), the TOAST table continues to autovacuum on its own schedule. Conversely, if your TOAST table's autovacuum thresholds are set too high, it may lag behind even when the parent table is healthy.
Monitor TOAST table autovacuum health with:
-- TOAST table vacuum stats
SELECT
relname,
last_autovacuum,
last_autoanalyze,
n_dead_tup,
n_live_tup,
autovacuum_count,
pg_size_pretty(pg_relation_size(oid)) AS size
FROM pg_stat_user_tables
WHERE relname LIKE 'pg_toast_%'
ORDER BY n_dead_tup DESC
LIMIT 20;n_dead_tup count that is not decreasing between autovacuum cycles indicates that autovacuum cannot keep up — either because the vacuum is being blocked by long-running transactions holding snapshots, or because the autovacuum worker is underpowered for the write rate. In either case, the TOAST table will grow unbounded. Check pg_stat_activity for long-running transactions and consider increasing autovacuum_vacuum_cost_delay or adding autovacuum workers.If you need to force a TOAST table vacuum manually, you must reference it by its system name:
-- Manually vacuum a specific TOAST table
-- First find its name:
SELECT 'pg_toast.' || t.relname AS toast_table
FROM pg_class m
JOIN pg_class t ON t.oid = m.reltoastrelid
WHERE m.relname = 'documents';
-- Then vacuum it directly:
VACUUM VERBOSE pg_toast.pg_toast_12345;Key Takeaways
- TOAST is automatic but not free. PostgreSQL moves large column values out of the main table transparently, but every TOAST fetch is a separate heap read — and that cost adds up at scale.
- Choose your storage strategy deliberately.
EXTENDEDis correct for most text and JSONB. UseEXTERNALfor pre-compressed binary data. UseMAINsparingly for columns where keeping data inline outweighs the compression overhead. - Switch to LZ4 on PostgreSQL 14+. It is substantially faster than pglz for both compression and decompression, with only a modest reduction in compression ratio. Apply it to all large text and JSONB columns that are not storing pre-compressed data.
- Write amplification is the hidden cost. Mixing frequently updated columns with large stable columns in the same table causes every UPDATE to rewrite all TOAST chunks for that row. Split the table.
- Monitor TOAST tables for bloat. They accumulate dead tuples independently and need autovacuum attention. If a parent table has high churn, its TOAST table may be the primary source of bloat and I/O overhead.
- Never filter on raw TOASTed columns. Always index a searchable derived attribute rather than using a TOASTed
jsonbortextcolumn directly inWHEREpredicates.
Working with JusDB on PostgreSQL Storage Optimization
TOAST is one of the layers where seemingly minor schema decisions have outsized production consequences. The difference between a table that autovacuums cleanly and one that accumulates TOAST bloat at 10 GB per week is often a single schema pattern — wide tables mixing hot and cold columns, or a missing storage strategy change on a binary column. These issues are rarely visible in application-level metrics until they are already causing real latency or disk pressure.
At JusDB, our PostgreSQL engineering practice covers the full storage stack: schema design reviews that identify TOAST write-amplification risks before they reach production, storage strategy audits on existing tables, LZ4 compression migrations using non-blocking tooling like pg_repack, and autovacuum tuning specific to TOAST-heavy workloads. If your PostgreSQL instance has tables where total size is many multiples of main-table size, or where autovacuum is perpetually running on TOAST tables without catching up, these are tractable problems with well-defined solutions.
Contact JusDB to schedule a PostgreSQL storage health review.
Related Posts
- PostgreSQL Autovacuum Tuning — Deep-dive into autovacuum configuration, cost parameters, and bloat prevention strategies.
- PostgreSQL Table Bloat and pg_repack — How to reclaim bloated table and index space without taking downtime.
- PostgreSQL JSONB vs JSON — Performance characteristics, indexing strategies, and when to choose each type.