EXPLAIN ANALYZE Is Your Most Important Tool
Every slow PostgreSQL query tells its story through EXPLAIN ANALYZE. The challenge is reading that story. This guide walks through real query plans, explains what each node means, and shows you how to identify the exact bottleneck.
The Basic Syntax
-- Basic: show estimated plan (no execution)
EXPLAIN SELECT * FROM orders WHERE user_id = 42;
-- Full: execute and show actual stats
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, FORMAT TEXT)
SELECT * FROM orders WHERE user_id = 42;Always use BUFFERS — it shows cache hits vs disk reads, which is often the real story.
Reading the Plan Tree
EXPLAIN (ANALYZE, BUFFERS)
SELECT o.id, u.email, SUM(oi.price)
FROM orders o
JOIN users u ON u.id = o.user_id
JOIN order_items oi ON oi.order_id = o.id
WHERE o.created_at > '2025-01-01'
GROUP BY o.id, u.email;
-- Sample output:
HashAggregate (cost=45231.82..45891.82 rows=66000 width=48)
(actual time=2341.5..2389.2 rows=58392 loops=1)
Buffers: shared hit=12043 read=8291
-> Hash Join (cost=8231.00..41631.82 rows=720000 width=40)
(actual time=145.3..1987.4 rows=720000 loops=1)
Hash Cond: (oi.order_id = o.id)
-> Seq Scan on order_items oi (cost=0..15000 rows=1000000 width=16)
(actual time=0.1..456.2 rows=1000000 loops=1)
Buffers: shared hit=5412 read=7801
-> Hash (cost=7231.00..7231.00 rows=80000 width=32)
(actual time=142.1..142.1 rows=58392 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 4096kB
-> Hash Join (cost=1231.00..7231.00 rows=80000 width=32)
-> Index Scan using orders_created_at_idx ...Key Metrics to Analyze
1. Rows Estimated vs Actual
-- Bad: planner estimated 100 rows, got 50,000
Seq Scan on big_table (cost=0..1000 rows=100 width=200)
(actual time=0.1..8432.1 rows=50000 loops=1)
-- Fix: update statistics
ANALYZE big_table;
-- Or increase statistics target for skewed columns
ALTER TABLE big_table ALTER COLUMN status SET STATISTICS 500;
ANALYZE big_table;2. Buffers: Shared Read vs Hit
-- 'read' = disk I/O, 'hit' = from shared_buffers cache
Buffers: shared hit=100 read=50000 -- BAD: lots of disk reads
Buffers: shared hit=50100 read=0 -- GOOD: fully cached
-- If read is high, either:
-- a) Increase shared_buffers / use larger instance
-- b) Add an index to reduce pages read
-- c) Investigate if a full table scan is happening3. Nested Loop with Many Loops
-- Nested Loop (actual time=0.1..9823.4 rows=58392 loops=1)
-- -> Seq Scan on users (actual time=0.1..234.1 rows=1 loops=58392)
-- 'loops=58392' means the inner scan ran 58,392 times!
-- Total inner time: 234.1ms * 58392 = ~13.7 seconds
-- Fix: add index on the join column
CREATE INDEX ON users (id);Common Plan Nodes
| Node | What it does | When it's a problem |
|---|---|---|
| Seq Scan | Full table scan | On large tables when selectivity is high |
| Index Scan | B-tree index + heap fetch | Heap fetches expensive if rows are scattered |
| Index Only Scan | Index covers all needed columns | Rarely a problem — this is good |
| Bitmap Heap Scan | Batch heap fetches after bitmap index scan | High rows can be slow |
| Hash Join | Build hash table on smaller input | When hash table spills to disk |
| Nested Loop | Iterate outer, probe inner | When loops * inner_time is large |
| Merge Join | Merge two sorted inputs | When sort is expensive |
Using auto_explain
-- Log plans for slow queries automatically
LOAD 'auto_explain';
SET auto_explain.log_min_duration = '100ms';
SET auto_explain.log_analyze = true;
SET auto_explain.log_buffers = true;
-- In postgresql.conf (for all sessions):
shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 500 -- ms
auto_explain.log_analyze = on
auto_explain.log_buffers = on
auto_explain.log_format = jsonInterpreting Cost Numbers
Costs are in arbitrary units (default: 1 unit = reading 1 page sequentially). They are estimates, not milliseconds. What matters is the relative cost between plan nodes and whether estimates match actuals.
-- Cost format: (startup cost..total cost)
-- startup: time to return first row
-- total: time to return all rows
Seq Scan (cost=0.00..18584.00 ...)
-- ^ ^
-- startup totalKey Takeaways
- Always use
BUFFERSto see disk I/O vs cache hits - Large discrepancies between estimated and actual rows = stale statistics
- High
loopson inner nodes in Nested Loop = missing index on join column - High
shared read= data not cached = I/O bottleneck - Use
auto_explainto capture plans of slow queries in production