Slow query logs tell you which queries are slow — but not why. When a query that ran in 50ms suddenly takes 4 seconds, you need the actual execution plan from that specific moment, not an average aggregated over thousands of runs. auto_explain fills exactly that gap: it automatically captures and logs the full EXPLAIN output for any query that exceeds your defined threshold. No manual intervention, no reproducing the condition in a staging environment — just the plan, right there in your PostgreSQL logs.
auto_explainis a PostgreSQL contrib module that automatically logs query execution plans for slow queries without manual EXPLAIN calls.- Enable globally via
shared_preload_libraries(requires restart) or per-session withLOAD 'auto_explain'(no restart needed). - Set
auto_explain.log_analyze = onto capture actual row counts and timing per plan node — at the cost of 10–30% execution overhead for logged queries. - On PostgreSQL 13+, use
auto_explain.sample_rateto log only a fraction of qualifying queries and reduce log volume on high-traffic databases. - Pair with
pg_stat_statements(aggregate trends) and pgBadger (log reports) for a complete slow query diagnostic workflow.
What is auto_explain?
auto_explain is a contrib module bundled with PostgreSQL that hooks into the executor and logs the EXPLAIN output for any statement whose execution time exceeds a configured threshold. It ships with every standard PostgreSQL installation — you do not need to compile anything or install an external package. A simple LOAD command or a line in postgresql.conf activates it.
The key distinction from every other slow-query tool is that auto_explain captures the actual plan for each individual execution. pg_stat_statements tells you that query X has run 10,000 times with an average latency of 800ms. That is useful. But when query X starts spiking to 5,000ms on Monday mornings, you need to know what plan the planner chose during those specific executions. Did it switch from an index scan to a sequential scan? Did a nested loop replace a hash join? auto_explain answers those questions in real time, without you having to reproduce the workload.
This makes auto_explain indispensable for two categories of problems:
- Plan regressions: A statistics refresh, a new autovacuum run, or a PostgreSQL minor upgrade causes the planner to choose a different — worse — plan. The query was fast yesterday; it is slow today. The plan that was chosen during those slow executions is exactly what you need to see.
- Parameter-sensitive plans: Queries with highly skewed data distributions can get radically different plans depending on the parameter value bound at execution time.
auto_explainlogs the plan alongside the query text, so you can correlate specific parameter values with bad plan choices.
Enabling and Configuring auto_explain
Method 1 — Global load via shared_preload_libraries (requires restart)
For persistent, database-wide coverage, add auto_explain to shared_preload_libraries in postgresql.conf. This loads the module at server startup and applies it to every session automatically.
# postgresql.conf — requires PostgreSQL service restart
shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 1000 # Log queries taking > 1 second
auto_explain.log_analyze = on # Include actual row counts (costs CPU)
auto_explain.log_buffers = on # Include buffer hit/miss stats
auto_explain.log_timing = on # Include per-node timing
auto_explain.log_triggers = off # Include trigger execution stats
auto_explain.log_verbose = off # Keep off in production (too verbose)
auto_explain.log_format = text # or json, xml, yaml
auto_explain.log_nested_statements = offAfter editing postgresql.conf, restart PostgreSQL:
# systemd-based systems
sudo systemctl restart postgresql
# Verify the module loaded
psql -c "SHOW shared_preload_libraries;"Method 2 — Per-session load (no restart needed)
When you are debugging a specific issue and cannot restart the server, load auto_explain for your session only. This is ideal for targeted investigations in production or staging without touching the server-wide configuration.
-- Load for this session only — no restart required
LOAD 'auto_explain';
SET auto_explain.log_min_duration = 500;
SET auto_explain.log_analyze = ON;
-- Run the slow query — plan will be logged to PostgreSQL log
SELECT * FROM orders WHERE customer_id = 12345 ORDER BY created_at DESC LIMIT 100;The plan appears in the PostgreSQL server log (typically /var/log/postgresql/ or wherever log_directory points). The session ends, and the module is unloaded — no persistent change to the server.
Method 3 — RDS PostgreSQL
On Amazon RDS for PostgreSQL, shared_preload_libraries is managed through Parameter Groups and the available libraries are restricted. Use the per-session method instead:
-- On RDS PostgreSQL — per-session load is the practical approach
LOAD 'auto_explain';
SET auto_explain.log_min_duration = '1s';
SET auto_explain.log_analyze = true;
-- Slow query plans will appear in RDS Enhanced Monitoring logs
-- or in the PostgreSQL error log exported to CloudWatchKey Configuration Parameters
| Parameter | Default | Recommended (Production) | Description |
|---|---|---|---|
log_min_duration |
-1 (disabled) | 1000ms | Threshold in milliseconds; queries longer than this get their plan logged |
log_analyze |
off | on | Includes actual row counts and timing per plan node (adds 10–30% overhead) |
log_buffers |
off | on | Shows shared/local buffer hits, reads, writes per node |
log_timing |
on | on | Per-node wall clock timing (requires log_analyze) |
log_triggers |
off | off | Include trigger execution statistics in the plan output |
log_format |
text | text | Plan output format: text, json, xml, or yaml |
log_nested_statements |
off | off | Logs plans for queries executed inside PL/pgSQL functions |
sample_rate |
1.0 | 0.1–1.0 | Fraction of qualifying queries to log (PostgreSQL 13+); reduces log volume |
auto_explain.log_analyze = on executes the query and measures actual rows and timing at each plan node. This adds 10–30% execution overhead for the logged queries. On high-QPS databases, use auto_explain.sample_rate = 0.01 (1%) or log_analyze = off to minimize impact.
Using sample_rate on high-traffic databases (PostgreSQL 13+)
# Log only 10% of qualifying slow queries — reduces log volume by 90%
auto_explain.sample_rate = 0.1
auto_explain.log_min_duration = 500 # 500ms threshold
auto_explain.log_analyze = on
auto_explain.log_buffers = onThe sample_rate parameter is applied probabilistically per query execution. At 0.1, roughly 1 in 10 slow queries will have its plan logged. This is a practical compromise between diagnostic coverage and log storage costs.
Set
auto_explain.log_min_duration 2–3x higher than your log_min_duration_statement so that auto_explain only captures plans for the very worst offenders, while pgBadger or pg_stat_statements handles the broader slow query landscape. For example: log_min_duration_statement = 500ms and auto_explain.log_min_duration = 1500ms.
Reading auto_explain Output in Logs
When auto_explain fires, the log entry contains the query text and the full plan tree. Here is a realistic example showing a problematic query that performs two sequential scans and a hash join on large tables:
2026-03-05 10:42:17 UTC [12345] LOG: duration: 2341.827 ms plan:
Query Text: SELECT o.*, u.email FROM orders o JOIN users u ON u.id = o.user_id WHERE o.status = 'pending' AND o.created_at > '2026-02-03'
Hash Join (cost=1847.53..45231.76 rows=12847 width=156) (actual time=89.432..2341.123 rows=12847 loops=1)
Hash Cond: (o.user_id = u.id)
Buffers: shared hit=2847 read=38923
-> Seq Scan on orders o (cost=0.00..38234.00 rows=12847 width=89) (actual time=0.023..1823.456 rows=12847 loops=1)
Filter: ((status = 'pending') AND (created_at > '2026-02-03'))
Rows Removed by Filter: 487153
Buffers: shared hit=847 read=38076
-> Hash (cost=1234.00..1234.00 rows=50000 width=67) (actual time=87.123..87.123 rows=50000 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 4096kB
Buffers: shared hit=2000 read=847
-> Seq Scan on users u (cost=0.00..1234.00 rows=50000 width=67) (actual time=0.018..45.234 rows=50000 loops=1)Reading this plan reveals several actionable signals immediately:
- Seq Scan on orders with 487,153 rows removed by filter: The planner is reading the entire table and discarding 97% of rows. A composite index on
(status, created_at)would eliminate this. - 38,923 shared buffer reads on orders: Nearly all data is coming from disk (or OS page cache miss). The table is too large to fit in
shared_buffersfor this query. - 2,341ms total duration with 1,823ms spent in the orders Seq Scan: 78% of the query time is in that one scan, confirming the index as the fix to pursue.
Using JSON format for structured log parsing
For environments that ship PostgreSQL logs into ELK, Datadog, or a custom log aggregator, the JSON plan format makes machine parsing straightforward:
# Use JSON for log parsing / ingestion into ELK / Datadog
auto_explain.log_format = json# Extract slow query plans from PostgreSQL log (text post-processing example)
grep -A 50 "auto_explain" /var/log/postgresql/postgresql.log | \
python3 -c "
import sys, json
for line in sys.stdin:
if 'plan:' in line:
# Extract and pretty-print JSON plan
pass
"Debugging PL/pgSQL stored procedures
# Enable to see plans for queries inside stored procedures
# Use only in dev/staging — log volume will be very high with log_min_duration = 0
auto_explain.log_nested_statements = on
auto_explain.log_min_duration = 0With log_nested_statements = on, every SQL statement executed inside a PL/pgSQL function, trigger, or DO block that exceeds the threshold will have its plan logged. This is invaluable when you have a stored procedure that runs fine on small data but degrades on production-scale inputs.
auto_explain vs pg_stat_statements vs pgBadger — When to Use Each
These three tools address overlapping but distinct questions. Using them together gives a complete picture of query performance:
| Tool | Captures | Actual plan? | Plan per execution? | Best for |
|---|---|---|---|---|
auto_explain |
Full query plan with actual stats | Yes (with log_analyze) |
Yes | Plan regression diagnosis, parameter-sensitive plans |
pg_stat_statements |
Aggregated execution statistics | No | No | Finding the top-N slowest query shapes overall |
| pgBadger | Log file analysis and reporting | No | No | Historical trends, per-hour reports, lock waits summary |
| EXPLAIN ANALYZE | Single query plan on demand | Yes | Yes | Interactive debugging when you can reproduce the query |
A typical workflow: pg_stat_statements identifies that SELECT ... FROM orders WHERE status = $1 is responsible for 40% of total query time this week. pgBadger confirms the spikes happen between 9am and 11am UTC. You then enable auto_explain with a threshold targeting those executions and capture the actual plan during the next spike window. Now you have the plan — and you can fix it.
Production Tuning and Log Management
Running auto_explain in production requires balancing diagnostic value against log volume and execution overhead. The following configuration represents a reasonable starting point for most workloads above 500 QPS:
# Production-balanced auto_explain configuration
shared_preload_libraries = 'pg_stat_statements,auto_explain'
# Only log the worst 1% of slow queries to limit I/O
auto_explain.log_min_duration = 2000 # 2 seconds — well above p99 baseline
auto_explain.log_analyze = on
auto_explain.log_buffers = on
auto_explain.log_timing = on
auto_explain.log_verbose = off # Never enable in production
auto_explain.log_format = text
auto_explain.sample_rate = 0.25 # Log 25% of qualifying queries
auto_explain.log_nested_statements = offKey decisions to revisit over time:
- Threshold tuning: Start at 2–3x your observed p99 latency. If your p99 is 300ms, set the threshold at 800ms–1000ms. Adjust down only after confirming log volume is manageable.
- Log rotation: Plan lines can be hundreds of lines long. A single hour of activity with
log_min_duration = 100can produce gigabytes of logs. Ensurelogrotateis configured with appropriate size and frequency limits. - log_analyze in practice: The overhead is real but bounded — it only applies to queries that are already slow (above the threshold). A 20% overhead on a 2-second query is 400ms extra. For the diagnostic value, this is usually acceptable when
sample_rateis kept low. - Pairing with pg_stat_statements: Both modules can coexist in
shared_preload_libraries. They complement each other without interference.
Key Takeaways
- auto_explain captures what no other tool does: the actual query plan for a specific slow execution, logged automatically without manual intervention.
- Load per-session with
LOAD 'auto_explain'when you need to debug without restarting PostgreSQL — critical for production investigations on RDS or live systems. - Always pair
log_analyze = onwith a meaningful threshold: the overhead is proportional to logged queries, not all queries, so a high threshold keeps cost manageable. - Use
sample_rateon PostgreSQL 13+ to cap log volume on high-QPS databases without sacrificing visibility into plan regressions. - auto_explain, pg_stat_statements, and pgBadger are complementary: use pg_stat_statements and pgBadger to identify problem queries, then auto_explain to diagnose the plan at execution time.
- For JSON log pipelines, set
auto_explain.log_format = jsonto make plan ingestion into ELK, Datadog, or custom tooling straightforward.
Working with JusDB on PostgreSQL Query Analysis
Configuring auto_explain is straightforward. Interpreting the plan output at scale — correlating plan changes with deployment events, statistics refreshes, or workload shifts — is where deep PostgreSQL expertise pays off. At JusDB, our PostgreSQL engagements routinely involve exactly this kind of plan regression analysis: establishing performance baselines, enabling targeted diagnostic instrumentation, and translating raw plan output into concrete index, schema, or configuration changes.
If you are dealing with intermittent slow queries that are hard to reproduce, parameter-sensitive plan instability, or a PostgreSQL upgrade that changed planner behavior, we can help you work through it systematically. Get in touch to discuss your PostgreSQL environment.