Database Performance

PostgreSQL Performance Tuning: shared_buffers, work_mem, and Memory Configuration

A complete guide to PostgreSQL memory configuration and performance tuning, covering shared_buffers, work_mem, effective_cache_size, checkpoint tuning, and autovacuum settings that matter in production.

JusDB Team
July 12, 2022
12 min read
155 views

In March last year, a FinTech startup running PostgreSQL 15 on a 16 GB RAM RDS instance reached out to JusDB with a familiar complaint: their dashboard queries were timing out at peak load, and their on-call engineer had been woken up twice in a week by autovacuum locking critical tables. A quick review of their postgresql.conf revealed the database was still running almost entirely on installation defaults — shared_buffers at 128 MB, work_mem at 4 MB, and autovacuum scale factors tuned for a test laptop, not a production OLTP workload. Their 16 GB server was behaving like a 512 MB one.

After a two-hour configuration session, without touching a line of application code, dashboard query p99 dropped from 4.2 seconds to 380 milliseconds. The autovacuum incidents stopped entirely. The on-call rotation went quiet.

PostgreSQL ships with conservative defaults because it must run on hardware ranging from a Raspberry Pi to a 768 GB bare-metal server. Those defaults keep the database from crashing on constrained hardware, but they will leave substantial performance on the table on any production server. Understanding which knobs matter — and how they interact — is the difference between a database that limps and one that flies.

This guide covers the parameters that move the needle on real OLTP workloads: buffer management, WAL tuning, autovacuum, planner cost parameters, and connection overhead. Everything is grounded in concrete values for a 16 GB RAM server.

TL;DR
  • Set shared_buffers to 25% of total RAM (4 GB on a 16 GB server) and effective_cache_size to 75% (12 GB) so the planner knows how much OS page cache is available.
  • Size work_mem carefully: it is allocated per sort/hash operation per connection, so max_connections * work_mem can exceed RAM if set carelessly. Start at 32–64 MB and raise only for sessions that need it.
  • Set checkpoint_completion_target = 0.9 and increase max_wal_size to 2–4 GB to spread checkpoint I/O and avoid write spikes.
  • Tighten autovacuum thresholds for large tables: the default 20% scale factor is catastrophic for a 50-million-row table, triggering vacuum only after 10 million dead tuples accumulate.
  • Set random_page_cost = 1.1 on SSD storage; the default of 4.0 causes the planner to avoid index scans that would actually be fast.
  • Enable pg_stat_statements to identify the queries actually driving load before tuning anything else.

Background

PostgreSQL's architecture is built around shared memory, write-ahead logging, and a background worker ecosystem. Knowing which subsystem each parameter controls helps you understand why defaults are conservative and what the real risk is when you raise them.

The shared buffer pool (shared_buffers) is PostgreSQL's in-process cache. Data pages are read from disk into shared buffers and served from there on subsequent accesses. The OS also caches file data in its own page cache. PostgreSQL is designed to lean on both layers, which is why effective_cache_size (which tells the planner how large the combined cache is) matters as much as shared_buffers itself.

Every data modification goes through the write-ahead log before it is applied to heap pages. WAL settings control how aggressively PostgreSQL batches and flushes log entries. Checkpoints periodically flush dirty shared buffer pages to disk; tuning checkpoint behavior is the primary lever for smoothing out write I/O spikes under heavy write workloads.

Autovacuum is not optional housekeeping — it is the mechanism that reclaims dead tuple space and prevents transaction ID wraparound. A misconfigured autovacuum is one of the most common root causes of production PostgreSQL incidents, and it almost always traces back to default thresholds that were never updated as the table grew.


Key Parameters Explained

Shared Memory and Cache Parameters

shared_buffers is the most impactful single parameter for read-heavy workloads. It controls the size of PostgreSQL's shared buffer pool, the in-memory cache that all server processes share. Every read from the database goes through shared buffers first; a cache hit here costs nanoseconds, while a disk read costs microseconds to milliseconds.

The canonical starting point is 25% of total system RAM. On a 16 GB server, set this to 4 GB. You will see recommendations ranging from 15% to 40%; 25% is the safe midpoint that leaves enough RAM for the OS page cache (which PostgreSQL also benefits from), active connections, and work_mem allocations.

Warning: Do not set shared_buffers above 40% of RAM on systems with heavy write workloads. A very large shared buffer pool increases checkpoint I/O because more dirty pages must be flushed at each checkpoint interval. Balance shared_buffers with appropriate checkpoint tuning.

effective_cache_size does not allocate any memory. It is a hint to the query planner telling it how much data it can expect to find in the combined shared buffers plus OS page cache. A higher value makes the planner more willing to use index scans (which need random I/O that is fast if data is cached). Set this to approximately 75% of total RAM — 12 GB on a 16 GB server.

maintenance_work_mem controls memory allocated for maintenance operations: VACUUM, CREATE INDEX, ALTER TABLE ADD FOREIGN KEY, and cluster operations. Unlike work_mem, only a small number of maintenance operations run concurrently, so you can set this higher. 256 MB to 1 GB is reasonable for most servers. A higher value makes index builds and vacuum operations significantly faster.

work_mem and Connection Overhead

work_mem is the most dangerous parameter to raise blindly. It controls the memory allocated to each sort operation and hash table inside a single query. A single complex query can use multiple sort/hash operations simultaneously. With 100 active connections, each running a query with three sort nodes, PostgreSQL could allocate 100 × 3 × work_mem bytes of RAM simultaneously.

Important: The formula that matters is: (max_connections × max_sort_operations_per_query × work_mem) ≤ available RAM. With max_connections = 200 and work_mem = 256MB, your theoretical RAM ceiling is 51 GB — which will cause OOM kills on a 16 GB server under load. Raise work_mem incrementally and monitor actual memory usage with pg_stat_activity.

A practical starting point for a 16 GB OLTP server is work_mem = 32MB. For analytics-heavy sessions that run large sorts, set it per-session with SET work_mem = '256MB' rather than globally. For servers using a connection pooler (PgBouncer), lower max_connections aggressively — 100 to 200 server-side connections through a pooler serving 1,000 application threads is both safe and correct. This dramatically reduces the risk of work_mem memory exhaustion.

WAL and Checkpoint Parameters

wal_buffers controls the shared memory buffer for WAL data that has been generated but not yet written to the WAL files on disk. The default (-1, meaning 1/32 of shared_buffers, capped at 64 MB) is acceptable for most workloads. On write-heavy servers, explicitly setting wal_buffers = 64MB ensures the auto-calculation gives a useful value.

checkpoint_completion_target controls what fraction of the checkpoint interval PostgreSQL uses to spread out dirty page writes. The default is 0.9 in PostgreSQL 14+ (it was 0.5 in older versions). This means PostgreSQL spreads checkpoint writes over 90% of the interval between checkpoints, smoothing I/O. If you are on an older version with the default at 0.5, raising it to 0.9 is a high-priority change.

max_wal_size sets the soft upper limit on WAL size on disk between checkpoints. The default is 1 GB. On write-heavy servers, raising this to 2–4 GB reduces checkpoint frequency, which reduces write amplification and I/O spikes. The trade-off is a longer crash recovery time if the server restarts uncleanly — acceptable for most OLTP workloads, but worth reviewing for systems with strict RTO requirements.

Tip: Watch the buffers_checkpoint and buffers_clean columns in pg_stat_bgwriter. If buffers_checkpoint is dramatically higher than buffers_clean, your checkpoints are doing most of the dirty work and you should increase max_wal_size to space them out more.

Configuration Guide

The following is a recommended postgresql.conf configuration for a 16 GB RAM OLTP server running on SSD storage. These are starting points — validate with monitoring before treating them as final.

# Memory
shared_buffers          = 4GB          # 25% of 16GB RAM
effective_cache_size    = 12GB         # 75% of 16GB RAM
work_mem                = 32MB         # Per sort/hash; monitor for OOM
maintenance_work_mem    = 512MB        # For VACUUM, CREATE INDEX

# WAL and Checkpoints
wal_buffers             = 64MB
checkpoint_completion_target = 0.9
max_wal_size            = 2GB
min_wal_size            = 512MB

# Planner Cost Parameters (SSD)
random_page_cost        = 1.1          # SSD: near-sequential cost. HDD default: 4.0
effective_io_concurrency = 200         # SSD: high concurrency. HDD: 2

# Connections
max_connections         = 100          # Use PgBouncer in front; keep this low

# Statistics
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max  = 10000
pg_stat_statements.track = all

After editing postgresql.conf, reload for most parameters (SELECT pg_reload_conf();) or restart for parameters marked as requiring a restart (shared_buffers, max_connections, shared_preload_libraries).

To check which parameters require a restart versus a reload:

sql
SELECT name, setting, unit, context
FROM pg_settings
WHERE context IN ('postmaster', 'sighup')
  AND name IN (
    'shared_buffers', 'work_mem', 'maintenance_work_mem',
    'effective_cache_size', 'max_connections', 'checkpoint_completion_target',
    'max_wal_size', 'random_page_cost', 'effective_io_concurrency'
  )
ORDER BY context, name;

The following table compares PostgreSQL defaults against recommended values for a 16 GB RAM OLTP server on SSD:

Parameter Default Recommended (16 GB SSD) Notes
shared_buffers 128 MB 4 GB 25% of RAM; requires restart
effective_cache_size 4 GB 12 GB Planner hint only; no memory allocated
work_mem 4 MB 32 MB Per sort/hash op; multiply by concurrency
maintenance_work_mem 64 MB 512 MB For VACUUM and index builds
wal_buffers -1 (auto) 64 MB Explicit; auto often resolves similarly
checkpoint_completion_target 0.9 (PG14+) 0.9 Was 0.5 pre-PG14; update if on older version
max_wal_size 1 GB 2 GB Raise to reduce checkpoint frequency
random_page_cost 4.0 1.1 Critical for SSD; planner avoids indexes at 4.0
effective_io_concurrency 1 200 SSD: set high; HDD: set 2
max_connections 100 100 Use PgBouncer; do not raise blindly

Autovacuum Tuning

Autovacuum is the background process that reclaims space from dead tuples (rows updated or deleted but not yet removed), updates table statistics, and prevents transaction ID wraparound — a hard limit that causes PostgreSQL to refuse all writes. It is not optional, and it is not a performance tax; it is the mechanism that keeps the database healthy under write load.

The default configuration is calibrated for small databases. On large OLTP tables, it causes two failure modes: autovacuum runs too infrequently (allowing bloat and stale statistics to accumulate) or it runs too aggressively (locking tables or consuming I/O at the wrong time). Both are fixable with the right parameters.

Core Autovacuum Thresholds

autovacuum_vacuum_threshold (default: 50) is the minimum number of dead tuples in a table before autovacuum will consider running. autovacuum_vacuum_scale_factor (default: 0.2) is the fraction of the total table size that must be dead tuples before autovacuum triggers.

The trigger condition is: dead_tuples > autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor × table_rows)

For a table with 50 million rows, the default scale factor means autovacuum triggers only after 10 million dead tuples accumulate. By that point, table bloat is significant, query plans are stale, and heap scans are reading megabytes of dead data on every query.

Warning: The 20% scale factor default is appropriate for tables with tens of thousands of rows. For tables with millions of rows, reduce the scale factor to 1–2% or use per-table storage parameters. A single misconfigured large table can corrupt query planner estimates for the entire database.

Apply tighter autovacuum settings to specific high-churn tables using storage parameters, rather than changing the global defaults (which affects all tables including small ones):

sql
-- Apply aggressive autovacuum to a large, high-churn OLTP table
ALTER TABLE orders SET (
    autovacuum_vacuum_scale_factor     = 0.01,  -- Trigger at 1% dead tuples
    autovacuum_vacuum_threshold        = 1000,  -- Or after 1,000 dead tuples minimum
    autovacuum_analyze_scale_factor    = 0.005, -- Update stats at 0.5% row changes
    autovacuum_analyze_threshold       = 500,
    autovacuum_vacuum_cost_delay       = 2      -- ms; lower = more aggressive I/O
);

Autovacuum Workers and Parallelism

autovacuum_max_workers (default: 3) controls how many autovacuum worker processes can run simultaneously. On a server with dozens of active tables experiencing concurrent write load, three workers can become a bottleneck. Raising to 5–6 workers on a server with sufficient I/O headroom prevents vacuum lag from accumulating across multiple tables simultaneously.

autovacuum_vacuum_cost_delay (default: 2 ms in PG13+, was 20 ms in older versions) introduces intentional sleep between vacuum I/O operations to throttle its impact on active queries. If vacuum is running too slowly on high-churn tables, reducing this value (or setting it to 0 for specific critical tables) allows vacuum to run faster at the cost of more I/O competition.

# Global autovacuum settings in postgresql.conf
autovacuum_max_workers          = 5
autovacuum_naptime              = 30s       # How often autovacuum launcher checks tables
autovacuum_vacuum_scale_factor  = 0.05     # 5% for moderate-size tables globally
autovacuum_vacuum_threshold     = 500
autovacuum_analyze_scale_factor = 0.02
autovacuum_analyze_threshold    = 250
autovacuum_vacuum_cost_delay    = 2ms
autovacuum_vacuum_cost_limit    = 400      # Default 200; higher = more throughput
Tip: Check which tables autovacuum is spending the most time on using pg_stat_user_tables. Tables with high n_dead_tup relative to n_live_tup are candidates for per-table autovacuum tuning or a manual VACUUM ANALYZE.
sql
-- Find tables with excessive dead tuple accumulation
SELECT
    schemaname,
    relname,
    n_live_tup,
    n_dead_tup,
    ROUND(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 2) AS dead_pct,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC
LIMIT 20;

Monitoring Performance

Configuration changes without monitoring are guesswork. The most important tool for query-level analysis in PostgreSQL is pg_stat_statements, which must be loaded via shared_preload_libraries and tracks cumulative statistics for every distinct query the server has executed.

sql
-- Top 10 queries by total execution time since last reset
SELECT
    LEFT(query, 80)                              AS query_snippet,
    calls,
    ROUND(total_exec_time::numeric, 2)           AS total_ms,
    ROUND(mean_exec_time::numeric, 2)            AS mean_ms,
    ROUND(stddev_exec_time::numeric, 2)          AS stddev_ms,
    ROUND((total_exec_time / SUM(total_exec_time) OVER ()) * 100, 2) AS pct_total_time,
    rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Use the pct_total_time column to prioritize: a query running 10,000 times per day at 1 ms each is often more impactful than a query running twice a day at 500 ms. Focus tuning efforts on the queries consuming the largest share of cumulative execution time.

For checkpoint and bgwriter I/O health:

sql
-- Checkpoint and bgwriter statistics
SELECT
    checkpoints_timed,
    checkpoints_req,                                     -- Forced checkpoints: should be low
    ROUND(checkpoint_write_time / 1000.0 / 60, 2)       AS checkpoint_write_min,
    ROUND(checkpoint_sync_time / 1000.0 / 60, 2)        AS checkpoint_sync_min,
    buffers_checkpoint,
    buffers_clean,
    maxwritten_clean,                                    -- bgwriter was throttled if > 0
    buffers_backend,                                     -- Backend had to write directly: bad
    buffers_alloc,
    stats_reset
FROM pg_stat_bgwriter;

A high checkpoints_req value relative to checkpoints_timed indicates WAL is filling faster than max_wal_size allows — increase max_wal_size. A non-zero maxwritten_clean indicates the bgwriter hit its write cap — consider increasing bgwriter_lru_maxpages. Non-trivial buffers_backend means backends are writing dirty pages directly because the bgwriter is not keeping up — a sign of sustained write pressure that may require I/O or WAL tuning.

For connection and memory pressure, monitor active sessions and their state:

sql
-- Connection state summary: how many connections and what they're doing
SELECT
    state,
    wait_event_type,
    wait_event,
    COUNT(*) AS count
FROM pg_stat_activity
WHERE pid <> pg_backend_pid()
GROUP BY state, wait_event_type, wait_event
ORDER BY count DESC;
Tip: For a full observability stack, JusDB recommends pairing pg_stat_statements with Prometheus (via postgres_exporter), Grafana dashboards, and — for managed environments — PMM (Percona Monitoring and Management), which ships with PostgreSQL-specific dashboards for vacuum lag, bloat, query performance, and replication health. See our guide on Database Monitoring with Prometheus, Grafana, and PMM.

Key Takeaways

Key Takeaways
  • Set shared_buffers to 25% of RAM and effective_cache_size to 75% of RAM. These two parameters together have the largest single impact on query planner decisions and cache hit rates.
  • Treat work_mem as a per-operation, per-connection value. Multiply it by your connection count and expected query complexity before raising it. Use PgBouncer to reduce server-side max_connections and give yourself more headroom.
  • Raise max_wal_size to 2–4 GB and confirm checkpoint_completion_target = 0.9 to spread checkpoint I/O. Monitor checkpoints_req in pg_stat_bgwriter — forced checkpoints mean your WAL ceiling is too low.
  • Set random_page_cost = 1.1 on any SSD-backed server immediately. The default of 4.0 causes the planner to choose sequential scans over index scans that would be substantially faster on solid-state storage.
  • Tighten autovacuum scale factors for large tables — either globally or per-table via ALTER TABLE ... SET (...). The 20% default is appropriate for small tables and actively harmful for tables with millions of rows.
  • Enable pg_stat_statements before tuning anything. It tells you which queries to optimize, surfaces regressions after configuration changes, and is the single most valuable observability tool available inside PostgreSQL.

Working with JusDB on PostgreSQL

PostgreSQL performance tuning is not a one-time configuration exercise. As your data volume grows, query patterns shift, and hardware changes, the optimal configuration drifts. Teams that perform well over time do so because they have monitoring in place that surfaces regressions early and a clear mental model of how the parameters interact.

JusDB's PostgreSQL consulting team works with engineering teams to establish that foundation: a properly tuned baseline configuration, per-table autovacuum policies, a pg_stat_statements-backed query review process, and dashboards that make configuration drift visible before it becomes an incident.

Whether you are tuning a single RDS instance or designing a multi-region PostgreSQL architecture for a high-growth product, we can help you get there without the trial-and-error learning curve.

PostgreSQL Consulting Services Talk to a PostgreSQL Expert

Share this article