Database SRE

AWS RDS PostgreSQL Cost Optimization

Optimize AWS RDS PostgreSQL costs without sacrificing performance. Learn instance sizing, storage class selection, backup strategies, and Multi-AZ optimization.

JusDB Team
November 8, 2022
5 min read
4798 views

We ran a cost audit on an e-commerce client's AWS bill last quarter and found they were paying $14,200/month for RDS PostgreSQL. After a week of changes — none of which touched the application code — they were at $7,900/month. The biggest wins: switching from a db.r6g to db.r7g Graviton (same RAM, 15% cheaper), converting from io1 to gp3 with matched IOPS (saved $1,800/month on storage alone), and buying 1-year Reserved Instances for the production cluster.

This guide is the checklist we use when auditing RDS PostgreSQL costs. No fluff — just the decisions that move the number, in priority order.

TL;DR checklist
  • Right-size on Graviton (db.t4g / db.m7g / db.r7g) before anything else.
  • Pick the cheapest HA that fits RTO/RPO: Single-AZ → Multi-AZ DB Instance (2 nodes) → Multi-AZ DB Cluster (3 nodes).
  • Use gp3 storage; provision only the IOPS/throughput you need; enable Storage autoscaling.
  • Tune autovacuum early to avoid bloat (wasted storage + I/O).
  • Stop non-prod nightly; schedule automatic stop/start.
  • Buy Reserved Instances for steady prod (1–3 years).
  • Trim snapshots & keep log retention sane (CloudWatch costs add up).
  • Use pgBouncer or RDS Proxy to prevent connection storms from forcing oversized instances.
  • Monitor replication slots & long transactions—both can inflate WAL/storage costs.

1) Right-size the instance (start with Graviton)

  • Prefer Graviton families (db.t4g, db.m7g, db.r7g) for better price/perf; use t for light, spiky dev/test only.
  • Match memory to workload: PostgreSQL benefits from RAM (shared buffers + OS cache). If queries spill, try more RAM before more vCPUs.
  • Re-evaluate size after fixing slow queries and bloat; many clusters run 1–2 sizes too big due to avoidable I/O.

2) Choose the right HA option (don’t overpay)

  • Single-AZ: cheapest, no automatic failover—fine for dev/test and low-impact internal tools.
  • Multi-AZ DB Instance: primary + standby in another AZ (roughly 2× instance hours); automatic failover.
  • Multi-AZ DB Cluster: 1 writer + 2 readers (3 instances) with faster failover; higher cost, higher availability/perf.

Tip: Reserve Multi-AZ/Cluster for prod OLTP. Many staging and internal apps don’t need it.

3) Storage: use gp3 and size IOPS deliberately

  • gp3 fits most workloads; decouples capacity and performance.
  • Set explicit IOPS/throughput based on metrics; avoid overprovisioning “just in case”.
  • io1/io2 only when you have sustained high I/O that gp3 can’t satisfy.
  • Enable storage autoscaling to expand capacity without emergency downtime.

4) Backups, snapshots, and WAL

  • Automated backup storage is typically free up to DB size in-region; manual snapshots & longer retention are billed.
  • Prune old manual snapshots; copy cross-region only for DR requirements.
  • Watch WAL accumulation: stalled logical slots or long transactions keep WAL around → higher storage & I/O.

5) Autovacuum & bloat control (direct cost impact)

  • Tune autovacuum_vacuum_scale_factor / autovacuum_analyze_scale_factor per table (lower for hot tables) to prevent bloat that wastes storage and IOPS.
  • Avoid long transactions—they block vacuum progress and inflate table/WAL size.
  • Use pg_repack (extension supported on RDS) to reclaim space without long locks when tables already bloated.
  • Partition very large, write-heavy tables to keep vacuum/maintenance localized and cheaper.

6) Stop idle environments

  • For dev/stage/sandbox: stop RDS nightly/weekends. You still pay for storage/backups, but not instance hours.
  • Automate stop/start via EventBridge + Lambda or SSM Maintenance Windows.

7) Commit for steady prod: Reserved Instances

  • Buy RDS Reserved Instances (1 or 3 years) for 24×7 prod to capture meaningful discounts over On-Demand.
  • Mix Convertible and Standard to balance flexibility and savings.

8) Observability costs (keep them lean)

  • Performance Insights: 7-day retention is free; longer retention costs per vCPU—set the minimum that suits your investigations.
  • CloudWatch Logs: keep log_min_duration_statement high enough for signal; set sane retention to cap costs.

9) Connection management: pgBouncer vs RDS Proxy

  • pgBouncer (session pooling) is lightweight and inexpensive; deploy on ECS/EC2 for app tiers that open many short-lived connections.
  • RDS Proxy adds managed pooling/failover help and can be worth it if it lets you run a smaller DB or stabilizes serverless/Lambda spikes.

10) Data transfer & topology

  • Co-locate app and DB in the same AZ to avoid cross-AZ data transfer charges.
  • Use read replicas only when they offset primary load; turn off unused ones.

PostgreSQL-specific quick wins

ActionEffortTypical impactWhen to do it
Switch to GravitonLow–MedLower $/perfMaintenance window or blue/green
Move gp2/PIOPS → gp3LowLower storage $Most OLTP workloads
Enable storage autoscalingLowAvoid emergency resizesImmediately
Tune autovacuum on hot tablesLowCut IOPS & storage bloatWeek 1
Reclaim bloat with pg_repackMedLower storage $After metrics show bloat
Partition largest write-heavy tablesMedCheaper maintenanceWhen tables exceed tens of GB
Stop non-prod nightly/weekendsLow~50–70% instance $ off-hoursDev/test/sandbox
Buy RIs for prodLowDeep discountStable 24×7 workloads
Cap PI/CloudWatch retentionLowLower observability $Immediately
Add pgBouncer / RDS ProxyMedSmaller DB possibleConnection storms, serverless
Monitor logical slots & WALLowPrevent runaway storageAlways-on check
  1. Measure a week: CPU, RAM, read/write IOPS, throughput, connections, slow queries, bloat (pg_stat), WAL growth, autovacuum activity.
  2. Right-size on Graviton; switch storage to gp3 with explicit IOPS/throughput; enable storage autoscaling.
  3. Tune autovacuum for hot tables; fix long transactions; monitor logical replication slots; repack if already bloated.
  4. Trim snapshots & set CloudWatch/PI retention; schedule stop/start for all non-prod.
  5. Buy RIs for steady prod once stable; consider pgBouncer or RDS Proxy if connection churn is the bottleneck.

If you want a hands-on audit of your RDS PostgreSQL fleet — we'll identify the savings, run the changes safely, and report on the outcome — reach out here. We can usually turn a cost audit around in 48 hours and have implementation running within a week.

Related reading: EBS gp2 vs gp3 Guide | AWS RDS MySQL Cost Optimization | PostgreSQL VACUUM Tuning

Further Reading

For more in-depth information, check out these authoritative resources:

Explore more database insights from JusDB:

Share this article

JusDB Team

Official JusDB content team