Aurora Global Database Architecture
Aurora Global Database is fundamentally different from standard MySQL/PostgreSQL replication. Replication happens at the storage layer — not at the SQL or WAL level. Aurora replicates storage volume changes (redo log records) directly between regions, achieving typical replication lag of under 1 second.
Structure:
- Primary region: Read/write Aurora cluster (up to 15 read replicas)
- Secondary regions: Up to 5 read-only Aurora clusters (up to 16 read replicas each)
- Storage is logically shared across all regions — Aurora replicates the storage volume, not individual rows
Setting Up Global Database
# AWS CLI: Create global database from existing cluster
aws rds create-global-cluster \
--global-cluster-identifier my-global-db \
--source-db-cluster-identifier arn:aws:rds:us-east-1:123:cluster:primary-cluster
# Add a secondary region
aws rds create-db-cluster \
--db-cluster-identifier secondary-cluster \
--global-cluster-identifier my-global-db \
--engine aurora-postgresql \
--engine-version 15.4 \
--region eu-west-1Write Forwarding
With write forwarding enabled, applications connected to a secondary region can execute writes — Aurora transparently forwards them to the primary region. This simplifies application architecture for global apps.
# Enable write forwarding on secondary cluster
aws rds modify-db-cluster \
--db-cluster-identifier secondary-cluster \
--enable-global-write-forwarding \
--region eu-west-1Write forwarding limitations:
- Higher latency for writes (round-trip to primary region)
- Does not support XA transactions, LOAD DATA INFILE, or some DDL
- Secondary replica lag must be < 1 second for consistent reads after writes
Planned vs Unplanned Failover
Planned Failover (RTO: 1-2 minutes)
# Graceful promotion of a secondary region to primary
aws rds failover-global-cluster \
--global-cluster-identifier my-global-db \
--target-db-cluster-identifier arn:aws:rds:eu-west-1:123:cluster:secondary-cluster
# Steps Aurora performs:
# 1. Blocks new writes on primary
# 2. Flushes remaining replication lag to secondary
# 3. Promotes secondary to primary
# 4. Reconfigures replication topology
# 5. Old primary becomes secondaryUnplanned Failover (RTO: ~1 minute, potential RPO)
# If primary region is completely unavailable:
# Detach the secondary from the global cluster and promote it
aws rds remove-from-global-cluster \
--global-cluster-identifier my-global-db \
--db-cluster-identifier arn:aws:rds:eu-west-1:123:cluster:secondary-cluster
# The detached cluster auto-promotes to a standalone writable cluster
# RPO = replication lag at time of failure (typically < 1 second)Monitoring Global Database
-- Check replication lag from primary
SELECT
server_id,
session_id,
aws_region,
last_update_timestamp,
durable_lsn,
highest_lsn_rcvd
FROM aurora_global_db_instance_status();
-- CloudWatch metrics to monitor:
-- AuroraGlobalDBReplicationLag (target: < 1000 ms)
-- AuroraGlobalDBDataTransferBytes (egress cost)
-- AuroraGlobalDBProgressLag (during failover)Cost Considerations
| Cost Component | Notes |
|---|---|
| Secondary cluster instances | Same price as primary region instances |
| Cross-region replication I/O | $0.20/GB replicated (varies by region pair) |
| Storage | Billed per region (storage is replicated, not shared) |
| Data transfer | Outbound from primary region standard rates |
Key Takeaways
- Storage-level replication achieves < 1 second lag vs minutes for logical replication
- Write forwarding enables transparent writes from any region — but adds latency
- Planned failover is fast and zero-RPO; unplanned has minimal RPO (~1s)
- Monitor
AuroraGlobalDBReplicationLagcontinuously - Cross-region replication I/O costs can be significant at high write volumes