- ▸ Sentinel quorum problems - 2-node Sentinel with only 1 healthy can't promote anything; you discovered
min-replicas-to-writeisn't tuned correctly only when production failed over. - ▸ Replica promoted with stale data - a lagging replica fell behind and was elected primary on failover because
min-replicas-max-lagwasn't enforced, silently dropping the most recent writes. - ▸ Replica failover taking 30s+ - Sentinel takes 10s to detect, 5s to vote, 15s to reconfigure replicas; app sees 30+ seconds of timeouts during what should be a graceful failover.
JusDB HA consultants own the failover playbook + 15-minute incident SLA. Book an HA architecture review →
Valkey High Availability
In short: Valkey high availability (single-primary + replicas + Sentinel) involves quorum-based Sentinel deployment across AZs, sub-second replica lag monitoring, split-brain prevention via min-replicas-to-write and failover-timeout tuning, and automated failover in 15-30 seconds - plus cross-region async replicas and RDB snapshots for disaster recovery beyond local HA.
Production Sentinel quorum design, replica lag monitoring, split-brain prevention, and 15-30 second automated failover SLAs. For horizontal multi-shard scaling, see Valkey Cluster.
Production HA capabilities
A typical Valkey HA deployment
The shape we deploy by default unless something in the workload pushes us to cluster mode.
HA FAQ
Related Valkey Services
Explore more ways our Valkey experts can help with your database infrastructure.