Valkey arrived as a drop-in Redis replacement after the license change, but most teams that migrated simply lifted their existing redis.conf and called it done. That works fine until traffic doubles, a memory spike evicts the wrong keys at 2 AM, or a cluster rebalance stalls and latency spikes to hundreds of milliseconds across every shard. The difference between a Valkey deployment that merely works and one that performs reliably under production pressure comes down to three areas: memory management, persistence strategy, and cluster configuration. This guide covers all three, with the exact parameters and the reasoning behind every decision.
Valkey is not Redis — it has diverged on internal optimizations and roadmap — but it inherits the same operational DNA, which means the same classes of production problems appear, and the same diagnostic tools apply. The commands and config parameters here are tested against Valkey 7.2 and 8.x.
- Set
maxmemoryexplicitly — without it, Valkey will consume all available RAM and trigger OOM kills. - Use allkeys-lru for pure cache workloads, volatile-lru when some keys must survive eviction, and allkeys-random only for uniform-access patterns where recency is irrelevant.
- For persistence: AOF with
appendfsync everysecgives the best durability-to-performance trade-off. RDB snapshots are fast for backups but can lose up tosave-interval seconds of data on a crash. - In cluster mode, tune
cluster-node-timeoutto avoid false failovers on network hiccups, and setcluster-require-full-coverage nofor workloads that tolerate partial availability. - Use
valkey-cli --latency-historyto track latency spikes over time and catch slowdowns before users report them.
Memory Policies and maxmemory Configuration
Memory is the most operationally dangerous axis in Valkey. An unconfigured instance running on a 32 GB host will happily consume all 32 GB, at which point the Linux OOM killer arrives and terminates the process with no warning and no persistence flush. The first line in every production Valkey config should be a hard memory cap.
# Set a hard memory ceiling — use 75-80% of total host RAM
# to leave headroom for the OS, replication buffers, and AOF rewrite
CONFIG SET maxmemory 8gb
# Verify the current memory configuration
CONFIG GET maxmemory
CONFIG GET maxmemory-policy
INFO memoryOnce maxmemory is set, you must also set maxmemory-policy to tell Valkey what to do when the limit is reached. The default policy — noeviction — returns errors to the client on write commands when memory is full. That is rarely what you want. Here are the three policies that cover the vast majority of production use cases:
allkeys-lru
Use when: Valkey is a pure cache — every key is reconstructible from a backing store, no key is special, and you want the best cache-hit ratio. The LRU (Least Recently Used) algorithm evicts the key that was accessed least recently across the entire keyspace.
CONFIG SET maxmemory-policy allkeys-lruThis is the correct choice for session caches, rendered page caches, API response caches, and rate limiter counters where the rate window is shorter than the eviction pressure. Under allkeys-lru, every key competes for memory on equal terms — TTLs are irrelevant to eviction order, though they still expire keys independently when their TTL expires.
Valkey uses an approximated LRU algorithm by default: it samples maxmemory-samples (default 5) random keys and evicts the least recently used of that sample. Increase maxmemory-samples to 10 for a closer approximation of true LRU at a small CPU cost. The improvement in eviction quality is measurable on skewed access patterns.
volatile-lru
Use when: your keyspace is mixed — some keys have TTLs set (ephemeral cache entries) and others do not (durable application state that must survive eviction). volatile-lru applies LRU eviction only to keys that have an expiry set, leaving keys without a TTL untouched.
CONFIG SET maxmemory-policy volatile-lruA classic example is a Valkey instance shared between a session store (each session key has a TTL of 24 hours) and a feature-flag cache (flag values are written once with no TTL and must always be available). Under volatile-lru, memory pressure evicts the oldest session keys while the feature-flag keys remain intact. If no keys with TTLs exist and memory is full, Valkey returns OOM errors for writes — the same behavior as noeviction — so ensure enough of your keyspace carries TTLs.
allkeys-random
Use when: your access pattern is uniform across the full keyspace and recency carries no signal. This is rare in practice — most workloads have hot keys — but it is appropriate for evenly distributed sampling data, time-bucketed metrics where every bucket is equally likely to be read, or any workload where you have external evidence that access frequency is flat.
CONFIG SET maxmemory-policy allkeys-randomDo not use allkeys-random on workloads with hot keys. If 20% of your keys handle 80% of reads, random eviction will frequently discard hot keys that LRU would have preserved, spiking your cache-miss rate and hammering your database backend. When in doubt, default to allkeys-lru.
The remaining policies — volatile-random, volatile-ttl, allkeys-lfu, and volatile-lfu — are variations on the same themes. LFU (Least Frequently Used) is worth evaluating if your access pattern is highly skewed and temporal recency is less important than cumulative popularity: set allkeys-lfu and tune lfu-log-factor (default 10) and lfu-decay-time (default 1 minute) for your specific pattern.
Persistence Tuning (AOF vs RDB for Valkey)
Persistence is a trade-off between durability, write throughput, and recovery time. Valkey supports three configurations: RDB snapshots only, AOF only, or both simultaneously. Most production deployments should choose deliberately rather than accepting the defaults.
RDB (Redis Database File) Snapshots
RDB persistence writes a point-in-time binary snapshot of the entire dataset to disk using a background fork. The fork operation is copy-on-write and does not block reads, but it doubles the effective memory footprint during the snapshot if the dataset is write-heavy. The default save configuration is:
# Default RDB save schedule (in valkey.conf)
save 3600 1 # snapshot if 1 key changed in the last hour
save 300 100 # snapshot if 100 keys changed in the last 5 minutes
save 60 10000 # snapshot if 10,000 keys changed in the last minute
# Disable RDB entirely (for pure AOF deployments)
save ""
# Force an immediate RDB snapshot
BGSAVERDB is the right choice when: recovery time matters more than recovery point. An RDB snapshot restores in seconds because Valkey loads a compact binary file rather than replaying a log. Use RDB for cache-only deployments where losing up to save-interval seconds of data on a crash is acceptable — the backing store can reconstruct the cache.
RDB risk: A crash between snapshots loses all writes since the last successful BGSAVE. On a busy instance with the save 60 10000 trigger active, that window can be up to 60 seconds of writes.
AOF (Append-Only File)
AOF logs every write command to a file in order. On restart, Valkey replays the AOF to reconstruct the dataset. The critical tuning parameter is appendfsync, which controls when AOF writes are flushed to disk:
# Enable AOF
CONFIG SET appendonly yes
# appendfsync options:
# always — fsync after every write command. Maximum durability, ~30-50% throughput penalty.
# everysec — fsync once per second (background thread). Lose at most ~1 second of data.
# Best trade-off for most production workloads.
# no — let the OS decide when to flush. Fastest, but can lose multiple seconds on crash.
CONFIG SET appendfsync everysec
# AOF rewrite: compact the AOF by rewriting it as the minimal set of commands
# to reconstruct current state. Triggered automatically when:
CONFIG SET auto-aof-rewrite-percentage 100 # AOF has grown 100% since last rewrite
CONFIG SET auto-aof-rewrite-min-size 64mb # and is at least 64 MBDuring an AOF rewrite, Valkey forks a child process to write the new AOF while buffering incoming writes in memory. On instances with a large dataset and high write rate, the rewrite buffer can grow to several hundred megabytes. If memory is tight, set aof-rewrite-incremental-fsync yes to have the rewrite child fsync incrementally every 32 MB rather than in one large flush at the end — this smooths out I/O latency spikes during rewrites.
Running Both RDB and AOF
For workloads that need fast restart (RDB) and strong durability (AOF), run both simultaneously. Valkey uses the AOF for recovery on restart when appendonly yes is set, regardless of whether RDB is also enabled. Use RDB snapshots as offsite backups — copy the .rdb file to S3 after each BGSAVE.
# Recommended configuration for production persistence
appendonly yes
appendfsync everysec
aof-use-rdb-preamble yes # AOF starts with an RDB preamble for faster rewrites
save 3600 1
save 300 100
save 60 10000
# Monitor persistence health
INFO persistence
# Fields to watch: aof_enabled, aof_rewrite_in_progress,
# aof_last_rewrite_time_sec, rdb_last_bgsave_status,
# rdb_last_bgsave_time_secCluster Mode Performance
Valkey cluster distributes data across shards using a 16,384-slot hash ring. Each primary shard owns a slot range and can have one or more replicas. The two cluster-mode parameters that most directly affect production reliability are cluster-node-timeout and cluster-require-full-coverage.
cluster-node-timeout
cluster-node-timeout is the millisecond threshold after which a node that is not reachable is considered to have failed. When a primary is unreachable for this duration, the cluster promotes a replica to primary. Setting it too low causes false failovers on transient network hiccups. Setting it too high delays recovery from genuine failures.
# Default is 15000ms (15 seconds) — a safe starting point for most deployments
# In cloud environments with occasional network jitter, keep at 15000 or increase to 20000
cluster-node-timeout 15000
# For LAN environments with very reliable networking, you can lower to 5000-8000ms
# to get faster failover on genuine failures
cluster-node-timeout 8000
# Verify cluster state after configuration
valkey-cli cluster info
# Look for: cluster_state:ok, cluster_slots_assigned:16384,
# cluster_known_nodes, cluster_sizecluster-node-timeout also controls two related behaviors: the time before a replica initiates a failover vote (cluster-node-timeout * 2), and the time before a cluster client considers a slot migration stalled. Reducing it has a cascading effect — if in doubt, leave it at the default.
cluster-require-full-coverage
When cluster-require-full-coverage yes (the default), the cluster stops accepting writes for all slots if any shard fails with no available replica. This is the safe setting for data stores where partial availability is worse than total unavailability — financial ledgers, inventory systems, anything where a stale partial read is dangerous.
# Stop accepting ALL writes if any slot range becomes uncovered (default)
cluster-require-full-coverage yes
# Accept writes to available slots even if some slots are uncovered
# Appropriate for cache workloads where partial availability is better than none
cluster-require-full-coverage noSetting cluster-require-full-coverage no on a persistent data store means writes targeting a failed shard are silently dropped (or return errors for those specific slots) while writes to healthy shards succeed. This can create inconsistency if your application does not handle per-key errors. Only use no for pure cache workloads where every key is reconstructible.
Cluster Rebalancing
After adding or removing shards, slot rebalancing migrates hash slots between nodes. This is an I/O and CPU-intensive background operation. Control its impact with cluster-migration-barrier and use CLUSTER REBALANCE with a pipeline size limit:
# Minimum number of replicas a primary must retain before Valkey
# auto-migrates an orphaned replica to balance the cluster
cluster-migration-barrier 1
# Manual rebalance — pipeline 10 keys at a time to reduce I/O burst
valkey-cli --cluster rebalance 127.0.0.1:7000 \
--cluster-pipeline 10 \
--cluster-threshold 0.01
# Monitor slot migration in progress
valkey-cli cluster info | grep migratingLatency Monitoring with valkey-cli --latency
Valkey latency problems fall into two categories: command execution latency (slow commands blocking the event loop) and network/system latency (OS scheduler jitter, I/O stalls, fork latency). The valkey-cli latency tools measure the latter — how long the server takes to respond to a round-trip ping — which captures infrastructure-level problems that SLOWLOG misses.
# Real-time latency monitoring: prints the current latency in milliseconds
# Hit Ctrl-C to stop
valkey-cli --latency -h 127.0.0.1 -p 6379
# Sample output:
# min: 0, max: 1, avg: 0.10 (1000 samples)
# --latency-history: records latency samples in 15-second windows
# and prints a histogram each window. Use this to track latency spikes over time.
valkey-cli --latency-history -h 127.0.0.1 -p 6379
# Sample output (one line per 15-second window):
# min: 0, max: 2, avg: 0.12 (937 samples) -- 15.00 seconds range
# min: 0, max: 47, avg: 1.83 (891 samples) -- 15.00 seconds range ← spike
# min: 0, max: 1, avg: 0.11 (944 samples) -- 15.00 seconds range
# --latency-dist: ASCII art latency distribution (useful for quick visual triage)
valkey-cli --latency-dist -h 127.0.0.1 -p 6379A one-time spike in --latency-history usually indicates a fork event (RDB snapshot or AOF rewrite), a slow client command, or an OS scheduler pause. A sustained elevation above your baseline indicates a systemic problem: disk I/O saturation, CPU contention, or network congestion.
Correlate --latency-history output with the server's internal latency event log:
# Enable the internal latency monitor (samples events above threshold in microseconds)
CONFIG SET latency-monitor-threshold 10 # flag any event taking over 10ms
# View the latest latency events
LATENCY LATEST
# Sample output:
# event-name last-time latest-event max-latency
# aof-stat 10342567 34 67
# fork 10341982 89 134
# fast-command 10341700 12 28
# View history for a specific event type
LATENCY HISTORY fork
# Reset latency history
LATENCY RESETOn Linux, fork latency spikes are almost always caused by Transparent Huge Pages (THP). Valkey's fork copies page table entries, and with THP enabled, each huge page (2 MB) requires collapsing before copy-on-write can proceed. Disable THP on the host: echo never > /sys/kernel/mm/transparent_hugepage/enabled. This single change eliminates fork latency spikes on most production Valkey hosts.
Key Configuration Parameters Reference
| Parameter | Default | Recommended (Production) | Notes |
|---|---|---|---|
maxmemory |
0 (unlimited) | 75–80% of host RAM | Always set explicitly. Leave headroom for fork, replication buffers. |
maxmemory-policy |
noeviction |
allkeys-lru for caches |
Use volatile-lru for mixed keyspaces. Avoid noeviction in production caches. |
maxmemory-samples |
5 | 10 | Higher value = better LRU approximation at a small CPU cost. |
appendonly |
no |
yes (for persistence) |
Required for AOF. Use aof-use-rdb-preamble yes alongside. |
appendfsync |
everysec |
everysec |
Best durability-to-throughput ratio. always safe but slow; no fast but risky. |
auto-aof-rewrite-percentage |
100 | 100 | Trigger rewrite when AOF doubles in size. Lower to 50 on high write-rate instances. |
cluster-node-timeout |
15000 ms | 15000–20000 ms | Increase in cloud environments with network jitter to avoid false failovers. |
cluster-require-full-coverage |
yes |
no for cache; yes for stores |
Set no only when partial availability is preferable to total write halt. |
latency-monitor-threshold |
0 (disabled) | 10–25 ms | Enables LATENCY LATEST and LATENCY HISTORY commands. |
hz |
10 | 15–20 | Controls background task frequency (key expiry, replication heartbeat). Higher improves expiry precision at a small CPU cost. |
tcp-backlog |
511 | 511 | Also set net.core.somaxconn = 511 at OS level; kernel cap applies. |
- Always set
maxmemoryto 75–80% of available RAM. The default of unlimited is a production incident waiting to happen. - Choose
allkeys-lrufor pure cache deployments,volatile-lruwhen some keys must survive memory pressure, andallkeys-randomonly with verified uniform access patterns. - AOF with
appendfsync everysecis the correct persistence default: at most one second of data loss on a crash, with write throughput close to RDB-only performance. - Enable
aof-use-rdb-preamble yesto get the restart speed of RDB combined with the durability of AOF rewriting. - Set
cluster-node-timeoutto 15,000–20,000 ms in cloud environments to avoid false failovers caused by transient network jitter. - Use
cluster-require-full-coverage nofor cache workloads that tolerate partial availability; keep ityesfor persistent data stores. - Run
valkey-cli --latency-historycontinuously in production to detect latency spikes before users report them, and correlate spikes withLATENCY HISTORY forkto identify AOF rewrite or RDB snapshot interference. - Disable Transparent Huge Pages on Linux hosts to eliminate the most common source of fork latency spikes.
Working with JusDB on Valkey
Getting memory policies, persistence, and cluster configuration right the first time requires hands-on production experience with Valkey under real load. The parameters in this guide will get you significantly further than defaults, but production Valkey deployments also need cluster-aware client configuration, replication lag monitoring, slot distribution analysis, and capacity planning tied to your actual key size distribution and TTL profile — not generic recommendations.
JusDB's database engineers have tuned Valkey and Redis deployments for e-commerce platforms, real-time analytics pipelines, and multi-region SaaS products. We cover initial configuration reviews, cluster health assessments, latency incident triage, and ongoing advisory for teams running Valkey at scale.