Your Redis instance just crashed. In the next sixty seconds, you will either restore to the exact state it was in moments ago, lose the last thirty seconds of writes, or — if you skipped persistence entirely — rebuild from scratch while your application throws errors. Which outcome you get was decided when you first configured persistence, probably during initial setup when it felt like a minor detail. Redis gives you three persistence strategies, each with different durability guarantees, recovery times, and performance costs, and the wrong choice for your workload will surface at the worst possible moment.
RDB (Redis Database) snapshots and AOF (Append-Only File) logging represent fundamentally different philosophies: periodic point-in-time backups versus a continuous write-ahead log. Redis 7 introduced a hybrid mode that combines both, and understanding the trade-offs between all three is essential for any production Redis deployment. This post walks through how each mode works under the hood, what you actually lose in a failure, and how to make the right call for your workload.
- RDB takes periodic snapshots via a forked child process — fast restores, but you can lose minutes of data on crash.
- AOF logs every write command — near-zero data loss with
appendfsync everysec, but slower restores and more disk I/O. - AOF with
appendfsync alwaysgives you strong durability but cuts throughput significantly. - Redis 7 hybrid mode (AOF-RDB) is the recommended default for most production workloads — fast restores with low data loss.
- Use
redis-check-aof --fixto repair a corrupted AOF file before restart. - If you disable both persistence modes, Redis is purely ephemeral — cache use cases only.
Persistence Options in Redis
Redis offers four persistence configurations:
- No persistence — all data is lost on restart. Suitable for pure caching tiers.
- RDB only — point-in-time snapshots on a schedule.
- AOF only — append-only log of every write command.
- RDB + AOF hybrid — combines both, using RDB format inside the AOF file (Redis 7+).
Both RDB and AOF can be enabled simultaneously. When both are enabled and a restart occurs, Redis uses the AOF file for recovery because it is more likely to be complete. Understanding the internals of each mode helps you reason about the guarantees you actually have.
RDB Snapshots
RDB persistence writes the entire dataset to disk as a compact binary snapshot. The trigger can be manual (BGSAVE or SAVE) or automatic via the save configuration directive.
How BGSAVE Works
When a snapshot is triggered, Redis forks a child process. The parent continues serving requests while the child writes the snapshot to a temporary file. Once complete, the temporary file atomically replaces the previous dump.rdb. This fork-based approach means the parent is never blocked during the snapshot write — but the fork itself is not free.
# Trigger a background save manually
127.0.0.1:6379> BGSAVE
Background saving started
# Check the save status
127.0.0.1:6379> LASTSAVE
(integer) 1708601234Configuring Automatic Snapshots
# redis.conf — save after N seconds if at least M keys changed
save 3600 1 # save after 1 hour if at least 1 key changed
save 300 100 # save after 5 minutes if at least 100 keys changed
save 60 10000 # save after 1 minute if at least 10,000 keys changed
# Path and filename
dir /var/lib/redis
dbfilename dump.rdb
# Compress the RDB file (recommended)
rdbcompression yes
# Checksum the RDB file on save and load
rdbchecksum yesOn Linux, Redis uses copy-on-write semantics after fork, so memory usage can temporarily spike to nearly 2x the dataset size if writes are heavy during the snapshot. On a 10 GB dataset, this can mean 10 GB of additional RSS if many pages are dirtied. Monitor latest_fork_usec in INFO stats to track fork latency — values above 20ms on a write-heavy instance indicate a problem.
Handling BGSAVE Failures
By default, Redis will stop accepting writes if a background save fails. This behaviour is controlled by stop-writes-on-bgsave-error yes. For high-availability setups where data availability trumps strict durability, you may set this to no, but be aware that you are silently dropping persistence guarantees.
# Stop accepting writes if RDB save fails (default: yes)
stop-writes-on-bgsave-error yesAOF Logging
AOF persistence records every write command that modifies the dataset to an append-only log file. On restart, Redis replays this log to reconstruct the dataset. The key configuration lever is appendfsync, which controls when the kernel buffer is flushed to disk.
# redis.conf — AOF configuration
appendonly yes
appendfilename "appendonly.aof"
dir /var/lib/redis
# Flush policy: always | everysec | no
appendfsync everysecappendfsync Modes Compared
The appendfsync setting is the single most important AOF tuning decision:
always—fsyncis called after every write command. Maximum durability: at most one command lost on crash. Throughput penalty is significant — expect 10–100x reduction in write throughput compared tono. Use only when you cannot tolerate any data loss.everysec(default and recommended) —fsyncruns in a background thread once per second. You can lose at most ~1 second of writes. Minimal throughput impact in most workloads.no— Redis never callsfsync; the OS decides when to flush. Fastest option, but data loss window is determined by kernel buffer flush intervals — typically 30 seconds on Linux.
appendfsync always calls fsync synchronously on every write. On a spinning disk, a single fsync takes 4–10ms, capping throughput at 100–250 operations per second. Even on NVMe, the serialisation overhead limits you to a few thousand ops/sec. Benchmark your specific hardware before using this in production.
AOF Rewrite
The AOF file grows unbounded without compaction. AOF rewrite solves this by creating a new, minimal AOF that represents the current dataset state — equivalent to running BGSAVE but for the log format.
# Trigger rewrite when AOF grows by 100% over the base size
auto-aof-rewrite-percentage 100
# Only trigger if the AOF is at least 64 MB
auto-aof-rewrite-min-size 64mb# Trigger a rewrite manually
127.0.0.1:6379> BGREWRITEAOF
Background append only file rewriting startedDuring an AOF rewrite, the child process generates heavy sequential I/O. If appendfsync everysec is also running, this can cause latency spikes. Setting no-appendfsync-on-rewrite yes suspends fsync during rewrites, trading a slightly larger data loss window during the rewrite for reduced latency. This is generally safe for everysec workloads.
# Suppress fsync during AOF rewrite to reduce I/O contention
no-appendfsync-on-rewrite yesRDB+AOF Hybrid Mode (Redis 7)
Hybrid persistence, introduced as stable in Redis 7, stores an RDB-format snapshot at the beginning of the AOF file, followed by the AOF log of commands that occurred after the snapshot. This gives you fast load times (binary RDB format is far faster to parse than replaying thousands of commands) while preserving the low data-loss guarantees of AOF.
# Enable hybrid persistence (requires appendonly yes)
appendonly yes
aof-use-rdb-preamble yesWhen a rewrite is triggered, Redis writes the current dataset in RDB format to the start of the new AOF file and then appends subsequent commands in text format. On restart, Redis detects the RDB preamble and loads it directly, then replays only the commands appended after the snapshot — dramatically reducing restart time compared to a pure AOF file.
For most production deployments — session stores, leaderboards, queues, rate limiters — hybrid mode with appendfsync everysec gives the best balance of durability, performance, and restart speed. Start here unless you have a specific reason to use a simpler mode.
Choosing the Right Persistence Mode
| Mode | Max Data Loss | Restart Speed | Disk Usage | Write Overhead | Best For |
|---|---|---|---|---|---|
| No persistence | 100% | Instant (empty) | None | None | Pure caches, ephemeral data |
| RDB only | Minutes (save interval) | Fast | Low (compressed binary) | Low (fork periodic) | Backups, analytics, tolerable loss |
| AOF (everysec) | ~1 second | Slow (log replay) | High (grows unbounded) | Low–Medium | Queues, counters, low-loss workloads |
| AOF (always) | At most 1 command | Slow | High | Very High | Financial, compliance, zero-loss required |
| Hybrid (RDB+AOF) | ~1 second | Fast (RDB preamble) | Medium | Low–Medium | Most production workloads (recommended) |
Recovery Scenarios
RDB Recovery
Recovery from RDB is straightforward: Redis reads and parses the binary dump.rdb file on startup. A 5 GB RDB file typically loads in 10–30 seconds depending on disk speed. If the file is corrupt, Redis will fail to start and log the error. Verify your RDB file with:
# Check RDB file integrity
redis-check-rdb /var/lib/redis/dump.rdbAOF Recovery
AOF recovery replays every command in the log sequentially. A 20 GB AOF file on a moderately fast instance can take several minutes. This is the main operational reason to prefer hybrid mode — the RDB preamble eliminates the need to replay the full history.
If your Redis instance crashed mid-write, the AOF file may be truncated or contain a partial write at the end. Redis detects this and refuses to start. Fix it with:
# Check for AOF corruption
redis-check-aof /var/lib/redis/appendonly.aof
# Automatically truncate the partial write at the end (safe for crash recovery)
redis-check-aof --fix /var/lib/redis/appendonly.aof--fix truncates the AOF at the first detected inconsistency. Always make a backup copy before running it. The commands after the truncation point are permanently discarded — for a crash scenario this is typically just the final partial write, but inspect the output carefully before confirming.
Data Loss Window in Practice
With appendfsync everysec, the worst-case data loss is the writes that arrived in the one-second window before the crash that had not yet been flushed. In practice, if the OS was healthy and the disk responsive, the actual loss is often zero — the background fsync thread typically completes well within its one-second window. The one-second figure is a guarantee, not the typical outcome.
With RDB only and a save 300 100 policy, you can lose up to five minutes of writes if the crash happens just before a scheduled save and fewer than 100 keys changed since the last one. This is frequently underestimated in initial deployments.
Testing Your Recovery Path
# Simulate crash recovery in a test environment
# 1. Identify the current persistence files
redis-cli CONFIG GET dir
redis-cli CONFIG GET dbfilename
redis-cli CONFIG GET appendfilename
# 2. Check when the last save occurred
redis-cli LASTSAVE
# 3. Force a clean shutdown and measure restart time
redis-cli SHUTDOWN SAVE
time redis-server /etc/redis/redis.conf
# 4. Verify key count after restart
redis-cli DBSIZESchedule a quarterly recovery drill where you restore from your actual RDB or AOF files in a staging environment. The drill validates your backup integrity, gives you a real measurement of restart time, and ensures your runbooks reflect the actual recovery procedure — not the one you wrote eighteen months ago.
Key Takeaways
- RDB snapshots are fast to load but carry a data loss window equal to your snapshot interval — never assume it is shorter than your
saveconfiguration. BGSAVEforks a child process; on large datasets, monitorlatest_fork_usecand ensure you have enough free memory for copy-on-write overhead.- AOF with
appendfsync everysecis the practical standard for low-loss persistence — roughly one second of worst-case data loss with minimal throughput impact. appendfsync alwaysprovides near-zero data loss but severely limits write throughput; only use it when you have measured the performance impact and explicitly accept the trade-off.no-appendfsync-on-rewrite yesreduces I/O contention during AOF rewrites at the cost of a slightly wider data loss window during the rewrite period.- Hybrid mode (AOF-RDB,
aof-use-rdb-preamble yes) is the recommended production default in Redis 7 — it combines fast binary restarts with AOF's durability guarantees. - Use
redis-check-aof --fixto recover from a crash-truncated AOF file; always back up the file before running the fix. - Recovery time scales with file size for AOF (command replay) and is roughly constant per MB for RDB (binary parse) — hybrid mode eliminates the worst-case AOF replay scenario.
- Set
stop-writes-on-bgsave-error yesin production to get an immediate signal when RDB persistence breaks, rather than silently losing durability.
Optimize Your Redis Persistence with JusDB
Getting persistence right is one of the most consequential configuration decisions in a Redis deployment. The wrong setting is invisible until a failure makes it very visible. JusDB provides managed Redis instances with persistence pre-configured for your workload profile — whether that is a high-throughput cache that needs hybrid mode with tuned rewrite thresholds, or a compliance workload that requires appendfsync always with monitored write latency.
JusDB handles the ongoing operational burden: monitoring fork latency, tracking AOF growth, alerting on bgsave failures, and maintaining tested recovery runbooks. If you are evaluating your current persistence configuration or planning a new deployment, the JusDB team can audit your redis.conf and benchmark the performance impact of each mode against your actual workload patterns.
Talk to a JusDB engineer to review your Redis persistence strategy, or start a free trial to see how JusDB manages persistence for production Redis deployments.