Valkey Cache Invalidation & Stampede Fix with Debezium CDC

A SaaS team running MySQL behind a Valkey caching layer notices something alarming during a flash sale: a single product-price UPDATE invalidates the cache key, 1,200 concurrent requests miss the cache simultaneously, and every one of them hammers MySQL with the same SELECT. The primary's CPU spikes to 97%, p99 latency goes from 4ms to 2.8 seconds, and the checkout queue stalls. The root cause isn't the database or the cache — it's the gap between them. Application-driven cache invalidation is inherently racy: you can't guarantee the cache is updated atomically with the database commit. Debezium's Change Data Capture (CDC) engine closes that gap by streaming row-level changes directly from MySQL's binlog to Valkey, eliminating stale reads and stampede conditions without touching application code.

TL;DR

Cache invalidation and cache stampede are two distinct but related problems — invalidation causes stale reads; stampede causes thundering-herd spikes on the database after invalidation.
Application-level DELETE-on-write invalidation is racy: the window between DB commit and cache eviction allows stale reads and concurrent recomputation.
Debezium's embedded engine captures MySQL binlog events and pushes them to Valkey in near-real-time (~50–200ms latency), removing application code from the invalidation path entirely.
Combine CDC-driven invalidation with probabilistic early expiration (XFetch) or mutex locking to eliminate stampede on high-traffic keys.
The Debezium 3.x AsyncEmbeddedEngine runs as a library inside your service — no Kafka cluster required for single-instance deployments.
This pattern works with MySQL, PostgreSQL, and MongoDB as CDC sources, with Valkey or Redis as the cache target.

What Is Cache Invalidation and Why Is It Hard?

Cache invalidation is the process of removing or updating a cached entry when the underlying source data changes. Phil Karlton's famous quip — "There are only two hard things in Computer Science: cache invalidation and naming things" — endures because the problem is fundamentally about distributed consistency. Your database is the source of truth; the cache is a stale copy by definition. The question is how stale you can tolerate and how much infrastructure you'll build to close the gap.

The standard application-level pattern is cache-aside (lazy-loading): the app checks the cache on read, queries the database on miss, and writes the result back to cache. On write, the app either deletes the cache key (invalidate-on-write) or writes the new value to both the database and cache (write-through). Both approaches have a race condition window:

text

Timeline of a stale-read race:

T1: App instance A commits UPDATE price=19.99 to MySQL
T2: App instance B reads cache → still sees price=24.99 (stale)
T3: App instance A issues DEL product:123 to Valkey
T4: App instance B writes price=24.99 back to cache (!!!)
T5: All subsequent reads see price=24.99 until TTL expires

Between T1 and T3 — typically 1–15ms in well-tuned systems, but potentially seconds under load — every read returns stale data. Worse, as shown at T4, a concurrent reader can re-populate the cache with the old value after the invalidation, creating a permanently stale entry until the TTL kicks in.

What Is a Cache Stampede?

A cache stampede (also called thundering herd) occurs when a popular cache key expires or is invalidated, and many concurrent requests simultaneously attempt to regenerate it by hitting the database. If 500 threads find the key missing at the same instant, all 500 issue the same expensive query. The database gets 500x the expected load for that key, response times spike, connection pools saturate, and the resulting timeouts cascade upstream.

Three factors determine stampede severity:

Factor	Low Risk	High Risk
Request concurrency per key	<10 req/s	>500 req/s
Query cost to regenerate	<5ms	>200ms (joins, aggregations)
TTL uniformity	Jittered TTLs	All keys expire at the same second

Three-phase stampede: healthy cache → single invalidation → 1,200 concurrent misses collapse the origin database.

Traditional mitigations — mutex locking, probabilistic early expiration, request coalescing — all operate at the application or cache layer. They reduce stampede impact but don't solve the root cause: the cache doesn't know when the database changed.

How Debezium CDC Solves Both Problems

Change Data Capture flips the invalidation model from push-from-app to pull-from-binlog. Instead of the application telling the cache "I changed this row," Debezium reads MySQL's binary log (the authoritative change stream) and emits structured events for every INSERT, UPDATE, and DELETE. A lightweight consumer receives these events and writes or evicts the corresponding Valkey keys.

Architecture Overview

Debezium tails MySQL's binlog and writes SET/DEL events to Valkey — the application's write path never touches cache invalidation logic.

This architecture provides three guarantees that application-level invalidation cannot:

No missed changes: Every committed transaction appears in the binlog — there is no code path that "forgets" to invalidate.
Ordering: Binlog events arrive in commit order, so the cache always converges to the latest state.
Decoupled latency: The application write path never touches cache invalidation logic — the CDC pipeline handles it asynchronously in ~50–200ms.

Why Valkey Instead of Redis?

Valkey is the Linux Foundation's community-driven fork of Redis, created after Redis Ltd. changed to a dual-license model in March 2024. Valkey 8.x is wire-compatible with Redis 7.2, supports the same data structures and commands, and is fully open-source under the BSD-3-Clause license. For new deployments that require an open-source cache, Valkey is the default choice — and AWS ElastiCache now defaults to Valkey as its engine.

Setting Up Debezium Embedded Engine with MySQL and Valkey

For single-instance deployments, Debezium's embedded engine runs as a library inside your JVM service — no Kafka cluster required. This is ideal for services that own their own cache layer and don't need cross-service event distribution.

Step 1: Add Maven Dependencies

xml

<properties>
    <debezium.version>3.1.0.Final</debezium.version>
</properties>

<dependencies>
    <!-- Debezium embedded engine (AsyncEmbeddedEngine in 3.x) -->
    <dependency>
        <groupId>io.debezium</groupId>
        <artifactId>debezium-api</artifactId>
        <version>${debezium.version}</version>
    </dependency>
    <dependency>
        <groupId>io.debezium</groupId>
        <artifactId>debezium-embedded</artifactId>
        <version>${debezium.version}</version>
    </dependency>
    <dependency>
        <groupId>io.debezium</groupId>
        <artifactId>debezium-connector-mysql</artifactId>
        <version>${debezium.version}</version>
    </dependency>

    <!-- Valkey/Redis client -->
    <dependency>
        <groupId>io.valkey</groupId>
        <artifactId>valkey-java</artifactId>
        <version>6.0.0</version>
    </dependency>
</dependencies>

Step 2: Enable MySQL Binlog for CDC

Debezium requires MySQL's row-based binary logging and a dedicated replication user:

sql

-- my.cnf (or SET GLOBAL for testing)
-- server-id        = 1
-- log_bin          = mysql-bin
-- binlog_format    = ROW
-- binlog_row_image = FULL
-- gtid_mode        = ON
-- enforce_gtid_consistency = ON

-- Verify binlog is active
SHOW VARIABLES LIKE 'log_bin';          -- ON
SHOW VARIABLES LIKE 'binlog_format';    -- ROW

-- Create a dedicated CDC user with replication privileges
CREATE USER 'debezium_cdc'@'%' IDENTIFIED BY 'strong_password_here';
GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT
  ON *.* TO 'debezium_cdc'@'%';
FLUSH PRIVILEGES;

Warning

Never use binlog_row_image = MINIMAL with Debezium — it omits unchanged columns from UPDATE events, breaking cache value reconstruction. Always use FULL.

Step 3: Configure and Start the Debezium Engine

java

import io.debezium.engine.ChangeEvent;
import io.debezium.engine.DebeziumEngine;
import io.debezium.engine.format.Json;

import java.util.Properties;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class ValkeyCdcInvalidator {

    public static DebeziumEngine<ChangeEvent<String, String>> createEngine() {
        Properties props = new Properties();

        // Engine settings
        props.setProperty("name", "valkey-cache-invalidator");
        props.setProperty("connector.class",
            "io.debezium.connector.mysql.MySqlConnector");

        // Offset storage (file-based for single instance)
        props.setProperty("offset.storage",
            "org.apache.kafka.connect.storage.FileOffsetBackingStore");
        props.setProperty("offset.storage.file.filename",
            "/var/lib/debezium/offsets.dat");
        props.setProperty("offset.flush.interval.ms", "10000");

        // Schema history
        props.setProperty("schema.history.internal",
            "io.debezium.storage.file.history.FileSchemaHistory");
        props.setProperty("schema.history.internal.file.filename",
            "/var/lib/debezium/schema-history.dat");

        // MySQL connection
        props.setProperty("database.hostname", "mysql-primary.internal");
        props.setProperty("database.port", "3306");
        props.setProperty("database.user", "debezium_cdc");
        props.setProperty("database.password", "strong_password_here");
        props.setProperty("database.server.id", "85744");
        props.setProperty("topic.prefix", "ecommerce");

        // Capture only the tables you cache
        props.setProperty("table.include.list",
            "ecommerce.products,ecommerce.inventory,ecommerce.pricing");

        // Skip initial snapshot if cache is warm
        props.setProperty("snapshot.mode", "no_data");

        return DebeziumEngine.create(Json.class)
            .using(props)
            .notifying(new ValkeyCacheConsumer())
            .build();
    }

    public static void main(String[] args) {
        ExecutorService executor = Executors.newSingleThreadExecutor();
        executor.execute(createEngine());
    }
}

Step 4: Write the Valkey Cache Consumer

java

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import io.debezium.engine.ChangeEvent;
import io.debezium.engine.DebeziumEngine;
import io.valkey.JedisPooled;

import java.util.List;

public class ValkeyCacheConsumer implements
        DebeziumEngine.ChangeConsumer<ChangeEvent<String, String>> {

    private final JedisPooled valkey = new JedisPooled("valkey.internal", 6379);
    private static final int CACHE_TTL_SECONDS = 3600;

    @Override
    public void handleBatch(
            List<ChangeEvent<String, String>> records,
            DebeziumEngine.RecordCommitter<ChangeEvent<String, String>> committer)
            throws InterruptedException {

        for (ChangeEvent<String, String> record : records) {
            if (record.value() == null) {
                // Tombstone event (hard delete) — skip or handle
                committer.markProcessed(record);
                continue;
            }

            JsonObject payload = JsonParser.parseString(record.value())
                .getAsJsonObject().getAsJsonObject("payload");

            String operation = payload.get("op").getAsString();
            String table = payload.getAsJsonObject("source")
                .get("table").getAsString();

            switch (operation) {
                case "c": // INSERT — warm the cache proactively
                case "u": // UPDATE — refresh the cached value
                    JsonObject after = payload.getAsJsonObject("after");
                    String id = after.get("id").getAsString();
                    String cacheKey = table + ":" + id;

                    // Write the full row as JSON to Valkey
                    valkey.setex(cacheKey, CACHE_TTL_SECONDS,
                        after.toString());
                    break;

                case "d": // DELETE — evict from cache
                    JsonObject before = payload.getAsJsonObject("before");
                    String deletedId = before.get("id").getAsString();
                    valkey.del(table + ":" + deletedId);
                    break;
            }

            committer.markProcessed(record);
        }
        committer.markBatchFinished();
    }
}

Tip

Use setex (SET with TTL) instead of plain SET even for CDC-driven writes. The TTL acts as a safety net — if the CDC pipeline stalls, keys auto-expire rather than serving indefinitely stale data.

Preventing Cache Stampede with CDC + XFetch

CDC-driven invalidation eliminates stale reads, but it doesn't fully prevent stampede on ultra-hot keys. When Debezium pushes a new value to Valkey, the key is always populated — but there's still a ~50–200ms window where the old value was evicted and the new one hasn't arrived. For keys receiving 1,000+ reads/second, that window produces a stampede.

The solution is to combine CDC with probabilistic early expiration (XFetch):

XFetch Algorithm

python

# Pseudocode: XFetch with CDC-aware TTL
import random, math, time

def xfetch_get(valkey_client, key, recompute_fn, ttl=3600, beta=1.0):
    """
    Read from cache with probabilistic early recomputation.
    Beta controls eagerness: higher = more aggressive refresh.
    """
    raw = valkey_client.get(key)
    if raw is None:
        # Hard miss — single-caller recompute with mutex
        return recompute_with_lock(valkey_client, key, recompute_fn, ttl)

    value, delta, expiry = decode_xfetch(raw)  # stored as JSON envelope
    now = time.time()

    # Probabilistic early refresh: as we approach expiry,
    # the probability of triggering a background refresh increases
    gap = expiry - now
    threshold = delta * beta * -math.log(random.random())

    if gap < threshold:
        # This request "volunteers" to refresh — others keep using cached value
        refresh_in_background(valkey_client, key, recompute_fn, ttl)

    return value

Combining CDC + XFetch + Mutex

The optimal strategy uses all three layers together:

Layer	Handles	Mechanism
Debezium CDC	Data freshness	Binlog → Valkey push (primary invalidation)
XFetch	Hot-key protection	Probabilistic early refresh before TTL
Mutex lock	Hard-miss stampede	Only one caller recomputes on a full miss

Three-layer defense: XFetch absorbs TTL expiry, a mutex contains hard-miss stampede, and Debezium CDC keeps Valkey fresh so the first two layers rarely fire.

python

# Mutex-based recompute for hard cache misses
def recompute_with_lock(valkey_client, key, recompute_fn, ttl):
    lock_key = f"lock:{key}"
    acquired = valkey_client.set(lock_key, "1", nx=True, ex=10)

    if acquired:
        try:
            value = recompute_fn()
            store_xfetch(valkey_client, key, value, ttl)
            return value
        finally:
            valkey_client.delete(lock_key)
    else:
        # Another caller is recomputing — wait briefly, then retry
        time.sleep(0.05)
        return xfetch_get(valkey_client, key, recompute_fn, ttl)

Important

Set the mutex lock TTL (the ex=10 parameter) to at least 2x your worst-case recomputation time. If the lock holder crashes without releasing, the lock auto-expires and another caller can take over.

Kafka Connect Deployment for Multi-Service Architectures

The embedded engine works well for a single service, but if multiple services need the same CDC stream — or you need at-least-once delivery guarantees with replay — deploy Debezium as a Kafka Connect source connector.

Connector Configuration

json

{
  "name": "ecommerce-mysql-cdc",
  "config": {
    "connector.class": "io.debezium.connector.mysql.MySqlConnector",
    "database.hostname": "mysql-primary.internal",
    "database.port": "3306",
    "database.user": "debezium_cdc",
    "database.password": "strong_password_here",
    "database.server.id": "85744",
    "topic.prefix": "ecommerce",
    "table.include.list": "ecommerce.products,ecommerce.inventory",
    "database.history.kafka.bootstrap.servers": "kafka:9092",
    "database.history.kafka.topic": "schema-changes.ecommerce",
    "snapshot.mode": "no_data",
    "transforms": "route",
    "transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
    "transforms.route.regex": "([^.]+)\\.([^.]+)\\.([^.]+)",
    "transforms.route.replacement": "cdc.$3"
  }
}

Valkey Sink Consumer (Kafka Consumer Group)

java

// Kafka consumer that writes CDC events to Valkey
@KafkaListener(topics = "cdc.products", groupId = "valkey-cache-writer")
public void onProductChange(ConsumerRecord<String, String> record) {
    JsonObject event = JsonParser.parseString(record.value())
        .getAsJsonObject().getAsJsonObject("payload");

    String op = event.get("op").getAsString();
    if ("d".equals(op)) {
        String id = event.getAsJsonObject("before").get("id").getAsString();
        valkey.del("products:" + id);
    } else {
        JsonObject after = event.getAsJsonObject("after");
        String id = after.get("id").getAsString();
        valkey.setex("products:" + id, 3600, after.toString());
    }
}

Performance Tuning and Production Considerations

Latency Budget

The end-to-end latency from MySQL commit to Valkey write depends on three components:

Stage	Typical Latency	Tuning Lever
MySQL binlog flush	1–5ms	`sync_binlog=1` (don't change)
Debezium poll + transform	10–100ms	`poll.interval.ms`, batch size
Valkey write	0.1–1ms	Pipeline mode for batches
Total	~50–200ms

For most read-heavy workloads, 50–200ms of eventual consistency is acceptable. If you need stronger guarantees, consider write-through caching for the critical-path tables and CDC for everything else.

Offset Storage for Durability

In production, use Kafka-based offset storage or a database-backed store rather than file-based offsets. If the offset file is lost (container restart, disk failure), Debezium will re-snapshot the entire database:

java

// Use Kafka offset storage in production
props.setProperty("offset.storage",
    "org.apache.kafka.connect.storage.KafkaOffsetBackingStore");
props.setProperty("offset.storage.topic", "debezium-offsets");
props.setProperty("offset.storage.partitions", "1");
props.setProperty("offset.storage.replication.factor", "3");

Monitoring the Pipeline

sql

-- Monitor MySQL replication lag for the CDC slot
SHOW SLAVE STATUS\G
-- Key metrics: Seconds_Behind_Master, Relay_Log_Space

-- Valkey-side: check memory and key count
valkey-cli INFO memory
valkey-cli INFO keyspace
valkey-cli DBSIZE

Tip

Export Debezium JMX metrics (debezium.metrics.*) to Prometheus. Key alerts: MilliSecondsBehindSource > 5000 (CDC lag), NumberOfDisconnects > 0 (connection drops), and QueueRemainingCapacity < 100 (back-pressure).

Cache Invalidation Strategy Comparison

Strategy	Stale Reads	Stampede Risk	Infra Complexity	App Code Changes
TTL-only	Up to TTL duration	High	None	None
App-level invalidate-on-write	Race window (1–15ms)	High	None	Every write path
CDC + Debezium (this guide)	~50–200ms	Low (with XFetch)	Debezium engine	None
Write-through + CDC hybrid	None (for critical keys)	None	Debezium + app logic	Critical paths only

Key Takeaways

Use Debezium CDC to decouple cache invalidation from application code — the binlog is the single source of truth, not your app's write path.
Deploy the embedded AsyncEmbeddedEngine (Debezium 3.x) for single-service setups; use Kafka Connect when multiple consumers need the same change stream.
Always set binlog_format=ROW and binlog_row_image=FULL on MySQL — MINIMAL breaks cache value reconstruction.
Combine CDC push with XFetch probabilistic early refresh and mutex locking for a three-layer defense against stale reads and stampede.
Set a TTL on every CDC-written key as a safety net — if the pipeline stalls, keys expire rather than serving indefinitely stale data.
Monitor MilliSecondsBehindSource in Debezium JMX metrics and alert at >5 seconds to catch pipeline lag before it becomes user-visible.

Working with JusDB on Valkey and Debezium CDC

JusDB manages CDC pipelines and caching infrastructure for engineering teams who need production-grade cache consistency without the operational overhead. Our DBAs handle MySQL binlog configuration, Debezium connector deployment, Valkey cluster tuning, and 24/7 pipeline monitoring — so your team ships features instead of debugging stale cache entries at 2 AM.

Explore JusDB Valkey Consulting → | Explore JusDB Debezium Services → | Talk to a DBA

Related reading:

Solving Cache Invalidation and Stampede in Valkey with Debezium CDC

What Is Cache Invalidation and Why Is It Hard?

What Is a Cache Stampede?

How Debezium CDC Solves Both Problems

Architecture Overview

Why Valkey Instead of Redis?

Setting Up Debezium Embedded Engine with MySQL and Valkey

Step 1: Add Maven Dependencies

Step 2: Enable MySQL Binlog for CDC

Step 3: Configure and Start the Debezium Engine

Step 4: Write the Valkey Cache Consumer

Preventing Cache Stampede with CDC + XFetch

XFetch Algorithm

Combining CDC + XFetch + Mutex

Kafka Connect Deployment for Multi-Service Architectures

Connector Configuration

Valkey Sink Consumer (Kafka Consumer Group)

Performance Tuning and Production Considerations

Latency Budget

Offset Storage for Durability

Monitoring the Pipeline

Cache Invalidation Strategy Comparison

Working with JusDB on Valkey and Debezium CDC

Share this article

JusDB Team

Keep reading

Normalization vs Denormalization (2026): When Each Wins for Database Performance

Database Schema Design for Scalability (2026): 12 Patterns DBAs Use in Production

Multi-Tenancy Database Patterns: Schema, Row, and Database-Level Isolation

Need Expert Help?

Database Performance Optimization

Debezium CDC Services

Flink CDC Services

Apache SeaTunnel

MySQL Performance Tuning

MySQL Consulting