Database Engineering

Debezium 3.0: Real-Time CDC from PostgreSQL, MySQL, and MongoDB

Debezium 3.0 turns your database transaction log into a real-time event stream with full before/after row images. Learn to configure CDC for PostgreSQL WAL and MySQL binlog, avoid the offset storage trap, and monitor replication lag.

JusDB Team
October 28, 2025
10 min read
189 views

Debezium 3.0 turns your database's transaction log into a real-time event stream. Every INSERT, UPDATE, and DELETE becomes a Kafka message with full before-and-after row images — without a single line of trigger code. Here's how to deploy Debezium for production CDC and avoid the configuration mistakes that cause data loss.

TL;DR
  • Debezium captures changes from MySQL binlog, PostgreSQL WAL, and MongoDB oplog without polling
  • Debezium 3.0 ships native Debezium Server (no Kafka required) and improved exactly-once delivery
  • The offset storage topic is the single most critical configuration — losing it means re-snapshotting your entire table
  • Use heartbeat events to detect replication lag before your WAL/binlog is purged

How Debezium CDC Works

Debezium connects to your database as a replica. For PostgreSQL, it uses a replication slot to receive WAL changes. For MySQL, it reads the binlog. For MongoDB, it tails the oplog. Each change event is published to a Kafka topic with the table name as the topic key, including the full before and after row state.

PostgreSQL CDC with Debezium 3.0

json
{
  "name": "pg-orders-connector",
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "database.hostname": "postgres.internal",
    "database.port": "5432",
    "database.user": "debezium",
    "database.password": "dbz_password",
    "database.dbname": "appdb",
    "topic.prefix": "pg.appdb",
    "table.include.list": "public.orders,public.products",
    "plugin.name": "pgoutput",
    "slot.name": "debezium_slot",
    "publication.name": "debezium_pub",
    "heartbeat.interval.ms": "10000",
    "heartbeat.action.query": "INSERT INTO public.heartbeat (id, ts) VALUES (1, NOW()) ON CONFLICT (id) DO UPDATE SET ts = NOW()",
    "snapshot.mode": "initial",
    "offset.storage": "org.apache.kafka.connect.storage.KafkaOffsetBackingStore",
    "offset.storage.topic": "debezium_offsets",
    "offset.flush.interval.ms": "5000"
  }
}

Set Up PostgreSQL for Debezium

sql
-- postgresql.conf
-- wal_level = logical  (required)
-- max_wal_senders = 10
-- max_replication_slots = 10

-- Create replication user
CREATE USER debezium REPLICATION LOGIN PASSWORD 'dbz_password';
GRANT CONNECT ON DATABASE appdb TO debezium;
GRANT USAGE ON SCHEMA public TO debezium;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO debezium;

-- Create publication for specific tables
CREATE PUBLICATION debezium_pub FOR TABLE orders, products;

-- Verify logical replication is enabled
SELECT name, setting FROM pg_settings WHERE name = 'wal_level';

MySQL CDC with Debezium 3.0

json
{
  "name": "mysql-orders-connector",
  "config": {
    "connector.class": "io.debezium.connector.mysql.MySqlConnector",
    "database.hostname": "mysql.internal",
    "database.port": "3306",
    "database.user": "debezium",
    "database.password": "dbz_password",
    "database.server.id": "184054",
    "topic.prefix": "mysql.appdb",
    "database.include.list": "appdb",
    "table.include.list": "appdb.orders,appdb.products",
    "include.schema.changes": "true",
    "snapshot.mode": "initial",
    "snapshot.locking.mode": "minimal",
    "gtid.source.includes": "",
    "binlog.buffer.size": "1048576"
  }
}

Debezium 3.0: What's New

Debezium Server (No Kafka Required)

properties
# debezium-server/conf/application.properties
# Route CDC events directly to HTTP, Redis, or Kinesis -- no Kafka cluster needed

debezium.source.connector.class=io.debezium.connector.postgresql.PostgresConnector
debezium.source.database.hostname=postgres.internal
debezium.source.database.dbname=appdb
debezium.source.topic.prefix=pg.appdb
debezium.source.plugin.name=pgoutput

# Sink: send to HTTP endpoint
debezium.sink.type=http
debezium.sink.http.url=https://api.internal/cdc-events
debezium.sink.http.timeout.ms=5000

# Offset storage (local file for simple deployments)
debezium.source.offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore
debezium.source.offset.storage.file.filename=/data/offsets.dat
Important

PostgreSQL replication slots accumulate WAL on disk when the consumer falls behind. A paused Debezium connector can cause your disk to fill completely. Always monitor pg_replication_slots.confirmed_flush_lsn lag and set max_slot_wal_keep_size in postgresql.conf to cap WAL accumulation.

Monitoring Debezium Health

bash
# Debezium exposes JMX metrics -- check via Kafka Connect REST API
curl http://kafka-connect:8083/connectors/pg-orders-connector/status | jq .

# Key metrics to monitor:
# - connector.status: RUNNING / PAUSED / FAILED
# - tasks[0].state: RUNNING / FAILED
# - debezium.postgres.MilliSecondsBehindSource: replication lag

# PostgreSQL: monitor replication slot lag
SELECT slot_name,
       pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS lag_bytes,
       active
FROM pg_replication_slots
WHERE slot_name = 'debezium_slot';
Key Takeaways
  • Debezium 3.0's native server mode eliminates the Kafka dependency for teams that need CDC to HTTP, Redis, or Kinesis.
  • The offset storage topic is the most critical component — losing it forces a full database snapshot which can take hours.
  • Always set max_slot_wal_keep_size on PostgreSQL to prevent a paused Debezium connector from filling your disk with WAL.
  • Heartbeat events detect when your CDC pipeline is alive but the source database has no activity — essential for low-traffic tables.

Working with JusDB on CDC and Debezium

JusDB designs and deploys Change Data Capture pipelines using Debezium for teams building real-time analytics, search index synchronization, and event-driven architectures. We configure replication slots, offset storage, and monitoring to ensure your CDC pipeline handles database failovers without data loss.

Explore JusDB CDC Services →  |  Talk to a DBA

Related reading:

Share this article

JusDB Team

Official JusDB content team