Debezium 3.0 turns your database's transaction log into a real-time event stream. Every INSERT, UPDATE, and DELETE becomes a Kafka message with full before-and-after row images — without a single line of trigger code. Here's how to deploy Debezium for production CDC and avoid the configuration mistakes that cause data loss.
- Debezium captures changes from MySQL binlog, PostgreSQL WAL, and MongoDB oplog without polling
- Debezium 3.0 ships native Debezium Server (no Kafka required) and improved exactly-once delivery
- The offset storage topic is the single most critical configuration — losing it means re-snapshotting your entire table
- Use heartbeat events to detect replication lag before your WAL/binlog is purged
How Debezium CDC Works
Debezium connects to your database as a replica. For PostgreSQL, it uses a replication slot to receive WAL changes. For MySQL, it reads the binlog. For MongoDB, it tails the oplog. Each change event is published to a Kafka topic with the table name as the topic key, including the full before and after row state.
PostgreSQL CDC with Debezium 3.0
{
"name": "pg-orders-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "postgres.internal",
"database.port": "5432",
"database.user": "debezium",
"database.password": "dbz_password",
"database.dbname": "appdb",
"topic.prefix": "pg.appdb",
"table.include.list": "public.orders,public.products",
"plugin.name": "pgoutput",
"slot.name": "debezium_slot",
"publication.name": "debezium_pub",
"heartbeat.interval.ms": "10000",
"heartbeat.action.query": "INSERT INTO public.heartbeat (id, ts) VALUES (1, NOW()) ON CONFLICT (id) DO UPDATE SET ts = NOW()",
"snapshot.mode": "initial",
"offset.storage": "org.apache.kafka.connect.storage.KafkaOffsetBackingStore",
"offset.storage.topic": "debezium_offsets",
"offset.flush.interval.ms": "5000"
}
}Set Up PostgreSQL for Debezium
-- postgresql.conf
-- wal_level = logical (required)
-- max_wal_senders = 10
-- max_replication_slots = 10
-- Create replication user
CREATE USER debezium REPLICATION LOGIN PASSWORD 'dbz_password';
GRANT CONNECT ON DATABASE appdb TO debezium;
GRANT USAGE ON SCHEMA public TO debezium;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO debezium;
-- Create publication for specific tables
CREATE PUBLICATION debezium_pub FOR TABLE orders, products;
-- Verify logical replication is enabled
SELECT name, setting FROM pg_settings WHERE name = 'wal_level';MySQL CDC with Debezium 3.0
{
"name": "mysql-orders-connector",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.hostname": "mysql.internal",
"database.port": "3306",
"database.user": "debezium",
"database.password": "dbz_password",
"database.server.id": "184054",
"topic.prefix": "mysql.appdb",
"database.include.list": "appdb",
"table.include.list": "appdb.orders,appdb.products",
"include.schema.changes": "true",
"snapshot.mode": "initial",
"snapshot.locking.mode": "minimal",
"gtid.source.includes": "",
"binlog.buffer.size": "1048576"
}
}Debezium 3.0: What's New
Debezium Server (No Kafka Required)
# debezium-server/conf/application.properties
# Route CDC events directly to HTTP, Redis, or Kinesis -- no Kafka cluster needed
debezium.source.connector.class=io.debezium.connector.postgresql.PostgresConnector
debezium.source.database.hostname=postgres.internal
debezium.source.database.dbname=appdb
debezium.source.topic.prefix=pg.appdb
debezium.source.plugin.name=pgoutput
# Sink: send to HTTP endpoint
debezium.sink.type=http
debezium.sink.http.url=https://api.internal/cdc-events
debezium.sink.http.timeout.ms=5000
# Offset storage (local file for simple deployments)
debezium.source.offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore
debezium.source.offset.storage.file.filename=/data/offsets.datPostgreSQL replication slots accumulate WAL on disk when the consumer falls behind. A paused Debezium connector can cause your disk to fill completely. Always monitor pg_replication_slots.confirmed_flush_lsn lag and set max_slot_wal_keep_size in postgresql.conf to cap WAL accumulation.
Monitoring Debezium Health
# Debezium exposes JMX metrics -- check via Kafka Connect REST API
curl http://kafka-connect:8083/connectors/pg-orders-connector/status | jq .
# Key metrics to monitor:
# - connector.status: RUNNING / PAUSED / FAILED
# - tasks[0].state: RUNNING / FAILED
# - debezium.postgres.MilliSecondsBehindSource: replication lag
# PostgreSQL: monitor replication slot lag
SELECT slot_name,
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS lag_bytes,
active
FROM pg_replication_slots
WHERE slot_name = 'debezium_slot';- Debezium 3.0's native server mode eliminates the Kafka dependency for teams that need CDC to HTTP, Redis, or Kinesis.
- The offset storage topic is the most critical component — losing it forces a full database snapshot which can take hours.
- Always set
max_slot_wal_keep_sizeon PostgreSQL to prevent a paused Debezium connector from filling your disk with WAL. - Heartbeat events detect when your CDC pipeline is alive but the source database has no activity — essential for low-traffic tables.
Working with JusDB on CDC and Debezium
JusDB designs and deploys Change Data Capture pipelines using Debezium for teams building real-time analytics, search index synchronization, and event-driven architectures. We configure replication slots, offset storage, and monitoring to ensure your CDC pipeline handles database failovers without data loss.
Explore JusDB CDC Services → | Talk to a DBA
Related reading: