MySQL

Diagnosing and Fixing MySQL Binlog Replication Failures

MySQL replication errors 1062 and 1032 stop your replica SQL thread silently. Learn to diagnose binlog failures, fix data drift with pt-table-sync, and configure GTID-safe replication.

JusDB Team
July 15, 2025
10 min read
135 views

Your replica SQL thread stopped at 3:47 AM with error 1062: Duplicate entry. By the time ops gets the alert, the replica is 40 minutes behind primary and your read traffic is hammering the primary directly. MySQL binlog replication failures are predictable — and mostly preventable with the right configuration and monitoring.

TL;DR
  • Most replication errors fall into three categories: duplicate keys, missing rows, and data type mismatches
  • GTID-based replication (MySQL 8.4) eliminates position-tracking bugs but requires clean setup
  • Use STOP REPLICA; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START REPLICA; only as a last resort
  • pt-table-checksum + pt-table-sync are the right tools for data drift repair

Understanding MySQL Binlog Replication

MySQL replication works by having the primary write all data changes to a binary log (binlog). The replica's IO thread connects to the primary, reads the binlog, and writes events to its own relay log. The SQL thread reads the relay log and replays events against the replica's data.

Replication Formats

sql
-- Check current binlog format
SHOW VARIABLES LIKE 'binlog_format';

-- MySQL 8.0+ default and recommended:
SET GLOBAL binlog_format = 'ROW';
SET GLOBAL binlog_row_image = 'FULL';

Diagnosing Replication Failures

Step 1: Read the Error Clearly

sql
-- On the replica:
SHOW REPLICA STATUS\G

-- Key fields to check:
-- Replica_IO_Running: Yes/No
-- Replica_SQL_Running: Yes/No
-- Last_SQL_Error: error message
-- Seconds_Behind_Source: lag in seconds

Step 2: Identify the Diverged Row

bash
# Use mysqlbinlog to inspect the failing event
mysqlbinlog --base64-output=DECODE-ROWS -v \
  --start-position=POSITION \
  /var/lib/mysql/binlog.000042 | head -100

Common Error Patterns and Fixes

Error CodeError MessageRoot CauseFix
1062Duplicate entry for key PRIMARYRow exists on replica but not primaryDelete conflicting row on replica
1032Can't find record in tableRow was deleted on replica before UPDATE arrivedRe-insert the expected row or skip
1146Table doesn't existDDL ran on primary before replica was readyRe-run DDL on replica manually
1236Could not find first log filePrimary binlog purged before replica caught upRebuild replica from fresh dump

Fixing Replication Errors

Fix Error 1062: Duplicate Entry

sql
-- On the replica: identify the conflicting row
SELECT * FROM orders WHERE id = 12345;

-- Option 1: Delete the conflicting row and let replication re-insert it
DELETE FROM orders WHERE id = 12345;
START REPLICA;

-- Option 2: Skip ONE event (non-GTID mode only)
STOP REPLICA;
SET GLOBAL SLAVE_SKIP_COUNTER = 1;
START REPLICA;
Important

Never use SLAVE_SKIP_COUNTER in GTID mode — it's silently ignored and you'll skip the wrong transaction. In GTID mode, inject an empty transaction instead.

Fix Error 1062 in GTID Mode

sql
-- Inject a no-op transaction to skip the failing GTID
STOP REPLICA;
SET GTID_NEXT = 'abc123:100';
BEGIN; COMMIT;
SET GTID_NEXT = 'AUTOMATIC';
START REPLICA;

Repair Data Drift with pt-table-sync

bash
# Step 1: Find diverged tables with pt-table-checksum
pt-table-checksum \
  --user=checker --password=secret \
  --host=primary.db.internal \
  --replicate=percona.checksums \
  --databases=myapp

# Step 2: Sync diverged rows
pt-table-sync \
  --execute \
  --sync-to-master \
  --user=syncer --password=secret \
  h=replica.db.internal,D=myapp,t=orders
Warning

pt-table-sync --execute modifies data directly. Always run with --dry-run first and review the output before executing in production.

Preventing Replication Failures

ini
[mysqld]
gtid_mode = ON
enforce_gtid_consistency = ON
relay_log_recovery = ON
relay_log_info_repository = TABLE
master_info_repository = TABLE
binlog_format = ROW
binlog_row_image = FULL
read_only = ON
super_read_only = ON
Key Takeaways
  • Enable GTID mode from the start — it eliminates position-tracking errors and simplifies failover with tools like Orchestrator or MHA.
  • Use ROW-based binlog format with binlog_row_image=FULL to ensure replicas get complete before/after row images.
  • Run pt-table-checksum weekly to detect data drift before it causes replication errors.
  • Never skip errors with SLAVE_SKIP_COUNTER in GTID mode — inject empty transactions instead.

Working with JusDB on MySQL Replication

JusDB manages MySQL replication topologies for engineering teams who can't afford replication failures during peak traffic. Our DBAs configure GTID-based replication, set up automated drift detection, and provide 24/7 incident response.

Explore JusDB MySQL Management →  |  Talk to a DBA

Related reading:

Share this article

JusDB Team

Official JusDB content team