Database SRE

Integrating MySQL Maintenance with systemd: Services, Timers, and Scheduling

Most MySQL maintenance tasks still rely on cron, but systemd timers offer better logging, dependency management, and failure handling. Here's how to convert your MySQL maintenance scripts to systemd services.

JusDB Team
July 10, 2022
8 min read
147 views

The cron job had been failing silently for eleven days. Nobody noticed until an on-call engineer, mid-incident, discovered the slow query log had not been rotated in almost two weeks and the /var/log/mysql partition was at 94% capacity. The backup verification script had also stopped running on day three — a permission error on the output directory that cron dutifully swallowed without a single notification. In production, silent failure is indistinguishable from success until the moment it becomes catastrophic. On Linux systems running MySQL, systemd offers a fundamentally better execution model for maintenance tasks: structured logging, dependency-aware scheduling, failure detection, and restart policies that cron was never designed to provide.

This guide shows you how to replace fragile cron-based MySQL maintenance with proper systemd service units and timers — including slow query log analysis, backup verification, and pre-flight dependency checks that block execution until MySQL is actually ready.

TL;DR
  • systemd timers replace cron for MySQL maintenance with better logging (journalctl), dependency management (After=mysqld.service), and failure alerting via OnFailure=.
  • Every maintenance task gets two files: a .service unit (what to run) and a .timer unit (when to run it). This separation makes it easy to test tasks manually without waiting for the schedule.
  • Use Type=oneshot for maintenance scripts that run and exit. Use RemainAfterExit=yes if you need the unit to show as active after completion.
  • The official mysqld.service unit is your dependency anchor — add After=mysqld.service and Requires=mysqld.service to any maintenance unit that must run against a live database.
  • journalctl -u mysql-slow-query-analysis.service --since today gives you full structured output for every run — timestamps, exit codes, stderr, and stdout — with no log rotation to manage.
  • Use OnCalendar= for human-readable schedules and RandomizedDelaySec= to avoid thundering herd when multiple timers fire at midnight.

Why systemd Over cron for MySQL Tasks

cron has been scheduling Unix tasks since 1975. It is available everywhere, configuration is simple, and every DBA knows how it works. It also has no native failure detection, sends output nowhere useful by default, provides no dependency ordering, and its logging story is "check your mailbox if mail is configured." For database maintenance tasks that need to be auditable, reliable, and observable, these are not minor inconveniences — they are fundamental design limitations.

systemd Advantages Over cron for Database Maintenance
  • Structured logging: All stdout and stderr from every run is captured by journald automatically. Query runs, errors, and row counts are all accessible via journalctl with filtering by unit, time range, and priority — no custom log file management required.
  • Dependency ordering: A systemd service can declare After=mysqld.service and Requires=mysqld.service, which prevents the maintenance task from starting if MySQL is down, mid-restart, or still initializing. cron fires regardless.
  • Failure handling: OnFailure= lets you trigger a notification unit when a maintenance service exits non-zero. You can alert on a backup verification failure before the next morning standup — not three incidents later.
  • Manual testing without schedule changes: Run a timer's service immediately with systemctl start mysql-backup-verify.service. No need to edit crontab, set a fake time, or wait for the next scheduled window.
  • Resource controls: Use CPUQuota=, MemoryMax=, and IOWeight= to limit how much the maintenance task can steal from a production MySQL process running on the same host.
  • Audit trail: Every execution is recorded with a monotonic timestamp in journald. Compliance and incident reviews can confirm exactly when a backup ran, what it output, and whether it succeeded.

The tradeoff is a small amount of additional configuration. Each task requires two files instead of one crontab line. For any MySQL maintenance task that matters — backup verification, slow query analysis, health checks, log rotation — the operational improvement justifies it without question.

MySQL systemd Service Unit (the official mysqld.service)

Before creating custom units, understand how MySQL itself is managed by systemd. When installed from official MySQL APT/YUM repositories or from your distribution's package manager, MySQL ships with a unit file that controls the database server process. Inspect it directly:

text
# On RPM-based systems (RHEL, CentOS, Rocky Linux, Fedora)
systemctl cat mysqld.service

# On DEB-based systems (Debian, Ubuntu)
systemctl cat mysql.service

A typical MySQL 8.0 unit on RHEL/Rocky Linux looks like this:

text
[Unit]
Description=MySQL Server
Documentation=man:mysqld(8)
Documentation=http://dev.mysql.com/doc/refman/en/using-systemd.html
After=network.target
After=syslog.target

[Install]
WantedBy=multi-user.target
Alias=mysql.service

[Service]
User=mysql
Group=mysql
Type=notify
ExecStartPre=/usr/bin/mysqld_pre_systemd
ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS
EnvironmentFile=-/etc/sysconfig/mysql
LimitNOFILE=10000
Restart=on-failure
RestartPreventExitStatus=1
TimeoutSec=0
PrivateTmp=false

Two things matter here for your maintenance units. First, the unit name is mysqld.service on RPM systems and mysql.service on DEB systems — your maintenance units need to reference the correct name in their After= and Requires= directives. Second, the official unit uses Type=notify, which means systemd waits for MySQL to send a readiness notification before marking it as active. This is important: when your maintenance service declares After=mysqld.service, systemd will not start your task until MySQL has actually signaled ready, not merely started the process.

Check which name is active on your system:

text
# Check which unit is active
systemctl is-active mysqld.service 2>/dev/null || systemctl is-active mysql.service

# Show full status including PID and uptime
systemctl status mysqld.service

Creating a systemd Timer for MySQL Maintenance

Every scheduled MySQL maintenance task needs two unit files placed in /etc/systemd/system/: a service unit defining what runs, and a timer unit defining when it runs. The service unit name must match the timer unit name — mysql-slow-query-analysis.service pairs with mysql-slow-query-analysis.timer.

Here is the minimal structure for a maintenance service unit:

text
# /etc/systemd/system/mysql-maintenance-example.service

[Unit]
Description=MySQL Maintenance Example Task
After=mysqld.service
Requires=mysqld.service

[Service]
Type=oneshot
User=mysql
Group=mysql
ExecStart=/usr/local/bin/mysql-maintenance-example.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=mysql-maintenance-example

# Prevent the task from consuming excessive CPU on a shared host
CPUQuota=25%
IOSchedulingClass=idle

And the corresponding timer unit:

text
# /etc/systemd/system/mysql-maintenance-example.timer

[Unit]
Description=Run MySQL Maintenance Example Task Daily at 02:00
Requires=mysql-maintenance-example.service

[Timer]
OnCalendar=*-*-* 02:00:00
RandomizedDelaySec=300
Persistent=true

[Install]
WantedBy=timers.target

Key configuration choices explained:

  • Type=oneshot is correct for maintenance scripts that run and exit. systemd tracks the service as active while the script runs, then moves it to inactive on exit.
  • Persistent=true in the timer means that if the system was off during the scheduled time (maintenance window, reboot), the timer fires immediately on next startup rather than waiting for the next occurrence. This is almost always correct for backup and log tasks.
  • RandomizedDelaySec=300 adds a random 0–300 second delay before firing. On servers running multiple maintenance timers at midnight, this prevents I/O storms.
  • OnCalendar= accepts human-readable formats: daily, weekly, Mon *-*-* 03:00:00, or full cron-equivalent expressions.

After creating both files, reload systemd and enable the timer:

text
systemctl daemon-reload
systemctl enable --now mysql-maintenance-example.timer

# Verify the timer is loaded and shows the next trigger time
systemctl list-timers --all | grep mysql

Example: Automated Slow Query Log Analysis

The MySQL slow query log is only useful if something processes it. Left unanalyzed, it grows without bound and provides no operational signal. The following setup runs pt-query-digest (from Percona Toolkit) nightly, writes a summary report, and alerts via systemd's OnFailure= mechanism if the analysis tool fails.

First, the analysis script at /usr/local/bin/mysql-slow-query-digest.sh:

text
#!/usr/bin/env bash
set -euo pipefail

SLOW_LOG="/var/log/mysql/mysql-slow.log"
REPORT_DIR="/var/lib/mysql-reports"
REPORT_FILE="${REPORT_DIR}/slow-query-$(date +%Y%m%d).txt"
DAYS_TO_KEEP=30

# Ensure the report directory exists
mkdir -p "${REPORT_DIR}"

# Verify the slow query log exists and is non-empty
if [[ ! -s "${SLOW_LOG}" ]]; then
    echo "Slow query log is empty or missing: ${SLOW_LOG}"
    exit 0
fi

echo "Running pt-query-digest on ${SLOW_LOG} at $(date -Iseconds)"

pt-query-digest \
    --limit=20 \
    --order-by=Query_time:sum \
    --report-format=header,query_report,profile \
    "${SLOW_LOG}" > "${REPORT_FILE}"

echo "Report written to ${REPORT_FILE}"
echo "Top queries by cumulative execution time:"
head -80 "${REPORT_FILE}"

# Rotate old reports
find "${REPORT_DIR}" -name "slow-query-*.txt" -mtime +${DAYS_TO_KEEP} -delete
echo "Rotated reports older than ${DAYS_TO_KEEP} days"
text
chmod +x /usr/local/bin/mysql-slow-query-digest.sh

The service unit:

text
# /etc/systemd/system/mysql-slow-query-analysis.service

[Unit]
Description=MySQL Slow Query Log Analysis (pt-query-digest)
Documentation=https://docs.percona.com/percona-toolkit/pt-query-digest.html
After=mysqld.service
Wants=mysqld.service

[Service]
Type=oneshot
User=mysql
Group=mysql
ExecStart=/usr/local/bin/mysql-slow-query-digest.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=mysql-slow-query-analysis

# Constrain resource use during analysis on production hosts
CPUQuota=40%
MemoryMax=512M
IOSchedulingClass=idle
IOSchedulingPriority=7

# Alert on failure (requires mysql-maintenance-alert@.service to exist)
OnFailure=mysql-maintenance-alert@%n.service

The timer, set to run every night at 01:30 AM:

text
# /etc/systemd/system/mysql-slow-query-analysis.timer

[Unit]
Description=Nightly MySQL Slow Query Log Analysis
Requires=mysql-slow-query-analysis.service

[Timer]
OnCalendar=*-*-* 01:30:00
RandomizedDelaySec=120
Persistent=true

[Install]
WantedBy=timers.target

After deploying:

text
systemctl daemon-reload
systemctl enable --now mysql-slow-query-analysis.timer

# Run immediately to test without waiting for the scheduled time
systemctl start mysql-slow-query-analysis.service

# Watch output in real time
journalctl -u mysql-slow-query-analysis.service -f

Example: Daily Backup Verification with systemd

Running backups without verifying them is operational theater. This example creates a service that verifies the most recent mysqldump backup can be parsed without errors, checks the backup file is recent enough (not older than 26 hours), and logs a structured result that monitoring systems can scrape.

The verification script at /usr/local/bin/mysql-backup-verify.sh:

text
#!/usr/bin/env bash
set -euo pipefail

BACKUP_DIR="/var/backups/mysql"
MAX_AGE_HOURS=26
MYSQL_USER="backup_verify"
MYSQL_DEFAULTS_FILE="/etc/mysql/backup-verify.cnf"

echo "=== MySQL Backup Verification ==="
echo "Timestamp: $(date -Iseconds)"
echo "Backup directory: ${BACKUP_DIR}"

# Find the most recent backup file
LATEST_BACKUP=$(find "${BACKUP_DIR}" -name "*.sql.gz" -printf '%T@ %p\n' \
    | sort -n | tail -1 | awk '{print $2}')

if [[ -z "${LATEST_BACKUP}" ]]; then
    echo "ERROR: No backup files found in ${BACKUP_DIR}"
    exit 1
fi

echo "Latest backup: ${LATEST_BACKUP}"

# Check backup age
BACKUP_AGE_SECONDS=$(( $(date +%s) - $(stat -c %Y "${LATEST_BACKUP}") ))
BACKUP_AGE_HOURS=$(( BACKUP_AGE_SECONDS / 3600 ))
echo "Backup age: ${BACKUP_AGE_HOURS} hours"

if (( BACKUP_AGE_HOURS > MAX_AGE_HOURS )); then
    echo "ERROR: Backup is ${BACKUP_AGE_HOURS} hours old — exceeds limit of ${MAX_AGE_HOURS} hours"
    exit 2
fi

# Check backup file integrity (gzip test)
echo "Testing backup file integrity..."
if ! gzip --test "${LATEST_BACKUP}"; then
    echo "ERROR: Backup file failed gzip integrity check"
    exit 3
fi

# Parse SQL structure — look for the expected dump header and completion marker
echo "Validating SQL dump structure..."
DUMP_HEADER=$(zcat "${LATEST_BACKUP}" | head -5)
if ! echo "${DUMP_HEADER}" | grep -q "MySQL dump"; then
    echo "ERROR: Backup does not appear to be a valid mysqldump file"
    exit 4
fi

LAST_LINE=$(zcat "${LATEST_BACKUP}" | tail -3)
if ! echo "${LAST_LINE}" | grep -q "Dump completed"; then
    echo "ERROR: Backup does not contain completion marker — dump may be truncated"
    exit 5
fi

BACKUP_SIZE=$(du -sh "${LATEST_BACKUP}" | cut -f1)
echo "Backup size: ${BACKUP_SIZE}"
echo "STATUS: OK — backup is valid, complete, and within age threshold"
exit 0
text
chmod +x /usr/local/bin/mysql-backup-verify.sh

The service unit:

text
# /etc/systemd/system/mysql-backup-verify.service

[Unit]
Description=MySQL Daily Backup Verification
After=mysqld.service
Wants=mysqld.service

[Service]
Type=oneshot
User=mysql
Group=mysql
ExecStart=/usr/local/bin/mysql-backup-verify.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=mysql-backup-verify

CPUQuota=20%
IOSchedulingClass=idle

OnFailure=mysql-maintenance-alert@%n.service
text
# /etc/systemd/system/mysql-backup-verify.timer

[Unit]
Description=Daily MySQL Backup Verification
Requires=mysql-backup-verify.service

[Timer]
OnCalendar=*-*-* 08:00:00
RandomizedDelaySec=600
Persistent=true

[Install]
WantedBy=timers.target

The 08:00 schedule is intentional: run verification after business hours begin so failures surface during working hours, not silently overnight.

Managing Dependencies — Wait for MySQL Before Running

The most common failure mode when converting cron tasks to systemd is incorrect dependency configuration. A maintenance script that connects to MySQL and runs immediately on boot will fail if it starts before MySQL has finished initializing. systemd provides precise control over this ordering.

Service Dependency Ordering: Requires= vs Wants= vs After=

These three directives are frequently confused and the distinction matters for MySQL maintenance tasks:

  • After=mysqld.service — Ordering only. The unit starts after MySQL, but if MySQL is not running, this unit will still attempt to start. Always use this.
  • Requires=mysqld.service — Hard dependency. If MySQL fails or is not active, this unit will not start and will fail immediately. Use this for tasks that genuinely cannot run without MySQL (e.g., query-based health checks).
  • Wants=mysqld.service — Soft dependency. The unit prefers MySQL to be running but will start anyway if MySQL is absent. Appropriate for tasks that degrade gracefully when MySQL is offline (e.g., log rotation that works on files only).

For most MySQL maintenance services, use After=mysqld.service combined with Wants=mysqld.service. Use Requires= only when you are certain the task is meaningless without an active database connection and you want a hard failure rather than a graceful exit.

For maintenance tasks that run a pre-flight connectivity check before doing any work, add an ExecStartPre= directive:

text
# /etc/systemd/system/mysql-health-check.service

[Unit]
Description=MySQL Connectivity Health Check
After=mysqld.service network.target
Wants=mysqld.service

[Service]
Type=oneshot
User=mysql
Group=mysql

# Pre-flight: wait up to 30 seconds for MySQL to accept connections
ExecStartPre=/bin/bash -c \
  'for i in $(seq 1 30); do \
     mysqladmin --defaults-file=/etc/mysql/health-check.cnf ping 2>/dev/null \
     && exit 0 || sleep 1; \
   done; echo "MySQL not reachable after 30 seconds"; exit 1'

ExecStart=/usr/local/bin/mysql-health-check.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=mysql-health-check

For tasks that should only run when MySQL is fully healthy and replication is in sync (such as replica-side reporting queries), add a more detailed readiness check:

text
# Pre-flight check script: /usr/local/bin/mysql-replica-preflight.sh
#!/usr/bin/env bash
set -euo pipefail

DEFAULTS="/etc/mysql/maintenance.cnf"

# Verify basic connectivity
mysqladmin --defaults-file="${DEFAULTS}" ping --connect-timeout=5

# On a replica, verify replication is running and lag is acceptable
IO_RUNNING=$(mysql --defaults-file="${DEFAULTS}" -Nse \
    "SHOW REPLICA STATUS\G" 2>/dev/null \
    | grep "Replica_IO_Running" | awk '{print $2}')

SQL_RUNNING=$(mysql --defaults-file="${DEFAULTS}" -Nse \
    "SHOW REPLICA STATUS\G" 2>/dev/null \
    | grep "Replica_SQL_Running:" | awk '{print $2}')

LAG=$(mysql --defaults-file="${DEFAULTS}" -Nse \
    "SHOW REPLICA STATUS\G" 2>/dev/null \
    | grep "Seconds_Behind_Source" | awk '{print $2}')

if [[ "${IO_RUNNING}" != "Yes" || "${SQL_RUNNING}" != "Yes" ]]; then
    echo "ERROR: Replication is not running (IO=${IO_RUNNING}, SQL=${SQL_RUNNING})"
    exit 1
fi

if [[ "${LAG}" -gt 60 ]]; then
    echo "ERROR: Replication lag is ${LAG} seconds — exceeds 60 second threshold"
    exit 1
fi

echo "Pre-flight passed: replication healthy, lag ${LAG}s"

Monitoring systemd Units with journalctl

All output from systemd-managed services is captured by journald automatically. This replaces the need for custom log files, log rotation configurations, and grep pipelines across /var/log directories.

text
# View all output from today's slow query analysis run
journalctl -u mysql-slow-query-analysis.service --since today

# Follow output from a running maintenance service in real time
journalctl -u mysql-backup-verify.service -f

# Show only the last 5 runs of a service (use --output=short-precise for timestamps)
journalctl -u mysql-slow-query-analysis.service -n 200 \
    --output=short-precise | grep -E "^(Feb|Jan|Mar|-- |mysql)"

# Check exit status of the last run
systemctl show mysql-backup-verify.service \
    --property=ExecMainStatus,ActiveEnterTimestamp,InactiveEnterTimestamp

# List all MySQL maintenance timers and their next scheduled run
systemctl list-timers "mysql-*"

# Show detailed status including last result (success/failure)
systemctl status mysql-slow-query-analysis.service

For structured log queries across multiple MySQL maintenance units, use journalctl's JSON output with filtering:

text
# Show all failure-level messages from MySQL maintenance units in the last 7 days
journalctl --since "7 days ago" -p err \
    --identifier=mysql-slow-query-analysis \
    --identifier=mysql-backup-verify \
    --identifier=mysql-health-check \
    --output=json-pretty | \
    python3 -c "import sys,json; \
    [print(json.loads(l).get('MESSAGE','')) for l in sys.stdin]"

# Count successful vs failed runs for a specific unit this month
journalctl -u mysql-backup-verify.service --since "$(date +%Y-%m-01)" \
    | grep -c "STATUS: OK" && \
journalctl -u mysql-backup-verify.service --since "$(date +%Y-%m-01)" \
    | grep -c "ERROR"

To expose unit status to an external monitoring system (Prometheus, Nagios, Datadog), use systemctl is-failed in a check script:

text
#!/usr/bin/env bash
# /usr/local/bin/mysql-systemd-check.sh
# Returns 0 if all MySQL maintenance units are healthy, 1 if any failed

UNITS=(
    "mysql-slow-query-analysis.service"
    "mysql-backup-verify.service"
    "mysql-health-check.service"
)

FAILED=0
for unit in "${UNITS[@]}"; do
    if systemctl is-failed --quiet "${unit}"; then
        echo "FAILED: ${unit}"
        FAILED=1
    else
        RESULT=$(systemctl show "${unit}" --property=ExecMainStatus | cut -d= -f2)
        echo "OK: ${unit} (last exit: ${RESULT})"
    fi
done

exit ${FAILED}
Key Takeaways
  • systemd timers replace cron for MySQL maintenance tasks by providing structured logging via journalctl, dependency ordering with After= and Requires=, and failure detection through OnFailure= — capabilities cron was never designed to provide.
  • Every scheduled task requires two unit files: a .service unit defining the script and its runtime constraints, and a .timer unit defining the schedule. The service can be triggered manually with systemctl start without waiting for the timer, which makes testing straightforward.
  • Use Type=oneshot for all MySQL maintenance scripts that run and exit. Use Persistent=true in timer units so that tasks missed during downtime or maintenance windows run immediately on next startup.
  • Distinguish carefully between Requires= (hard dependency, fails if MySQL is absent), Wants= (soft dependency, starts anyway), and After= (ordering only). For most maintenance tasks, After=mysqld.service combined with Wants=mysqld.service is the correct combination.
  • Use ExecStartPre= with a connectivity retry loop for tasks that must confirm MySQL is accepting connections before doing any work — particularly relevant for services that start close to system boot time.
  • Apply CPUQuota=, MemoryMax=, and IOSchedulingClass=idle in all maintenance service units to prevent log analysis and backup verification from competing with production MySQL workloads on shared hosts.
  • All unit output is captured by journald without any additional configuration. Use journalctl -u <unit> --since today for operational queries and systemctl list-timers "mysql-*" to verify scheduled runs at a glance.

Working with JusDB on MySQL Operations Automation

JusDB automates MySQL maintenance for engineering teams — backup scheduling, log rotation, health checks, and alerting — using systemd for reliable, auditable execution with full logging.

Explore JusDB MySQL Management →  |  Talk to a DBA

Related reading:

Share this article