Database SRE

MySQL Monitoring with Prometheus and Grafana: Complete Setup Guide

mysqld_exporter exposes 300+ MySQL metrics for Prometheus. Learn to deploy it, configure essential alerting rules for connection saturation and replication lag, and build production dashboards in Grafana.

JusDB Team
January 6, 2025
8 min read
159 views

Your MySQL instance went down at 2 AM and nobody knew until users started complaining at 2:15 AM. A Prometheus + Grafana monitoring stack would have alerted you 20 minutes earlier when replication lag hit 60 seconds and connections maxed out. Here's how to build it in an afternoon.

TL;DR
  • mysqld_exporter exposes 300+ MySQL metrics as a Prometheus-scrapable endpoint
  • Five dashboards cover 95% of MySQL production incidents: connections, replication, InnoDB, queries, and slow log
  • Alert on: connections > 80% of max, replication lag > 30s, InnoDB buffer pool hit rate < 99%
  • Use Grafana dashboard ID 7362 (MySQL Overview) as your starting point

Setting Up mysqld_exporter

Step 1: Create the Monitoring User

sql
-- MySQL: create least-privilege monitoring user
CREATE USER 'exporter'@'localhost'
    IDENTIFIED BY 'strong_monitoring_password';

GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
GRANT SELECT ON performance_schema.* TO 'exporter'@'localhost';
FLUSH PRIVILEGES;

Step 2: Deploy mysqld_exporter

bash
# Download and install
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.15.1/mysqld_exporter-0.15.1.linux-amd64.tar.gz
tar -xzf mysqld_exporter-0.15.1.linux-amd64.tar.gz
mv mysqld_exporter-0.15.1.linux-amd64/mysqld_exporter /usr/local/bin/

# Create credentials file
cat > /etc/mysqld_exporter/.my.cnf < /etc/systemd/system/mysqld_exporter.service <

Step 3: Configure Prometheus Scrape

yaml
# prometheus.yml
scrape_configs:
  - job_name: mysql
    static_configs:
      - targets:
          - mysql-primary.internal:9104
          - mysql-replica1.internal:9104
          - mysql-replica2.internal:9104
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance

Essential Alerting Rules

yaml
# mysql-alerts.yml
groups:
  - name: mysql
    rules:

    # Connection saturation
    - alert: MySQLConnectionSaturation
      expr: |
        mysql_global_status_threads_connected /
        mysql_global_variables_max_connections > 0.8
      for: 2m
      labels: { severity: warning }
      annotations:
        summary: "MySQL connections at {{ $value | humanizePercentage }} of max"

    # Replication lag
    - alert: MySQLReplicationLag
      expr: mysql_slave_status_seconds_behind_master > 30
      for: 1m
      labels: { severity: critical }
      annotations:
        summary: "MySQL replica {{ $labels.instance }} is {{ $value }}s behind primary"

    # InnoDB buffer pool hit rate
    - alert: MySQLBufferPoolHitRateLow
      expr: |
        rate(mysql_global_status_innodb_buffer_pool_read_requests[5m]) /
        (rate(mysql_global_status_innodb_buffer_pool_read_requests[5m]) +
         rate(mysql_global_status_innodb_buffer_pool_reads[5m])) < 0.99
      for: 5m
      labels: { severity: warning }
      annotations:
        summary: "InnoDB buffer pool hit rate {{ $value | humanizePercentage }}"

    # Slow queries spike
    - alert: MySQLSlowQueriesSpike
      expr: rate(mysql_global_status_slow_queries[5m]) > 10
      for: 2m
      labels: { severity: warning }
      annotations:
        summary: "{{ $value }} slow queries/sec on {{ $labels.instance }}"

    # MySQL down
    - alert: MySQLDown
      expr: mysql_up == 0
      for: 30s
      labels: { severity: critical }
      annotations:
        summary: "MySQL instance {{ $labels.instance }} is DOWN"

Key Metrics to Dashboard

MetricPromQLHealthy Range
QPSrate(mysql_global_status_queries[1m])Workload-dependent
Connections usedmysql_global_status_threads_connected / mysql_global_variables_max_connections< 80%
Buffer pool hit rate1 - rate(reads[5m]) / rate(read_requests[5m])> 99%
Replication lagmysql_slave_status_seconds_behind_master< 5s
Aborted connectionsrate(mysql_global_status_aborted_connects[5m])< 1/s
Tip

Import Grafana dashboard ID 7362 (MySQL Overview by Percona) as your baseline. It includes all critical panels pre-built. Then add your custom alert annotations on top.

Key Takeaways
  • Deploy mysqld_exporter with a least-privilege exporter MySQL user — it needs PROCESS, REPLICATION CLIENT, and SELECT on performance_schema.
  • Alert on five signals: connection saturation, replication lag, buffer pool hit rate, slow query rate, and mysql_up.
  • Start with Grafana dashboard 7362 — it covers the 20 most important MySQL metrics without custom panel work.
  • Scrape all replicas separately — replication lag is only visible on the replica endpoint.

Working with JusDB on MySQL Monitoring

JusDB deploys Prometheus + Grafana monitoring stacks for MySQL deployments and configures alerting rules that catch problems before they become incidents. We build runbooks alongside every alert so on-call engineers know exactly what to do when a page fires.

Explore JusDB MySQL Services →  |  Talk to a DBA

Share this article

JusDB Team

Official JusDB content team