Analytics & OLAP

TiDB Explained: The Complete Guide to Distributed SQL and HTAP

Explore TiDB distributed SQL database for HTAP workloads. Learn TiKV/TiFlash architecture, MySQL compatibility, horizontal scaling, and deployment strategies.

JusDB Team
September 6, 2022
4 min read
4031 views

Most databases force you to choose: either run a transactional system (MySQL, PostgreSQL) and build a separate analytics pipeline, or run an analytical warehouse (Redshift, ClickHouse) and accept that it's not great for transactions. TiDB was designed to eliminate that choice. It's a distributed SQL database that handles both workloads on the same dataset — writes go in via TiKV (the OLTP storage engine), analytics run on TiFlash (the columnar replica) — without any ETL between them.

That's a genuinely useful architecture for teams that are tired of maintaining dual-database systems. This guide explains how TiDB works, when it's the right choice, and the operational realities of running it.

TL;DR
  • TiDB = MySQL-compatible distributed SQL with built-in horizontal scaling and HTAP
  • TiKV handles OLTP writes with Raft consensus; TiFlash replicates data in columnar format for analytics
  • Drop-in MySQL replacement for most applications — same protocol, same SQL dialect
  • Best fit: applications that have outgrown a single MySQL node but want SQL semantics and ACID transactions at scale
  • Not the right choice if your workload fits on a single MySQL node — the distributed overhead isn't worth it below ~1TB or 10,000 QPS

Architecture: Three Layers That Do Different Jobs

TiDB separates concerns cleanly across three layers:

TiDB Server — Stateless SQL Layer

TiDB Server processes SQL, optimizes query plans, and coordinates distributed transactions. It's stateless — you can add or remove TiDB Server nodes without data movement. All clients connect to TiDB Server using the standard MySQL protocol (port 4000 by default).

code
# Connect to TiDB exactly like MySQL
mysql -h tidb-server-host -P 4000 -u root

# TiDB speaks MySQL protocol — existing tools work:
# MySQL Workbench, DBeaver, Sequelize, SQLAlchemy, JDBC, etc.

PD (Placement Driver) — Cluster Brain

PD manages the cluster: it stores metadata, allocates timestamps for transactions (TiDB uses MVCC with global timestamps), and tells TiKV how to distribute data. PD runs as 3 or 5 nodes for HA using Raft consensus. It's the cluster's single source of truth for topology and transaction ordering.

code
# Check cluster health via pd-ctl
pd-ctl -u http://pd-host:2379 store      # view all TiKV stores
pd-ctl -u http://pd-host:2379 health     # cluster health
pd-ctl -u http://pd-host:2379 region 1   # inspect a specific region

TiKV — Distributed OLTP Storage

TiKV stores data as sorted key-value pairs, sharded into Regions (default 96MB each). Each Region has 3 replicas distributed across TiKV nodes using Raft consensus. Writes go to the Raft leader; reads can be served from followers. When a Region grows beyond the target size, PD automatically splits it and rebalances.

code
# Check Region distribution
pd-ctl -u http://pd-host:2379 region --jq '.regions | length'

# View store utilization
pd-ctl -u http://pd-host:2379 store --jq '.stores[] | {id, capacity, used_size, region_count}'

TiFlash — Columnar OLAP Replica

TiFlash is a columnar storage engine that replicates data from TiKV in real time using the Raft learner role. It doesn't participate in write consensus — it just receives committed data from TiKV as fast as possible. For analytical queries, the TiDB query optimizer automatically routes to TiFlash when it detects a full-table scan or aggregation that benefits from columnar storage.

code
-- Enable TiFlash replication for a table
ALTER TABLE orders SET TIFLASH REPLICA 1;

-- Check replication progress
SELECT TABLE_NAME, REPLICA_COUNT, AVAILABLE, PROGRESS
FROM information_schema.TIFLASH_REPLICA
WHERE TABLE_SCHEMA = 'myapp';

-- Force a query to use TiFlash (optimizer usually handles this automatically)
SELECT /*+ READ_FROM_STORAGE(TIFLASH[orders]) */
  region, SUM(total), COUNT(*)
FROM orders
WHERE order_date > '2025-01-01'
GROUP BY region;

Transactions and Consistency

TiDB implements distributed ACID transactions using a two-phase commit protocol over TiKV, coordinated by global timestamps from PD. For users, this is mostly transparent — standard SQL transactions work as expected:

code
-- Standard transactions work exactly like MySQL
BEGIN;
UPDATE accounts SET balance = balance - 500 WHERE id = 1001;
UPDATE accounts SET balance = balance + 500 WHERE id = 1002;
COMMIT;

-- Snapshot isolation is the default (repeatable read)
-- For serializable isolation:
SET GLOBAL transaction_isolation = 'SERIALIZABLE';

-- Check transaction status
SELECT * FROM information_schema.CLUSTER_PROCESSLIST;

Large transaction limits

TiDB has limits on transaction size that don't exist in single-node MySQL:

code
-- Default limits (configurable in tidb.toml):
-- tidb_txn_entry_size_limit: 6MB per key-value pair
-- tidb_txn_total_size_limit: 100MB total per transaction

-- For bulk operations, use batch mode:
-- Option 1: BATCH DML (TiDB-specific extension)
BATCH ON id LIMIT 10000 UPDATE orders SET status = 'archived'
WHERE created_at < '2023-01-01';

-- Option 2: Application-level chunking
-- Break large DELETEs/UPDATEs into chunks of 10,000 rows

Horizontal Scaling

Adding capacity to TiDB is simpler than most distributed databases — you add nodes and let PD rebalance automatically:

code
# Scale out TiKV (add storage + compute)
tiup cluster scale-out prod-cluster scale-out.yaml

# scale-out.yaml:
tikv_servers:
  - host: 10.0.1.14
    port: 20160
    data_dir: /data/tikv
    config:
      server.labels:
        zone: us-east-1c

# Scale out TiDB Server (add query capacity without data movement)
tidb_servers:
  - host: 10.0.1.15
    port: 4000

# PD automatically rebalances regions across new TiKV nodes
# Progress visible in TiDB Dashboard: http://pd-host:2379/dashboard

You can also scale down by evicting a node:

code
# Remove a TiKV store safely (PD moves its regions first)
pd-ctl -u http://pd-host:2379 store delete 5  # store ID 5

# Wait for region migration to complete (can take minutes to hours depending on data size)
pd-ctl -u http://pd-host:2379 store 5 | jq '.store.state_name'
# When returns "Tombstone", the node is safe to remove

Monitoring

TiDB ships with a pre-built Grafana dashboard and a built-in web UI (TiDB Dashboard):

code
-- Key SQL queries for monitoring:

-- Slow queries (built-in slow query log table)
SELECT query, query_time, mem_max, disk_max
FROM information_schema.SLOW_QUERY
WHERE is_internal = 0
ORDER BY query_time DESC
LIMIT 20;

-- Active transactions and locks
SELECT * FROM information_schema.CLUSTER_PROCESSLIST
WHERE command != 'Sleep'
ORDER BY time DESC;

-- TiKV Region health (all should be 0)
SELECT type, count FROM information_schema.TIKV_REGION_PEERS
WHERE state != 'normal';

-- Statement summary (aggregated performance stats like Performance Schema)
SELECT schema_name, digest_text, exec_count, avg_latency, max_latency
FROM information_schema.STATEMENTS_SUMMARY
WHERE schema_name = 'myapp'
ORDER BY avg_latency DESC
LIMIT 20;

TiDB vs MySQL: When to Choose Which

Factor TiDB MySQL (single node or InnoDB Cluster)
Data scaleMulti-TB to PB (scales horizontally)Up to ~10TB practically (single node)
Write throughputHigh (scales with TiKV nodes)High on single node; GR adds overhead
Analytics on live dataNative (TiFlash)Needs separate ETL + analytics DB
MySQL compatibilityHigh (some DDL differences)Native
Operational complexityHigher (PD, TiKV, TiFlash all to manage)Lower (familiar single-engine model)
Transaction size limits100MB per transaction (configurable)Effectively unlimited
Best forOutgrown MySQL, need HTAP, no sharding complexityMost OLTP workloads up to ~10TB

Our honest take: if your MySQL instance is under 5TB and your QPS is under 10,000, stay on MySQL. The distributed overhead of TiDB — extra network hops for every transaction, operational complexity of managing PD + TiKV + TiFlash — isn't worth it at that scale. TiDB shines when you've genuinely hit the ceiling of what a single MySQL node (or InnoDB Cluster) can do and need horizontal scaling without abandoning SQL semantics.


MySQL Compatibility: What Works and What Doesn't

TiDB is highly compatible with MySQL but has differences worth knowing before migrating:

code
-- Works in TiDB (same as MySQL):
-- SELECT, INSERT, UPDATE, DELETE
-- JOINs, subqueries, window functions
-- Stored procedures, triggers (limited)
-- Standard DDL (CREATE TABLE, ALTER TABLE)
-- JSON functions, full-text search

-- Differences to check:
-- AUTO_INCREMENT: TiDB uses a different allocation strategy
--   (not strictly sequential — intentional for distributed performance)
SET @@tidb_auto_id_cache = 1;  -- set to 1 for strictly sequential (slower)

-- Foreign keys: supported in TiDB 6.6+ with some limitations
SHOW VARIABLES LIKE 'foreign_key_checks';

-- LOCK TABLES: not supported (use SELECT ... FOR UPDATE instead)

-- Temporary tables: supported in TiDB 5.3+

-- Check compatibility before migrating:
-- Run: tiup compat check --host mysql-host --port 3306 --user root

Working with JusDB on TiDB

TiDB migrations from MySQL require more planning than a typical version upgrade. The distributed transaction model, Region sizing, and TiFlash replica management all need to be designed for your specific workload — not just set to defaults.

We help teams evaluate whether TiDB is the right move for their scale, design the migration path from MySQL, and operate TiDB clusters in production. Our TiDB consulting includes pre-migration compatibility assessment, cluster sizing for your data volume and QPS, and post-migration performance tuning.

If you're hitting the ceiling of MySQL at scale and evaluating your options — TiDB, Aurora, CockroachDB, or sharded MySQL — talk to us about what makes sense for your workload.

Related reading: MySQL InnoDB Cluster Guide | StarRocks for OLAP | MySQL Performance Tuning

Share this article

JusDB Team

Official JusDB content team