ClickHouse is the fastest OLAP database available, capable of scanning billions of rows per second on commodity hardware. Here is how to set it up and use it for real-time analytics alongside your OLTP database.
ClickHouse vs Traditional Analytics
text
Benchmark: COUNT(*) on 1 billion rows
PostgreSQL: ~120 seconds (seq scan)
MySQL: ~180 seconds
ClickHouse: ~0.8 seconds (columnar, vectorized)Create a Table
sql
CREATE TABLE events (
event_date Date,
event_time DateTime,
user_id UInt64,
event_type LowCardinality(String),
page String,
session_id String
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_date, user_id, event_time);Ingest from MySQL via Materialized View
sql
-- Use MySQL table engine to read from MySQL
CREATE TABLE mysql_orders (
id UInt64, user_id UInt64, amount Float64, created_at DateTime
) ENGINE = MySQL('mysql-host:3306', 'myapp', 'orders', 'user', 'pass');
-- Materialize into ClickHouse MergeTree
INSERT INTO orders_analytics SELECT * FROM mysql_orders;Aggregating Functions
sql
-- Unique users per day (HyperLogLog approximation)
SELECT event_date, uniq(user_id) AS dau
FROM events
WHERE event_date >= today() - 30
GROUP BY event_date
ORDER BY event_date;
-- Quantiles (P50, P95, P99)
SELECT
quantile(0.5)(response_ms) AS p50,
quantile(0.95)(response_ms) AS p95,
quantile(0.99)(response_ms) AS p99
FROM api_logs
WHERE event_date = today();AggregatingMergeTree for Pre-Aggregation
sql
-- Pre-aggregate counts and sums for dashboard queries
CREATE MATERIALIZED VIEW daily_stats
ENGINE = AggregatingMergeTree()
ORDER BY (event_date, event_type)
AS SELECT
event_date,
event_type,
uniqState(user_id) AS unique_users,
countState() AS event_count
FROM events
GROUP BY event_date, event_type;
-- Query pre-aggregated data
SELECT event_date, event_type,
uniqMerge(unique_users) AS dau,
countMerge(event_count) AS events
FROM daily_stats
GROUP BY event_date, event_type
ORDER BY event_date DESC;Key Takeaways
- Use
LowCardinality(String)for columns with fewer than 10K distinct values — 3-5x faster - The ORDER BY in MergeTree is your primary sort key — choose it based on your most common filter columns
- AggregatingMergeTree materializes aggregates continuously, making dashboard queries instant
- ClickHouse reads only the columns a query needs — keep unused columns out of SELECT *
JusDB Can Help
ClickHouse can replace expensive data warehouse solutions for many analytics workloads. JusDB can design your ClickHouse schema and ingestion pipeline.