Schema design decisions made on day one follow you for years. A table with the wrong primary key strategy can make sharding impossible later. A poorly chosen data type means every query allocates more memory than it needs. A missing foreign key means your application has to enforce referential integrity in code — which it won't, consistently.
This guide covers the schema design decisions that matter most for scalability, with concrete guidance on what to do and — equally important — what common advice to ignore.
- Primary key strategy: use
BIGINT AUTO_INCREMENTfor MySQL InnoDB, not UUIDs (unless you're sharding across DBs) - Column widths:
VARCHAR(255)for unknown strings,VARCHAR(64)for email/usernames — don't oversize everything - Timestamps: always store in UTC, use
DATETIMEnotTIMESTAMPfor dates beyond 2038 - Soft deletes: a
deleted_atcolumn creates silent performance problems — understand the trade-off before using it - Indexes on foreign keys: MySQL does not automatically index FK columns, PostgreSQL does not either — add them explicitly
Primary Key Strategy: The Decision That Ages Poorly
The primary key choice in InnoDB (MySQL) affects more than uniqueness — it determines the physical storage order of the table (clustered index). Getting this wrong creates write hotspots, page fragmentation, and eventually makes the table hard to scale.
Auto-increment integers (recommended for most cases)
-- Good: sequential inserts = sequential writes = minimal page fragmentation CREATE TABLE orders ( id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY, customer_id INT UNSIGNED NOT NULL, created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, status TINYINT NOT NULL DEFAULT 0, total DECIMAL(12,2) NOT NULL ); -- Use BIGINT (not INT) — you'll exceed 2 billion rows sooner than you think -- UNSIGNED doubles the range at no cost
UUIDs: when they help and when they hurt
-- BAD: random UUIDs as InnoDB primary keys CREATE TABLE events ( id CHAR(36) PRIMARY KEY DEFAULT (UUID()), -- random inserts = page fragmentation ... ); -- PROBLEM: each insert goes to a random page in the B-tree -- Result: buffer pool thrashing, 50-70% higher storage than BIGINT, -- dramatically worse write performance at scale -- BETTER: UUID v7 (time-ordered) if you need global uniqueness -- Available in MySQL 8.0.34+ via UUID_TO_BIN(UUID(), 1) CREATE TABLE events ( id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID(), 1)), external_id VARCHAR(36) GENERATED ALWAYS AS (BIN_TO_UUID(id, 1)) VIRTUAL, ... ); -- Time-ordered UUIDs preserve locality — inserts are mostly sequential -- OR: use BIGINT for PK, add a UUID for external exposure CREATE TABLE events ( id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY, public_id CHAR(36) NOT NULL DEFAULT (UUID()) UNIQUE, -- expose this to clients ... );
See our deep dive: UUIDs as Primary Keys: How We Destroyed Database Performance — a real production case study.
PostgreSQL primary keys
-- PostgreSQL: use BIGSERIAL or GENERATED ALWAYS AS IDENTITY CREATE TABLE orders ( id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, customer_id BIGINT NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- PostgreSQL 17+: use UUIDv7 for time-ordered global IDs CREATE TABLE events ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), -- UUIDv4 (avoid for hot tables) -- or in PG 17: id UUID PRIMARY KEY DEFAULT uuidv7() -- UUIDv7 (time-ordered, better) );
Data Type Selection: Every Byte Matters at Scale
Integers
-- Storage sizes and ranges: -- TINYINT: 1 byte, -128 to 127 (or 0-255 UNSIGNED) → status codes, boolean flags -- SMALLINT: 2 bytes, -32,768 to 32,767 → age, quantity (bounded) -- INT: 4 bytes, -2.1B to 2.1B → user IDs (medium scale) -- BIGINT: 8 bytes, -9.2 quintillion to 9.2Q → PKs, large counters, timestamps -- Common mistake: using INT for a PK then hitting 2.1 billion limit -- It happens. Use BIGINT for anything that might grow large. -- Boolean: use TINYINT(1) in MySQL (no native BOOLEAN, though the alias works) -- PostgreSQL: use native BOOLEAN status TINYINT(1) NOT NULL DEFAULT 0 -- MySQL active BOOLEAN NOT NULL DEFAULT TRUE -- PostgreSQL
Strings
-- VARCHAR vs CHAR: -- CHAR(n): fixed width, padded with spaces — use for fixed-format strings (country codes, codes) -- VARCHAR(n): variable width — use for most strings country_code CHAR(2) -- always exactly 2 chars: 'US', 'DE', 'JP' currency CHAR(3) -- always exactly 3: 'USD', 'EUR' email VARCHAR(254) -- RFC max is 254 chars username VARCHAR(64) -- practical limit description VARCHAR(1000) -- medium text content TEXT -- unbounded text (stored separately from row in InnoDB) -- Warning: VARCHAR(255) vs VARCHAR(256) matters in MySQL temp tables -- MySQL uses 2 bytes to store length for VARCHAR > 255 (vs 1 byte for <= 255) -- For sort buffers and temp tables: VARCHAR(255) is slightly more efficient -- TEXT vs VARCHAR for large content: -- TEXT columns are stored off-page in InnoDB — accessing them costs an extra I/O -- For frequently accessed short strings: keep in VARCHAR -- For document content, descriptions, markdown: TEXT is fine
Timestamps and dates
-- MySQL TIMESTAMP vs DATETIME: -- TIMESTAMP: stores UTC, converts to session timezone on read -- range: 1970-2038 (the Y2K38 problem is coming) -- DATETIME: stores literal datetime, no timezone conversion -- range: 1000-9999 -- Use DATETIME for application timestamps in MySQL: created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP -- Store everything in UTC at the application layer — don't rely on MySQL timezone conversion -- Set your connection timezone: SET time_zone = '+00:00'; -- PostgreSQL: use TIMESTAMPTZ (timestamp with time zone) for everything created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() -- For date-only values (no time component): birth_date DATE -- 3 bytes in MySQL, 4 bytes in PostgreSQL event_date DATE NOT NULL
Decimals and money
-- Never use FLOAT or DOUBLE for money — floating point errors -- 0.1 + 0.2 = 0.30000000000000004 in IEEE 754 -- Use DECIMAL for exact arithmetic: price DECIMAL(10, 2) -- up to 99,999,999.99 amount DECIMAL(15, 6) -- for currency with 6 decimal places (crypto etc.) -- Or store as integer cents/smallest unit (preferred for high-volume financial systems): price_cents INT UNSIGNED -- 100 = $1.00 (no floating point at all)
Normalization: When to Normalize, When to Denormalize
Normalization is good for data integrity. Denormalization is good for read performance. The right balance depends on your read/write ratio and query patterns.
3NF (normalized) — good for write-heavy transactional systems
-- Normalized order schema CREATE TABLE customers ( id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY, email VARCHAR(254) NOT NULL UNIQUE, name VARCHAR(128) NOT NULL ); CREATE TABLE orders ( id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY, customer_id BIGINT UNSIGNED NOT NULL, created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (customer_id) REFERENCES customers(id) ); CREATE TABLE order_items ( id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY, order_id BIGINT UNSIGNED NOT NULL, product_id BIGINT UNSIGNED NOT NULL, quantity SMALLINT UNSIGNED NOT NULL, unit_price DECIMAL(10, 2) NOT NULL, -- snapshot price at time of order FOREIGN KEY (order_id) REFERENCES orders(id), FOREIGN KEY (product_id) REFERENCES products(id), INDEX idx_order (order_id) -- always index FK columns );
Denormalized — good for read-heavy analytics
-- Flattened order record (denormalized for reporting) CREATE TABLE order_facts ( order_id BIGINT UNSIGNED PRIMARY KEY, customer_id BIGINT UNSIGNED NOT NULL, customer_email VARCHAR(254) NOT NULL, -- duplicated from customers customer_name VARCHAR(128) NOT NULL, -- duplicated order_date DATE NOT NULL, total_amount DECIMAL(12, 2) NOT NULL, item_count SMALLINT UNSIGNED NOT NULL, status TINYINT NOT NULL, INDEX idx_customer (customer_id), INDEX idx_date (order_date), INDEX idx_status_date (status, order_date) ); -- No JOINs needed for reporting queries — much faster at scale -- Data duplication is the trade-off
Indexing Strategy: Add What You Need, Remove What You Don't
-- Rule 1: always index foreign key columns (MySQL does NOT do this automatically) CREATE TABLE order_items ( order_id BIGINT UNSIGNED NOT NULL, product_id BIGINT UNSIGNED NOT NULL, INDEX idx_order_id (order_id), -- explicit FK index INDEX idx_product_id (product_id) -- explicit FK index ); -- Rule 2: composite index column order matters -- For queries filtering on status and sorting by created_at: INDEX idx_status_date (status, created_at) -- equality first, range/sort second -- Rule 3: covering indexes eliminate table lookups -- This query can be served entirely from the index (no row access): SELECT order_id, status, created_at FROM orders WHERE customer_id = 1001; -- Add: INDEX idx_customer_covering (customer_id, status, created_at, order_id) -- Rule 4: indexes have write overhead — every index maintained on every write -- A table with 15 indexes on a write-heavy workload is paying a heavy price -- Audit unused indexes regularly: SELECT * FROM sys.schema_unused_indexes -- MySQL WHERE object_schema = 'myapp'; -- Rule 5: partial indexes for sparse conditions (PostgreSQL) CREATE INDEX idx_pending_orders ON orders (created_at) WHERE status = 'pending'; -- only indexes pending orders — much smaller index
Soft Deletes: The Hidden Performance Problem
Soft deletes (deleted_at IS NULL filter everywhere) are popular because they preserve history. But they create a silent, compounding performance problem that teams often don't notice until the table is large:
-- Soft delete pattern: ALTER TABLE users ADD COLUMN deleted_at DATETIME DEFAULT NULL; -- Every query now needs: WHERE deleted_at IS NULL -- Every index now has to include both active and deleted rows -- As deleted rows accumulate, every query scans more dead data -- VACUUM (PostgreSQL) and InnoDB purge still process these rows -- Table grows unboundedly even as "active" data stays constant -- Better approaches: -- Option 1: Archive table (move deleted rows to a separate table) INSERT INTO users_deleted SELECT * FROM users WHERE id = 1001; DELETE FROM users WHERE id = 1001; -- Option 2: Partial index (PostgreSQL — index only active rows) CREATE INDEX idx_active_users_email ON users (email) WHERE deleted_at IS NULL; -- This index only stores active rows — much smaller, much faster -- Option 3: Partition by status and DROP old partitions periodically -- Works well for event logs, audit tables, time-series data
Partitioning for Large Tables
-- MySQL range partitioning by date (good for time-series data)
CREATE TABLE events (
id BIGINT UNSIGNED AUTO_INCREMENT,
created_at DATETIME NOT NULL,
user_id BIGINT UNSIGNED NOT NULL,
event_type VARCHAR(64) NOT NULL,
payload JSON,
PRIMARY KEY (id, created_at) -- partition key must be in PK for MySQL
) PARTITION BY RANGE (YEAR(created_at) * 100 + MONTH(created_at)) (
PARTITION p202401 VALUES LESS THAN (202402),
PARTITION p202402 VALUES LESS THAN (202403),
PARTITION p202403 VALUES LESS THAN (202404),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
-- Drop old partitions instantly (much faster than DELETE):
ALTER TABLE events DROP PARTITION p202401;
-- PostgreSQL declarative partitioning (cleaner syntax)
CREATE TABLE events (
id BIGSERIAL,
created_at TIMESTAMPTZ NOT NULL,
user_id BIGINT NOT NULL
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2025_01 PARTITION OF events
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE TABLE events_2025_02 PARTITION OF events
FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
Schema Evolution: Migrations at Scale
-- Adding a column: safe (online DDL in MySQL 8+) ALTER TABLE orders ADD COLUMN notes TEXT DEFAULT NULL; -- Adding a NOT NULL column to a large table: dangerous without default -- MySQL will scan and update every row ALTER TABLE orders ADD COLUMN priority TINYINT NOT NULL DEFAULT 0; -- Use DEFAULT to avoid full table rewrite in MySQL 8.0+ -- Adding an index to a large table: use ALGORITHM=INPLACE ALTER TABLE orders ADD INDEX idx_status_date (status, created_at), ALGORITHM=INPLACE, LOCK=NONE; -- For very large tables (>100GB): use pt-online-schema-change or gh-ost -- These tools create a shadow table, sync data incrementally, then swap gh-ost --host=mysql-host --database=myapp --table=orders \ --alter="ADD INDEX idx_status_date (status, created_at)" \ --execute -- Always test DDL on a clone of production first: -- 1. Create snapshot of production -- 2. Run migration on snapshot, measure time -- 3. Plan maintenance window based on measured time -- 4. Run on production
Schema Review Checklist
| Check | What to verify |
|---|---|
| Primary keys | BIGINT (not INT), AUTO_INCREMENT or time-ordered UUIDs, not random UUIDs on hot tables |
| FK indexes | Every foreign key column has an explicit index |
| Column widths | No VARCHAR(255) on everything — size to actual expected data |
| Money columns | DECIMAL or integer cents — never FLOAT/DOUBLE |
| Timestamps | DATETIME (MySQL) or TIMESTAMPTZ (PostgreSQL), stored in UTC |
| Composite indexes | Equality columns before range/sort columns |
| Unused indexes | sys.schema_unused_indexes shows indexes never used since restart |
| Large table DDL | Plan uses ALGORITHM=INPLACE or gh-ost/pt-osc for tables > 10GB |
Working with JusDB
Schema design problems compound over time — a wrong primary key strategy at 100K rows is a minor issue; at 500M rows it's a migration project. We review schemas early in the development cycle as part of our database performance optimization service, and we help teams plan and execute large-table migrations safely using gh-ost and pt-online-schema-change.
If you're scaling an existing application and hitting schema-related performance problems, reach out for a schema review.
Related reading: UUIDs as Primary Keys: A Production Case Study | MySQL Indexes Deep Dive | PostgreSQL VACUUM Tuning