Database Architecture

Database Schema Design Fundamentals for Scalability

Learn the fundamental principles of database schema design for building scalable applications. This guide covers entity relationships, primary keys, foreign keys, and normalization concepts essential for high-performance databases.

JusDB Team
March 1, 2022
5 min read
2860 views

Schema design decisions made on day one follow you for years. A table with the wrong primary key strategy can make sharding impossible later. A poorly chosen data type means every query allocates more memory than it needs. A missing foreign key means your application has to enforce referential integrity in code — which it won't, consistently.

This guide covers the schema design decisions that matter most for scalability, with concrete guidance on what to do and — equally important — what common advice to ignore.

TL;DR — decisions with the highest long-term impact
  • Primary key strategy: use BIGINT AUTO_INCREMENT for MySQL InnoDB, not UUIDs (unless you're sharding across DBs)
  • Column widths: VARCHAR(255) for unknown strings, VARCHAR(64) for email/usernames — don't oversize everything
  • Timestamps: always store in UTC, use DATETIME not TIMESTAMP for dates beyond 2038
  • Soft deletes: a deleted_at column creates silent performance problems — understand the trade-off before using it
  • Indexes on foreign keys: MySQL does not automatically index FK columns, PostgreSQL does not either — add them explicitly

Primary Key Strategy: The Decision That Ages Poorly

The primary key choice in InnoDB (MySQL) affects more than uniqueness — it determines the physical storage order of the table (clustered index). Getting this wrong creates write hotspots, page fragmentation, and eventually makes the table hard to scale.

-- Good: sequential inserts = sequential writes = minimal page fragmentation
CREATE TABLE orders (
  id          BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  customer_id INT UNSIGNED NOT NULL,
  created_at  DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  status      TINYINT NOT NULL DEFAULT 0,
  total       DECIMAL(12,2) NOT NULL
);
-- Use BIGINT (not INT) — you'll exceed 2 billion rows sooner than you think
-- UNSIGNED doubles the range at no cost

UUIDs: when they help and when they hurt

-- BAD: random UUIDs as InnoDB primary keys
CREATE TABLE events (
  id  CHAR(36) PRIMARY KEY DEFAULT (UUID()),  -- random inserts = page fragmentation
  ...
);

-- PROBLEM: each insert goes to a random page in the B-tree
-- Result: buffer pool thrashing, 50-70% higher storage than BIGINT,
-- dramatically worse write performance at scale

-- BETTER: UUID v7 (time-ordered) if you need global uniqueness
-- Available in MySQL 8.0.34+ via UUID_TO_BIN(UUID(), 1)
CREATE TABLE events (
  id          BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID(), 1)),
  external_id VARCHAR(36) GENERATED ALWAYS AS (BIN_TO_UUID(id, 1)) VIRTUAL,
  ...
);
-- Time-ordered UUIDs preserve locality — inserts are mostly sequential

-- OR: use BIGINT for PK, add a UUID for external exposure
CREATE TABLE events (
  id          BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  public_id   CHAR(36) NOT NULL DEFAULT (UUID()) UNIQUE,  -- expose this to clients
  ...
);

See our deep dive: UUIDs as Primary Keys: How We Destroyed Database Performance — a real production case study.

PostgreSQL primary keys

-- PostgreSQL: use BIGSERIAL or GENERATED ALWAYS AS IDENTITY
CREATE TABLE orders (
  id          BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  customer_id BIGINT NOT NULL,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- PostgreSQL 17+: use UUIDv7 for time-ordered global IDs
CREATE TABLE events (
  id  UUID PRIMARY KEY DEFAULT gen_random_uuid(),  -- UUIDv4 (avoid for hot tables)
  -- or in PG 17:
  id  UUID PRIMARY KEY DEFAULT uuidv7()            -- UUIDv7 (time-ordered, better)
);

Data Type Selection: Every Byte Matters at Scale

Integers

-- Storage sizes and ranges:
-- TINYINT:   1 byte,  -128 to 127 (or 0-255 UNSIGNED)  → status codes, boolean flags
-- SMALLINT:  2 bytes, -32,768 to 32,767                 → age, quantity (bounded)
-- INT:       4 bytes, -2.1B to 2.1B                     → user IDs (medium scale)
-- BIGINT:    8 bytes, -9.2 quintillion to 9.2Q           → PKs, large counters, timestamps

-- Common mistake: using INT for a PK then hitting 2.1 billion limit
-- It happens. Use BIGINT for anything that might grow large.

-- Boolean: use TINYINT(1) in MySQL (no native BOOLEAN, though the alias works)
-- PostgreSQL: use native BOOLEAN
status TINYINT(1) NOT NULL DEFAULT 0   -- MySQL
active BOOLEAN NOT NULL DEFAULT TRUE   -- PostgreSQL

Strings

-- VARCHAR vs CHAR:
-- CHAR(n): fixed width, padded with spaces — use for fixed-format strings (country codes, codes)
-- VARCHAR(n): variable width — use for most strings

country_code CHAR(2)          -- always exactly 2 chars: 'US', 'DE', 'JP'
currency     CHAR(3)          -- always exactly 3: 'USD', 'EUR'
email        VARCHAR(254)     -- RFC max is 254 chars
username     VARCHAR(64)      -- practical limit
description  VARCHAR(1000)    -- medium text
content      TEXT             -- unbounded text (stored separately from row in InnoDB)

-- Warning: VARCHAR(255) vs VARCHAR(256) matters in MySQL temp tables
-- MySQL uses 2 bytes to store length for VARCHAR > 255 (vs 1 byte for <= 255)
-- For sort buffers and temp tables: VARCHAR(255) is slightly more efficient

-- TEXT vs VARCHAR for large content:
-- TEXT columns are stored off-page in InnoDB — accessing them costs an extra I/O
-- For frequently accessed short strings: keep in VARCHAR
-- For document content, descriptions, markdown: TEXT is fine

Timestamps and dates

-- MySQL TIMESTAMP vs DATETIME:
-- TIMESTAMP: stores UTC, converts to session timezone on read
--            range: 1970-2038 (the Y2K38 problem is coming)
-- DATETIME:  stores literal datetime, no timezone conversion
--            range: 1000-9999

-- Use DATETIME for application timestamps in MySQL:
created_at  DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
updated_at  DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

-- Store everything in UTC at the application layer — don't rely on MySQL timezone conversion
-- Set your connection timezone: SET time_zone = '+00:00';

-- PostgreSQL: use TIMESTAMPTZ (timestamp with time zone) for everything
created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()

-- For date-only values (no time component):
birth_date  DATE             -- 3 bytes in MySQL, 4 bytes in PostgreSQL
event_date  DATE NOT NULL

Decimals and money

-- Never use FLOAT or DOUBLE for money — floating point errors
-- 0.1 + 0.2 = 0.30000000000000004 in IEEE 754

-- Use DECIMAL for exact arithmetic:
price       DECIMAL(10, 2)   -- up to 99,999,999.99
amount      DECIMAL(15, 6)   -- for currency with 6 decimal places (crypto etc.)

-- Or store as integer cents/smallest unit (preferred for high-volume financial systems):
price_cents INT UNSIGNED      -- 100 = $1.00 (no floating point at all)

Normalization: When to Normalize, When to Denormalize

Normalization is good for data integrity. Denormalization is good for read performance. The right balance depends on your read/write ratio and query patterns.

3NF (normalized) — good for write-heavy transactional systems

-- Normalized order schema
CREATE TABLE customers (
  id      BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  email   VARCHAR(254) NOT NULL UNIQUE,
  name    VARCHAR(128) NOT NULL
);

CREATE TABLE orders (
  id          BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  customer_id BIGINT UNSIGNED NOT NULL,
  created_at  DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (customer_id) REFERENCES customers(id)
);

CREATE TABLE order_items (
  id         BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  order_id   BIGINT UNSIGNED NOT NULL,
  product_id BIGINT UNSIGNED NOT NULL,
  quantity   SMALLINT UNSIGNED NOT NULL,
  unit_price DECIMAL(10, 2) NOT NULL,  -- snapshot price at time of order
  FOREIGN KEY (order_id) REFERENCES orders(id),
  FOREIGN KEY (product_id) REFERENCES products(id),
  INDEX idx_order (order_id)           -- always index FK columns
);

Denormalized — good for read-heavy analytics

-- Flattened order record (denormalized for reporting)
CREATE TABLE order_facts (
  order_id        BIGINT UNSIGNED PRIMARY KEY,
  customer_id     BIGINT UNSIGNED NOT NULL,
  customer_email  VARCHAR(254) NOT NULL,  -- duplicated from customers
  customer_name   VARCHAR(128) NOT NULL,  -- duplicated
  order_date      DATE NOT NULL,
  total_amount    DECIMAL(12, 2) NOT NULL,
  item_count      SMALLINT UNSIGNED NOT NULL,
  status          TINYINT NOT NULL,
  INDEX idx_customer (customer_id),
  INDEX idx_date (order_date),
  INDEX idx_status_date (status, order_date)
);
-- No JOINs needed for reporting queries — much faster at scale
-- Data duplication is the trade-off

Indexing Strategy: Add What You Need, Remove What You Don't

-- Rule 1: always index foreign key columns (MySQL does NOT do this automatically)
CREATE TABLE order_items (
  order_id   BIGINT UNSIGNED NOT NULL,
  product_id BIGINT UNSIGNED NOT NULL,
  INDEX idx_order_id (order_id),     -- explicit FK index
  INDEX idx_product_id (product_id)  -- explicit FK index
);

-- Rule 2: composite index column order matters
-- For queries filtering on status and sorting by created_at:
INDEX idx_status_date (status, created_at)  -- equality first, range/sort second

-- Rule 3: covering indexes eliminate table lookups
-- This query can be served entirely from the index (no row access):
SELECT order_id, status, created_at FROM orders WHERE customer_id = 1001;
-- Add: INDEX idx_customer_covering (customer_id, status, created_at, order_id)

-- Rule 4: indexes have write overhead — every index maintained on every write
-- A table with 15 indexes on a write-heavy workload is paying a heavy price
-- Audit unused indexes regularly:
SELECT * FROM sys.schema_unused_indexes  -- MySQL
WHERE object_schema = 'myapp';

-- Rule 5: partial indexes for sparse conditions (PostgreSQL)
CREATE INDEX idx_pending_orders ON orders (created_at)
WHERE status = 'pending';  -- only indexes pending orders — much smaller index

Soft Deletes: The Hidden Performance Problem

Soft deletes (deleted_at IS NULL filter everywhere) are popular because they preserve history. But they create a silent, compounding performance problem that teams often don't notice until the table is large:

-- Soft delete pattern:
ALTER TABLE users ADD COLUMN deleted_at DATETIME DEFAULT NULL;

-- Every query now needs: WHERE deleted_at IS NULL
-- Every index now has to include both active and deleted rows
-- As deleted rows accumulate, every query scans more dead data

-- VACUUM (PostgreSQL) and InnoDB purge still process these rows
-- Table grows unboundedly even as "active" data stays constant

-- Better approaches:

-- Option 1: Archive table (move deleted rows to a separate table)
INSERT INTO users_deleted SELECT * FROM users WHERE id = 1001;
DELETE FROM users WHERE id = 1001;

-- Option 2: Partial index (PostgreSQL — index only active rows)
CREATE INDEX idx_active_users_email ON users (email)
WHERE deleted_at IS NULL;
-- This index only stores active rows — much smaller, much faster

-- Option 3: Partition by status and DROP old partitions periodically
-- Works well for event logs, audit tables, time-series data

Partitioning for Large Tables

-- MySQL range partitioning by date (good for time-series data)
CREATE TABLE events (
  id          BIGINT UNSIGNED AUTO_INCREMENT,
  created_at  DATETIME NOT NULL,
  user_id     BIGINT UNSIGNED NOT NULL,
  event_type  VARCHAR(64) NOT NULL,
  payload     JSON,
  PRIMARY KEY (id, created_at)  -- partition key must be in PK for MySQL
) PARTITION BY RANGE (YEAR(created_at) * 100 + MONTH(created_at)) (
  PARTITION p202401 VALUES LESS THAN (202402),
  PARTITION p202402 VALUES LESS THAN (202403),
  PARTITION p202403 VALUES LESS THAN (202404),
  PARTITION p_future VALUES LESS THAN MAXVALUE
);

-- Drop old partitions instantly (much faster than DELETE):
ALTER TABLE events DROP PARTITION p202401;

-- PostgreSQL declarative partitioning (cleaner syntax)
CREATE TABLE events (
  id         BIGSERIAL,
  created_at TIMESTAMPTZ NOT NULL,
  user_id    BIGINT NOT NULL
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2025_01 PARTITION OF events
  FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');

CREATE TABLE events_2025_02 PARTITION OF events
  FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');

Schema Evolution: Migrations at Scale

-- Adding a column: safe (online DDL in MySQL 8+)
ALTER TABLE orders ADD COLUMN notes TEXT DEFAULT NULL;

-- Adding a NOT NULL column to a large table: dangerous without default
-- MySQL will scan and update every row
ALTER TABLE orders ADD COLUMN priority TINYINT NOT NULL DEFAULT 0;
-- Use DEFAULT to avoid full table rewrite in MySQL 8.0+

-- Adding an index to a large table: use ALGORITHM=INPLACE
ALTER TABLE orders
  ADD INDEX idx_status_date (status, created_at),
  ALGORITHM=INPLACE, LOCK=NONE;

-- For very large tables (>100GB): use pt-online-schema-change or gh-ost
-- These tools create a shadow table, sync data incrementally, then swap
gh-ost --host=mysql-host --database=myapp --table=orders \
  --alter="ADD INDEX idx_status_date (status, created_at)" \
  --execute

-- Always test DDL on a clone of production first:
-- 1. Create snapshot of production
-- 2. Run migration on snapshot, measure time
-- 3. Plan maintenance window based on measured time
-- 4. Run on production

Schema Review Checklist

CheckWhat to verify
Primary keysBIGINT (not INT), AUTO_INCREMENT or time-ordered UUIDs, not random UUIDs on hot tables
FK indexesEvery foreign key column has an explicit index
Column widthsNo VARCHAR(255) on everything — size to actual expected data
Money columnsDECIMAL or integer cents — never FLOAT/DOUBLE
TimestampsDATETIME (MySQL) or TIMESTAMPTZ (PostgreSQL), stored in UTC
Composite indexesEquality columns before range/sort columns
Unused indexessys.schema_unused_indexes shows indexes never used since restart
Large table DDLPlan uses ALGORITHM=INPLACE or gh-ost/pt-osc for tables > 10GB

Working with JusDB

Schema design problems compound over time — a wrong primary key strategy at 100K rows is a minor issue; at 500M rows it's a migration project. We review schemas early in the development cycle as part of our database performance optimization service, and we help teams plan and execute large-table migrations safely using gh-ost and pt-online-schema-change.

If you're scaling an existing application and hitting schema-related performance problems, reach out for a schema review.

Related reading: UUIDs as Primary Keys: A Production Case Study | MySQL Indexes Deep Dive | PostgreSQL VACUUM Tuning

Share this article

JusDB Team

Official JusDB content team