JusDB LogoJusDB
Services
AboutBlogAutopilotContactGet Started
JusDB

JusDB

Uncompromised database reliability engineered by experts. Trusted by startups to enterprises worldwide.

Services

  • Remote DBA
  • 24/7 Monitoring
  • Performance Tuning & Security Audit
  • Database Support & Services

Company

  • About Us
  • Careers
  • Contact
  • Blog

Contact

  • contact@jusdb.com
  • +91-9994791055
  • Trichy, Tamil Nadu, India

© 2025 JusDB, Inc. All rights reserved.

Privacy PolicyTerms of UseCookies PolicySecurity

Real-World Examples of Scalable Database Schemas

August 16, 2025
5 min read
0 views

Table of Contents

Real-World Examples of Scalable Database Schemas

Learning from Production Systems: How Industry Leaders Scale Their Databases

Introduction: Theory Meets Practice

Understanding theoretical principles of database design is important, but seeing how these principles are applied in real-world scenarios provides invaluable insights. This post presents several examples of database schemas designed for scalability across different domains, illustrating specific challenges and how they were addressed through thoughtful schema design.

Key Insight: Each domain has unique scaling challenges, but common patterns emerge: strategic partitioning, selective denormalization, hybrid storage models, and tiered data management.

E-Commerce Platform Schema

E-commerce platforms must handle high transaction volumes, seasonal traffic spikes, and ever-growing product catalogs while maintaining fast response times.

Core Schema Design

text
-- Core product information (frequently accessed)
CREATE TABLE products (
    product_id BIGINT PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    base_price DECIMAL(10,2) NOT NULL,
    category_id INT NOT NULL,
    inventory_count INT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    FOREIGN KEY (category_id) REFERENCES categories(category_id)
) PARTITION BY RANGE (product_id);

-- Detailed product information (less frequently accessed)
CREATE TABLE product_details (
    product_id BIGINT PRIMARY KEY,
    description TEXT,
    specifications JSON,
    dimensions VARCHAR(100),
    weight DECIMAL(8,2),
    FOREIGN KEY (product_id) REFERENCES products(product_id)
);

-- User data sharded by user ID ranges
CREATE TABLE users_shard_1 (
    user_id BIGINT PRIMARY KEY CHECK (user_id BETWEEN 1 AND 1000000),
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

-- Orders partitioned by date
CREATE TABLE orders_2024_q4 (
    order_id BIGINT PRIMARY KEY,
    user_id BIGINT NOT NULL,
    order_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    status VARCHAR(20) NOT NULL DEFAULT 'pending',
    total_amount DECIMAL(10,2) NOT NULL,
    CHECK (order_date >= '2024-10-01' AND order_date < '2025-01-01')
);

Scalability Features

1. Vertical Partitioning

Product details are separated from core product information to optimize frequently accessed data:

  • Hot data: Product name, price, inventory (small, fast access)
  • Cold data: Descriptions, specifications (larger, less frequent access)

2. Denormalized Search Table

text
-- Optimized for product search and browsing
CREATE TABLE product_search (
    product_id BIGINT PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    category_name VARCHAR(100) NOT NULL,
    category_path VARCHAR(255) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    average_rating DECIMAL(3,2),
    review_count INT DEFAULT 0,
    search_vector TSVECTOR,
    FOREIGN KEY (product_id) REFERENCES products(product_id)
);

CREATE INDEX idx_product_search_vector ON product_search USING GIN(search_vector);
CREATE INDEX idx_product_search_category ON product_search(category_name, price);

3. Time-Based Order Partitioning

text
-- Efficient querying of recent orders (90% of queries)
CREATE TABLE orders_current_month PARTITION OF orders
    FOR VALUES FROM ('2024-11-01') TO ('2024-12-01');

-- Archived orders for reporting
CREATE TABLE orders_archive_2024_q1 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

4. Caching Strategy

  • Product catalog: Redis cache with 1-hour TTL
  • User sessions: In-memory session store
  • Shopping carts: Separate fast-access store (Redis)
  • Inventory counts: Eventually consistent with background synchronization

Performance Optimizations

Component Optimization Strategy Impact
Product Catalog Read replicas + CDN 99% cache hit rate
User Orders Sharding by user_id Linear scaling
Inventory Eventual consistency High write throughput
Search Denormalized + full-text Sub-100ms response

Social Media Platform Schema

Social media platforms must handle complex relationships between users, high write volumes for content creation, and efficient content delivery.

Core Schema Design

text
-- Users sharded by user_id
CREATE TABLE users_shard_1 (
    user_id BIGINT PRIMARY KEY CHECK (user_id % 100 = 1),
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    profile_picture_url VARCHAR(500),
    bio TEXT,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

-- Posts kept with their authors (same shard)
CREATE TABLE posts_shard_1 (
    post_id BIGINT PRIMARY KEY,
    user_id BIGINT NOT NULL CHECK (user_id % 100 = 1),
    content TEXT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP,
    privacy_level TINYINT NOT NULL DEFAULT 0,
    FOREIGN KEY (user_id) REFERENCES users_shard_1(user_id)
);

-- Polymorphic likes table for multiple content types
CREATE TABLE likes (
    like_id BIGINT PRIMARY KEY,
    user_id BIGINT NOT NULL,
    content_type TINYINT NOT NULL, -- 1=post, 2=comment, 3=photo
    content_id BIGINT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    UNIQUE KEY unique_like (user_id, content_type, content_id)
);

-- Pre-computed counters for performance
CREATE TABLE post_counters (
    post_id BIGINT PRIMARY KEY,
    like_count INT NOT NULL DEFAULT 0,
    comment_count INT NOT NULL DEFAULT 0,
    share_count INT NOT NULL DEFAULT 0,
    view_count INT NOT NULL DEFAULT 0,
    last_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (post_id) REFERENCES posts_shard_1(post_id)
);

Scalability Features

1. Content Sharding Strategy

text
-- Shard determination function
function determine_shard(user_id) {
    return "shard_" + (user_id % 100);
}

-- Keeps user's data together for timeline queries
-- Shard 1: Users 1, 101, 201... + their posts, comments, likes
-- Shard 2: Users 2, 102, 202... + their posts, comments, likes

2. Feed Materialization

text
-- Pre-computed feeds for active users
CREATE TABLE user_feeds (
    feed_id BIGINT PRIMARY KEY,
    user_id BIGINT NOT NULL,
    post_id BIGINT NOT NULL,
    author_id BIGINT NOT NULL,
    score FLOAT NOT NULL, -- relevance/engagement score
    created_at TIMESTAMP NOT NULL,
    UNIQUE KEY unique_feed_item (user_id, post_id),
    INDEX idx_user_timeline (user_id, score DESC)
);

-- Background service populates feeds based on:
-- - Following relationships
-- - Engagement history
-- - Content freshness
-- - Trending topics

3. Asynchronous Counter Updates

text
-- Avoid locks on popular content
-- Write like event to queue immediately
-- Update counters in background batches

-- Example batch counter update
UPDATE post_counters 
SET like_count = like_count + batch_increment,
    last_updated = NOW()
WHERE post_id IN (batch_post_ids);

4. Graph Database Integration

text
// Neo4j for complex relationship queries
// Friend-of-friend recommendations
MATCH (user:User {user_id: 12345})-[:FOLLOWS]->(friend)-[:FOLLOWS]->(fof)
WHERE NOT (user)-[:FOLLOWS]->(fof)
RETURN fof.user_id, count(friend) AS common_connections
ORDER BY common_connections DESC
LIMIT 10

Performance Strategies

  • Timeline generation: Pre-computed for active users, on-demand for inactive
  • Hot content caching: Popular posts cached with high TTL
  • Media handling: CDN + object storage, metadata in database
  • Real-time features: WebSocket connections + message queues

IoT Data Platform Schema

IoT platforms must handle massive volumes of time-series data from millions of devices while supporting both real-time monitoring and historical analysis.

Core Schema Design

text
-- Organization-based isolation
CREATE TABLE organizations (
    org_id BIGINT PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    plan_level VARCHAR(20) NOT NULL,
    max_devices INT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

-- Device metadata (relatively static)
CREATE TABLE devices (
    device_id UUID PRIMARY KEY,
    org_id BIGINT NOT NULL,
    name VARCHAR(100) NOT NULL,
    device_type SMALLINT NOT NULL,
    location GEOGRAPHY(POINT),
    status SMALLINT NOT NULL DEFAULT 1,
    last_connected TIMESTAMPTZ,
    FOREIGN KEY (org_id) REFERENCES organizations(org_id)
);

-- Time-series data (high volume, time-partitioned)
CREATE TABLE sensor_data (
    reading_id BIGSERIAL PRIMARY KEY,
    device_id UUID NOT NULL,
    sensor_type SMALLINT NOT NULL,
    timestamp TIMESTAMPTZ NOT NULL,
    value DOUBLE PRECISION NOT NULL,
    quality SMALLINT NOT NULL DEFAULT 100
) PARTITION BY RANGE (timestamp);

-- Daily partitions for efficient data management
CREATE TABLE sensor_data_2024_11_15 PARTITION OF sensor_data
    FOR VALUES FROM ('2024-11-15') TO ('2024-11-16');

Scalability Features

1. Hypertable Implementation (TimescaleDB)

text
-- Convert to hypertable with automatic partitioning
SELECT create_hypertable('sensor_data', 'timestamp',
    chunk_time_interval => INTERVAL '1 day');

-- Add space partitioning by device_id for parallel processing
SELECT add_dimension('sensor_data', 'device_id', number_partitions => 16);

-- Automatic data retention
SELECT add_retention_policy('sensor_data', INTERVAL '30 days');

2. Continuous Aggregation

text
-- Automatic rollup of time-series data
CREATE MATERIALIZED VIEW sensor_data_hourly
WITH (timescaledb.continuous) AS
SELECT
    device_id,
    sensor_type,
    time_bucket('1 hour', timestamp) AS hour,
    MIN(value) AS min_value,
    MAX(value) AS max_value,
    AVG(value) AS avg_value,
    COUNT(*) AS sample_count
FROM sensor_data
GROUP BY device_id, sensor_type, hour;

-- Automatic refresh policy
SELECT add_continuous_aggregate_policy('sensor_data_hourly',
    start_offset => INTERVAL '3 days',
    end_offset => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour');

3. Multi-Tenant Schema Isolation

text
-- Each organization gets its own schema for isolation
CREATE SCHEMA org_12345;

-- Organization-specific device tables
CREATE TABLE org_12345.devices (
    device_id UUID PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    device_type SMALLINT NOT NULL,
    location GEOGRAPHY(POINT),
    status SMALLINT NOT NULL
);

-- Benefits:
-- - Strong tenant isolation
-- - Independent scaling per organization
-- - Easier compliance and data residency

4. Tiered Storage Strategy

Data Age Storage Tier Access Pattern Retention
0-7 days SSD (Hot) Real-time queries Raw data
7-30 days SSD (Warm) Recent analysis Raw data
30-365 days HDD (Cold) Historical reports Hourly aggregates
1-5 years Object Storage Compliance/backup Daily aggregates

Performance Optimizations

  • Batch inserts: Buffer sensor readings and insert in batches of 1000+
  • Compression: Automatic compression for historical partitions
  • Parallel queries: Leverage multiple CPU cores for aggregations
  • Approximate queries: Use sampling for dashboard visualizations

SaaS Application Platform Schema

SaaS platforms must support multi-tenancy while providing isolation, customization, and scalability for each tenant.

Core Schema Design

text
-- Tenant management
CREATE TABLE tenants (
    tenant_id BIGINT PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    subdomain VARCHAR(50) UNIQUE NOT NULL,
    plan_type VARCHAR(20) NOT NULL,
    max_users INT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

-- Schema-based multi-tenancy for large tenants
CREATE SCHEMA tenant_12345;

-- Tenant-specific project table
CREATE TABLE tenant_12345.projects (
    project_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    description TEXT,
    created_by INT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

-- Shared table approach for smaller tenants
CREATE TABLE shared_projects (
    project_id SERIAL PRIMARY KEY,
    tenant_id BIGINT NOT NULL,
    name VARCHAR(100) NOT NULL,
    description TEXT,
    created_by INT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (tenant_id) REFERENCES tenants(tenant_id)
);

Scalability Features

1. Hybrid Multi-Tenancy Strategy

text
-- Function to determine tenant storage strategy
CREATE OR REPLACE FUNCTION get_tenant_table(tenant_id INT, table_name TEXT)
RETURNS TEXT AS $
DECLARE
    tenant_plan TEXT;
BEGIN
    SELECT plan_type INTO tenant_plan FROM tenants WHERE tenant_id = tenant_id;
    
    IF tenant_plan IN ('enterprise', 'premium') THEN
        RETURN 'tenant_' || tenant_id || '.' || table_name;
    ELSE
        RETURN 'shared.' || table_name;
    END IF;
END;
$ LANGUAGE plpgsql;

-- Enterprise tenants get dedicated schemas
-- Basic tenants share tables with row-level security

2. Extensible Custom Fields

text
-- Allow tenants to define custom fields without schema changes
CREATE TABLE custom_fields (
    field_id SERIAL PRIMARY KEY,
    tenant_id BIGINT NOT NULL,
    entity_type VARCHAR(50) NOT NULL, -- 'project', 'task', etc.
    field_name VARCHAR(100) NOT NULL,
    field_type VARCHAR(50) NOT NULL, -- 'text', 'number', 'date', etc.
    is_required BOOLEAN NOT NULL DEFAULT FALSE,
    default_value TEXT,
    UNIQUE (tenant_id, entity_type, field_name)
);

CREATE TABLE custom_field_values (
    value_id SERIAL PRIMARY KEY,
    field_id INT NOT NULL,
    entity_id INT NOT NULL,
    value TEXT,
    FOREIGN KEY (field_id) REFERENCES custom_fields(field_id)
);

3. Tenant-Based Sharding

text
-- Distribution strategy based on tenant size
Large Enterprise Tenants: Dedicated database instances
Medium Tenants: Dedicated schemas on shared instances  
Small Tenants: Shared tables with row-level security

-- Example shard allocation:
Shard 1: Enterprise tenants A, B, C (dedicated)
Shard 2: Medium tenants D-M (dedicated schemas)
Shard 3: Small tenants N-Z (shared tables)

Performance Considerations

  • Connection pooling: Tenant-specific connection pools for large tenants
  • Query optimization: Automatic tenant_id filter injection
  • Resource quotas: CPU and memory limits per tenant
  • Background jobs: Tenant-aware job scheduling with priority

Financial Transaction Processing Schema

Financial systems must handle high transaction volumes with absolute data integrity, comprehensive audit trails, and complex reporting requirements.

Core Schema Design

text
-- Account management
CREATE TABLE accounts (
    account_id BIGINT PRIMARY KEY,
    account_number VARCHAR(20) UNIQUE NOT NULL,
    account_type VARCHAR(20) NOT NULL,
    currency CHAR(3) NOT NULL,
    current_balance DECIMAL(18,2) NOT NULL DEFAULT 0,
    status VARCHAR(20) NOT NULL DEFAULT 'active',
    owner_id BIGINT NOT NULL,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

-- Transactions partitioned by date
CREATE TABLE transactions (
    txn_id BIGINT PRIMARY KEY,
    from_account BIGINT,
    to_account BIGINT,
    amount DECIMAL(18,2) NOT NULL,
    currency CHAR(3) NOT NULL,
    txn_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    status VARCHAR(20) NOT NULL DEFAULT 'pending',
    reference VARCHAR(100),
    FOREIGN KEY (from_account) REFERENCES accounts(account_id),
    FOREIGN KEY (to_account) REFERENCES accounts(account_id)
) PARTITION BY RANGE (txn_date);

-- Double-entry ledger entries
CREATE TABLE ledger_entries (
    entry_id BIGSERIAL PRIMARY KEY,
    txn_id BIGINT NOT NULL,
    account_id BIGINT NOT NULL,
    debit_amount DECIMAL(18,2) NOT NULL DEFAULT 0,
    credit_amount DECIMAL(18,2) NOT NULL DEFAULT 0,
    entry_date TIMESTAMP NOT NULL,
    balance_after DECIMAL(18,2) NOT NULL,
    FOREIGN KEY (txn_id) REFERENCES transactions(txn_id),
    FOREIGN KEY (account_id) REFERENCES accounts(account_id),
    CHECK (debit_amount >= 0 AND credit_amount >= 0),
    CHECK (debit_amount = 0 OR credit_amount = 0)
);

Scalability Features

1. Immutable Audit Trail

text
-- Comprehensive audit log for compliance
CREATE TABLE audit_trail (
    audit_id BIGSERIAL PRIMARY KEY,
    entity_type VARCHAR(50) NOT NULL,
    entity_id BIGINT NOT NULL,
    action VARCHAR(20) NOT NULL,
    old_values JSONB,
    new_values JSONB,
    user_id BIGINT,
    ip_address INET,
    timestamp TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
) PARTITION BY RANGE (timestamp);

-- Triggers capture all changes automatically
CREATE OR REPLACE FUNCTION audit_changes() RETURNS TRIGGER AS $
BEGIN
    INSERT INTO audit_trail (entity_type, entity_id, action, old_values, new_values, user_id)
    VALUES (TG_TABLE_NAME, NEW.id, TG_OP, row_to_json(OLD), row_to_json(NEW), current_user_id());
    RETURN NEW;
END;
$ LANGUAGE plpgsql;

2. Pre-Calculated Balance Snapshots

text
-- Daily account balance snapshots for fast statement generation
CREATE TABLE account_balances (
    balance_id BIGSERIAL PRIMARY KEY,
    account_id BIGINT NOT NULL,
    as_of_date DATE NOT NULL,
    opening_balance DECIMAL(18,2) NOT NULL,
    closing_balance DECIMAL(18,2) NOT NULL,
    total_debits DECIMAL(18,2) NOT NULL,
    total_credits DECIMAL(18,2) NOT NULL,
    UNIQUE (account_id, as_of_date),
    FOREIGN KEY (account_id) REFERENCES accounts(account_id)
);

-- Background job calculates daily balances
-- Enables instant account statements without scanning all transactions

3. Sharding by Account Range

text
-- Accounts and transactions kept together
Shard 1: Account numbers 1000000-1999999 + their transactions
Shard 2: Account numbers 2000000-2999999 + their transactions
Shard 3: Account numbers 3000000-3999999 + their transactions

-- Minimizes cross-shard operations for:
-- - Account-to-account transfers
-- - Balance inquiries  
-- - Transaction history

Common Patterns Across All Examples

1. Domain-Specific Partitioning

  • E-commerce: By date (orders) and category (products)
  • Social Media: By user_id for data locality
  • IoT: By time for data lifecycle management
  • SaaS: By tenant for isolation
  • Financial: By date and account for compliance

2. Strategic Denormalization

  • Counters and aggregates: Avoid expensive calculations
  • Search indexes: Eliminate complex joins
  • Materialized views: Pre-compute common queries
  • Cached feeds: Improve user experience

3. Hybrid Storage Models

  • Relational + Cache: Redis for hot data
  • Relational + Search: Elasticsearch for full-text
  • Relational + Graph: Neo4j for relationships
  • Relational + Time-series: InfluxDB/TimescaleDB for metrics

4. Tiered Data Management

  • Hot data: Recent, frequently accessed (SSD, memory)
  • Warm data: Less frequent access (standard storage)
  • Cold data: Archive, compliance (object storage)

Key Takeaways

Universal Principles:

  1. Start with your access patterns: Design around how data is actually used
  2. Embrace appropriate complexity: Simple solutions for simple problems, sophisticated solutions for scale
  3. Plan for evolution: Build in flexibility for changing requirements
  4. Monitor and measure: Data-driven optimization decisions
  5. Balance trade-offs: Consistency vs. performance, simplicity vs. scalability

Each of these real-world examples demonstrates that there's no one-size-fits-all solution to database scalability. The key is understanding your specific domain challenges and applying the right combination of techniques to address them effectively.


This concludes our comprehensive series on database schema design for scalability. Apply these patterns thoughtfully to build systems that grow with your needs.

Share this article

Search
Newsletter

Get the latest database insights and expert tips delivered to your inbox.

Categories
Database PerformanceDevOpsMongoDBMySQLPostgreSQLRedis
Popular Tags
MySQL
PostgreSQL
MongoDB
Redis
Performance
Security
Migration
Backup
Cloud
AWS
Azure
Stay Connected

Subscribe to our RSS feed for instant updates.

RSS Feed