In the world of distributed systems and microservices architecture, UUIDs (Universally Unique Identifiers) have become the go-to solution for generating unique primary keys across multiple services. The appeal is obvious: generate globally unique identifiers without coordinating with a central database, simplify data merging across regions, and avoid exposing sequential business intelligence through your APIs.
But here's the uncomfortable truth we've witnessed across dozens of client deployments: UUIDs as primary keys can silently devastate your database performance, especially at scale.
The Allure of UUIDs
Before diving into the problems, let's understand why UUIDs became so popular:
Global Uniqueness: Generate IDs across distributed services without database round trips. Perfect for microservices architectures where multiple applications need to create records independently.
Data Merging Simplified: Multi-region deployments become easier when merging data from different databases doesn't require ID conflict resolution.
Security Through Obscurity: Sequential IDs like /users/12345 expose business metrics. UUIDs obscure this information.
These benefits are real, which makes the performance trade-offs easy to overlook—until your database starts grinding to a halt.
The Performance Catastrophe: Real Numbers
Let me share some eye-opening metrics from production systems we've analyzed:
Storage Overhead: The 4x to 20x Problem
A standard auto-incrementing integer (INT or BIGINT) consumes 32-64 bits per value. A UUID, even in optimized binary format (BINARY(16)), consumes 128 bits—4x larger. But most implementations use human-readable CHAR(36) format, consuming up to 288 bits, roughly 9x larger than a BIGINT.
In one client's 50-million-row table:
Integer primary key: ~240 MB index size
UUID (CHAR(36)) primary key: ~1.7 GB index size
That's a 7x increase in storage just for the primary key index. And remember, secondary indexes in InnoDB store the primary key value as a pointer, multiplying this overhead across all indexes.
The B+ Tree Balancing Nightmare
Here's where things get truly painful. MySQL and PostgreSQL store indexes in B+ tree structures, optimized for sequential insertions. When you insert sequential integers (1, 2, 3, 4...), the database simply appends new values to the rightmost leaf page of the tree. InnoDB fills pages to about 94% capacity before creating a new page.
With random UUIDs, chaos ensues:
Random Insertion Points: A UUID like
9b9b9b9b-9b9b-9b9b-9b9b-9b9b9b9b9b9bmight need to be inserted between existing values550e8400-e29b-41d4-a716-446655440000anda9a9a9a9-a9a9-a9a9-a9a9-a9a9a9a9a9a9Page Splitting: The database must split existing pages to make room, then rebalance the entire tree structure
Poor Page Utilization: Instead of 94% utilization, random UUIDs result in 50-70% page utilization, nearly doubling storage requirements
Constant Reorganization: Every insert potentially triggers rebalancing across multiple levels of the tree
On a high-volume OLTP system processing 10,000 inserts per second, this constant rebalancing creates significant I/O overhead and lock contention.
Real-World Performance Degradation
A recent client case study:
Before (Auto-increment INT):
INSERT operations: ~0.5ms average
SELECT on indexed column: ~0.1ms
Buffer pool hit ratio: 99.8%
After (UUID v4 as CHAR(36)):
INSERT operations: ~2.3ms average (460% slower)
SELECT on indexed column: ~0.8ms (700% slower)
Buffer pool hit ratio: 92.1% (7.7% more disk I/O)
Storage costs increased by 340%
Why UUIDs Kill Cache Efficiency
Modern databases rely heavily on caching frequently accessed pages in memory (InnoDB buffer pool, PostgreSQL shared buffers). With sequential keys, recently inserted rows cluster together in the same index pages. A query for recent records hits just a few cached pages.
With random UUIDs, temporal locality vanishes:
Records inserted at the same time scatter across the entire index
The entire index must stay in cache for good performance
Cache miss rates skyrocket as the table grows
More disk I/O means slower queries and higher latency
As one of our clients discovered, what worked fine with 1 million rows became unusably slow at 50 million rows—not because of scale, but because their working set exceeded available memory.
The String Comparison Tax
Beyond storage and indexing issues, UUIDs introduce computational overhead:
String operations are slower than integer operations, especially for:
WHEREclause comparisonsJOINoperations on foreign keysIndex lookups
Sorting and grouping
A query like:
WHERE id = '550e8400-e29b-41d4-a716-446655440000'
is significantly slower than:
WHERE id = 12345
At scale, across millions of queries per day, these microseconds compound into real performance degradation.
The JOIN Performance Penalty
In relational databases, foreign keys reference primary keys. When you use UUIDs as primary keys, your foreign keys also become UUIDs, propagating the performance problems:
SELECT o.order_id, c.customer_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date = '2024-01-01';
With UUID foreign keys, this join:
Scans larger indexes (4-9x more data)
Performs slower string comparisons
Requires more buffer pool memory
Generates more disk I/O
Multiply this across complex queries with multiple joins, and performance degrades rapidly.
Choosing the Right Integer Type: BIGINT vs BIGINT UNSIGNED
Before diving into solutions, an important note about integer primary keys: always use UNSIGNED for auto-incrementing primary keys.
Why UNSIGNED?
INT vs INT UNSIGNED:
INT: -2,147,483,648 to 2,147,483,647 (~2.1 billion positive values)INT UNSIGNED: 0 to 4,294,967,295 (~4.3 billion values)
BIGINT vs BIGINT UNSIGNED:
BIGINT: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807BIGINT UNSIGNED: 0 to 18,446,744,073,709,551,615 (18+ quintillion values)
Since auto-increment values are never negative, using UNSIGNED effectively doubles your capacity without any storage overhead—both use the same 8 bytes for BIGINT.
INT UNSIGNED vs BIGINT UNSIGNED: The Modern Choice
Historically, developers used INT (4 bytes) to save space over BIGINT (8 bytes). While this 50% reduction can save 10-20% on index sizes in heavily normalized databases, modern best practices favor BIGINT UNSIGNED:
Why BIGINT UNSIGNED is now standard:
Future-proofing: 4.3 billion rows might seem like a lot, but high-volume systems can hit this limit
No migration headaches: Migrating from
INTtoBIGINTon a billion-row table is painfulStorage is cheap: The performance difference is negligible on modern hardware
Industry standard: Rails 5.1+ (2017) made
BIGINTthe default for all primary keys
When to use INT UNSIGNED:
Tables guaranteed to never exceed 4.3 billion rows
Heavily normalized schemas with dozens of foreign keys where index size truly matters
Legacy applications where schema changes are prohibitively expensive
Our recommendation: Default to BIGINT UNSIGNED for all new projects unless you have specific constraints.
The Right Solution: Hybrid Approaches
The good news? You don't have to choose between performance and the benefits of UUIDs. Here are battle-tested approaches we recommend to clients:
Strategy 1: UUID as Secondary Key (Recommended)
CREATE TABLE users (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
uuid BINARY(16) UNIQUE NOT NULL DEFAULT (UUID_TO_BIN(UUID())),
email VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_uuid (uuid)
) ENGINE=InnoDB;
Benefits:
Fast inserts and queries using integer primary key
Global uniqueness via UUID for API exposure and data merging
18+ quintillion possible IDs (you'll never run out)
Best of both worlds with minimal overhead
Usage pattern:
Internal queries use
idExternal API uses
uuidData sync between regions uses
uuid
Strategy 2: Ordered UUIDs (UUIDv7)
UUIDv7 incorporates timestamps in the most significant bits, providing natural ordering:
-- Using UUIDv7 (timestamp-based, ordered)
CREATE TABLE events (
id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(gen_uuid_v7())),
event_type VARCHAR(50),
payload JSON,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB;
Performance characteristics:
Much better than random UUIDs (v4)
Maintains sequential-like insertion patterns
Reduces page splitting and fragmentation
Still larger than integers (128 vs 64 bits)
Note: MySQL native UUIDv7 support is coming in future versions. Meanwhile, use user-defined functions or application-level generation.
Strategy 3: Composite Keys for Distributed Systems
For truly distributed scenarios:
CREATE TABLE distributed_events (
shard_id SMALLINT UNSIGNED NOT NULL,
sequence_id BIGINT UNSIGNED AUTO_INCREMENT,
uuid BINARY(16) UNIQUE NOT NULL,
event_data JSON,
PRIMARY KEY (shard_id, sequence_id)
) ENGINE=InnoDB;
This approach:
Maintains sequential ordering within shards
Scales horizontally across up to 65,535 shards
Provides UUIDs for cross-shard uniqueness
Optimizes for common query patterns
Migration Strategy: From UUID to Hybrid
If you're already using UUIDs, here's our recommended migration path:
Step 1: Add Integer Primary Key
ALTER TABLE users
ADD COLUMN id_new BIGINT UNSIGNED AUTO_INCREMENT UNIQUE,
ADD INDEX idx_id_new (id_new);
Step 2: Rename Columns
ALTER TABLE users
DROP PRIMARY KEY,
ADD PRIMARY KEY (id_new),
CHANGE COLUMN uuid uuid_external BINARY(16),
CHANGE COLUMN id_new id BIGINT UNSIGNED;
Step 3: Update Foreign Keys This requires careful planning and testing, potentially using dual-write patterns during transition.
Step 4: Update Application Layer
Internal queries use new integer
idExternal APIs continue using
uuid_externalImplement gradual rollout with monitoring
Storage Optimization: BINARY(16) vs CHAR(36)
If you must use UUIDs, the storage format matters significantly:
The Storage Breakdown
CHAR(36):
36 bytes with
CHARSET latin1orCHARSET ascii108-144 bytes with
CHARSET utf8mb4(default in MySQL 8.0+)2 bytes VARCHAR overhead
36 characters × 3-4 bytes per character
Human-readable in queries and dumps
Simple to work with in application code
BINARY(16):
16 bytes - the actual UUID size
55% smaller than
CHAR(36)with ASCII89% smaller than
CHAR(36)with utf8mb4Requires conversion functions (
UUID_TO_BIN(),BIN_TO_UUID())Not human-readable in raw queries
Performance Impact
The storage difference multiplies across:
Primary key index
All secondary indexes (which include the primary key in InnoDB)
Buffer pool memory utilization
Disk I/O for index operations
Example with 50 million rows:
CHAR(36)with utf8mb4: ~6.8 GB just for primary key indexBINARY(16): ~800 MB for primary key index8.5x difference in index size
Conversion Functions (MySQL 8.0+)
-- Create table with BINARY(16)
CREATE TABLE users (
id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID())),
name VARCHAR(255) NOT NULL
) ENGINE=InnoDB;
-- Insert with conversion
INSERT INTO users (id, name)
VALUES (UUID_TO_BIN('550e8400-e29b-41d4-a716-446655440000'), 'John Doe');
-- Query with conversion
SELECT BIN_TO_UUID(id) AS id, name FROM users;
-- Where clause requires conversion
SELECT * FROM users WHERE id = UUID_TO_BIN('550e8400-e29b-41d4-a716-446655440000');
The Trade-offs
Use BINARY(16) when:
Working with large tables (millions of rows)
Performance and storage efficiency are critical
Your application layer can handle conversion
You're comfortable with conversion overhead in debugging
Use CHAR(36) when:
Small tables (<1 million rows) where overhead is negligible
Development/debugging convenience is prioritized
Your ORM doesn't handle binary UUIDs well
Legacy systems where changing would break integrations
Recommendation
For production systems at scale: Use BINARY(16) with proper conversion functions. The 55-89% storage savings compounds across indexes and dramatically improves cache efficiency.
For small applications or rapid prototyping: CHAR(36) is acceptable if the table will remain small. But plan migration paths if the application grows.
Best of both worlds: Use our recommended hybrid approach (BIGINT UNSIGNED primary key + BINARY(16) UUID as secondary unique key) where the UUID can be BINARY for storage efficiency while the primary key remains simple and performant.
Monitoring and Benchmarking
Track these metrics when evaluating UUID performance impact:
Storage Metrics:
Index size growth rate
Page fill factor (should be >85%)
Tablespace utilization
Performance Metrics:
Average query response time (especially JOINs)
INSERT throughput and latency
Buffer pool hit ratio
Disk I/O rates
Monitoring queries:
-- Index fragmentation check
SELECT
table_name,
index_name,
ROUND(data_length/1024/1024, 2) AS data_mb,
ROUND(index_length/1024/1024, 2) AS index_mb
FROM information_schema.tables
WHERE table_schema = 'your_database'
ORDER BY index_length DESC;
-- Page fill monitoring (requires sys schema)
SELECT
object_schema,
object_name,
index_name,
ROUND(avg_leaf_pages_per_segment, 2) AS avg_leaf_pages
FROM sys.x$schema_index_statistics
WHERE object_schema = 'your_database';
When UUIDs Make Sense
Despite the performance concerns, UUIDs are appropriate in specific scenarios:
Small to Medium Tables: Under 10 million rows, the performance impact is often negligible
Truly Distributed Writes: Systems with no central coordination point where multiple services must generate IDs independently
Append-Only Event Logs: When using UUIDv7 or similar ordered variants for event sourcing
Security-Critical Systems: Where exposing sequential IDs creates genuine security risks (combined with secondary UUID approach)
The Bottom Line
UUIDs as primary keys create a performance tax that compounds at scale. For most applications, especially those with:
High transaction volumes (>1000 inserts/sec)
Large datasets (>10 million rows)
Complex queries with multiple joins
Memory-constrained environments
The hybrid approach (auto-increment primary key + UUID secondary key) provides the best balance of performance and functionality.
JusDB's Recommendations
Based on our experience managing 100+ MySQL fleets in production:
Default to BIGINT UNSIGNED for auto-increment primary keys—never plain
BIGINTAdd UUIDs as secondary unique keys when global uniqueness is required
Use BINARY(16) for UUID storage in production systems at scale (use CHAR(36) only for small tables or development)
Use UUIDv7 if you must use UUIDs as primary keys
Monitor index fragmentation and page utilization metrics
Benchmark before migrating existing systems
Use INT UNSIGNED only when you're absolutely certain a table won't exceed 4.3 billion rows and need the space savings
Remember: premature optimization is the root of all evil, but so is premature pessimization. Choose your primary key strategy based on actual requirements, not assumptions.
Need help optimizing your database architecture? JusDB specializes in MySQL performance tuning, database migrations, and reliability engineering. Contact us at contact@jusdb.com for a free database health assessment.
Working with JusDB on Schema Performance
UUID primary key performance problems are one of the most common schema issues we see in production — and one of the easiest to design around upfront. We review primary key strategies as part of schema design work. If you're migrating from UUID to BIGINT or evaluating UUIDv7, reach out.
Related reading: Database Schema Design Fundamentals | MySQL Indexes Explained | Zero-Downtime Schema Migrations