Schema migrations are one of the most dangerous operations in a live PostgreSQL database — not because they corrupt data, but because a single poorly-timed ALTER TABLE can freeze your entire application for minutes. The lock that DDL acquires is not selective: it blocks every read and every write until it completes. If you are running a high-traffic service, that means a connection pile-up, timeouts, and an incident that looks exactly like a database outage even though nothing is broken. Understanding exactly which operations are safe and which are lethal in production is the difference between a smooth release and a 3 a.m. rollback.
- Most
ALTER TABLEstatements acquireACCESS EXCLUSIVE— the most aggressive PostgreSQL lock, blocking all reads and writes until the statement completes. - Always set
SET lock_timeout = '2s';before any DDL in production so a blocked migration fails fast instead of queuing and cascading into a connection pile-up. - Use
CREATE INDEX CONCURRENTLY,NOT VALIDconstraints, and the expand-contract pattern to make schema changes online. - PostgreSQL 11+ safely handles
ADD COLUMN ... DEFAULTin O(1); on PG10 and earlier it rewrites the entire table. pg_repackcan rebuild bloated tables and indexes entirely online — no table lock, no downtime.
Why ALTER TABLE Is Dangerous in Production
Running a schema change in development feels instant because the table is empty. In production the same statement can hold a lock for minutes on a table with hundreds of millions of rows. But it is not just about table size — even a fast DDL operation can cause an outage if it queues behind a long-running transaction and all subsequent connections stack up behind the queued DDL.
Lock Types in PostgreSQL
PostgreSQL has eight lock modes, from lightest to heaviest:
- ACCESS SHARE — acquired by
SELECT. Conflicts only withACCESS EXCLUSIVE. - ROW SHARE — acquired by
SELECT ... FOR UPDATE. - ROW EXCLUSIVE — acquired by
INSERT,UPDATE,DELETE. - SHARE UPDATE EXCLUSIVE — acquired by
VACUUM,ANALYZE,CREATE INDEX CONCURRENTLY, and someALTER TABLEforms. BlocksSHARE UPDATE EXCLUSIVEand heavier modes. - SHARE — acquired by
CREATE INDEX(non-concurrent). Blocks writes. - SHARE ROW EXCLUSIVE — acquired by
CREATE TRIGGER, some referential integrity checks. - EXCLUSIVE — rarely used directly by user SQL.
- ACCESS EXCLUSIVE — acquired by most
ALTER TABLEforms,DROP TABLE,TRUNCATE,REINDEX. Conflicts with every other lock mode — blocks all reads and writes.
What Blocks What
The practical rule is simple: most DDL you care about — adding a column with a default, changing a column type, adding a constraint — takes ACCESS EXCLUSIVE. That lock cannot be acquired until every open transaction on the table has finished. And while the DDL is waiting, every new SELECT and every new INSERT must wait behind it. A 10-second long-running report query holding a table lock means your DDL waits 10 seconds, and every application query that arrives during those 10 seconds also waits.
You can observe active blocking in real time with this query:
-- Check what's blocking/blocked right now
SELECT
pid, query, wait_event_type, wait_event,
pg_blocking_pids(pid) AS blocked_by
FROM pg_stat_activity
WHERE cardinality(pg_blocking_pids(pid)) > 0;lock_timeout = '2s' before ALL schema migrations in production. Without it, a CREATE INDEX or ALTER TABLE that can't immediately acquire a lock will queue — and ALL subsequent queries will queue behind it, causing a connection pile-up that looks like a database outage.
-- Lock timeout: fail DDL rather than queue and block all connections
SET lock_timeout = '2s';
ALTER TABLE orders ADD COLUMN notes TEXT;
-- If lock can't be acquired in 2s, statement fails (no lingering queue)Safe DDL Operations (No Lock Required)
Several commonly-needed schema changes either acquire no significant lock or acquire only SHARE UPDATE EXCLUSIVE, which allows concurrent reads and writes to proceed.
-- ADD COLUMN without DEFAULT: safe in all PG versions (O(1), no table rewrite)
ALTER TABLE orders ADD COLUMN notes TEXT;
-- ADD COLUMN with DEFAULT: safe in PG11+ (O(1), stores default in catalog)
-- PG10 and earlier: this rewrites the entire table!
ALTER TABLE orders ADD COLUMN processing_fee DECIMAL(10,2) DEFAULT 0.00;
-- CREATE INDEX CONCURRENTLY: no table lock, allows reads/writes during build
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);
-- DROP INDEX CONCURRENTLY: no lock either
DROP INDEX CONCURRENTLY idx_orders_old;
-- ADD CONSTRAINT using NOT VALID (validate separately)
ALTER TABLE orders ADD CONSTRAINT fk_user
FOREIGN KEY (user_id) REFERENCES users(id) NOT VALID;
-- Later, during low-traffic period:
ALTER TABLE orders VALIDATE CONSTRAINT fk_user;CREATE INDEX CONCURRENTLY takes two passes over the table and builds the index while live traffic continues. It acquires only SHARE UPDATE EXCLUSIVE, which means reads and writes proceed normally the entire time. The trade-off: it takes roughly twice as long as a regular index build and can fail mid-way, leaving an invalid index behind that you need to drop and retry.
The NOT VALID flag on constraints is equally powerful. When you add a foreign key or check constraint with NOT VALID, PostgreSQL only validates new rows going forward — it does not scan existing rows. The subsequent VALIDATE CONSTRAINT does perform a table scan but acquires only SHARE UPDATE EXCLUSIVE, allowing concurrent writes. Combined, these two steps achieve the same end result as a standard constraint add, without ever blocking writes.
Risky DDL Operations and Their Safe Alternatives
The following operations are dangerous in production but each has a safe alternative that achieves the same goal without holding an ACCESS EXCLUSIVE lock for an extended period.
| Dangerous | Safe Alternative |
|---|---|
ALTER TABLE ADD COLUMN ... DEFAULT (PG10 and earlier) |
Upgrade to PG11+, or add column nullable, backfill in batches, then set default |
ALTER TABLE ALTER COLUMN TYPE |
Add new column, backfill in batches, sync with trigger, swap (expand-contract pattern) |
ALTER TABLE ADD CONSTRAINT (full validation) |
Use NOT VALID then VALIDATE CONSTRAINT separately |
CREATE INDEX (regular) |
CREATE INDEX CONCURRENTLY |
ALTER TABLE SET NOT NULL |
Add check constraint first with NOT VALID, validate it, then set NOT NULL |
ALTER TABLE ADD COLUMN ... DEFAULT rewrites the ENTIRE table. This can take hours on large tables and holds an ACCESS EXCLUSIVE lock the entire time. Always check your PostgreSQL version before running this in production. Use SELECT version(); to confirm.
Setting NOT NULL Without a Full Table Scan
Adding a NOT NULL constraint normally triggers a full sequential scan under ACCESS EXCLUSIVE. Here is the safe multi-step approach that avoids the full scan on older PostgreSQL versions:
-- Safe way to add NOT NULL constraint
-- For older versions, use a check constraint approach:
-- Step 1: Add check constraint (fast, not validated)
ALTER TABLE orders
ADD CONSTRAINT orders_user_id_not_null
CHECK (user_id IS NOT NULL) NOT VALID;
-- Step 2: Validate (does not hold ACCESS EXCLUSIVE)
ALTER TABLE orders VALIDATE CONSTRAINT orders_user_id_not_null;
-- Step 3: Drop check constraint, add NOT NULL
-- With the validated constraint in place, PG can skip the full table scan
ALTER TABLE orders ALTER COLUMN user_id SET NOT NULL;
ALTER TABLE orders DROP CONSTRAINT orders_user_id_not_null;pg_repack for Zero-Downtime Table Rewrites
pg_repack is a PostgreSQL extension that rebuilds tables and indexes online, reclaiming space from dead tuples — effectively what VACUUM FULL or CLUSTER does, but without holding an ACCESS EXCLUSIVE lock for the duration. It works by creating a new copy of the table in the background, applying changes via a trigger-based log, and then performing a brief lock-swap at the end.
The lock-swap at the end does acquire ACCESS EXCLUSIVE, but only for the time it takes to rename the tables — typically milliseconds, not minutes.
# Install pg_repack (works with PG10-17)
sudo apt-get install postgresql-17-repack
# Repack a bloated table without locking
pg_repack -h localhost -U postgres -d myapp -t orders
# Repack with index rebuild
pg_repack -h localhost -U postgres -d myapp -t orders --index idx_orders_status
# Repack only indexes (faster when table data is fine but indexes are bloated)
pg_repack -h localhost -U postgres -d myapp --only-indexes -t orderspg_repack is the right tool when:
- Your table has grown significantly through deletions or updates and
VACUUMcannot reclaim the space because of open long-running transactions pinning the oldest transaction horizon. - You need to change the physical storage order of a table (
CLUSTERequivalent) without downtime. - Index bloat has grown to the point where rebuild improves query performance significantly and you cannot take a maintenance window.
pgstattuple extension before deciding whether pg_repack is warranted. A table that is 30% dead tuples is a good candidate; a table at 5% is not worth the I/O cost of a full repack.
Adding Columns with Defaults in PostgreSQL 11+
PostgreSQL 11 introduced a significant optimization for ALTER TABLE ... ADD COLUMN ... DEFAULT. In PG10 and earlier, adding a column with a non-null default required rewriting every row in the table to store the new default value — an O(n) operation that held ACCESS EXCLUSIVE the entire time.
In PG11+, PostgreSQL stores the column default in the system catalog (pg_attrdef). When a row is read that was written before the column was added, PostgreSQL synthesizes the default value at read time. The table is not physically rewritten at all — the operation is O(1) regardless of table size.
-- PG11+: This is now O(1) — safe on tables of any size
ALTER TABLE orders ADD COLUMN processing_fee DECIMAL(10,2) DEFAULT 0.00 NOT NULL;
-- Verify your PostgreSQL version first
SELECT version();
-- Example output: PostgreSQL 16.2 on x86_64-pc-linux-gnu ...The caveat is NOT NULL: adding a column with DEFAULT and NOT NULL together is safe in PG11+. Adding NOT NULL to an existing nullable column without a default still triggers a table scan. The PG11 optimization applies specifically to the combination of a constant (non-volatile) default expression with the new column.
PostgreSQL 17 further improved SET NOT NULL to check only that no new rows can violate the constraint, delegating historical row validation to a previously validated check constraint — reducing lock hold time considerably on large tables.
Schema Migration Best Practices
Safe schema migrations at scale require more than knowing which SQL commands are safe. They require a repeatable process your team follows for every deployment.
1. Always set lock_timeout. Before every DDL statement in production, set SET lock_timeout = '2s';. If the lock cannot be acquired within 2 seconds, the statement fails with an error rather than waiting indefinitely and stacking up connections behind it.
2. Run migrations during low-traffic windows when possible. Even safe operations like CREATE INDEX CONCURRENTLY consume I/O. Running them during peak traffic competes with application queries. Schedule large backfill operations and index builds for off-peak periods.
3. Break large backfills into batches. When backfilling a new column across millions of rows, never run a single UPDATE orders SET new_col = ...; — that holds a transaction open for the entire duration and balloons WAL. Batch in chunks of 5,000–50,000 rows with a brief sleep between batches to avoid I/O saturation.
4. Test migrations on a production replica first. Restore a recent snapshot, run the migration against it, and measure actual wall-clock time and lock hold duration before applying to production.
5. Keep migrations reversible. In the expand-contract pattern, you have a natural rollback point at every step — you can always drop the new column or revert the application code before committing to the final swap. Design multi-step migrations so each step is independently deployable and independently reversible.
6. Monitor pg_stat_activity during the migration. Keep the lock-monitor query running in a separate session so you can immediately spot any unexpected blocking.
The Expand-Contract Pattern for Column Type Changes
Changing a column's type — for example from INTEGER to BIGINT, or from VARCHAR(50) to TEXT — is one of the most dangerous DDL operations because it rewrites every row in the table. The expand-contract pattern breaks this into safe, incremental steps:
-- Dangerous: ALTER TABLE orders ALTER COLUMN amount TYPE BIGINT;
-- Safe expand-contract approach:
-- Step 1: Add new column (O(1), no default needed)
ALTER TABLE orders ADD COLUMN amount_bigint BIGINT;
-- Step 2: Backfill in batches (non-locking, runs in the background)
DO $$
DECLARE batch_size INT := 10000; last_id BIGINT := 0;
BEGIN
LOOP
UPDATE orders SET amount_bigint = amount::BIGINT
WHERE id > last_id AND id <= last_id + batch_size AND amount_bigint IS NULL;
last_id := last_id + batch_size;
EXIT WHEN NOT FOUND;
PERFORM pg_sleep(0.1); -- small pause to avoid I/O saturation
END LOOP;
END;
$$;
-- Step 3: Add trigger to keep new column in sync during the deployment window
CREATE OR REPLACE FUNCTION sync_amount_bigint()
RETURNS TRIGGER AS $$
BEGIN NEW.amount_bigint := NEW.amount::BIGINT; RETURN NEW; END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER sync_amount BEFORE INSERT OR UPDATE ON orders
FOR EACH ROW EXECUTE FUNCTION sync_amount_bigint();
-- Step 4: Deploy application to write to and read from new column
-- Step 5: Drop old column and rename (single transaction — lock is brief)
BEGIN;
ALTER TABLE orders DROP COLUMN amount;
ALTER TABLE orders RENAME COLUMN amount_bigint TO amount;
DROP TRIGGER sync_amount ON orders;
COMMIT;The final swap in Step 5 does acquire ACCESS EXCLUSIVE, but the lock hold time is measured in milliseconds — just the time to update the catalog — because no data is being rewritten. The heavy work (reading and writing every row) happened in the batched backfill under no lock at all.
Key Takeaways
- ACCESS EXCLUSIVE blocks everything. Most
ALTER TABLEforms take this lock. On a busy table with long-running transactions, even a "fast" DDL can queue for a long time and cause a connection pile-up that looks like a full database outage. - lock_timeout is your safety net.
SET lock_timeout = '2s';before every DDL in production. A failed migration is recoverable; a frozen database is an incident. - PG11+ changes the calculus for ADD COLUMN. Adding a column with a constant default is O(1) in PG11+ and safe on tables of any size. On PG10 and earlier, it rewrites the table.
- Use CONCURRENTLY for indexes.
CREATE INDEX CONCURRENTLYtakes roughly 2x longer but never blocks reads or writes. There is almost no justification for a plainCREATE INDEXin production. - Expand-contract for type changes. Never use
ALTER COLUMN TYPEon a large table in production. Add new column, backfill in batches, sync with a trigger, then swap. - pg_repack for table bloat. When
VACUUMcannot reclaim dead space,pg_repackrebuilds the table online with only a millisecond-level lock at the final swap. - NOT VALID + VALIDATE is the safe constraint pattern. Split constraint additions into two steps: add with
NOT VALID(no scan), then validate separately (shared lock, no write blocking).
Working with JusDB on Zero-Downtime Deployments
Zero-downtime schema migrations require both knowledge of PostgreSQL internals and operational discipline — not just on the day of the migration, but in every step of your deployment pipeline. At JusDB, we help engineering teams build the processes, tooling, and guard-rails that make safe schema changes repeatable and systematic, not heroic one-off efforts.
We work with teams that are:
- Running PostgreSQL on self-managed EC2 or on RDS/Aurora and need to build a safe migration workflow from scratch.
- Preparing for a major PostgreSQL version upgrade that involves multiple schema changes across a large dataset.
- Dealing with accumulated table bloat where
VACUUMis no longer keeping up and need apg_repackstrategy that fits within SLA constraints. - Integrating database migrations into a CI/CD pipeline with automated lock-monitoring and rollback logic.
Our PostgreSQL consulting service covers schema migration audits, lock analysis, query optimization, and hands-on implementation of the patterns described in this post. If your team is about to run a high-stakes schema change on a production database, get in touch before you run it.
PostgreSQL Consulting Services | Talk to a PostgreSQL DBA
Related posts: