PostgreSQL

MySQL to PostgreSQL Migration: Schema, Data, and Application Code Guide

Migrate from MySQL to PostgreSQL using pgLoader. Covers schema differences, type casting, application code changes, validation, and zero-downtime migration strategies.

JusDB Team
March 5, 2026
11 min read
240 views

MySQL powered a generation of web applications, but its cracks show at scale: silent data truncation, weak JSON support, and a permissive default mode that swallows errors whole. PostgreSQL has become the default choice for engineering teams that need strict SQL compliance, advanced type support, and a replication model built for the modern cloud. Migrating is a one-time investment that pays dividends in reliability, developer ergonomics, and long-term infrastructure cost.

TL;DR
  • Use pgLoader for automated schema and data migration; it handles most MySQL-to-PostgreSQL type casting automatically.
  • Resolve schema differences first: AUTO_INCREMENTIDENTITY, TINYINT(1)BOOLEAN, DATETIMETIMESTAMPTZ.
  • pgLoader does not migrate stored procedures, triggers, or views — rewrite these in PL/pgSQL before cutover.
  • Enable strict SQL mode in MySQL for at least one week before migrating to surface silent data violations.
  • Validate row counts, data checksums, and query equivalence on a staging clone before touching production.

Why Migrate from MySQL to PostgreSQL?

The technical case for PostgreSQL has strengthened with every release. But the business case — especially at scale — is now equally compelling.

ACID compliance and MVCC. Both databases are ACID-compliant, but PostgreSQL’s Multi-Version Concurrency Control implementation is fundamentally different. In MySQL’s InnoDB, writers can block readers on certain operations. In PostgreSQL, readers never block writers and writers never block readers — each transaction sees a consistent snapshot of the database. For high-concurrency write workloads, this difference is measurable in wall-clock latency.

JSON/JSONB support. MySQL added JSON columns in 5.7, but the implementation is limited: you cannot index arbitrary nested paths efficiently, and JSON columns cannot be used in functional indexes. PostgreSQL’s JSONB type stores JSON in a decomposed binary format, supports GIN indexing on any nested key or array element, and has a rich operator set (@>, #>>, jsonb_path_query) that makes semi-structured queries genuinely fast.

Advanced native types. PostgreSQL ships with arrays, hstore (key-value), range types (tstzrange, int4range), native UUID, INET, and CIDR types. These let you enforce constraints and run queries directly on structured data without denormalizing into text columns or relying on application-layer parsing.

Logical replication. PostgreSQL’s publish/subscribe logical replication model lets you replicate individual tables, filter rows, and replicate between major versions. This is critical for zero-downtime major version upgrades and for feeding downstream analytical systems (Debezium, Kafka Connect, and most CDC tools have first-class PostgreSQL support).

Strict SQL standards compliance. MySQL’s default sql_mode silently truncates strings that exceed column length and accepts dates like 0000-00-00. PostgreSQL is strict by default and will raise an error rather than silently corrupt data. Teams frequently discover latent data bugs during migration that have been silently accumulating in MySQL for years.

Cost at scale. On AWS, Aurora PostgreSQL and Aurora MySQL are priced identically per ACU-hour, but PostgreSQL’s superior MVCC and connection pooling (via PgBouncer) typically allow higher throughput per instance size. At the upper end, Aurora PostgreSQL’s integration with Amazon RDS Proxy and its support for logical replication to Amazon Redshift can replace expensive ETL pipelines entirely.

Key Schema Differences to Resolve Before Migration

The most dangerous part of a MySQL-to-PostgreSQL migration is not the data movement — it is the schema incompatibilities that pgLoader handles silently with best-effort casts that may not match your application’s expectations. Review each of the following before running the migration tool.

1. AUTO_INCREMENT vs SERIAL / IDENTITY

MySQL’s AUTO_INCREMENT attribute has two PostgreSQL equivalents. The legacy approach uses the SERIAL pseudo-type (which creates a sequence under the hood). The modern, SQL-standard approach uses GENERATED ALWAYS AS IDENTITY.

sql
-- MySQL
CREATE TABLE users (
  id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(100)
);

-- PostgreSQL equivalent (modern syntax)
CREATE TABLE users (
  id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name VARCHAR(100)
);

Use GENERATED ALWAYS AS IDENTITY for new schemas. If you need to override the sequence (e.g., during data import), use GENERATED BY DEFAULT AS IDENTITY. pgLoader will use SERIAL by default; you can post-process the DDL to upgrade to IDENTITY columns.

2. TINYINT(1) as Boolean

MySQL has no native boolean type. The convention is TINYINT(1) with values 0 and 1. PostgreSQL has a proper BOOLEAN type that accepts TRUE/FALSE, 't'/'f', and 1/0.

sql
-- MySQL uses TINYINT(1) for booleans
is_active TINYINT(1) DEFAULT 0

-- PostgreSQL native BOOLEAN
is_active BOOLEAN DEFAULT FALSE

pgLoader’s tinyint-to-boolean cast handles this automatically, but verify any application code that compares these columns with integer arithmetic (WHERE is_active = 1 vs WHERE is_active IS TRUE).

3. DATETIME vs TIMESTAMP WITH TIME ZONE

MySQL’s DATETIME stores a calendar date and time with no timezone information. PostgreSQL’s TIMESTAMP is equivalent. However, in production you should almost always use TIMESTAMPTZ (alias for TIMESTAMP WITH TIME ZONE), which stores the value normalized to UTC and converts on display based on the session timezone.

sql
-- MySQL DATETIME has no timezone awareness
created_at DATETIME DEFAULT CURRENT_TIMESTAMP

-- PostgreSQL -- always use TIMESTAMPTZ in production
created_at TIMESTAMPTZ DEFAULT NOW()
Tip: Set timezone = 'UTC' in your postgresql.conf and standardize all application connections to UTC. This eliminates entire classes of bugs during DST transitions and cross-region deployments.

4. ENUM Handling

MySQL ENUMs are stored as integers internally and are notoriously difficult to alter (adding a value requires a full table rebuild in older versions). In PostgreSQL you have two better options: a CHECK constraint on a TEXT column (easy to alter) or a separate lookup table with a foreign key. Avoid using PostgreSQL’s native ENUM type for frequently-changing value sets — it has the same ALTER pain as MySQL.

sql
-- PostgreSQL: CHECK constraint approach (preferred for small, stable enums)
status TEXT NOT NULL CHECK (status IN ('pending', 'active', 'archived'))

-- PostgreSQL: lookup table approach (preferred for large or changing enums)
CREATE TABLE order_statuses (code TEXT PRIMARY KEY);
INSERT INTO order_statuses VALUES ('pending'), ('active'), ('archived');
ALTER TABLE orders ADD CONSTRAINT fk_status
  FOREIGN KEY (status) REFERENCES order_statuses(code);

5. Backtick Identifiers to Double Quotes

MySQL uses backticks to quote identifiers (`table_name`). PostgreSQL uses ANSI SQL double quotes ("table_name"). Most migration tools handle this, but check any raw SQL in your application code, stored procedures, and migration scripts.

6. UNSIGNED Integers

PostgreSQL has no unsigned integer types. The usual replacement is a CHECK (column >= 0) constraint, or moving to BIGINT if the full positive range of an unsigned 32-bit integer is needed.

sql
-- MySQL
quantity INT UNSIGNED NOT NULL DEFAULT 0

-- PostgreSQL
quantity INT NOT NULL DEFAULT 0 CHECK (quantity >= 0)

7. String Functions and Syntax Differences

Several MySQL-specific functions and syntax patterns need substitution in application queries and views:

  • IFNULL(a, b)COALESCE(a, b)
  • GROUP_CONCAT(col ORDER BY col SEPARATOR ',')STRING_AGG(col, ',' ORDER BY col)
  • LIMIT 10, 20 (offset-first syntax) → LIMIT 20 OFFSET 10
  • LIKE is case-insensitive in MySQL by default → use ILIKE in PostgreSQL for case-insensitive matching, or LIKE for case-sensitive

Migrating with pgLoader

pgLoader is an open-source migration tool written in Common Lisp that reads from MySQL (and other sources) and streams data directly into PostgreSQL with configurable type casting. For most schemas under 500 GB, a single pgLoader run is the fastest path to a working PostgreSQL clone.

Compared to alternatives: AWS DMS (Database Migration Service) is well-suited for heterogeneous migrations inside AWS infrastructure and supports ongoing replication, but requires careful endpoint configuration and can be expensive for large one-time migrations. ora2pg is purpose-built for Oracle-to-PostgreSQL and not the right tool here. For straightforward MySQL-to-PostgreSQL migrations on self-managed or non-AWS infrastructure, pgLoader is the most direct option.

Critical: MySQL’s default sql_mode silently truncates strings and accepts invalid dates. PostgreSQL is strict by default. Before migrating, run SET sql_mode='STRICT_ALL_TABLES,NO_ZERO_DATE' in MySQL for a week to surface all silent data violations before pgLoader encounters them. Fix each error at the source before proceeding.

Step 1: Install pgLoader

bash
# Ubuntu / Debian
sudo apt-get install pgloader

# macOS (Homebrew)
brew install pgloader

# Verify installation
pgloader --version

Step 2: Run a Basic Migration

For a simple schema with no stored procedures, the one-liner command is sufficient for an initial test run on a staging clone:

bash
pgloader mysql://mysqluser:pass@localhost/myapp \
         postgresql://pguser:pass@localhost/myapp

Step 3: Use a Configuration File for Production

Production migrations require explicit control over type casting, table exclusions, and memory settings. Create a .load configuration file:

ini
LOAD DATABASE
  FROM mysql://mysqluser:pass@mysql-host/myapp
  INTO postgresql://pguser:pass@pg-host/myapp

WITH include drop, create tables, create indexes,
     reset sequences, foreign keys

SET maintenance_work_mem to '512MB',
    work_mem to '128MB'

CAST type tinyint to boolean using tinyint-to-boolean,
     type datetime to timestamptz,
     type enum to text

EXCLUDING TABLE NAMES MATCHING 'temp_.*', 'cache_.*'
;
bash
pgloader migration.load
Warning: pgLoader handles most type conversions automatically but does NOT migrate MySQL stored procedures, triggers, or views. These must be manually rewritten in PL/pgSQL before going live. Audit your MySQL schema for these objects with SHOW PROCEDURE STATUS WHERE Db = 'myapp' and SHOW TRIGGERS FROM myapp before scheduling a cutover window.

Step 4: Monitor the Migration Run

pgLoader outputs a summary table at the end of each run showing rows loaded, errors, and timing per table. Any rows with casting errors are written to a reject file. Review pgloader.log and the reject file before validating row counts.

Application Code Changes

A successful data migration is only half the work. Application code that talked to MySQL must be updated to use PostgreSQL-compatible drivers, connection strings, and query patterns.

Connection Driver Changes

python
# MySQL (pymysql)
import pymysql
conn = pymysql.connect(host='localhost', db='myapp')

# PostgreSQL (psycopg2) -- synchronous
import psycopg2
conn = psycopg2.connect("host=localhost dbname=myapp user=pguser password=pass")

# PostgreSQL (asyncpg) -- async, high-performance
import asyncpg
conn = await asyncpg.connect('postgresql://pguser:pass@localhost/myapp')

For Django, update DATABASES['ENGINE'] from django.db.backends.mysql to django.db.backends.postgresql. For SQLAlchemy, change the connection URL scheme from mysql+pymysql:// to postgresql+psycopg2:// or postgresql+asyncpg://.

ORM Migration Updates

If you use Django migrations or Alembic, review auto-generated migration files after switching backends. Boolean fields previously backed by SmallIntegerField or IntegerField should be converted to BooleanField. Datetime fields should be marked as timezone-aware. Run makemigrations --check (Django) or alembic check after updating column type mappings.

Case Sensitivity and ILIKE

MySQL’s default collation is case-insensitive for LIKE and equality comparisons. PostgreSQL’s LIKE is case-sensitive. Replace case-insensitive searches with ILIKE, or add a functional index on lower(column) and match against lower(:param) for index-eligible case-insensitive lookups.

sql
-- MySQL (case-insensitive by default)
SELECT * FROM users WHERE email LIKE '%@example.com';

-- PostgreSQL (case-sensitive LIKE, case-insensitive ILIKE)
SELECT * FROM users WHERE email ILIKE '%@example.com';

-- Better: functional index for production use
CREATE INDEX idx_users_email_lower ON users (lower(email));
SELECT * FROM users WHERE lower(email) = lower('user@Example.com');

Transaction Isolation

MySQL’s default transaction isolation level is REPEATABLE READ. PostgreSQL’s default is READ COMMITTED. In PostgreSQL, even REPEATABLE READ is implemented via snapshots with no phantom reads, which is stronger than MySQL’s equivalent. Review any application code that explicitly sets isolation levels or relies on specific locking behavior (SELECT ... FOR UPDATE SKIP LOCKED is supported in both, but semantics differ).

Validating the Migration

Never cut over production traffic without completing all three layers of validation: row counts, data checksums, and query equivalence. Run these against a staging clone migrated from a recent production backup.

Row Count Validation

sql
-- PostgreSQL: approximate row counts from statistics (fast)
SELECT table_name, n_live_tup AS row_count
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;

-- Exact row counts -- run the same query on both MySQL and PostgreSQL
SELECT 'users' AS table_name, COUNT(*) FROM users
UNION ALL SELECT 'orders', COUNT(*) FROM orders
UNION ALL SELECT 'products', COUNT(*) FROM products;

Any table with a row count mismatch indicates either a pgLoader reject (check the reject file) or a filtering rule that excluded rows unexpectedly.

Data Checksums

Row counts confirm quantity but not content. Use an MD5 checksum over a sample of rows to verify data fidelity. Run equivalent queries on both databases — the hash should match if the data, ordering, and casting are identical.

sql
-- PostgreSQL: checksum a sample of data
SELECT md5(string_agg(id::text || name, ',' ORDER BY id))
FROM users LIMIT 10000;

-- Run the equivalent in MySQL and compare outputs
SELECT md5(group_concat(id, name ORDER BY id SEPARATOR ','))
FROM (SELECT id, name FROM users ORDER BY id LIMIT 10000) t;

Checksum mismatches most commonly arise from encoding differences (UTF-8 vs latin1), whitespace trimming behavior, or boolean casting (0/1 vs TRUE/FALSE string representations).

Query Equivalence

Run your application’s most critical read queries against both databases and compare result sets. Use your existing test suite if it covers database queries, or extract slow-query-log samples from MySQL and replay them against the PostgreSQL clone. Pay special attention to queries using ORDER BY without a deterministic sort key — PostgreSQL and MySQL can return ties in different orders.

Tip: Enable log_min_duration_statement = 0 in PostgreSQL during staging validation to capture all query plans. Use EXPLAIN (ANALYZE, BUFFERS) on any query that regresses in performance — missing indexes are the most common cause after migration.

Key Takeaways

Key Takeaways
  • Enable MySQL strict mode (STRICT_ALL_TABLES,NO_ZERO_DATE) in production for at least one week before migrating to surface silent data bugs that PostgreSQL will reject.
  • pgLoader automates schema DDL generation and data streaming with configurable type casts — use a .load config file for production migrations, not the one-liner command.
  • Stored procedures, triggers, and views are NOT migrated by pgLoader and must be manually rewritten in PL/pgSQL before cutover.
  • Application code changes are unavoidable: update connection drivers, replace MySQL-specific functions (IFNULLCOALESCE, GROUP_CONCATSTRING_AGG), and replace case-insensitive LIKE with ILIKE.
  • Validate with row counts, data checksums, and query equivalence on a staging clone before touching production. Never skip the checksum step.
  • PostgreSQL’s JSONB, array types, and logical replication model open up architectural patterns — CDC pipelines, analytical replication, connection multiplexing — that justify the migration cost many times over.

Working with JusDB on MySQL to PostgreSQL Migration

At JusDB, we have guided teams through MySQL-to-PostgreSQL migrations ranging from single-instance applications to multi-region Aurora clusters under sustained write loads. The pattern is always the same: the schema differences and application code changes are predictable and solvable; the surprises come from silent data violations accumulated over years of permissive MySQL defaults, and from stored procedure logic that no one on the current team fully understands.

Our migration engagements start with a schema and data audit before any migration tooling is run. We identify ENUM columns with values that have drifted, DATETIME columns with zero-dates, TINYINT columns that carry non-boolean semantics, and stored procedures that need to be rewritten or replaced with application-layer logic. This audit typically takes one to two days and prevents the migration weekend from turning into an incident.

We then run pgLoader against a staging clone, iterate on the casting rules, validate row counts and checksums, replay slow-query-log workloads, and identify any queries that need index additions or rewrites before the production cutover. For zero-downtime migrations, we configure logical replication from MySQL to PostgreSQL using AWS DMS in ongoing replication mode, keeping the PostgreSQL instance current until the final DNS cutover.

If your team is planning a MySQL to PostgreSQL migration — whether you’re moving off a self-managed instance, migrating from RDS MySQL to Aurora PostgreSQL, or consolidating a multi-database estate — our PostgreSQL managed services provide the engineering support and operational runbooks to do it safely. Get in touch to schedule a migration scoping call.

Related reading:

Share this article

JusDB Team

Official JusDB content team