Running queries across multiple databases used to mean writing ETL pipelines, scheduling cron jobs, and maintaining brittle sync logic that breaks the moment a schema changes. PostgreSQL Foreign Data Wrappers (FDWs) let you skip all of that — querying remote MySQL instances, flat CSV files, S3-hosted Parquet files, and even MongoDB collections as if they were ordinary local tables. The SQL interface stays the same, the data stays where it lives, and your application sees a single unified PostgreSQL connection. This post covers the four most production-relevant wrappers, their setup commands, and the performance realities you need to understand before shipping to production.
- FDWs let PostgreSQL query remote data sources (other Postgres instances, MySQL, CSV files, S3/Parquet) using standard SQL.
postgres_fdwsupports WHERE/JOIN/aggregate pushdown — use it for cross-Postgres federation.mysql_fdwbridges PostgreSQL and MySQL with read/write support.file_fdwexposes local CSV files as foreign tables with zero configuration.parquet_s3_fdwqueries Parquet files on S3 directly from PostgreSQL.- All FDWs carry network overhead — plan for it, and push filters to the remote server whenever possible.
- FDWs are not a replacement for proper replication; treat them as a federation and migration tool.
What Are Foreign Data Wrappers?
Foreign Data Wrappers are PostgreSQL's implementation of the SQL/MED (Management of External Data) standard, introduced in SQL:2003 and available in PostgreSQL since version 9.1. The architecture is straightforward: a FDW is a shared library that translates PostgreSQL's internal scan API into calls against an external data source. From the query planner's perspective, a foreign table looks like any other relation — it has a schema, statistics, and can participate in joins.
The core objects you will always create are:
- Extension — loads the FDW shared library.
- Foreign Server — defines the remote endpoint (host, port, database).
- User Mapping — maps a local PostgreSQL role to credentials on the remote server.
- Foreign Table — defines the local schema that maps to the remote table or file.
PostgreSQL ships two FDWs in its core distribution: postgres_fdw (for remote PostgreSQL servers) and file_fdw (for local files). Everything else — MySQL, MongoDB, S3, Redis — is provided by third-party extensions.
postgres_fdw: Federating PostgreSQL Instances
postgres_fdw is the most capable wrapper in the ecosystem. It connects to a remote PostgreSQL server over libpq and, critically, supports predicate pushdown, join pushdown, and aggregate pushdown — meaning the remote server does most of the work before sending rows back.
Installation and Setup
-- Step 1: Load the extension on the local server
CREATE EXTENSION postgres_fdw;
-- Step 2: Define the remote server
CREATE SERVER remote_pg
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'db2.internal', port '5432', dbname 'analytics');
-- Step 3: Map a local role to a remote user
CREATE USER MAPPING FOR reporting_user
SERVER remote_pg
OPTIONS (user 'analytics_reader', password 's3cr3t');
-- Step 4a: Import an entire remote schema at once
IMPORT FOREIGN SCHEMA public
FROM SERVER remote_pg
INTO local_analytics_schema;
-- Step 4b: Or define a single foreign table manually
CREATE FOREIGN TABLE orders_remote (
order_id bigint,
customer_id bigint,
total_cents integer,
created_at timestamptz
)
SERVER remote_pg
OPTIONS (schema_name 'public', table_name 'orders');IMPORT FOREIGN SCHEMA is the practical choice for any schema with more than a handful of tables — it introspects the remote catalog and creates all foreign table definitions automatically. Re-run it after remote schema changes to pick up new columns.
Pushdown Behavior
The single biggest performance lever in postgres_fdw is controlling what gets pushed to the remote server versus evaluated locally.
-- This WHERE clause is pushed to the remote server
-- Only matching rows are transferred over the network
SELECT order_id, total_cents
FROM orders_remote
WHERE created_at >= '2025-01-01'
AND total_cents > 10000;
-- Join pushdown: both tables on the same remote server,
-- the join executes remotely
SELECT o.order_id, c.email
FROM orders_remote o
JOIN customers_remote c ON o.customer_id = c.customer_id
WHERE o.created_at >= '2025-01-01';
-- Aggregate pushdown: GROUP BY and SUM run on the remote server
SELECT DATE_TRUNC('day', created_at) AS day,
SUM(total_cents) AS daily_revenue
FROM orders_remote
GROUP BY 1
ORDER BY 1;Enable use_remote_estimate = true on the server definition so PostgreSQL fetches remote row count statistics when planning queries. Without it, the planner uses a default estimate of 1000 rows and may choose poor join strategies.
ALTER SERVER remote_pg OPTIONS (ADD use_remote_estimate 'true');Functions that are not immutable or that depend on local session state (e.g., current_setting(), custom PL/pgSQL functions) will not be pushed down. The planner fetches all candidate rows and evaluates the function locally — potentially transferring millions of rows across the network for a filter that returns ten.
mysql_fdw: Bridging PostgreSQL and MySQL
mysql_fdw, maintained by EnterpriseDB, provides read and write access to MySQL and MariaDB tables. It is the most common FDW used during MySQL-to-PostgreSQL migrations because you can keep the MySQL instance running, query it live from PostgreSQL, and cut over at your own pace.
-- Install the extension (must be compiled against libmysqlclient)
CREATE EXTENSION mysql_fdw;
-- Define the MySQL server
CREATE SERVER mysql_crm
FOREIGN DATA WRAPPER mysql_fdw
OPTIONS (host '10.0.1.50', port '3306');
-- Map credentials
CREATE USER MAPPING FOR local_app_user
SERVER mysql_crm
OPTIONS (username 'crm_reader', password 'hunter2');
-- Create a foreign table mirroring the MySQL schema
CREATE FOREIGN TABLE contacts_mysql (
id int,
first_name varchar(100),
last_name varchar(100),
email varchar(255),
created_at datetime
)
SERVER mysql_crm
OPTIONS (dbname 'crm', table_name 'contacts');
-- Write back to MySQL (INSERT/UPDATE/DELETE are supported)
INSERT INTO contacts_mysql (id, first_name, last_name, email, created_at)
VALUES (9001, 'Ada', 'Lovelace', 'ada@example.com', NOW());MySQL's type system does not map perfectly to PostgreSQL. TINYINT(1) comes across as smallint, not boolean. DATETIME has no timezone, while PostgreSQL's timestamptz does. Always verify type mappings before assuming data integrity in migration scenarios.
file_fdw: Querying CSV Files with SQL
file_fdw ships with PostgreSQL and exposes files readable by the PostgreSQL server process as foreign tables. It is ideal for loading reference data, querying log exports, or giving analysts a SQL interface to CSV dumps without running an import.
CREATE EXTENSION file_fdw;
CREATE SERVER csv_files
FOREIGN DATA WRAPPER file_fdw;
-- The file must be readable by the postgres OS user
CREATE FOREIGN TABLE sales_export (
region text,
product_sku text,
quantity integer,
revenue numeric(12,2),
sale_date date
)
SERVER csv_files
OPTIONS (
filename '/var/lib/postgresql/exports/sales_2025.csv',
format 'csv',
header 'true',
delimiter ','
);
-- Query it like any other table
SELECT region, SUM(revenue) AS total_revenue
FROM sales_export
WHERE sale_date >= '2025-01-01'
GROUP BY region
ORDER BY total_revenue DESC;file_fdw is read-only. The file must reside on the PostgreSQL server's filesystem and must be accessible by the postgres OS user. It cannot read files from object storage directly — use parquet_s3_fdw or aws_s3 extensions for S3.
parquet_s3_fdw: Querying S3-Hosted Parquet Files
parquet_s3_fdw (the parquet-s3-fdw project) lets PostgreSQL query columnar Parquet files stored on Amazon S3, MinIO, or any S3-compatible object store. Because Parquet is columnar, the extension can perform column projection — only transferring the columns your query actually reads.
CREATE EXTENSION parquet_s3_fdw;
CREATE SERVER parquet_s3
FOREIGN DATA WRAPPER parquet_s3_fdw
OPTIONS (
use_minio 'false', -- set 'true' for MinIO
region 'us-east-1'
);
CREATE USER MAPPING FOR analytics_role
SERVER parquet_s3
OPTIONS (
access_key_id 'AKIAIOSFODNN7EXAMPLE',
secret_access_key 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'
);
CREATE FOREIGN TABLE events_parquet (
event_id bigint,
user_id bigint,
event_type text,
properties jsonb,
occurred_at timestamptz
)
SERVER parquet_s3
OPTIONS (filename 's3://data-lake-bucket/events/2025/*.parquet');
-- Column projection reduces data transfer significantly
SELECT user_id, COUNT(*) AS event_count
FROM events_parquet
WHERE occurred_at >= '2025-06-01'
AND event_type = 'purchase'
GROUP BY user_id
HAVING COUNT(*) > 5;Python-Based FDWs with Multicorn
When no compiled FDW exists for your target data source, Multicorn provides a Python API for writing custom wrappers. MongoDB, Redis, Elasticsearch, and REST APIs have all been implemented as Multicorn wrappers. The tradeoff is performance: Python function call overhead is significant compared to a compiled C extension. Use Multicorn for low-volume lookups and reference data, not for high-throughput analytical queries.
FDW Performance: What to Expect
FDWs are not a transparent performance layer. Every query against a foreign table incurs at minimum one network round trip to the remote server. Understanding where the overhead comes from lets you design around it.
Network and Serialization Overhead
Even with pushdown enabled, every result set crosses the network. A query returning 10 million rows from a remote table is slow — not because PostgreSQL is doing extra work, but because 10 million rows of serialized data must travel over the wire. Design FDW queries to return aggregates or narrow result sets.
Index Limitations
PostgreSQL cannot use indexes that exist on the remote server to satisfy local query plans. postgres_fdw pushes WHERE clauses to the remote server, which can then use its own indexes — but only for simple, pushdown-eligible predicates. Wrappers like file_fdw and parquet_s3_fdw have no index concept at all; they always perform full scans (though Parquet row group statistics can skip data blocks).
Avoid joining a large local table to a foreign table without a selective WHERE clause on the foreign side. PostgreSQL may choose a nested loop join that issues one remote query per local row — a classic N+1 pattern that can bring a database to its knees. Use EXPLAIN (ANALYZE, VERBOSE) to verify the join strategy before running in production.
Connection Pooling
postgres_fdw opens a persistent libpq connection per local backend, per remote server. On a busy application server with many short-lived connections, this multiplies the connection count on the remote server. Place a connection pooler (PgBouncer) in front of the remote server and keep keep_connections enabled on the server definition (the default since PostgreSQL 14).
-- Verify pushdown decisions for a query
EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT region, SUM(total_cents)
FROM orders_remote
WHERE created_at >= '2025-01-01'
GROUP BY region;
-- Look for "Remote SQL" in the output to confirm pushdown:
-- Foreign Scan on orders_remote
-- Remote SQL: SELECT region, total_cents FROM public.orders
-- WHERE ((created_at >= '2025-01-01 00:00:00+00'))Use Cases and Limitations
Where FDWs Shine
- Data federation — Combining data from a production OLTP database with an analytical warehouse without running an ETL pipeline. Run the join in SQL and let pushdown do the heavy lifting.
- Migration phases — During a MySQL-to-PostgreSQL migration, keep the old MySQL instance as a foreign server. New code writes to PostgreSQL; legacy reads can still hit MySQL via
mysql_fdwuntil cutover. - Read-only replicas — Expose a read-only replica of a remote PostgreSQL instance as a foreign server, giving reporting users a SQL interface without granting direct access to the replica's credentials.
- Data lake queries — Query S3-hosted Parquet files from your PostgreSQL application without standing up a separate query engine like Athena or Trino.
Known Limitations
- FDWs do not participate in distributed transactions. A local transaction that writes to both a local table and a write-enabled foreign table has no two-phase commit unless you implement it manually.
- VACUUM and ANALYZE do not run on foreign tables. Statistics for the query planner must come from
use_remote_estimateor be set manually withALTER FOREIGN TABLE ... ALTER COLUMN ... SET STATISTICS. - Row-level security policies on local PostgreSQL do not apply to data returned from foreign servers — enforce access controls at the remote server level.
- Not all FDWs support
RETURNINGclauses,ON CONFLICT, orTRUNCATE.
Key Takeaways
- PostgreSQL FDWs expose remote data sources as local SQL tables using the SQL/MED standard — no ETL pipelines required.
- The setup pattern is always the same:
CREATE EXTENSION→CREATE SERVER→CREATE USER MAPPING→CREATE FOREIGN TABLE(orIMPORT FOREIGN SCHEMA). postgres_fdwsupports WHERE, JOIN, and aggregate pushdown — always verify withEXPLAIN VERBOSEand look for theRemote SQLline.mysql_fdwis write-enabled, making it the go-to choice for live MySQL-to-PostgreSQL migration phases.file_fdwis the fastest path to querying CSV exports with SQL — read-only, no external dependencies.parquet_s3_fdwenables columnar data lake queries directly from PostgreSQL without a separate query engine.- Network overhead is real — filter aggressively, return aggregates not raw rows, and monitor remote connection counts.
- FDWs are a federation and migration tool, not a replacement for streaming replication or a dedicated data warehouse.
Optimize Your PostgreSQL Deployment with JusDB
Setting up FDWs is the straightforward part. The harder challenges — right-sizing connection pools, tuning use_remote_estimate, ensuring pushdown actually fires for your workload, and monitoring cross-server query latency — require deep PostgreSQL expertise and proper observability tooling.
JusDB provides managed PostgreSQL with expert DBA support, covering FDW configuration, performance tuning, and ongoing query optimization. Whether you are federating multiple PostgreSQL instances, migrating off MySQL, or building a data lake query layer, our team has done it at scale.
Start your free trial at jusdb.com and get your PostgreSQL environment production-ready from day one.