What if you could know whether a new index would actually be used by the query planner — before spending hours building it? On a 500GB production table, a misguided CREATE INDEX is not just wasted time; it locks resources, consumes disk, and may never improve a single query. That is exactly the problem hypopg solves. It lets you test hypothetical indexes in PostgreSQL without writing a single byte to disk, giving you planner-level evidence to guide index decisions before you commit.
- hypopg is a PostgreSQL extension that creates in-memory "hypothetical" indexes — the planner can see and use them in
EXPLAIN, but they do not exist on disk. - Use
hypopg_create_index()to define a hypothetical index and immediately runEXPLAINto see whether the planner would switch from a Seq Scan to an Index Scan. hypopg_relation_size()estimates how large the real index would be, so you can compare a full index against a partial index before committing disk space.- Always use
EXPLAIN, neverEXPLAIN ANALYZE— hypothetical indexes cannot execute; ANALYZE will ignore them. - Hypothetical indexes are session-scoped and auto-drop on disconnect; call
hypopg_reset()to clear them manually within a session.
What is hypopg?
hypopg is an open-source PostgreSQL extension developed by Julien Rouhaud that injects virtual index definitions into the planner's catalog without writing anything to disk. When you call hypopg_create_index(), PostgreSQL records metadata describing the index — column list, index type, predicate for partial indexes — and exposes it to the query planner exactly as a real index would be exposed. The planner evaluates these entries during cost estimation, so an EXPLAIN on any query will reflect whether that hypothetical index would be chosen.
Because no pages are written to disk and no index scans can actually execute, the only output you get is a cost estimate. That is all you need. If the planner's estimated cost drops by two orders of magnitude after you add the hypothetical index, you have strong evidence that building the real index will pay off. If the planner ignores it, you know the index would be useless before spending hours on CREATE INDEX CONCURRENTLY.
hypopg supports B-tree, GiST, GIN, and BRIN index types, covering the vast majority of indexing scenarios in production PostgreSQL workloads. It works with PostgreSQL 9.2 and above, including the current PostgreSQL 17 release.
Installing hypopg
On Ubuntu or Debian, the simplest path is the official PostgreSQL apt repository package:
# Ubuntu/Debian (PostgreSQL 17)
sudo apt-get install postgresql-17-hypopg
# Or build from source
git clone https://github.com/HypoPG/hypopg
cd hypopg && make && sudo make installAfter the shared library is installed, enable the extension inside the target database:
-- Enable in database
CREATE EXTENSION hypopg;No postgresql.conf changes or server restarts are required. The extension is fully session-scoped: hypothetical indexes created in one session are invisible to all other sessions and are automatically dropped when the session terminates.
Creating and Testing Hypothetical Indexes
Step 1 — Create a hypothetical index
Pass any valid CREATE INDEX statement as a string to hypopg_create_index(). The function returns the internal indexrelid and a generated name:
-- Create a hypothetical index (not actually built on disk)
SELECT * FROM hypopg_create_index('CREATE INDEX ON orders(customer_id, created_at DESC)');
-- Returns: indexrelid | indexname
-- ─────────────────────────────────────────
-- 24601 | <24601>btree_orders_customer_id
-- Create a partial hypothetical index
SELECT * FROM hypopg_create_index(
'CREATE INDEX ON orders(status) WHERE status IN (''pending'', ''processing'')'
);
-- Create hypothetical index on expression
SELECT * FROM hypopg_create_index(
'CREATE INDEX ON users(lower(email))'
);
-- List all hypothetical indexes
SELECT * FROM hypopg_list_indexes();Step 2 — Run EXPLAIN before and after
EXPLAIN (not EXPLAIN ANALYZE) when testing hypothetical indexes. EXPLAIN ANALYZE actually executes the query and cannot use hypothetical indexes — it will revert to the real (non-indexed) plan while reporting the hypothetical plan's cost estimates. This is confusing and misleading.
-- Before hypothetical index
EXPLAIN SELECT * FROM orders WHERE customer_id = 12345 AND created_at > NOW() - INTERVAL '30 days';
-- Seq Scan on orders (cost=0.00..45231.00 rows=847 width=89)
-- Filter: ((customer_id = 12345) AND (created_at > ...))
-- After creating hypothetical index
SELECT * FROM hypopg_create_index('CREATE INDEX ON orders(customer_id, created_at DESC)');
EXPLAIN SELECT * FROM orders WHERE customer_id = 12345 AND created_at > NOW() - INTERVAL '30 days';
-- Index Scan using <24601>btree_orders_customer_id on orders (cost=0.56..89.23 rows=847 width=89)
-- Index Cond: ((customer_id = 12345) AND (created_at > ...))
-- Cost dropped from 45231 → 89: 508x improvementThe planner switched from a Seq Scan costing 45,231 to an Index Scan costing 89 — a 508x estimated improvement. That single EXPLAIN output is your green light to build the real index.
Step 3 — Estimate index size
Before committing disk space, use hypopg_relation_size() to get an estimated byte count for the real index:
-- hypopg can also estimate how large the real index would be
SELECT hypopg_relation_size(indexrelid) AS estimated_index_size_bytes,
pg_size_pretty(hypopg_relation_size(indexrelid)) AS human_readable
FROM hypopg_list_indexes();Step 4 — Compare full vs. partial indexes
One of the most practical uses of hypopg is comparing a full index against a partial index to find the smallest index that still gets used by the planner:
-- Full index
SELECT * FROM hypopg_create_index('CREATE INDEX ON events(type, created_at)');
EXPLAIN SELECT * FROM events WHERE type = 'click' AND created_at > NOW() - INTERVAL '1 day';
SELECT hypopg_relation_size(indexrelid) FROM hypopg_list_indexes();
SELECT hypopg_reset();
-- Partial index (smaller, faster for high-cardinality filter)
SELECT * FROM hypopg_create_index(
'CREATE INDEX ON events(created_at) WHERE type = ''click'''
);
EXPLAIN SELECT * FROM events WHERE type = 'click' AND created_at > NOW() - INTERVAL '1 day';
SELECT hypopg_relation_size(indexrelid) FROM hypopg_list_indexes();
SELECT hypopg_reset();
-- Compare: partial index is typically 70-90% smallerIf both plans show an Index Scan with similar cost, take the partial index. You get the same query plan improvement at a fraction of the disk footprint and maintenance overhead.
Real-World Workflow: Index Impact Analysis Before Deployment
The most valuable workflow combines pg_stat_statements with hypopg to build an evidence-based index proposal before touching production.
-- Step 1: Find slow queries from pg_stat_statements
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
WHERE mean_exec_time > 1000 -- queries averaging > 1 second
ORDER BY total_exec_time DESC
LIMIT 5;
-- Step 2: Run EXPLAIN on the slow query
EXPLAIN SELECT o.*, u.email
FROM orders o JOIN users u ON u.id = o.user_id
WHERE o.status = 'pending' AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC;
-- Shows Seq Scan (expensive)
-- Step 3: Create hypothetical indexes and test
SELECT * FROM hypopg_create_index('CREATE INDEX ON orders(status, created_at DESC)');
SELECT * FROM hypopg_create_index('CREATE INDEX ON orders(user_id) WHERE status = ''pending''');
EXPLAIN SELECT o.*, u.email
FROM orders o JOIN users u ON u.id = o.user_id
WHERE o.status = 'pending' AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC;
-- Now shows Index Scan
-- Step 4: Compare costs
-- Step 5: Drop hypothetical indexes
SELECT hypopg_drop_index(indexrelid) FROM hypopg_list_indexes();
-- Or drop all at once:
SELECT hypopg_reset();
-- Step 6: Create the winning index for real
CREATE INDEX CONCURRENTLY idx_orders_status_created
ON orders(status, created_at DESC);CREATE INDEX CONCURRENTLY on a 500GB table that takes 4 hours to build. Verify the planner will actually use your new index on the specific queries you care about before committing to the build time.
This workflow takes five minutes and produces hard numbers. You can present the EXPLAIN output as a before/after diff in a deployment proposal, giving reviewers confidence without requiring a staging environment with production-scale data.
hypopg vs EXPLAIN with Real Indexes — When Each Applies
hypopg is not a replacement for testing with real indexes. It is a fast first pass that filters out bad ideas before you invest real time. Here is when to use each approach:
| Scenario | Use hypopg | Use real index |
|---|---|---|
| Table is > 100GB and index build takes hours | Yes — validate first | Only after hypopg confirms |
| Comparing 5 candidate index designs quickly | Yes — iterate in seconds | Too slow for iteration |
| Verifying runtime performance improvement | No — no actual execution | Yes — measure wall clock time |
| Checking planner cost estimate direction | Yes — sufficient | Overkill |
| CI/CD pre-deploy index validation | Yes — low overhead | May be impractical |
| Confirming index is actually used at runtime | No | Yes — use EXPLAIN ANALYZE |
A healthy workflow uses both: hypopg to validate planner intent in minutes, then CREATE INDEX CONCURRENTLY followed by EXPLAIN ANALYZE on staging to confirm real execution behavior before production rollout.
Limitations
hypopg is purpose-built and transparent about its constraints. Understanding them prevents misuse:
- EXPLAIN only, not EXPLAIN ANALYZE. Hypothetical indexes exist only in planner metadata. Any query that actually executes — including EXPLAIN ANALYZE — cannot use them. The planner will produce a cost estimate reflecting the hypothetical index but ANALYZE will execute against the real table without it, producing meaningless timing data.
- No actual query execution. The only output is a cost estimate. You cannot measure real latency or throughput with hypothetical indexes. Use them for directional validation, not benchmarking.
- Supported index types: B-tree, GiST, GIN, BRIN. Hash indexes and extension-provided index types (e.g., bloom) are not supported.
- Session-scoped. All hypothetical indexes are dropped when the session ends. In pooled environments with short-lived connections, you may need to recreate them every time. Call
hypopg_reset()within a session to clear them manually between tests. - Statistics dependency. hypopg relies on
pg_statisticdata for row count estimates. On tables with stale statistics (post-bulk-load or post-significant-churn), planner estimates may be inaccurate even with real indexes. RunANALYZEbefore relying on hypopg results in those cases. - No write amplification simulation. hypopg does not model the overhead a real index would add to INSERT, UPDATE, and DELETE operations. Consider write workload separately when making the final decision.
Key Takeaways
- hypopg lets the PostgreSQL planner evaluate an index that does not exist on disk — giving you cost estimates in seconds rather than after a multi-hour index build.
- Use
hypopg_create_index()with any valid CREATE INDEX DDL, including partial indexes and expression indexes. The planner treats them as real candidates during EXPLAIN. - Always use plain
EXPLAIN, neverEXPLAIN ANALYZE— ANALYZE executes the query and bypasses hypothetical indexes entirely. hypopg_relation_size()estimates disk footprint before you commit, enabling direct comparisons between full and partial index designs.- The recommended workflow: identify slow queries via
pg_stat_statements→ test candidate indexes with hypopg → pick the winning design → build withCREATE INDEX CONCURRENTLY→ verify withEXPLAIN ANALYZEon staging. - Hypothetical indexes are session-scoped and invisible to other connections. Call
hypopg_reset()to clear them when done, or let them drop on disconnect.
Working with JusDB on PostgreSQL Index Optimization
Knowing that an index would help is the first step. Knowing which index design to pick — composite vs. partial, covering vs. filtered, B-tree vs. GIN — requires understanding your query patterns, data distribution, and write workload together. hypopg answers the planner question; the broader index strategy question takes deeper analysis.
At JusDB's PostgreSQL consulting practice, index optimization engagements typically start with a workload audit: mining pg_stat_statements for the highest-impact queries, profiling table statistics and bloat, and mapping index candidates to execution plans using exactly this kind of hypopg workflow. The result is an index proposal with documented evidence — cost comparisons, size estimates, and write overhead projections — that teams can review before any production DDL runs.
If your PostgreSQL cluster has slow queries, growing index bloat, or an upcoming schema migration involving large tables, get in touch. We can run a targeted index audit and deliver a prioritized set of changes with hypopg-backed validation for each one.
Related reading: