A fintech team running MongoDB on Atlas reached out after a routine sprint demo turned into a production incident. A single query on their transactions collection — filtered by accountId, sorted by createdAt descending, and bounded to the last 90 days — was taking 4.2 seconds at the 99th percentile under 800 concurrent users. The collection held 220 million documents. Their existing index was a single-field index on accountId. MongoDB was using it — IXSCAN confirmed in the explain plan — but it was still examining 1.4 million documents to return 50 rows. Three fields, one compound index, thirty minutes of work: query time dropped to 3ms. The existing single-field index was doing almost nothing useful, and it took an explain plan to prove it.
MongoDB's indexing model is rich and non-obvious. The difference between a query that scans 1.4 million documents and one that scans 50 is almost always an index design decision, not a hardware upgrade. This guide covers every index type available in MongoDB, how to design compound indexes correctly using the ESR rule, how to read explain output without guessing, and how to monitor index usage in production so you can catch dead indexes before they accumulate write overhead for no benefit.
- MongoDB supports seven index types: single-field, compound, multikey (arrays), text, geospatial, hashed, and wildcard. Each solves a specific access pattern; none is universally applicable.
- Design compound indexes using the ESR rule — Equality fields first, Sort fields second, Range fields last — to maximize index selectivity and avoid in-memory sorts.
- Read
explain("executionStats")output by comparingtotalDocsExaminedtonReturned; a ratio above 1:1 means the index is not selective enough. - Covered queries — where the index contains every field the query reads — avoid document fetches entirely and are the fastest possible query path.
- Partial indexes index only documents matching a filter expression, dramatically reducing index size and write overhead for workloads with natural document subsets.
- Index intersection rarely outperforms a well-designed compound index; rely on it only as a last resort when you cannot change the index set.
- Monitor index usage in production with
$indexStatsand drop indexes with zero or near-zeroaccesses.opscounts to reclaim write throughput.
MongoDB Index Types
Understanding which index type to apply to a given access pattern is the foundation of MongoDB performance work. Choosing the wrong type — or no index at all — forces MongoDB into a COLLSCAN, reading every document in the collection regardless of the query's selectivity.
Single-Field Indexes
Single-field indexes are the default and most common index type. They index one field across all documents in a collection and support equality queries, range queries, and sort operations on that field.
// Index on a top-level field
db.transactions.createIndex({ accountId: 1 }); // 1 = ascending, -1 = descending
// Index on an embedded document field using dot notation
db.orders.createIndex({ "shipping.country": 1 });
// Single-field indexes are direction-agnostic for equality and single-sort queries.
// { accountId: 1 } and { accountId: -1 } are equivalent for .find({ accountId: x })
// Direction matters only in compound indexes.Compound Indexes
Compound indexes index multiple fields in a defined order. Field order determines which queries the index can satisfy — a compound index on { accountId: 1, createdAt: -1 } supports queries filtering on accountId alone, but not queries filtering on createdAt alone. This is the leftmost prefix rule: any leftmost prefix of a compound index can be used independently.
// Supports: filter on status; filter on status + region; sort on status + region + createdAt
db.transactions.createIndex({ status: 1, region: 1, createdAt: -1 });
// Does NOT support: filter on region alone (not a leftmost prefix)
// Does NOT support: filter on createdAt alone (not a leftmost prefix)Multikey Indexes (Arrays)
When a field contains an array, MongoDB automatically creates a multikey index — one index entry per array element. This allows efficient queries like { tags: "payments" } against documents where tags is an array field.
db.products.createIndex({ tags: 1 });
// For a document { tags: ["payments", "recurring", "ach"] }, MongoDB creates
// three index entries: one for "payments", one for "recurring", one for "ach".
// Compound multikey indexes have one constraint: at most ONE array field per index.
// This is invalid if both categories and vendors are arrays:
db.products.createIndex({ categories: 1, vendors: 1 }); // Error if both are arraysMultikey indexes inflate index size and write overhead proportionally to array cardinality. A document with a 200-element array generates 200 index entries on insert, update, or delete. If array elements are large objects, consider indexing only a scalar field within each element rather than the full array.
Text Indexes
Text indexes tokenize string fields into stemmed terms and support $text queries. A collection may have at most one text index, but that index can cover multiple fields and assign per-field weights.
db.articles.createIndex(
{ title: "text", body: "text", summary: "text" },
{ weights: { title: 10, summary: 5, body: 1 }, default_language: "english" }
);
// Query using the text index
db.articles.find(
{ $text: { $search: "mongodb indexing performance" } },
{ score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } });Text indexes work well for basic search use cases but do not support phrase proximity, fuzzy matching, or relevance ranking beyond tf-idf weighting. For production search requirements, evaluate Atlas Search (Lucene-backed) or a dedicated search engine.
Geospatial Indexes
MongoDB provides two geospatial index types. 2dsphere handles GeoJSON geometry on a sphere (the Earth) and supports $near, $geoWithin, and $geoIntersects queries. 2d handles flat-plane Cartesian coordinates and is a legacy type you should avoid for new schemas.
// 2dsphere index on a GeoJSON Point field
db.locations.createIndex({ coordinates: "2dsphere" });
// Documents must store GeoJSON geometry in the indexed field
// { coordinates: { type: "Point", coordinates: [ -122.4194, 37.7749 ] } }
// Find locations within 5km of a point
db.locations.find({
coordinates: {
$near: {
$geometry: { type: "Point", coordinates: [ -122.4194, 37.7749 ] },
$maxDistance: 5000 // meters
}
}
});Hashed Indexes
Hashed indexes store the hash of a field's value rather than the value itself. They are used exclusively for hash-based sharding — enabling even data distribution across shards when the shard key has high cardinality or monotonically increasing values. They support only equality queries; range queries cannot use a hashed index.
// Create a hashed index on userId for hash-based sharding
db.events.createIndex({ userId: "hashed" });
// Enable hash-based sharding on this key
sh.shardCollection("mydb.events", { userId: "hashed" });Wildcard Indexes
Wildcard indexes, introduced in MongoDB 4.2, index all fields in a document or all fields within a sub-document path. They are useful for workloads with unpredictable or highly variable field names — such as user-defined attribute schemas or JSONB-style flexible documents.
// Index every field in the document
db.products.createIndex({ "$**": 1 });
// Index only fields within a specific sub-document
db.products.createIndex({ "attributes.$**": 1 });
// Exclude specific paths from a wildcard index
db.products.createIndex(
{ "$**": 1 },
{ wildcardProjection: { description: 0, rawBlob: 0 } }
);Wildcard indexes are not a substitute for compound indexes on known access patterns. They carry significant overhead — every field in every document generates index entries — and cannot support sort operations or covered queries. Use them only when field names are genuinely unknown at schema design time.
The ESR Rule: Designing Compound Indexes That Actually Work
The ESR rule is the most important compound index design principle in MongoDB. It determines the order of fields within a compound index to maximize index utilization across the full query shape: Equality fields first, Sort fields second, Range fields last.
Consider a query that filters orders by status = "pending", sorts by createdAt descending, and applies a range filter of amount >= 100:
db.orders.find({
status: "pending",
amount: { $gte: 100 }
}).sort({ createdAt: -1 });
// Correct ESR compound index:
db.orders.createIndex({ status: 1, createdAt: -1, amount: 1 });
// E: status (equality)
// S: createdAt (sort)
// R: amount (range)
// Wrong order — range before sort — forces an in-memory sort:
db.orders.createIndex({ status: 1, amount: 1, createdAt: -1 });The reason for this ordering is mechanical. Equality fields reduce the working set to a single contiguous range of the index. Placing sort fields immediately after the equality fields means the documents within that subset are already physically ordered for the sort operation — MongoDB can return them without a blocking sort stage. Range fields placed after sort fields allow MongoDB to apply the range filter within the already-sorted result, rather than disrupting the sort order by interleaving range predicates.
When your explain plan shows a SORT stage with "memUsage" in the executionStats, the query required an in-memory sort because the index field order did not match the sort requirement. This caps the query at 100 MB of sort memory by default and signals a compound index field order problem.
Reading Explain Plans: COLLSCAN vs IXSCAN
The explain plan is the primary diagnostic tool for index analysis. Run it with "executionStats" verbosity to see the actual document and key examination counts from a real query execution.
db.transactions.explain("executionStats").find({
accountId: "ACC-8821",
status: "settled",
createdAt: { $gte: ISODate("2024-10-01") }
}).sort({ createdAt: -1 }).limit(50);The key fields to examine in the output:
winningPlan.stage:COLLSCANmeans no index was used — MongoDB read every document.IXSCANmeans an index was used.totalDocsExamined: The number of documents fetched from the collection after the index scan. In a perfect covered query this is 0. Any nonzero value indicates a document fetch.totalKeysExamined: The number of index entries scanned. This should be close tonReturned. A high ratio oftotalKeysExaminedtonReturnedindicates a low-selectivity index — the index is scanning far more entries than it returns.nReturned: Documents returned to the application.executionTimeMillis: Wall-clock execution time for the query.
// A well-indexed query looks like this:
{
"executionStats": {
"nReturned": 50,
"executionTimeMillis": 3,
"totalKeysExamined": 50,
"totalDocsExamined": 0, // covered query — zero document fetches
"executionStages": {
"stage": "PROJECTION_COVERED",
"inputStage": { "stage": "IXSCAN" }
}
}
}
// A poorly indexed query looks like this:
{
"executionStats": {
"nReturned": 50,
"executionTimeMillis": 4180,
"totalKeysExamined": 1400000, // examined 1.4M index keys
"totalDocsExamined": 1400000, // fetched 1.4M documents
"executionStages": {
"stage": "SORT", // in-memory sort required
"inputStage": { "stage": "IXSCAN" }
}
}
}Index Selectivity and Cardinality
Index selectivity measures how effectively an index narrows the candidate document set. A high-selectivity index on a field like userId (millions of distinct values) eliminates almost all documents immediately. A low-selectivity index on a field like status (three possible values: "pending", "settled", "failed") eliminates only two-thirds of documents on average.
Always place high-cardinality, high-selectivity equality fields first in compound indexes. If you have a query filtering on both status and userId, the correct order for the ESR equality portion is { userId: 1, status: 1, ... } — not { status: 1, userId: 1, ... } — because userId is dramatically more selective.
Estimate field cardinality with db.collection.distinct("fieldName").length. Any field with fewer than ~20 distinct values is a low-selectivity candidate and should rarely be the leading field in a compound index unless it is the primary equality filter and is paired immediately with a high-selectivity field.
Covered Queries: Eliminating Document Fetches
A covered query is one where every field referenced in the query filter, projection, and sort is present in the index. MongoDB satisfies the query entirely from the index, without reading the underlying documents. This is the fastest possible query path because it avoids the document fetch, which is a random I/O operation against the collection's data files.
// Index includes: accountId, createdAt, amount, status
db.transactions.createIndex({ accountId: 1, createdAt: -1, amount: 1, status: 1 });
// This query is covered: all fields in filter, projection, and sort are in the index
db.transactions.find(
{ accountId: "ACC-8821", status: "settled" },
{ _id: 0, accountId: 1, createdAt: 1, amount: 1, status: 1 }
).sort({ createdAt: -1 });
// Covered queries show totalDocsExamined: 0 in explain output
// and the winning plan stage will be PROJECTION_COVERED or similar, not FETCHThree things break covered query eligibility: including _id in the projection when _id is not in the index (always exclude it explicitly with _id: 0 or add _id to the index), projecting any field not present in the index, or filtering on a field not present in the index.
Partial Indexes: Index Only What You Query
Partial indexes index only the subset of documents that match a filter expression. They are one of the most under-used MongoDB index features and can dramatically reduce index size and write overhead for collections with natural document subsets.
// A partial index on only "pending" orders — the subset that needs processing
// An index on all orders would include tens of millions of settled/failed documents
// that are never queried in hot paths
db.orders.createIndex(
{ createdAt: 1, merchantId: 1 },
{ partialFilterExpression: { status: "pending" } }
);
// This query uses the partial index:
db.orders.find({ status: "pending", merchantId: "M-44321" }).sort({ createdAt: 1 });
// This query does NOT use the partial index (filter does not guarantee status: "pending"):
db.orders.find({ merchantId: "M-44321" }).sort({ createdAt: 1 });
// MongoDB requires the query filter to be a superset of the partialFilterExpression
// to guarantee the index covers the full result set.Partial index filter expressions support $exists, $gt, $gte, $lt, $lte, $eq, $type, $and, and $or. They do not support $nor, $not, or $where.
Sparse Indexes vs Partial Indexes
Sparse indexes, the predecessor to partial indexes, only index documents where the indexed field exists and is non-null. They solve the specific problem of indexing optional fields without creating null-value index entries for documents that lack the field entirely.
// Sparse index: only indexes documents where deletedAt exists
db.users.createIndex({ deletedAt: 1 }, { sparse: true });
// Equivalent partial index (preferred — more explicit and flexible):
db.users.createIndex(
{ deletedAt: 1 },
{ partialFilterExpression: { deletedAt: { $exists: true } } }
);Prefer partial indexes over sparse indexes in all new work. Partial indexes are a strict superset — they can express everything a sparse index can plus arbitrary filter expressions. Sparse indexes also have a footgun: a query with { deletedAt: null } will not use the sparse index, which can surprise engineers expecting it to.
Index Builds: Background vs Rolling
Building an index on a large collection is a blocking operation concern in production environments. The historical { background: true } option in createIndex was deprecated in MongoDB 4.2 and has been fully removed. As of MongoDB 4.2, all index builds use an optimized hybrid approach that holds exclusive locks only briefly at the start and end of the build, allowing concurrent reads and writes during the main build phase.
// As of MongoDB 4.2+, background:true is ignored or unsupported.
// Index builds are inherently concurrent with read/write operations during the scan phase.
db.transactions.createIndex({ accountId: 1, createdAt: -1 });
// Monitor an in-progress index build
db.currentOp({ "command.createIndexes": { $exists: true } });
// In MongoDB 4.4+, use the $currentOp aggregation for better visibility
db.adminCommand({ currentOp: true, "command.createIndexes": { $exists: true } });For replica sets on MongoDB versions before 4.2, or for zero-impact index builds on very large collections in any version, a rolling index build is the production-safe approach: build the index on each secondary individually (by stepping it down and building as a standalone or by using the replSetResizeOplog technique), then step down the primary and build there last.
In MongoDB Atlas, use the Atlas UI to trigger rolling index builds. Atlas handles the replica set member stepping and build sequencing automatically, ensuring no member falls behind the oplog window during the build.
Index Intersection and Why Compound Indexes Win
MongoDB can use index intersection — combining two separate single-field indexes at query execution time — to satisfy queries that filter on multiple fields. It sounds appealing: one index per field, maximum flexibility. In practice, index intersection rarely outperforms a well-designed compound index.
// Two separate indexes
db.orders.createIndex({ status: 1 });
db.orders.createIndex({ merchantId: 1 });
// MongoDB may intersect these for:
db.orders.find({ status: "pending", merchantId: "M-44321" });
// The explain plan shows AND_SORTED or AND_HASH stage when intersection is used.
// These stages are expensive: they require materializing both index scans into memory
// and performing a set intersection before fetching documents.
// A single compound index almost always wins:
db.orders.createIndex({ merchantId: 1, status: 1 });Index intersection carries real costs: both indexes are scanned in full for their respective predicates, the resulting document ID sets are intersected in memory, and then documents are fetched. A compound index that covers both fields scans only the entries that satisfy both predicates simultaneously, with no in-memory set operation. The compound index wins on both keys examined and execution time in virtually every benchmark.
Index intersection can occasionally be useful when you cannot modify the index set — for instance, on a collection owned by a third-party service or when adding a compound index would exceed Atlas's index limit for the tier. In all other cases, design the compound index you need and drop the redundant single-field indexes.
Monitoring Index Usage in Production
An index that is never used is not free. Every index adds write overhead on insert, update, and delete operations, and consumes RAM in the WiredTiger cache. Production MongoDB instances commonly accumulate dead indexes — often from access pattern changes that were never reflected in the index set — that silently degrade write throughput for months or years.
$indexStats Aggregation
// Returns per-index access counts since the mongod process started (or stats were reset)
db.transactions.aggregate([{ $indexStats: {} }]);
// Sample output for one index:
// {
// "name": "accountId_1_createdAt_-1",
// "key": { "accountId": 1, "createdAt": -1 },
// "host": "rs0/mongo1:27017",
// "accesses": {
// "ops": 0, // ← zero operations used this index since restart
// "since": ISODate("2025-01-15T00:00:00Z")
// }
// }
// Find indexes with zero usage across all shards or replica set members
db.transactions.aggregate([
{ $indexStats: {} },
{ $match: { "accesses.ops": 0 } },
{ $project: { name: 1, key: 1, "accesses.since": 1 } }
]);$indexStats counts reset when mongod restarts or when index stats are explicitly reset. Before dropping any index with low counts, verify that the monitoring period covers your full traffic pattern — monthly reports, end-of-quarter batch jobs, or seasonal traffic spikes will not appear if the server was restarted recently. Run $indexStats continuously over at least one full business cycle before making index removal decisions.
Collection-Level Index Stats
// db.collection.stats() provides index sizes on disk
const stats = db.transactions.stats();
// stats.indexSizes shows each index's storage footprint
printjson(stats.indexSizes);
// {
// "_id_": 2048000000,
// "accountId_1_createdAt_-1": 8192000000,
// "status_1": 512000000 // ← 512MB index never accessed — candidate for removal
// }
// Also check totalIndexSize for overall index RAM pressure
printjson({ totalIndexSize: stats.totalIndexSize, totalSize: stats.totalSize });Do not drop an index directly in production without first hiding it. MongoDB 4.4+ supports hideIndex, which makes an index invisible to the query planner without removing it. Hide the index, monitor query performance for 24–48 hours across your full traffic pattern, and only drop it if no degradation is observed. This lets you unhide the index instantly if a low-traffic query path relied on it.
// Hide an index (MongoDB 4.4+)
db.transactions.hideIndex("status_1");
// Verify the planner ignores it
db.transactions.find({ status: "pending" }).explain("executionStats");
// If no regressions after monitoring period, drop it
db.transactions.dropIndex("status_1");
// If you need to restore visibility immediately
db.transactions.unhideIndex("status_1");- Match the index type to the access pattern: single-field for simple equality and range, compound for multi-field queries, multikey for arrays, text for full-text search, 2dsphere for geospatial, hashed for shard key distribution, and wildcard only for truly dynamic schemas.
- Apply the ESR rule to every compound index: Equality fields first, Sort fields second, Range fields last. Incorrect field order forces in-memory sorts and reduces selectivity.
- Diagnose queries with
explain("executionStats"). ACOLLSCANstage or atotalDocsExamined-to-nReturnedratio above 10:1 is a hard signal that an index is missing or misdesigned. - Design for covered queries on hot read paths by including every projected field in the index and explicitly excluding
_idwhen it is not needed. - Use partial indexes on collections with natural document subsets — active orders, unprocessed events, soft-deleted records — to reduce index size and write amplification.
- Prefer partial indexes over sparse indexes for all new index definitions; partial indexes are more expressive and have no edge-case surprises.
- Compound indexes almost always outperform index intersection. Do not rely on intersection as a design pattern; use it only when the index set is genuinely immutable.
- Run
$indexStatscontinuously across at least one full business cycle. Hide unused indexes withhideIndexbefore dropping them, and only drop after confirming no query regression. - The
background: trueoption forcreateIndexis deprecated and removed. Use rolling index builds on replica sets for large collections when minimizing oplog and lock impact.
Indexing decisions compound over time — a missed compound index on a growing collection that was fine at 10 million documents becomes a production incident at 200 million. If your team is dealing with slow MongoDB queries, bloated index sets, or uncertain index coverage on high-traffic collections, the JusDB team works with engineering teams to audit existing indexes, design compound index strategies aligned to real access patterns, and safely remove index overhead that is degrading write throughput. Reach out at jusdb.com to talk through your specific workload.