NoSQL Databases

Null Handling in MongoDB: null vs Missing Fields vs undefined Explained

MongoDB's flexible schema allows fields to be null, explicitly set to null, or simply absent — and these three states behave differently in queries, aggregations, and indexes.

JusDB Team
September 27, 2024
7 min read
157 views

A backend engineer at a logistics company spent an afternoon debugging a dashboard report that was consistently undercounting active shipments. The query looked correct: a straightforward countDocuments({ arrivedAt: null }) intended to return all shipments that had not yet arrived. What they did not realize was that when the application first started writing shipment documents, the arrivedAt field was simply omitted rather than set to null — and MongoDB's null query matched both documents where the field was explicitly null and documents where the field did not exist at all. Three months into the migration to a new schema convention, the collection had two distinct document shapes in production, and the count was wrong in ways that depended entirely on which convention a given document had been written under. The fix took five minutes once the cause was understood. The debugging took four hours because nobody on the team knew that MongoDB treats null and missing fields identically in equality queries — unless you tell it not to.

TL;DR
  • MongoDB documents can represent "no value" in three ways: a field set to null, a field that is entirely absent, and the JavaScript undefined type (which MongoDB coerces to null on write).
  • An equality query { field: null } matches both documents where the field is null and documents where the field is missing entirely — this surprises almost every developer encountering it for the first time.
  • Use $exists to distinguish between a field being present (regardless of value) and a field being absent.
  • Use $type: 10 to match only BSON null values, excluding missing fields.
  • Sparse indexes skip documents where the indexed field is missing, which changes query behavior for partial datasets and must be designed deliberately.
  • Schema consistency between null and missing fields is a data modeling decision with long-term query and index implications — pick one convention and enforce it.

The Three States: null, Missing, and undefined

MongoDB documents are BSON-encoded, and BSON defines a richer type system than JSON. Within that type system, there are three meaningfully different ways a field can represent the absence of a value, and they are not interchangeable at the query or index level.

Explicit null

A document with a field set to null stores a BSON Null type entry in the document. The field exists in the document structure — it is present in the key list — but its value is BSON type 10 (Null). This is intentional absence: the application explicitly communicated that the value is known to be absent or not applicable.

javascript
// Insert a document with an explicit null field
db.shipments.insertOne({
  _id: ObjectId(),
  trackingCode: "SHIP-001",
  arrivedAt: null   // Field exists, value is BSON null (type 10)
});

Missing field

A document that simply does not contain a field at all is different at the BSON level: there is no key, no value, no type entry. The field is absent. This is common in MongoDB's schemaless model where not every document in a collection needs every field. An older document written before a field was added to the application schema will be missing that field entirely.

javascript
// Insert a document with no arrivedAt field at all
db.shipments.insertOne({
  _id: ObjectId(),
  trackingCode: "SHIP-002"
  // arrivedAt is simply not present in this document
});

undefined (and why it becomes null)

JavaScript has an undefined type, and the MongoDB shell and Node.js driver both accept it. However, BSON does not have a native undefined type for document fields (the BSON spec defines type 6 as Undefined but marks it as deprecated). When a driver serializes a JavaScript object with an undefined field value, it is coerced to BSON null. This means undefined and null are functionally identical at the storage layer.

javascript
// Inserting undefined — this is coerced to null on write
db.shipments.insertOne({
  _id: ObjectId(),
  trackingCode: "SHIP-003",
  arrivedAt: undefined   // Stored as BSON null, not as undefined
});

// Verify: the stored document will show arrivedAt: null
db.shipments.findOne({ trackingCode: "SHIP-003" });
// { _id: ..., trackingCode: 'SHIP-003', arrivedAt: null }
Warning

If your application conditionally assigns undefined to fields (for example, when a config value is not set), those fields will be stored as null in MongoDB — not omitted. This is a common source of unexpected null values in collections where developers expected missing fields. Always explicitly omit fields rather than assigning undefined if you intend for the field to be absent.

How MongoDB Queries Handle null and Missing Fields

The behavior that trips up most developers is MongoDB's equality query semantics for null. An equality match against null is not a strict type check — it is a logical absence check that matches both explicit null values and missing fields.

Equality query: { field: null }

javascript
// Setup: three documents representing the three states
db.shipments.insertMany([
  { trackingCode: "A", arrivedAt: null },          // explicit null
  { trackingCode: "B" },                            // field missing
  { trackingCode: "C", arrivedAt: ISODate("2024-06-01") } // has a value
]);

// Query with null equality — matches BOTH "A" and "B"
db.shipments.find({ arrivedAt: null }).toArray();
// Returns: [{ trackingCode: 'A', arrivedAt: null }, { trackingCode: 'B' }]

// This is almost always NOT what developers expect the first time

This behavior is documented but not intuitive. MongoDB's rationale is that both a null field and a missing field semantically represent "no value" for that field in the context of filtering. For many use cases this is pragmatic — if you want all documents where a field has no meaningful value, you get them all regardless of how "no value" was represented. But for use cases where the distinction matters — auditing schema migration progress, differentiating "user explicitly cleared this field" from "this field was never set" — the equality query is insufficient.

Inequality query: { field: { $ne: null } }

javascript
// $ne: null also has symmetric behavior
// It excludes documents where the field is null OR missing
db.shipments.find({ arrivedAt: { $ne: null } }).toArray();
// Returns only: [{ trackingCode: 'C', arrivedAt: ISODate("2024-06-01") }]
Warning

The { field: { $ne: null } } query excludes both explicit null documents and documents where the field is missing. If you have a collection where some documents were written with arrivedAt: null and others with no arrivedAt field, this query treats them identically — both are excluded. A count using $ne: null will give you the correct total of "documents with a non-null arrivedAt value," but you will lose the ability to distinguish the two null representations in your result set.

The $exists Operator — Query for Field Presence

When you need to distinguish between a field being present in a document versus being absent, $exists is the correct operator. It tests for the key's presence in the BSON document structure regardless of the key's value — including when the value is null.

javascript
// Find documents where arrivedAt field EXISTS (present, any value including null)
db.shipments.find({ arrivedAt: { $exists: true } }).toArray();
// Returns: [{ trackingCode: 'A', arrivedAt: null }, { trackingCode: 'C', arrivedAt: ISODate(...) }]
// Does NOT return "B" (field is missing)

// Find documents where arrivedAt field is MISSING entirely
db.shipments.find({ arrivedAt: { $exists: false } }).toArray();
// Returns: [{ trackingCode: 'B' }]
// Does NOT return "A" (field exists, value is null)

// Combine $exists: true with a value check to find documents
// where the field is present AND has a non-null value
db.shipments.find({
  arrivedAt: { $exists: true, $ne: null }
}).toArray();
// Returns: [{ trackingCode: 'C', arrivedAt: ISODate(...) }]

The combination of { $exists: true, $ne: null } is the precise filter for "field is present and has a meaningful value." This is the query pattern to use when your application logic depends on distinguishing schema migration state (document written after a field was added versus before).

Tip

A common schema migration scenario: you add a new field to your data model and start writing it on all new documents. Old documents are missing the field. Use db.collection.countDocuments({ newField: { $exists: false } }) to monitor migration progress and understand how many old-schema documents remain. Do not use { newField: null } — that count will include any documents where you explicitly wrote null, making the migration progress metric unreliable if some documents have intentional null values.

$type Operator for Explicit null Checks

When you need to match only documents where a field is explicitly set to BSON null — and you want to exclude documents where the field is simply missing — use $type: 10. BSON type 10 is the Null type, and $type checks the stored BSON type of the field value. A missing field has no type, so it does not match.

javascript
// Match ONLY documents with an explicit null value (not missing fields)
db.shipments.find({ arrivedAt: { $type: 10 } }).toArray();
// Returns: [{ trackingCode: 'A', arrivedAt: null }]
// Does NOT return "B" (field missing — no type to match)

// You can also use the string alias "null" instead of the numeric type code
db.shipments.find({ arrivedAt: { $type: "null" } }).toArray();
// Same result as above

This is the precise inverse of $exists: false. Together they give you complete control over the three states:

javascript
// State 1: explicit null
db.shipments.find({ arrivedAt: { $type: 10 } });

// State 2: field missing entirely
db.shipments.find({ arrivedAt: { $exists: false } });

// State 3: field present with a non-null value
db.shipments.find({ arrivedAt: { $exists: true, $ne: null } });

// Alternative for State 3 — exclude both null and missing
db.shipments.find({
  arrivedAt: { $exists: true, $not: { $type: 10 } }
});

Aggregation Pipeline Behavior with null/Missing

Null and missing fields behave distinctly in aggregation pipelines, particularly in $group, $project, and expression operators. Understanding these behaviors is essential for accurate reporting queries.

$group and null/missing field values

javascript
// Group shipments by arrivedAt date (truncated to day)
// Documents with null arrivedAt and missing arrivedAt both group into the null bucket
db.shipments.aggregate([
  {
    $group: {
      _id: {
        $dateToString: {
          format: "%Y-%m-%d",
          date: "$arrivedAt",
          onNull: "not-arrived"  // handles both null and missing
        }
      },
      count: { $sum: 1 }
    }
  }
]);
// The "not-arrived" bucket will contain BOTH null and missing-field documents
javascript
// To separate null from missing in a $group, use $ifNull with $exists logic
// via $cond and $$REMOVE
db.shipments.aggregate([
  {
    $addFields: {
      arrivalState: {
        $cond: {
          if: { $eq: [{ $type: "$arrivedAt" }, "missing"] },
          then: "field-absent",
          else: {
            $cond: {
              if: { $eq: ["$arrivedAt", null] },
              then: "explicit-null",
              else: "has-value"
            }
          }
        }
      }
    }
  },
  {
    $group: {
      _id: "$arrivalState",
      count: { $sum: 1 }
    }
  }
]);
// Now you get three distinct buckets: "field-absent", "explicit-null", "has-value"

$ifNull and $cond for null-safe expressions

javascript
// $ifNull handles both explicit null AND missing field with a single expression
// This is the idiomatic MongoDB way to provide a default value for absent data
db.shipments.aggregate([
  {
    $project: {
      trackingCode: 1,
      // Returns the arrivedAt value if present and non-null, else the default string
      displayArrival: {
        $ifNull: ["$arrivedAt", "Pending arrival"]
      }
    }
  }
]);
// Documents with null arrivedAt: displayArrival = "Pending arrival"
// Documents with missing arrivedAt: displayArrival = "Pending arrival"
// Documents with a date arrivedAt: displayArrival = the date value
Tip

$ifNull in aggregation pipelines treats null values and missing fields identically — it returns the fallback for both. This is usually the desired behavior for display and reporting purposes. If you need to differentiate the two states in a pipeline, use { $eq: [{ $type: "$field" }, "missing"] } to detect a missing field, then handle explicit null separately with { $eq: ["$field", null] }.

Sparse vs Non-Sparse Indexes and null Values

Index behavior for null and missing fields is one of the most consequential production implications of MongoDB's null handling model, and it is where schema design decisions have the highest performance and correctness impact.

Default (non-sparse) indexes include null and missing

javascript
// Create a standard index on arrivedAt
db.shipments.createIndex({ arrivedAt: 1 });

// This index includes ALL documents:
// - Documents with a date value for arrivedAt (indexed at the date)
// - Documents with null arrivedAt (indexed at the null position in the B-tree)
// - Documents with missing arrivedAt (also indexed at the null position)

// A query { arrivedAt: null } will use this index and return
// both null-valued and missing-field documents
db.shipments.find({ arrivedAt: null }).explain("executionStats");
// Confirm: IXSCAN on arrivedAt_1, keysExamined covers null + missing entries

Sparse indexes skip missing fields

javascript
// Create a sparse index on arrivedAt
db.shipments.createIndex({ arrivedAt: 1 }, { sparse: true });

// A sparse index ONLY indexes documents where arrivedAt exists
// (regardless of whether the value is null or a real date)
// Documents where arrivedAt is entirely missing are NOT indexed

// Consequence: a query { arrivedAt: null } may NOT use the sparse index
// because it would miss the "missing field" documents that are not in the index
// MongoDB's query planner may choose a collection scan instead for correctness

db.shipments.find({ arrivedAt: null }).explain("executionStats");
// May show COLLSCAN if planner determines sparse index would miss documents
Warning

A sparse index on a field is not automatically used for { field: null } queries, because MongoDB knows that documents with a missing field are not in the sparse index. If your query needs to match missing-field documents alongside null-valued documents, a non-sparse index is required. Using a sparse index incorrectly here produces a silent correctness issue: the query runs, returns results, but misses all documents where the field is absent. Always verify index selection with explain() when working with sparse indexes and null queries.

javascript
// Partial indexes (preferred over sparse in most cases)
// Target documents where arrivedAt IS a date — skip null and missing entirely
db.shipments.createIndex(
  { arrivedAt: 1 },
  {
    partialFilterExpression: {
      arrivedAt: { $type: "date" }
    }
  }
);

// This partial index is precise: only documents with a real date value are indexed
// Queries that filter { arrivedAt: { $gt: someDate } } will use this index efficiently
// Queries that need null/missing documents cannot use this index (correct behavior)

// Verify the partial index is selected for a date-range query
db.shipments.find({
  arrivedAt: { $gte: ISODate("2024-01-01"), $lt: ISODate("2024-12-31") }
}).explain("executionStats");

Schema Design: When to Use null vs Omit the Field

The decision between storing explicit null versus omitting a field is an architectural choice that should be made deliberately for each field, not left to application-layer convention. Both approaches are valid but have different implications.

Arguments for explicit null

Intentionality. A field set to null communicates that the application explicitly evaluated the field and determined it has no value. A missing field might mean the data was never collected, the application had a bug, or the document predates the field's introduction. If these distinctions matter to your business logic, storing null explicitly preserves the intent.

Schema uniformity. Keeping a consistent set of keys across all documents in a collection simplifies application-layer deserialization. Object mappers and typed drivers do not need to handle missing-key cases when all documents always have the same key set — even if many values are null.

$type queries work predictably. If you choose the explicit-null convention, { field: { $type: 10 } } is a reliable "this field has no value" query. You do not need to combine it with $exists checks.

Arguments for omitting the field

Storage efficiency at scale. On a collection with 50 fields where most documents only have values for 10-15 of them, storing explicit null for the other 35-40 fields adds real storage overhead. At millions of documents this becomes significant, especially if the fields have long names. Missing fields occupy zero bytes.

Sparse index efficiency. Sparse indexes are valuable when a field applies to only a small fraction of documents — geolocation for an e-commerce platform where most products are not location-restricted, or a premium-only feature flag. Sparse indexes only work as intended when the field is omitted from non-applicable documents rather than stored as null.

Tip

Adopt a consistent per-field convention and enforce it at the application layer. A practical heuristic: use explicit null for fields that are part of your core document schema and represent a known "cleared" or "not applicable" state (user's middle name, optional address line 2, a feature toggle that was explicitly disabled). Omit fields that are additive — optional metadata, event timestamps for events that have not occurred, fields added to the schema after the collection was first created. Document the convention in your data model documentation so it survives team turnover.

Migration strategy: standardizing an inconsistent collection

javascript
// Scenario: you have a collection with mixed null and missing conventions
// for the same field. You want to standardize to explicit null.

// Step 1: Count documents in each state
const nullCount = db.shipments.countDocuments({ arrivedAt: { $type: 10 } });
const missingCount = db.shipments.countDocuments({ arrivedAt: { $exists: false } });
const valueCount = db.shipments.countDocuments({ arrivedAt: { $exists: true, $ne: null } });

print(`Explicit null: ${nullCount}`);
print(`Missing field: ${missingCount}`);
print(`Has value:     ${valueCount}`);

// Step 2: Backfill missing-field documents to explicit null (batched)
let updated = 0;
let batch;

do {
  const ids = db.shipments
    .find({ arrivedAt: { $exists: false } }, { _id: 1 })
    .limit(500)
    .toArray()
    .map(d => d._id);

  if (ids.length === 0) break;

  const result = db.shipments.updateMany(
    { _id: { $in: ids } },
    { $set: { arrivedAt: null } }
  );

  updated += result.modifiedCount;
  print(`Backfilled ${updated} documents...`);
  sleep(100); // rate-limit to avoid replica lag
} while (true);

print(`Migration complete. Total updated: ${updated}`);
Key Takeaways
  • MongoDB treats null and missing fields as equivalent in equality queries ({ field: null } matches both) — this is intentional behavior but routinely surprises developers, especially during schema migrations.
  • Use $exists: true to find documents where a field is present (even if null), and $exists: false to find documents where a field is absent entirely.
  • Use $type: 10 (or $type: "null") to match only explicit BSON null values, excluding missing fields — this is the precise counterpart to $exists: false.
  • JavaScript undefined is coerced to BSON null at write time by all MongoDB drivers — never rely on undefined to represent a missing field at the storage layer.
  • Sparse indexes do not index documents with missing fields, which can cause query planners to skip the index for { field: null } queries to preserve result correctness — always verify with explain().
  • Partial indexes with $type filter expressions are more precise than sparse indexes for selective field indexing and produce predictable query planner behavior.
  • Pick one convention — explicit null or field omission — for each field in your schema, document it, enforce it at the application layer, and backfill existing documents when switching conventions.

Working with JusDB on MongoDB Schema Design

JusDB advises on MongoDB schema design for engineering teams — data modeling, null handling strategy, index design, and query optimization. Our DBAs help you avoid schema anti-patterns before they cause production issues.

Explore JusDB MongoDB Management →  |  Talk to a DBA

Related reading:

Share this article