When our team migrated a product catalog service to MySQL in 2017, we stored variant attributes — colors, sizes, weights — in a TEXT column as raw JSON strings. Every time we needed to filter by a specific attribute, we resorted to fragile LIKE '%"color":"blue"%' queries that ignored indexes entirely, caused full table scans on 40 million rows, and broke silently whenever someone stored malformed JSON. MySQL 8.0 changed all of that: a native JSON column type, a rich function library for extraction and mutation, and — critically — the ability to index specific paths inside a document. This guide covers every major JSON capability in MySQL 8.0 so you can stop treating JSON as an opaque blob and start querying it like a first-class citizen.
- The
JSONcolumn type validates documents on insert and stores them in an optimized binary format — not plain text. - Use
JSON_EXTRACT()or the shorthand->/->>operators to read values at any path. JSON_SET(),JSON_INSERT(),JSON_REPLACE(),JSON_REMOVE(), andJSON_MERGE_PATCH()let you surgically update documents without rewriting the whole column.- Generated columns + functional indexes let you index any JSON path with zero application changes.
- Multi-valued indexes (MySQL 8.0.17+) index every element inside a JSON array in a single B-tree index, enabling fast
MEMBER OFandJSON_OVERLAPS()queries. - MySQL JSON is fully capable for most workloads; PostgreSQL JSONB has a richer operator set and GIN indexing — the right choice depends on your existing stack.
Background
JSON support in MySQL evolved in three distinct phases. MySQL 5.7 introduced the JSON data type and a first set of extraction functions. MySQL 8.0 added path expression enhancements, inline path operators, and the JSON_MERGE_PATCH() function. MySQL 8.0.17 then shipped multi-valued indexes — arguably the most impactful feature for teams storing arrays of tags, category IDs, or SKUs inside a column.
Before native JSON support, teams used one of three workarounds: (1) a TEXT or BLOB column with no validation, (2) the Entity-Attribute-Value (EAV) anti-pattern, or (3) a separate key-value table. All three approaches traded query complexity for schema flexibility and none of them scaled gracefully. Understanding why MySQL added a binary JSON format — and what that binary format actually gives you — is the foundation for using these features correctly.
JSON Column Type
Declaring a column as JSON does three things that a plain TEXT column cannot do: it validates every document on write and rejects malformed JSON with an error, it stores the document in MySQL's internal binary format (BLOB-like on disk, but with offset metadata for O(1) path lookups), and it enables partial in-place updates when you use JSON_SET() on documents where the new value is the same type and length or shorter.
-- Create a table with a JSON column
CREATE TABLE products (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
sku VARCHAR(64) NOT NULL,
attributes JSON NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Insert a valid document — MySQL validates and converts to binary
INSERT INTO products (sku, attributes) VALUES (
'SHOE-001',
'{"color": "navy", "sizes": [7, 8, 9, 10], "material": "leather", "weight_kg": 0.6}'
);
-- This will FAIL with ER_INVALID_JSON_TEXT — TEXT columns would silently accept it
INSERT INTO products (sku, attributes) VALUES (
'BAD-002',
'{"color": "red", sizes: [8, 9]}' -- missing quotes around key
);MySQL enforces JSON validity on write, but NULL is always accepted. If you want to disallow NULL documents, add NOT NULL to the column definition. Also note that MySQL normalizes JSON on storage: duplicate keys are removed (last value wins), and whitespace is stripped from the binary representation.
The binary storage format means that reading a single key from a large document does not require parsing the entire JSON string — MySQL can jump directly to the offset of the requested key. This makes point-lookup extraction much faster than parsing a TEXT column in application code, especially for documents larger than a few kilobytes.
Extracting Values
MySQL provides three equivalent ways to read data out of a JSON column, each progressively more concise.
JSON_EXTRACT() Function
-- Full function form — returns JSON type (quoted strings, JSON arrays, etc.)
SELECT
sku,
JSON_EXTRACT(attributes, '$.color') AS color,
JSON_EXTRACT(attributes, '$.sizes[0]') AS first_size,
JSON_EXTRACT(attributes, '$.weight_kg') AS weight
FROM products;
-- Path expression syntax:
-- $.key — top-level key
-- $.key.nested — nested key
-- $.array[0] — first array element
-- $.array[last] — last element (MySQL 8.0.27+)
-- $.** — wildcard: all descendantsArrow Operators -> and ->>
-- -> is syntactic sugar for JSON_EXTRACT(); returns quoted JSON string
SELECT sku, attributes->'$.color' AS color FROM products;
-- Result: "navy" (with quotes — still a JSON string type)
-- ->> is JSON_UNQUOTE(JSON_EXTRACT()); returns unquoted plain string
SELECT sku, attributes->>'$.color' AS color FROM products;
-- Result: navy (no quotes — a plain VARCHAR-like value)
-- Use ->> when you want to compare with a string literal directly
SELECT * FROM products WHERE attributes->>'$.color' = 'navy';
-- Nested and array paths work the same way
SELECT
sku,
attributes->>'$.material' AS material,
attributes->'$.sizes[1]' AS second_size
FROM products;Always prefer ->> over -> when comparing against string literals in WHERE clauses. The -> operator returns a JSON-typed value that includes surrounding quotes, so WHERE attributes->'$.color' = 'navy' will never match — you need either ->> or an explicit JSON_UNQUOTE() call.
Path Expression Reference
-- All supported path expression forms
SELECT
-- Scalar key
attributes->>'$.color' AS color,
-- Nested object
attributes->>'$.shipping.carrier' AS carrier,
-- Array element by index
attributes->'$.sizes[0]' AS size_0,
-- Array element from end (8.0.27+)
attributes->'$.sizes[last]' AS last_size,
-- All values under a key (returns JSON array)
JSON_EXTRACT(attributes, '$.tags[*]') AS all_tags,
-- Double wildcard: all values at any depth
JSON_EXTRACT(attributes, '$**.name') AS all_names
FROM products;Modifying JSON
MySQL provides five functions for modifying JSON documents. Each has a distinct insert/update/delete semantic — using the wrong one is a common source of bugs.
-- JSON_SET: INSERT if path does not exist, REPLACE if it does
UPDATE products
SET attributes = JSON_SET(attributes,
'$.color', 'midnight-blue', -- replaces existing key
'$.season', 'winter' -- inserts new key
)
WHERE sku = 'SHOE-001';
-- JSON_INSERT: INSERT only — silently ignores if path already exists
UPDATE products
SET attributes = JSON_INSERT(attributes,
'$.season', 'spring', -- ignored if $.season already set
'$.new_field', 'hello' -- inserted if missing
)
WHERE sku = 'SHOE-001';
-- JSON_REPLACE: REPLACE only — silently ignores if path does NOT exist
UPDATE products
SET attributes = JSON_REPLACE(attributes,
'$.color', 'charcoal' -- only updates if $.color already present
)
WHERE sku = 'SHOE-001';
-- JSON_REMOVE: delete one or more paths
UPDATE products
SET attributes = JSON_REMOVE(attributes,
'$.season',
'$.new_field'
)
WHERE sku = 'SHOE-001';
-- JSON_MERGE_PATCH: RFC 7396 merge — great for partial updates from API payloads
-- Keys in the patch overwrite the target; keys set to null are removed
UPDATE products
SET attributes = JSON_MERGE_PATCH(attributes,
'{"color": "black", "weight_kg": null}' -- updates color, removes weight_kg
)
WHERE sku = 'SHOE-001';JSON_MERGE_PATCH() follows RFC 7396: any key with a null value in the patch document is deleted from the target, not set to null. If you genuinely want to store a JSON null for a key, you must use JSON_SET() with a CAST(NULL AS JSON) argument instead. Also, JSON_MERGE_PATCH() performs a shallow merge for nested objects by default — it replaces the entire nested object rather than merging recursively unless you nest the patch structure.
Searching and Indexing
Full-document search functions let you query JSON content without knowing the exact path. They are useful but expensive on large tables without indexes.
-- JSON_CONTAINS: check if a value exists anywhere in a document
-- Returns 1 (true) or 0 (false)
SELECT sku FROM products
WHERE JSON_CONTAINS(attributes, '"leather"', '$.material');
-- JSON_OVERLAPS: true if two JSON documents share at least one value (8.0.17+)
-- Useful for checking array intersection
SELECT sku FROM products
WHERE JSON_OVERLAPS(attributes->'$.sizes', '[8, 9]');
-- JSON_SEARCH: find the path(s) where a string value appears
-- 'one' = first match, 'all' = all matches (returns JSON array of paths)
SELECT sku,
JSON_SEARCH(attributes, 'one', 'leather') AS path_to_leather,
JSON_SEARCH(attributes, 'all', 'cotton') AS all_cotton_paths
FROM products;Generated Columns and Functional Indexes
The most practical way to make a specific JSON path fast is a generated column or functional index. A virtual generated column adds no storage overhead; MySQL computes the value on read and can build a B-tree index on it.
-- Option 1: Virtual generated column + index
ALTER TABLE products
ADD COLUMN color_gen VARCHAR(64)
GENERATED ALWAYS AS (attributes->>'$.color') VIRTUAL,
ADD INDEX idx_color (color_gen);
-- Now this query uses the index:
SELECT sku FROM products WHERE color_gen = 'navy';
-- Option 2: Functional index directly on the JSON path (MySQL 8.0.13+)
-- No separate column needed; the index stores the expression result
ALTER TABLE products
ADD INDEX idx_color_functional ((attributes->>'$.color'));
-- MySQL optimiser picks up the functional index automatically:
SELECT sku FROM products WHERE attributes->>'$.color' = 'navy';
-- EXPLAIN will show: key = idx_color_functional
-- Verify with EXPLAIN
EXPLAIN SELECT sku FROM products WHERE attributes->>'$.color' = 'navy'\GFunctional indexes (the parenthesised expression form) are the preferred approach in MySQL 8.0.13+ because they require no schema change beyond the index itself and do not add a column to SELECT * output. Use a stored generated column only when you need to reference the extracted value in application queries by column name without always spelling out the full path expression.
Multi-Valued Indexes (MySQL 8.0.17+)
A standard B-tree index on a JSON array column can only index the entire array as a value — it cannot index individual elements. Multi-valued indexes solve this by creating one B-tree entry per array element, making it possible to ask "does this array contain the value X?" and get an index range scan instead of a full table scan.
-- Create a multi-valued index on the sizes array
-- CAST(... AS UNSIGNED ARRAY) tells MySQL each element is an unsigned integer
ALTER TABLE products
ADD INDEX idx_sizes ((CAST(attributes->'$.sizes' AS UNSIGNED ARRAY)));
-- Query using MEMBER OF operator — uses the multi-valued index
SELECT sku FROM products WHERE 9 MEMBER OF (attributes->'$.sizes');
-- Query using JSON_CONTAINS — also uses the multi-valued index
SELECT sku FROM products
WHERE JSON_CONTAINS(attributes->'$.sizes', '9');
-- Query using JSON_OVERLAPS — also index-eligible
SELECT sku FROM products
WHERE JSON_OVERLAPS(attributes->'$.sizes', '[8, 9, 10]');
-- Verify the index is used:
EXPLAIN SELECT sku FROM products WHERE 9 MEMBER OF (attributes->'$.sizes')\G
-- type: ref, key: idx_sizes, Extra: Using where
-- Multi-valued index on a string array (tag slugs, category IDs, etc.)
ALTER TABLE products
ADD INDEX idx_tags ((CAST(attributes->'$.tags' AS CHAR(64) ARRAY)));
SELECT sku FROM products
WHERE 'waterproof' MEMBER OF (attributes->'$.tags');Multi-valued indexes have important constraints: the indexed expression must evaluate to a JSON array (not a scalar), each element must be castable to the declared type, and the index cannot be used with ORDER BY or as a covering index. Also, inserting a document where the array element count exceeds innodb_ft_result_cache_limit may cause issues on very large arrays. Always test with EXPLAIN after adding a multi-valued index to confirm it is being selected by the optimizer.
MySQL JSON vs PostgreSQL JSONB
If you are evaluating which database to use for a JSON-heavy workload — or inheriting a system that mixes both — understanding the differences between MySQL JSON and PostgreSQL JSONB is essential.
| Feature | MySQL 8.0 JSON | PostgreSQL JSONB |
|---|---|---|
| Storage format | Binary (proprietary) | Binary (decomposed) |
| Extraction operator | -> (JSON), ->> (text) |
-> (JSON), ->> (text) |
| Containment operator | JSON_CONTAINS() |
@> and <@ |
| Key existence operator | JSON_CONTAINS_PATH() |
?, ?|, ?& |
| GIN/full-document index | Not supported | GIN index on entire column |
| Index on path | Functional index or generated column | Expression index or GIN path ops |
| Multi-valued / array index | Multi-valued index (8.0.17+) | GIN index with jsonb_path_ops |
| Partial in-place update | Yes (same type + shorter value) | No (always full row rewrite) |
| JSONPath standard support | Partial (MySQL path syntax) | Full SQL/JSON path (12+) |
| JSON schema validation | JSON_SCHEMA_VALID() (8.0.17+) |
Via extension or check constraint |
The most meaningful practical difference is indexing strategy: PostgreSQL lets you put a single GIN index on the entire JSONB column and use it for containment, key-existence, and full-text searches across any path — no need to know which keys you will query at schema-design time. MySQL requires you to anticipate which paths need indexing and create explicit functional or multi-valued indexes for each one. For workloads with highly variable query patterns across dynamic JSON schemas, PostgreSQL JSONB's GIN index is significantly more convenient. For workloads with a known set of queryable paths and high write throughput, MySQL's partial in-place update mechanism gives it an edge.
If you are coming from PostgreSQL and miss the @> containment operator, JSON_CONTAINS(target, candidate) is the direct MySQL equivalent. The argument order is reversed from what PostgreSQL users expect: in MySQL, the first argument is the document to search, and the second is the value you are looking for.
- Use the
JSONcolumn type — notTEXT— for any JSON data. Validation on write prevents silent corruption and unlocks all extraction and indexing features. - The
->>operator is the most ergonomic way to extract scalar values forWHERE,ORDER BY, andGROUP BY; it returns an unquoted string that compares correctly with literals. - For selective JSON path queries, add a functional index on the expression:
ADD INDEX ((attributes->>'$.color')). It costs nothing at read time and lets the optimizer use a range scan. - Use
JSON_SET()for upsert-style path writes,JSON_MERGE_PATCH()for applying partial API patch payloads, andJSON_REMOVE()to delete keys — matching the right function to the intent prevents silent data loss. - For columns containing JSON arrays you need to filter by membership, a multi-valued index (MySQL 8.0.17+) with
CAST AS ARRAYenables index-range scans onMEMBER OFandJSON_OVERLAPS(). - MySQL JSON is production-ready for known-schema JSON workloads; PostgreSQL JSONB's GIN index wins for ad-hoc querying across fully dynamic schemas.
Working with JusDB on MySQL
Implementing MySQL JSON functions correctly in production — choosing the right index strategy, avoiding the -> vs ->> trap in WHERE clauses, designing multi-valued indexes that the optimizer actually picks up — takes time that most engineering teams would rather spend on product work. JusDB's MySQL managed service handles index design reviews, schema migrations, and query optimization so your team ships features instead of debugging full table scans.
- Learn more about our MySQL managed database service including JSON workload support, performance tuning, and high-availability configurations.
- Ready to discuss your specific schema? Contact the JusDB team for a free architecture review.
Related reading:
- MySQL Explained: A Complete Guide for Modern Applications — covers storage engines, replication, and connection pooling fundamentals.
- PostgreSQL JSONB vs JSON: Indexing and Performance — deep dive on GIN vs B-tree indexing for PostgreSQL's two JSON types.
- MySQL EXPLAIN: Query Optimization Guide — how to read EXPLAIN output and use it to verify that your JSON path indexes are being selected.