Your e-commerce catalog stores product tags as a JSON array — ["wireless", "noise-cancelling", "bluetooth", "over-ear"] — because product managers add tag schemas every quarter and a rigid junction table stopped making sense two years ago. The problem surfaces during a routine query review: SELECT * FROM products WHERE JSON_CONTAINS(tags, '"bluetooth"') is full-scanning 4.2 million rows at 800ms per call, showing up constantly in SLOW_QUERY_LOG. You tried adding a generated column and a regular B-tree index on it, but a single generated column can only capture one element, not a variable-length array. MySQL 8.0.17 shipped a direct solution to this exact problem: multi-valued indexes, which index every element of a JSON array as a separate B-tree entry — enabling fast, indexed lookups against array membership without any schema restructuring.
- Multi-valued indexes (MVI) are a MySQL 8.0.17+ InnoDB feature that creates one B-tree index entry per JSON array element, enabling indexed array membership queries.
- Create them with a
CAST(col->'$.path' AS type ARRAY)expression insideCREATE INDEXor inline inCREATE TABLE. - Three operators use MVIs:
MEMBER OF(),JSON_CONTAINS(), andJSON_OVERLAPS()— regular equality operators andJSON_SEARCH()do not. - On a 4M-row table, MVI turns a 900ms full table scan into a 1.2ms index range scan for JSON array membership queries.
- Limitations are real: InnoDB only, one MVI per JSON path expression per column, and only JSON arrays are supported — not scalar JSON fields.
- MVIs outperform the generated column + regular index workaround for variable-length arrays while requiring less schema surgery.
What Multi-Valued Indexes Actually Are
A standard B-tree index maps one index key to one row. This model breaks down completely for JSON arrays because a single row can legitimately contain zero, one, or fifty array elements, and any of those elements might be the target of a lookup. Before MySQL 8.0.17, there was no way to index the contents of a JSON array directly — you either full-scanned the table using JSON_CONTAINS() or you denormalized the array into a separate table.
A multi-valued index (MVI) solves this by storing multiple index entries for a single row — one entry per array element. The index is physically a standard InnoDB B-tree. MySQL's indexing layer handles the fan-out: when you insert a row whose tags JSON array has four elements, InnoDB creates four index entries, each pointing back to the same row's primary key. When you delete or update the row, all corresponding index entries are removed or updated atomically.
The index is declared using a CAST(expression AS type ARRAY) functional expression. The ARRAY keyword is what signals MySQL to activate the multi-valued indexing behavior rather than treating the expression as a scalar.
Creating Multi-Valued Indexes
Inline in CREATE TABLE
The most common pattern is declaring the MVI at table creation time alongside the rest of the schema:
CREATE TABLE products (
id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
sku VARCHAR(64) NOT NULL,
name VARCHAR(255) NOT NULL,
price_cents INT UNSIGNED NOT NULL,
tags JSON,
attributes JSON,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id),
UNIQUE KEY uq_sku (sku),
-- Multi-valued index on the tags array (string elements)
INDEX idx_tags ((CAST(tags->'$.tags' AS CHAR(64) ARRAY))),
-- Multi-valued index on a nested numeric array
INDEX idx_category_ids ((CAST(attributes->'$.category_ids' AS UNSIGNED ARRAY)))
) ENGINE=InnoDB;Note the double parentheses around the CAST expression — this is required syntax for functional key parts in MySQL 8.0. The outer parentheses signal that this is a functional index expression rather than a plain column name.
Adding an MVI to an Existing Table
-- Add MVI to an existing products table (online DDL, non-blocking on MySQL 8.0)
ALTER TABLE products
ADD INDEX idx_tags ((CAST(tags->'$.tags' AS CHAR(64) ARRAY)));
-- Verify the index was created correctly
SHOW CREATE TABLE products\G
-- Check the index in information_schema
SELECT
INDEX_NAME,
COLUMN_NAME,
EXPRESSION,
INDEX_TYPE
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_db'
AND TABLE_NAME = 'products'
AND INDEX_NAME = 'idx_tags';Supported CAST Types for MVI
The type you specify in CAST(... AS type ARRAY) determines what MySQL casts each array element to before storing it in the index. MySQL 8.0 supports the following types for multi-valued indexes:
-- String elements (VARCHAR-like, requires explicit length)
CAST(col->'$.tags' AS CHAR(64) ARRAY)
-- Unsigned integer elements (product IDs, category IDs)
CAST(col->'$.category_ids' AS UNSIGNED ARRAY)
-- Signed integer elements
CAST(col->'$.adjustments' AS SIGNED ARRAY)
-- Decimal / numeric elements
CAST(col->'$.scores' AS DECIMAL(10,2) ARRAY)
-- Date elements
CAST(col->'$.dates' AS DATE ARRAY)
-- Datetime elements
CAST(col->'$.timestamps' AS DATETIME(6) ARRAY)When using CHAR(n) ARRAY, elements longer than n characters are silently truncated at index time. If your tags can exceed 64 characters, size up appropriately. Truncation means longer tag values will not be found via the index and queries will degrade to a full table scan for those values. Check your data distribution before choosing the length.
Which Operators Use the Multi-Valued Index
This is the most operationally important detail about MVIs: the optimizer will only use a multi-valued index for three specific operators. Every other JSON function, including ones that look like they should benefit, falls back to a full table scan.
MEMBER OF() — Primary MVI Operator
-- Find all products tagged "bluetooth"
-- This query uses the MVI on idx_tags
SELECT id, name, price_cents
FROM products
WHERE 'bluetooth' MEMBER OF(tags->'$.tags');MEMBER OF() is the cleanest and most optimizer-friendly of the three. MySQL can push the predicate directly into an index range scan. It is the preferred operator for single-element membership tests.
JSON_CONTAINS() — Array Subset Check
-- Find products that have ALL of the specified tags (subset check)
-- This query uses the MVI on idx_tags
SELECT id, name
FROM products
WHERE JSON_CONTAINS(tags->'$.tags', '["bluetooth", "wireless"]');JSON_CONTAINS() with an array argument checks whether the first argument is a superset of the second. MySQL decomposes the needle array and performs multiple index lookups, one per element, then intersects the result sets. The MVI is used for each individual lookup.
JSON_OVERLAPS() — Intersection Check
-- Find products that have ANY of the specified tags (intersection check)
-- This query uses the MVI on idx_tags
SELECT id, name
FROM products
WHERE JSON_OVERLAPS(tags->'$.tags', '["bluetooth", "wifi", "5g"]');JSON_OVERLAPS() is the OR-semantics counterpart to JSON_CONTAINS(). It returns rows where at least one element from the needle set appears in the stored array. The MVI is used via a union of index range scans for each needle element.
Always run EXPLAIN FORMAT=JSON on your query before deploying to production. A mistyped path expression or an unsupported operator will silently fall back to a full table scan with no error. Look for "index_name": "idx_tags" and "access_type": "ref" or "range" in the JSON output.
Reading the EXPLAIN Output
Confirming that MySQL is actually using your MVI requires checking EXPLAIN. Here is what to look for:
-- Traditional EXPLAIN
EXPLAIN SELECT id, name
FROM products
WHERE 'bluetooth' MEMBER OF(tags->'$.tags');
-- Output (key columns):
-- +----+-------------+----------+-------+---------------+----------+---------+------+------+-------------+
-- | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
-- +----+-------------+----------+-------+---------------+----------+---------+------+------+-------------+
-- | 1 | SIMPLE | products | ref | idx_tags | idx_tags | 259 | const| 312 | Using where |
-- +----+-------------+----------+-------+---------------+----------+---------+------+------+-------------+
-- JSON EXPLAIN for full detail
EXPLAIN FORMAT=JSON
SELECT id, name
FROM products
WHERE 'bluetooth' MEMBER OF(tags->'$.tags')\GThe critical signals in the output:
type: refortype: range— the optimizer used the index.type: ALLmeans a full table scan; the MVI is not being used.key: idx_tags— the named MVI is selected.rows— should be a fraction of total table rows. If it equals the full table row count, suspect a cast mismatch or path expression typo.Using whereinExtra— normal; MySQL applies a post-filter after the index lookup to handle any elements the B-tree cannot fully resolve.
-- The JSON EXPLAIN will contain a section like:
-- "index_name": "idx_tags",
-- "access_type": "ref",
-- "rows_examined_per_scan": 312,
-- "rows_produced_per_join": 312,
-- "used_key_parts": ["cast(tags->\"$.tags\" as char(64) array)"]Performance Benchmark: Full Scan vs Multi-Valued Index
To make this concrete, here is a reproducible benchmark on a 4-million-row products table with an average of 5 tags per product (20 million index entries total), running on MySQL 8.0.32 on a 4-core 16GB instance.
-- Setup: populate 4M rows with realistic tag arrays
INSERT INTO products (sku, name, price_cents, tags)
SELECT
CONCAT('SKU-', seq),
CONCAT('Product ', seq),
FLOOR(RAND() * 50000) + 99,
JSON_ARRAY(
ELT(FLOOR(RAND()*8)+1, 'wireless','bluetooth','usb-c','noise-cancelling',
'waterproof','5g','wifi6','fast-charge'),
ELT(FLOOR(RAND()*8)+1, 'wireless','bluetooth','usb-c','noise-cancelling',
'waterproof','5g','wifi6','fast-charge'),
ELT(FLOOR(RAND()*8)+1, 'wireless','bluetooth','usb-c','noise-cancelling',
'waterproof','5g','wifi6','fast-charge')
)
FROM seq_1_to_4000000; -- using MariaDB sequence engine or equivalent generator-- Benchmark 1: Full table scan (no MVI, or using JSON_SEARCH which ignores MVI)
SET SESSION query_cache_type = OFF;
SELECT COUNT(*)
FROM products
WHERE JSON_SEARCH(tags, 'one', 'bluetooth') IS NOT NULL;
-- Execution time: ~870ms
-- Handler reads: 4,000,000 (full scan)
-- Benchmark 2: MVI-backed MEMBER OF()
SELECT COUNT(*)
FROM products
WHERE 'bluetooth' MEMBER OF(tags->'$.tags');
-- Execution time: ~1.2ms
-- Handler reads: ~500,000 (index range scan, ~1/8th of rows match)The improvement is roughly 725x on this dataset. In practice the factor depends on selectivity — a highly selective tag (found in 0.1% of rows) will show even more dramatic gains because fewer rows are returned from the index; a near-universal tag (found in 90% of rows) will show less improvement because the optimizer may prefer a full scan anyway.
Because each array element generates a separate index entry, INSERT and UPDATE throughput decreases proportionally to the average array length. A row with 50 tags causes 50 index insertions. For write-heavy workloads with large arrays, benchmark your insert throughput before and after adding an MVI. If write throughput drops beyond acceptable levels, consider capping array size at the application layer or partitioning the write and read paths.
Multi-Valued Index Limitations
MVIs are powerful but come with a specific set of hard constraints. Knowing these before you design your schema prevents painful surprises in production.
InnoDB Only
Multi-valued indexes are an InnoDB storage engine feature. MyISAM, MEMORY, and other engines do not support them. In practice this is rarely a constraint since virtually all MySQL 8.0 production workloads run on InnoDB, but if you are running mixed engines for legacy reasons, confirm before adding an MVI.
One MVI Per JSON Path Per Column
You cannot create two multi-valued indexes on the same JSON path expression of the same column. You can create MVIs on different paths within the same JSON document:
-- This is valid: two MVIs on different paths of the same JSON column
ALTER TABLE products
ADD INDEX idx_tags ((CAST(attributes->'$.tags' AS CHAR(64) ARRAY))),
ADD INDEX idx_category_ids((CAST(attributes->'$.category_ids' AS UNSIGNED ARRAY)));
-- This will fail: duplicate path expression
ALTER TABLE products
ADD INDEX idx_tags_1 ((CAST(attributes->'$.tags' AS CHAR(64) ARRAY))),
ADD INDEX idx_tags_2 ((CAST(attributes->'$.tags' AS CHAR(64) ARRAY)));
-- ERROR 1831 (HY000): Duplicate index 'idx_tags_2'JSON Arrays Only — Not Scalar Fields
The ARRAY keyword in the cast expression requires the JSON path to resolve to a JSON array, not a scalar string, number, or object. If your JSON column stores a mix of arrays and scalars at the same path across different rows, MySQL will raise a runtime error on insert for non-array values.
NULL and Missing Path Behavior
-- NULL JSON column: no index entries created for this row (safe)
INSERT INTO products (sku, name, price_cents, tags) VALUES ('X', 'X', 100, NULL);
-- JSON column where the path does not exist: no index entries (safe)
INSERT INTO products (sku, name, price_cents, tags) VALUES ('Y', 'Y', 100, '{"other": "data"}');
-- JSON column where the path resolves to a non-array: ERROR
INSERT INTO products (sku, name, price_cents, tags) VALUES ('Z', 'Z', 100, '{"tags": "bluetooth"}');
-- ERROR 3907: Passed argument is not a JSON arrayEnforce array type at the application layer or with a CHECK constraint to prevent the third case from reaching the database.
MVI vs Generated Column + Regular Index
Before MVIs, the standard workaround for indexing JSON array data was to extract a scalar value via a generated column and index that. This works for single-value extraction but does not handle variable-length arrays gracefully.
| Criterion | Multi-Valued Index | Generated Column + B-tree Index |
|---|---|---|
| Array support | Native — all elements indexed automatically | One element per generated column; requires schema change per new path |
| Schema changes | One CREATE INDEX statement |
ALTER TABLE to add generated column, then CREATE INDEX |
| Query operators | MEMBER OF(), JSON_CONTAINS(), JSON_OVERLAPS() |
Standard =, IN(), range operators on the extracted scalar |
| Variable-length arrays | Handled automatically; any number of elements | Cannot handle variable length without multiple generated columns |
| Storage overhead | Index entries proportional to total elements across all rows | One additional column per extracted path, stored in the row |
| Write amplification | Higher for large arrays (N elements = N index writes) | Constant (one generated column write per row) |
| MySQL version required | 8.0.17+ | 5.7+ (generated columns), 5.7.6+ (JSON type) |
The generated column approach remains valid when you need to extract a single, well-known scalar field from a JSON document — for example, indexing metadata->>'$.tenant_id' for multi-tenant row filtering. MVIs are the correct choice when the JSON field is genuinely an array of variable membership.
Real-World Use Cases
Product Tags and Feature Flags
-- Schema: products with tags and feature flags stored as JSON arrays
CREATE TABLE products (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
price_cents INT UNSIGNED NOT NULL,
tags JSON, -- ["bluetooth", "wireless", "noise-cancelling"]
flags JSON, -- ["new-arrival", "sale", "limited-edition"]
INDEX idx_tags ((CAST(tags->'$' AS CHAR(64) ARRAY))),
INDEX idx_flags ((CAST(flags->'$' AS CHAR(64) ARRAY)))
) ENGINE=InnoDB;
-- Query: find all products on sale with bluetooth
SELECT p.id, p.name, p.price_cents
FROM products p
WHERE 'bluetooth' MEMBER OF(p.tags->'$')
AND 'sale' MEMBER OF(p.flags->'$')
ORDER BY p.price_cents ASC
LIMIT 20;User Roles and Permissions
CREATE TABLE users (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) NOT NULL,
roles JSON, -- ["admin", "billing", "read-only"]
INDEX idx_roles ((CAST(roles->'$' AS CHAR(64) ARRAY)))
) ENGINE=InnoDB;
-- Find all users who have either admin or billing role
SELECT id, email
FROM users
WHERE JSON_OVERLAPS(roles->'$', '["admin", "billing"]');Event and Audience Targeting
CREATE TABLE campaigns (
id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
segment_ids JSON, -- [1042, 2081, 3300] — audience segment IDs
INDEX idx_segments ((CAST(segment_ids->'$' AS UNSIGNED ARRAY)))
) ENGINE=InnoDB;
-- Find all campaigns targeting a specific segment
SELECT id, name
FROM campaigns
WHERE 2081 MEMBER OF(segment_ids->'$');MySQL 8.0's optimizer can use index merge on a multi-valued index lookup combined with a regular B-tree index on a different column. For example, if you filter by both tenant_id (regular index) and tags (MVI), MySQL can intersect both result sets. Use EXPLAIN FORMAT=JSON to confirm whether index merge is being applied, and whether the selectivity of each index makes the intersection worthwhile versus a single-index scan with post-filter.
- Create multi-valued indexes with
CAST(col->'$.path' AS type ARRAY)inside a standardCREATE INDEX— no schema restructuring required beyond the index itself. - Use
MEMBER OF(),JSON_CONTAINS(), andJSON_OVERLAPS()to activate MVI-backed index scans; other JSON functions includingJSON_SEARCH()fall back to a full table scan regardless of whether an MVI exists. - Always validate index usage with
EXPLAIN FORMAT=JSONand check for"access_type": "ref"or"range"before shipping to production. - Expect write amplification proportional to average array cardinality — benchmark insert throughput on representative data before enabling MVIs on write-heavy tables.
- Prefer MVIs over the generated column workaround for variable-length arrays; use generated columns only when extracting a single, fixed scalar path from a JSON document.
- Guard against non-array values at the application layer or with a
CHECKconstraint — inserting a scalar JSON value into an MVI-backed path raises a hard error at runtime.
Working with JusDB on MySQL Query Optimization
Multi-valued indexes are one of several MySQL 8.0 features — alongside invisible indexes, hash joins, and the new optimizer hints framework — that engineering teams consistently underutilize in production. Identifying which queries on your specific workload will benefit from an MVI, sizing the CHAR(n) argument correctly against your real data distribution, and validating the EXPLAIN output across all query shapes that touch the indexed column requires hands-on MySQL expertise that most platform teams do not have available day-to-day.
JusDB's DBA team works with engineering teams to audit slow query logs, identify MVI and index optimization opportunities, implement schema changes with zero-downtime DDL strategies on live production tables, and verify query plan stability after changes. We cover MySQL 8.0 query optimization, InnoDB tuning, replication topology review, and ongoing performance monitoring — so your team ships features instead of chasing query regressions.
Explore JusDB MySQL Consulting → | Talk to a MySQL DBA
Related reading: