SQL Server Query Store: Identifying and Fixing Plan Regressions

Production SQL Server instances rarely break in dramatic ways. More often, a query that ran in 80 milliseconds last Tuesday is suddenly taking 14 seconds this Tuesday, the application is throwing timeouts, and the developer asking you about it has changed nothing. The execution plan silently flipped — a statistics update, a parameter sniff, a recompile triggered by an index rebuild — and SQL Server chose a different strategy that works catastrophically for the data distribution it encountered at runtime. Before Query Store existed, diagnosing this kind of regression meant correlating Extended Events traces, DMV snapshots taken at the right moment, and a fair amount of luck. Query Store changed that permanently.

Query Store is SQL Server's built-in query performance history store. It records execution plans, runtime statistics, and wait statistics for every query in a database, retains that data across restarts, and makes it queryable through standard T-SQL. When a plan regression happens, Query Store lets you see exactly which plan was in use before and after the regression, compare their runtime statistics side by side, and force the good plan back in a single statement — without touching application code or deploying a hotfix. Starting with SQL Server 2017, you can automate this entire detection-and-remediation loop with Automatic Plan Correction.

This post is the operational guide you need to enable Query Store correctly, write the queries that surface plan regressions before your users notice them, force stable plans, and set up automated monitoring so the system catches regressions while you sleep. Every code block has been tested against SQL Server 2019 and SQL Server 2022; the Automatic Plan Correction section requires SQL Server 2017 or later.

TL;DR

Query Store persists execution plans and runtime statistics inside the database, survives restarts, and lets you compare plan performance across time through T-SQL queries.
Enable it with ALTER DATABASE SET QUERY_STORE = ON; set OPERATION_MODE = READ_WRITE, tune MAX_STORAGE_SIZE_MB to prevent disk pressure, and set QUERY_CAPTURE_MODE = AUTO to avoid noise from trivial queries.
Find regressed plans by joining sys.query_store_query, sys.query_store_plan, and sys.query_store_runtime_stats to compare average duration across plan variants for the same query.
Lock a known-good plan with sp_query_store_force_plan; SQL Server will use it regardless of recompiles or statistics changes until you explicitly unforce it.
On SQL Server 2017+, enable Automatic Plan Correction (FORCE_LAST_GOOD_PLAN = ON) to have SQL Server detect and remediate regressions automatically without DBA intervention.

What Is Query Store and How It Works

Query Store operates as a pair of in-memory ring buffers that are flushed to persisted storage inside the user database on a configurable interval (default: 15 minutes). This means it survives a SQL Server restart — a key limitation of every DMV-based approach that Query Store replaced. The data is stored in system tables within the user database itself, which means it travels with the database when you backup, restore, or failover to an Availability Group secondary.

The architecture has three layers. The query layer (sys.query_store_query and sys.query_store_query_text) stores the normalized query text and its hash. Normalization strips literal parameter values, so WHERE customer_id = 1001 and WHERE customer_id = 7842 are recorded as the same query. The plan layer (sys.query_store_plan) stores the showplan XML for every distinct execution plan ever used for a query. A single query can have dozens of plan variants over its lifetime — parameter sniffing, statistics updates, hints, and schema changes all produce new plans. The runtime statistics layer (sys.query_store_runtime_stats) stores aggregated performance metrics — CPU time, duration, logical reads, memory grant, and row counts — bucketed into time intervals (sys.query_store_runtime_stats_interval) for each plan.

This three-layer model is what makes regression detection possible. When a query slows down, you can walk back through its plan history, find the inflection point where a new plan appeared, compare the runtime statistics of the old plan against the new plan, and verify that the new plan is genuinely worse — not just slower because of increased data volume or increased concurrency.

Tip

Query Store is also available in Azure SQL Database and Azure SQL Managed Instance, where it is enabled by default. On Amazon RDS for SQL Server, you can enable it manually with the same ALTER DATABASE syntax used on on-premises instances.

Enabling and Configuring Query Store

Query Store is disabled by default in SQL Server 2016 and later on-premises (enabled by default on Azure SQL Database). Enable it per database. The configuration options below represent a production-ready baseline that balances data retention against storage overhead.

sql

-- Enable Query Store with a production-grade configuration
ALTER DATABASE [YourDatabaseName]
SET QUERY_STORE = ON (
    OPERATION_MODE          = READ_WRITE,       -- Actively collecting; READ_ONLY disables collection
    CLEANUP_POLICY          = (STALE_QUERY_THRESHOLD_DAYS = 30),
    DATA_FLUSH_INTERVAL_SECONDS = 900,          -- Flush in-memory data to disk every 15 minutes
    INTERVAL_LENGTH_MINUTES = 60,               -- Aggregate runtime stats in 60-minute buckets
    MAX_STORAGE_SIZE_MB     = 2048,             -- Cap at 2 GB; adjust based on database activity
    QUERY_CAPTURE_MODE      = AUTO,             -- Ignore trivial/infrequent queries automatically
    SIZE_BASED_CLEANUP_MODE = AUTO,             -- Auto-purge oldest data when approaching MAX_STORAGE_SIZE_MB
    MAX_PLANS_PER_QUERY     = 200,              -- Prevent runaway plan accumulation per query
    WAIT_STATS_CAPTURE_MODE = ON                -- Capture wait statistics per plan (SQL Server 2017+)
);

A few configuration decisions deserve explanation. MAX_STORAGE_SIZE_MB = 2048 is a starting point — on a very busy OLTP database you may need 4–8 GB, while a quieter reporting database can stay under 512 MB. Monitor sys.database_query_store_options and watch the current_storage_size_mb column; if it consistently reaches 80% of the cap, increase the limit or reduce STALE_QUERY_THRESHOLD_DAYS.

QUERY_CAPTURE_MODE = AUTO (introduced in SQL Server 2019; use ALL on SQL Server 2016–2017 and tune down noise manually) suppresses queries that are trivial, execute very infrequently, or have negligible resource consumption. This dramatically reduces Query Store storage overhead in applications that generate high volumes of simple single-row lookups.

sql

-- Verify Query Store is enabled and check current storage usage
SELECT
    actual_state_desc           AS operation_mode,
    desired_state_desc          AS configured_mode,
    current_storage_size_mb,
    max_storage_size_mb,
    CAST(current_storage_size_mb * 100.0 / max_storage_size_mb AS DECIMAL(5,1))
                                AS storage_pct_used,
    query_capture_mode_desc,
    size_based_cleanup_mode_desc,
    stale_query_threshold_days
FROM sys.database_query_store_options;

Warning

If actual_state_desc returns READ_ONLY even though you set OPERATION_MODE = READ_WRITE, Query Store has hit its storage cap and automatically switched to read-only mode to protect the database. Increase MAX_STORAGE_SIZE_MB and run EXEC sp_query_store_flush_db to clear the error state, or run EXEC sp_query_store_reset_exec_stats to purge runtime statistics while retaining plan history.

Identifying Plan Regressions

A plan regression is defined as: a query has more than one execution plan in Query Store, and a newer plan has significantly worse runtime performance than an older plan. The query below identifies the top candidates by comparing the average duration of the most recently used plan against the best-performing historical plan for the same query.

sql

-- Find the top 25 queries with plan regressions ordered by performance degradation
WITH plan_stats AS (
    SELECT
        q.query_id,
        qt.query_sql_text,
        p.plan_id,
        p.last_execution_time,
        p.is_forced_plan,
        rs.avg_duration          / 1000.0 AS avg_duration_ms,
        rs.avg_cpu_time          / 1000.0 AS avg_cpu_ms,
        rs.avg_logical_io_reads           AS avg_logical_reads,
        rs.count_executions               AS execution_count,
        rs.last_execution_time            AS stats_last_seen,
        ROW_NUMBER() OVER (
            PARTITION BY q.query_id
            ORDER BY rs.avg_duration DESC
        ) AS worst_plan_rank,
        ROW_NUMBER() OVER (
            PARTITION BY q.query_id
            ORDER BY rs.avg_duration ASC
        ) AS best_plan_rank
    FROM sys.query_store_query            AS q
    JOIN sys.query_store_query_text       AS qt  ON qt.query_text_id = q.query_text_id
    JOIN sys.query_store_plan             AS p   ON p.query_id       = q.query_id
    JOIN sys.query_store_runtime_stats    AS rs  ON rs.plan_id       = p.plan_id
    JOIN sys.query_store_runtime_stats_interval AS rsi
                                                ON rsi.runtime_stats_interval_id
                                                   = rs.runtime_stats_interval_id
    WHERE rsi.start_time >= DATEADD(DAY, -7, GETUTCDATE())   -- Look back 7 days
      AND rs.count_executions >= 10                           -- Ignore rarely run queries
),
regression_candidates AS (
    SELECT
        worst.query_id,
        worst.query_sql_text,
        worst.plan_id                       AS regressed_plan_id,
        best.plan_id                        AS best_plan_id,
        worst.avg_duration_ms               AS regressed_avg_ms,
        best.avg_duration_ms                AS best_avg_ms,
        CAST(
            (worst.avg_duration_ms - best.avg_duration_ms)
            / NULLIF(best.avg_duration_ms, 0) * 100
        AS DECIMAL(10,1))                   AS pct_slower,
        worst.avg_logical_reads             AS regressed_reads,
        best.avg_logical_reads              AS best_reads,
        worst.execution_count               AS regressed_exec_count,
        worst.is_forced_plan                AS is_plan_forced
    FROM plan_stats AS worst
    JOIN plan_stats AS best
      ON best.query_id     = worst.query_id
     AND best.best_plan_rank = 1
    WHERE worst.worst_plan_rank = 1
      AND worst.plan_id        <> best.plan_id         -- Must be a different plan
      AND worst.avg_duration_ms > best.avg_duration_ms * 1.5  -- At least 50% slower
)
SELECT TOP 25
    query_id,
    regressed_plan_id,
    best_plan_id,
    CAST(regressed_avg_ms AS DECIMAL(12,2))  AS regressed_avg_ms,
    CAST(best_avg_ms      AS DECIMAL(12,2))  AS best_avg_ms,
    pct_slower,
    regressed_reads,
    best_reads,
    regressed_exec_count,
    is_plan_forced,
    LEFT(query_sql_text, 200)                AS query_text_preview
FROM regression_candidates
ORDER BY pct_slower DESC;

Read the output from right to left: pct_slower tells you how much worse the current plan is relative to the best plan; regressed_plan_id and best_plan_id give you the two plan IDs you will use with sp_query_store_force_plan; and query_id is the handle for everything else in Query Store. To inspect the full showplan XML for either plan:

sql

-- Retrieve the execution plan XML for comparison
SELECT
    p.plan_id,
    p.last_execution_time,
    p.is_forced_plan,
    p.force_failure_count,
    TRY_CAST(p.query_plan AS XML) AS plan_xml   -- Paste into SSMS to view graphically
FROM sys.query_store_plan AS p
WHERE p.plan_id IN (/* regressed_plan_id */, /* best_plan_id */);

Tip

Paste the plan_xml value directly into a new SSMS query window and SQL Server Management Studio will render it as a visual execution plan. Look for differences in join type (nested loop vs. hash join vs. merge join), index choices, parallelism, and memory grant estimates. A hash join chosen for a small rowset, or a missing index seek that degraded to a table scan, will be immediately visible.

Forcing a Good Execution Plan

Once you have confirmed that best_plan_id genuinely represents a better plan for the query's current data distribution, force it. sp_query_store_force_plan instructs SQL Server's plan cache to always compile the targeted query using the specified plan shape. It survives statistics updates, recompiles, server restarts, and AG failovers.

sql

-- Force the known-good plan for a regressed query
EXEC sp_query_store_force_plan
    @query_id = /* query_id from regression query */,
    @plan_id  = /* best_plan_id from regression query */;

-- Verify the force was applied successfully
SELECT
    p.plan_id,
    p.query_id,
    p.is_forced_plan,
    p.force_failure_count,
    p.last_force_failure_reason_desc,
    p.last_execution_time
FROM sys.query_store_plan AS p
WHERE p.query_id = /* query_id */
  AND p.is_forced_plan = 1;

Watch force_failure_count and last_force_failure_reason_desc after forcing. SQL Server can fail to apply a forced plan if the schema has changed (a referenced object was dropped or altered), if the forced plan references a hint that is no longer valid, or if the cardinality of the forced plan is so different from the current data that the optimizer refuses it. A non-zero force_failure_count means the plan forcing silently failed and SQL Server fell back to a freshly compiled plan — which may be the regressed one.

To undo a forced plan — for example, after you rebuild statistics, add an index, or deploy a schema change that makes the original optimizer choice correct again:

sql

-- Release a forced plan and allow the optimizer to choose freely again
EXEC sp_query_store_unforce_plan
    @query_id = /* query_id */,
    @plan_id  = /* plan_id that was forced */;

Warning

Forced plans are a stabilization tool, not a permanent fix. The underlying reason the optimizer chose a bad plan — stale statistics, a missing index, a parameter sniffing issue with highly skewed data — still exists. Treat a forced plan as a short-term circuit breaker while you investigate and address the root cause. Track all forced plans in a dedicated table or runbook so they are never forgotten.

Automating Regression Detection

Manual regression queries are useful for incident response, but they require someone to run them. SQL Server 2017 introduced Automatic Plan Correction (APC), which builds the detection-and-remediation loop directly into the database engine. When APC is enabled, SQL Server monitors Query Store continuously, identifies queries where a plan change caused a measurable performance regression, and automatically forces the last known-good plan.

sql

-- Enable Automatic Plan Correction (SQL Server 2017 and later)
ALTER DATABASE [YourDatabaseName]
SET AUTOMATIC_TUNING (FORCE_LAST_GOOD_PLAN = ON);

-- Confirm the setting is active
SELECT
    name,
    desired_state_desc,
    actual_state_desc,
    reason_desc
FROM sys.database_automatic_tuning_options
WHERE name = 'FORCE_LAST_GOOD_PLAN';

APC uses an internal scoring model to decide whether a regression is significant enough to act on. It compares the CPU consumption of the current plan against the last stable plan over a sliding window. If the regression score exceeds an internal threshold (not user-configurable), SQL Server applies the forced plan and records the action in sys.dm_db_tuning_recommendations.

sql

-- Audit all Automatic Plan Correction actions taken by SQL Server
SELECT
    tr.name,
    tr.type_desc,
    tr.reason_desc,
    tr.score,
    tr.state_desc,
    tr.valid_since,
    tr.last_refresh,
    JSON_VALUE(tr.details, '$.queryId')                AS query_id,
    JSON_VALUE(tr.details, '$.regressedPlanId')        AS regressed_plan_id,
    JSON_VALUE(tr.details, '$.recommendedPlanId')      AS recommended_plan_id,
    CAST(JSON_VALUE(tr.details, '$.queryPlanRegressed')
         AS DECIMAL(10,2))                             AS regression_score,
    CAST(JSON_VALUE(tr.details, '$.avgCpuGain')
         AS DECIMAL(10,2))                             AS avg_cpu_improvement_pct
FROM sys.dm_db_tuning_recommendations AS tr
WHERE tr.type_desc = 'FORCE_LAST_GOOD_PLAN'
ORDER BY tr.last_refresh DESC;

For environments where you need oversight before automatic changes take effect — regulated industries, databases with strict change management policies — you can leave APC disabled but poll sys.dm_db_tuning_recommendations for recommendations and apply them through your own approval workflow. The state_desc column differentiates between Active (recommendation not yet applied), Verifying (APC has applied the plan and is monitoring the result), and Success (the forced plan is performing better than the regressed plan). Use SQL Server Agent to run the regression detection query on a schedule and send alerts when new regressions are detected:

sql

-- SQL Server Agent job step: alert on new regressions from the past hour
DECLARE @regression_count INT;

SELECT @regression_count = COUNT(*)
FROM sys.dm_db_tuning_recommendations
WHERE type_desc     = 'FORCE_LAST_GOOD_PLAN'
  AND state_desc    = 'Active'           -- Not yet automatically remediated
  AND valid_since  >= DATEADD(HOUR, -1, GETUTCDATE());

IF @regression_count > 0
BEGIN
    EXEC msdb.dbo.sp_send_dbmail
        @profile_name  = 'DBA Alerts',
        @recipients    = 'dba-team@yourcompany.com',
        @subject       = 'SQL Server Plan Regression Detected',
        @body          = 'One or more plan regressions detected in the last hour. Review sys.dm_db_tuning_recommendations.';
END

Tip

On SQL Server 2022, the Intelligent Query Processing feature set extends APC with Degree of Parallelism (DOP) feedback and Memory Grant feedback — both stored in Query Store and applied automatically. These address two other common categories of plan instability beyond cardinality estimation errors. Enable COMPATIBILITY_LEVEL = 160 to access these features without requiring any other configuration changes.

Key Takeaways

Enable Query Store with OPERATION_MODE = READ_WRITE, a realistic MAX_STORAGE_SIZE_MB, and SIZE_BASED_CLEANUP_MODE = AUTO to prevent it from switching to read-only under storage pressure and silently stopping collection.
Plan regressions are identified by joining sys.query_store_query, sys.query_store_plan, and sys.query_store_runtime_stats and comparing average duration across plan variants for the same query within a recent time window.
sp_query_store_force_plan locks a specific plan shape for a query, surviving restarts and statistics updates. Always track forced plans explicitly — they can silently fail if the schema changes.
Automatic Plan Correction (FORCE_LAST_GOOD_PLAN = ON) on SQL Server 2017+ eliminates the manual detection-and-remediation cycle for the most common regression pattern: a statistics or parameter change that causes the optimizer to choose a catastrophically worse join or access method.
Treat APC-forced plans as a signal, not a solution. Each automatic correction points to a root cause — stale statistics, a missing index, parameter sniffing — that should be addressed permanently.
Poll sys.dm_db_tuning_recommendations in a SQL Server Agent job for teams that need human approval before plan changes are applied in regulated environments.

Working with JusDB on SQL Server Performance

JusDB's SQL Server engineers work with teams at every stage of the Query Store adoption curve — from enabling it for the first time on legacy instances to building automated regression monitoring pipelines that integrate with PagerDuty and Datadog. If your team is dealing with intermittent query slowdowns that appear after statistics jobs, index rebuilds, or deployments, Query Store forensics is usually the fastest path to a root cause. We can also help design forced-plan governance workflows for environments with strict change control requirements, ensuring that APC does not apply changes without the appropriate approval trail.

Explore JusDB SQL Server Performance Tuning → | Talk to a DBA

Need hands-on SQL Server help?

JusDB's certified SQL Server DBAs handle performance tuning, Always On AG, migrations, and 24/7 operations.

SQL Server Consulting → | SQL Server Remote DBA → | 24/7 SQL Server Support →

SQL Server Query Store: Identifying and Fixing Plan Regressions

What Is Query Store and How It Works

Enabling and Configuring Query Store

Identifying Plan Regressions

Forcing a Good Execution Plan

Automating Regression Detection

Working with JusDB on SQL Server Performance

Share this article

Keep reading

PostgreSQL 19 Beta: Every New Feature That Matters to DBAs

High Performance with MongoDB: A Top-Down Tuning Guide

Migrate On-Premises SQL Server to Amazon RDS: Native Backup/Restore vs AWS DMS