Database SRE

Ola Hallengren's SQL Server Maintenance Solution: Production Setup Guide

Production setup of Ola Hallengren's SQL Server Maintenance Solution: the four jobs that matter, FULL/DIFF/LOG backup cadence for your RPO, DBCC CHECKDB scheduling, IndexOptimize tuning, encryption, and CommandLog-based alerting.

JusDB Team
May 27, 2026
13 min read
0 views

Ola Hallengren's SQL Server Maintenance Solution is the de-facto industry standard for backup, integrity checking, and index/statistics maintenance on SQL Server. Microsoft's built-in Maintenance Plans were deprecated in practice years ago — they're verbose, hard to version, lack proper retry logic, and don't expose enough configuration for production workloads. Ola's scripts, by contrast, are battle-tested at thousands of installations, MIT-licensed, and configurable enough to drop into a SOC 2 audit without modification. This guide walks through a production setup: not a copy-paste of MaintenanceSolution.sql, but the configuration decisions that determine whether the scripts will actually save your data when you need them to.

TL;DR
  • Install via the single MaintenanceSolution.sql script — it creates the CommandLog table, the dbo. stored procedures, and the SQL Agent jobs in one shot.
  • The four jobs you actually run: DatabaseBackup - SYSTEM_DATABASES - FULL, DatabaseBackup - USER_DATABASES - FULL/DIFF/LOG, DatabaseIntegrityCheck, and IndexOptimize.
  • The decisions that matter most: backup destination (UNC path vs local), @Compress = 'Y' by default, @Verify = 'Y' for at least weekly verification, @CheckSum = 'Y' on every backup, @ChangeBackupType = 'Y' to handle databases that lack a FULL backup.
  • For very large databases (>2 TB), use @BlockSize and @BufferCount tuning + multiple @FileCount stripes — defaults are not optimal.
  • Always enable @LogToTable = 'Y' so failures land in the CommandLog table instead of disappearing into SQL Agent history.

Why Ola's Scripts vs. Built-In Maintenance Plans

SQL Server Maintenance Plans (the visual ones in SSMS) have a fundamental architectural problem: they aren't transactional, don't have proper error handling, and there's no clean way to skip a database that's currently offline or in recovery without breaking the whole job. They also produce useless output — figuring out which database failed and why often means parsing SQL Agent's plain-text log files.

Ola's scripts solve all three:

  • Idempotent and skip-safe — a database that's read-only, offline, or unrecoverable is skipped with a logged reason, the next database in the loop continues.
  • Granular logging — every command, its parameters, its start/end time, and any error message lands in dbo.CommandLog as one row. SQL queries against that table answer "what failed last night and when?" in seconds.
  • Configurable per-job — backup destination, compression, encryption, retention, parallelism, fragmentation thresholds — every knob is a parameter on the procedure call.

Installation: One Script, Five Minutes

Download the latest MaintenanceSolution.sql from ola.hallengren.com. Before running, edit two values at the top:

sql
-- Specify the database you want to host the objects.
-- Default is master; we recommend a dedicated 'admin' database.
USE [admin];

-- Edit the BackupDirectory variable to your backup destination.
-- For UNC paths use double backslashes.
SET @BackupDirectory = N'D:\Backups';
-- Or: SET @BackupDirectory = N'\\backup-srv\sqlbackup\$(MSSQL_SERVERNAME)';

-- Should jobs be created? Y/N
SET @CreateJobs = 'Y';

The script creates:

  • dbo.CommandLog — the audit table
  • dbo.DatabaseBackup — the backup procedure
  • dbo.DatabaseIntegrityCheck — DBCC CHECKDB wrapper
  • dbo.IndexOptimize — index rebuild/reorganize + statistics update
  • dbo.CommandExecute — the inner helper that all four procedures use
  • SQL Agent jobs for all of the above (disabled by default — you enable and schedule them)
Don't run from master

The README defaults to master because every server has it. We strongly recommend creating a dedicated admin database first and pointing the script there. Master should hold as little user-defined object as possible — it complicates recovery and version upgrades.

The Four Jobs to Schedule

Job 1: DatabaseBackup — SYSTEM_DATABASES — FULL

Backs up master, msdb, and model. tempdb is never backed up by SQL Server (it's recreated on startup). System database backups are small and fast — schedule daily, retain 14 days.

sql
EXECUTE [dbo].[DatabaseBackup]
    @Databases = 'SYSTEM_DATABASES',
    @Directory = 'D:\Backups',
    @BackupType = 'FULL',
    @Verify = 'Y',
    @Compress = 'Y',
    @CheckSum = 'Y',
    @CleanupTime = 336,                 -- 14 days × 24h
    @LogToTable = 'Y';

Job 2: DatabaseBackup — USER_DATABASES — FULL (Weekly) + DIFF (Daily) + LOG (Hourly)

This is the core RPO machinery. Three schedules, one procedure:

sql
-- Weekly FULL, e.g. Sunday 02:00
EXECUTE [dbo].[DatabaseBackup]
    @Databases = 'USER_DATABASES',
    @Directory = 'D:\Backups',
    @BackupType = 'FULL',
    @Verify = 'Y',
    @Compress = 'Y',
    @CheckSum = 'Y',
    @CleanupTime = 168,                 -- 7 days × 24h
    @LogToTable = 'Y',
    @ChangeBackupType = 'Y';            -- promote LOG → DIFF when no FULL exists yet

-- Daily DIFF, e.g. Mon-Sat 02:00
EXECUTE [dbo].[DatabaseBackup]
    @Databases = 'USER_DATABASES',
    @Directory = 'D:\Backups',
    @BackupType = 'DIFF',
    @Verify = 'Y',
    @Compress = 'Y',
    @CheckSum = 'Y',
    @CleanupTime = 168,
    @LogToTable = 'Y',
    @ChangeBackupType = 'Y';

-- Hourly LOG (only for FULL-recovery databases)
EXECUTE [dbo].[DatabaseBackup]
    @Databases = 'USER_DATABASES',
    @Directory = 'D:\Backups',
    @BackupType = 'LOG',
    @Verify = 'N',                      -- skip verify on hourly LOG to keep job runtime low
    @Compress = 'Y',
    @CheckSum = 'Y',
    @CleanupTime = 168,
    @LogToTable = 'Y';
Decoding the cadence

The FULL/DIFF/LOG triplet gives you a Recovery Point Objective (RPO) of 1 hour and a Recovery Time Objective (RTO) that's dominated by the size of the most recent FULL plus all DIFFs + LOGs since. If your RPO target is < 1 hour, drop LOG frequency to 15 or 5 minutes. If your data is non-critical and SIMPLE recovery is fine, skip LOG entirely.

Job 3: DatabaseIntegrityCheck

Runs DBCC CHECKDB. The catch: CHECKDB can be expensive on multi-TB databases. Two tactical knobs:

sql
EXECUTE [dbo].[DatabaseIntegrityCheck]
    @Databases = 'USER_DATABASES',
    @CheckCommands = 'CHECKDB',
    @LogToTable = 'Y',
    @PhysicalOnly = 'N',                -- 'Y' is faster but skips logical consistency
    @TimeLimit = 7200;                  -- abort after 2h to protect daily window

For databases >1 TB, consider weekly PhysicalOnly = 'N' (full check) plus daily PhysicalOnly = 'Y' (cheaper, catches most disk-level corruption). For databases >5 TB, look at the partitioned-CHECKDB pattern (CHECKFILEGROUP a different filegroup each day on rotation).

Job 4: IndexOptimize

This is where teams over-tune and waste hours of maintenance window. Defaults are good; the only knobs worth touching:

sql
EXECUTE [dbo].[IndexOptimize]
    @Databases = 'USER_DATABASES',
    @FragmentationLow = NULL,                       -- skip <5% fragmentation
    @FragmentationMedium = 'INDEX_REORGANIZE',      -- 5-30%: reorganize
    @FragmentationHigh = 'INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',  -- >30%: rebuild
    @FragmentationLevel1 = 5,
    @FragmentationLevel2 = 30,
    @UpdateStatistics = 'ALL',
    @OnlyModifiedStatistics = 'Y',                  -- only update stats touched since last update
    @LogToTable = 'Y',
    @TimeLimit = 14400;                             -- 4h max
The fragmentation-doesn't-matter argument

Modern SSDs make page-level fragmentation a much smaller performance concern than it was on spinning rust. The maintenance benefit is now more about reclaiming wasted space (rebuilds compact the index) and updating statistics (which is what actually drives query plan selection). Don't burn your maintenance window obsessing over fragmentation thresholds — focus on @UpdateStatistics = 'ALL' with @OnlyModifiedStatistics = 'Y'.

Reading the CommandLog

Every command executed by the scripts lands in dbo.CommandLog. The two queries that matter:

sql
-- Last night's failures (or any non-success)
SELECT
    DatabaseName,
    CommandType,
    CommandText,
    StartTime,
    EndTime,
    DATEDIFF(SECOND, StartTime, EndTime) AS duration_sec,
    ErrorNumber,
    ErrorMessage
FROM dbo.CommandLog
WHERE StartTime > DATEADD(HOUR, -24, GETDATE())
  AND (ErrorNumber IS NOT NULL OR ErrorMessage IS NOT NULL)
ORDER BY StartTime DESC;

-- Long-running commands over the last week (capacity-planning input)
SELECT
    DatabaseName,
    CommandType,
    AVG(DATEDIFF(SECOND, StartTime, EndTime)) AS avg_sec,
    MAX(DATEDIFF(SECOND, StartTime, EndTime)) AS max_sec,
    COUNT(*) AS run_count
FROM dbo.CommandLog
WHERE StartTime > DATEADD(DAY, -7, GETDATE())
GROUP BY DatabaseName, CommandType
ORDER BY max_sec DESC;

Set up an alert that fires when any row in CommandLog from the last 25 hours has a non-null ErrorNumber — that's your backup-failed pager.

Performance Tuning for Large Databases

Defaults work well up to ~500 GB databases. Beyond that, three parameters get meaningful gains:

ParameterDefaultTune for >1 TBWhat it does
@FileCount 1 4–8 Stripes backup across N files; if you have 8-thread network or storage, this gives ~linear speedup
@BlockSize NULL (use default) 65536 I/O block size in bytes; larger blocks reduce per-block overhead on fast storage
@BufferCount NULL (auto) 50–100 Number of I/O buffers; more buffers let SQL Server pre-fetch more data during backup
@MaxTransferSize NULL (1 MB) 4194304 (4 MB) Maximum I/O size; useful on Azure Blob backups where TLS overhead dominates
sql
-- Example: 2 TB database with fast NVMe + 10 GbE network
EXECUTE [dbo].[DatabaseBackup]
    @Databases = 'BigDB',
    @Directory = 'D:\Backups',
    @BackupType = 'FULL',
    @Compress = 'Y',
    @CheckSum = 'Y',
    @FileCount = 8,
    @BlockSize = 65536,
    @BufferCount = 50,
    @MaxTransferSize = 4194304,
    @LogToTable = 'Y';

Backup Encryption

For PCI / HIPAA / SOC 2, backups must be encrypted at rest. Ola's scripts pass straight through to native backup encryption (SQL 2014+):

sql
EXECUTE [dbo].[DatabaseBackup]
    @Databases = 'USER_DATABASES',
    @Directory = 'D:\Backups',
    @BackupType = 'FULL',
    @Encrypt = 'Y',
    @EncryptionAlgorithm = 'AES_256',
    @ServerCertificate = 'BackupCert',  -- pre-created with CREATE CERTIFICATE
    @LogToTable = 'Y';

The BackupCert certificate must be created in master and backed up to a secure offsite location — without it, the encrypted backups are unrecoverable. We've seen this go wrong once; the customer had to restore from a 6-month-old backup. Test certificate restoration as part of the DR drill.

Common Pitfalls

PitfallHow it shows upFix
Backup destination fills up Jobs fail with "Cannot open backup device" or "There is not enough space on the disk" Set @CleanupTime shorter than your retention requirement, monitor the destination drive separately
SQL Service account can't write to UNC backup share Backups fail immediately on the first attempt; CommandLog shows access denied Grant the SQL Server service account NTFS Modify on the share; or use a backup-specific gMSA
@CleanupTime deletes valid backups before next FULL Restore fails because the most recent FULL is gone Set @CleanupTime to at least 2× the FULL cadence; never less than (FULL interval + DIFF retention)
CHECKDB never runs because TimeLimit too low Big database integrity goes unverified; CommandLog shows TimeLimit-aborted entries Move large-database CHECKDB to a weekend slot, or partition-CHECKFILEGROUP across days
IndexOptimize blocks user workload App times out during maintenance window; rebuild took 4h instead of 30min expected Use @MaxDOP to cap parallelism, use ONLINE for Enterprise Edition, schedule during low-traffic window

Monitoring and Alerting

A maintenance solution you don't monitor is just a hope. Three alerts every production deployment should have:

  1. Job failure — SQL Agent job failure → PagerDuty / Slack within 5 minutes. Trivial to set up via Operator + Alert.
  2. Backup age — query msdb.dbo.backupset hourly; if any database hasn't had a successful FULL in N days or LOG in N hours, alert.
  3. CHECKDB skipped — query CommandLog for any database with no CHECKDB run in the last 14 days. Silent corruption is the worst kind.
sql
-- "Has any database not been backed up in 25 hours?"
SELECT
    d.name,
    MAX(b.backup_finish_date) AS last_backup,
    DATEDIFF(HOUR, MAX(b.backup_finish_date), GETDATE()) AS hours_since
FROM sys.databases d
LEFT JOIN msdb.dbo.backupset b
       ON b.database_name = d.name
      AND b.type IN ('D', 'I')           -- FULL or DIFF
WHERE d.database_id > 4                   -- skip system DBs
  AND d.state_desc = 'ONLINE'
GROUP BY d.name
HAVING MAX(b.backup_finish_date) < DATEADD(HOUR, -25, GETDATE())
    OR MAX(b.backup_finish_date) IS NULL;

Working with JusDB on SQL Server Operations

JusDB deploys Ola's Maintenance Solution as the operational backbone on every SQL Server engagement we manage. Our SQL Server remote DBA service includes Ola-scripts setup, custom retention policies tuned to your RPO/RTO targets, and 24/7 monitoring on every backup, integrity check, and index job — so the CommandLog is read by a human before users notice anything. For one-off setup or audit-prep engagements, SQL Server consulting delivers a documented backup-and-DR posture you can hand to your compliance auditor.

Explore SQL Server Remote DBA →  |  SQL Server Consulting →  |  Talk to a SQL Server DBA

Related reading:

Share this article