REORGANIZE seems to work as well as REBUILD all the time

Let me quote another part of that same page, which basically says IT DEPENDS - emphasis mine:

These values provide a rough guideline for determining the point at which you should switch between ALTER INDEX REORGANIZE and ALTER INDEX REBUILD. However, the actual values may vary from case to case. It is important that you experiment to determine the best threshold for your environment.

There's not really a good answer for this except "it depends"... those are just general guidelines and may or may not be appropriate depending on the number of rows, the columns in your index, their data types, how wide the index is, what else is getting written to disk in between writes to those tables, etc.

Like how big should I size tempdb, what is a good threshold for PLE, and what type of car should I buy, the answer is the same: IT DEPENDS.

The big difference, in my environment, is that Reorganize is single threaded, and Rebuild can go as parallel as you have processors. Rebuild also has the added bonus of updating your statistics, so you can forgo separate statistics updates where you've rebuilt.

From my perspective, dealing with some big (100GB to multi-terabyte) tables, I pretty much set everything to rebuild. I've watched Reorganize spend 8 hours on one PK (300GB table) and barely make a dent. I'm solely responsible for about 100TB of data across 8 servers, and our maintenance windows are not the longest (1am - 7am).

I currently use Ola Hallengren scripts, but I am looking to make a move to Minion Reindex in the coming months, since they offer the ability to pre-fetch the objects you'd like to work on, so the whole maintenance window can be spent on maintenance, rather than gathering fragmentation data. The function that gathers fragmentation data can run for quite a while, even in LIMITED mode, on larger tables.

@AaronBertrand has a very good answer and is correct - it depends. I'd like to add that you're only looking at one metric - index fragmentation. That's a good start, but there are other things to consider.

Consider the following scenario (using AdventureWorks2014):

select * 
from Sales.SalesOrderDetail
where SalesOrderID < 50000;

--modify
delete 
from Sales.SalesOrderDetail
where SalesOrderID < 50000;
go

select 
    SchemaName = s.name,
    TableName = t.name, 
    IndexName = idx.name, 
    StatsName = st.name, 
    AvgFragmentationPct = idx.avg_fragmentation_in_percent,
    StatsModifications = st.modification_counter,
    StatsPctModified = 100.*st.modification_counter / st.unfiltered_rows,
    StatsSampleRate = st.sample_rate, 
    StatsLastUpdate = st.last_updated,
    Rows = st.rows
from (
    select s.object_id, s.name, sp.modification_counter, sp.unfiltered_rows, s.stats_id, sp.last_updated, sample_rate = (100.*sp.rows_sampled)/sp.unfiltered_rows , sp.rows
    from sys.stats s 
    cross apply sys.dm_db_stats_properties(s.object_id,s.stats_id) sp 
) st
left join (
        select ips.object_id, i.name, ips.partition_number, ips.avg_fragmentation_in_percent, i.index_id
        from sys.dm_db_index_physical_stats(db_id(),null, null, null, default) ips
        join sys.indexes i 
            on ips.index_id = i.index_id
            and ips.object_id = i.object_id
        ) idx
on st.object_id = idx.object_id
    and st.stats_id = idx.index_id
join sys.tables t 
    on st.object_id = t.object_id
join sys.schemas s
    on t.schema_id = s.schema_id
where s.name = 'Sales' and t.name = 'SalesOrderDetail'
go

delete 
from Sales.SalesOrderDetail
where ProductID = 779
go

select 
    SchemaName = s.name,
    TableName = t.name, 
    IndexName = idx.name, 
    StatsName = st.name, 
    AvgFragmentationPct = idx.avg_fragmentation_in_percent,
    StatsModifications = st.modification_counter,
    StatsPctModified = 100.*st.modification_counter / st.unfiltered_rows,
    StatsSampleRate = st.sample_rate, 
    StatsLastUpdate = st.last_updated,
    Rows = st.rows
from (
    select s.object_id, s.name, sp.modification_counter, sp.unfiltered_rows, s.stats_id, sp.last_updated, sample_rate = (100.*sp.rows_sampled)/sp.unfiltered_rows , sp.rows
    from sys.stats s 
    cross apply sys.dm_db_stats_properties(s.object_id,s.stats_id) sp 
) st
left join (
        select ips.object_id, i.name, ips.partition_number, ips.avg_fragmentation_in_percent, i.index_id
        from sys.dm_db_index_physical_stats(db_id(),null, null, null, default) ips
        join sys.indexes i 
            on ips.index_id = i.index_id
            and ips.object_id = i.object_id
        ) idx
on st.object_id = idx.object_id
    and st.stats_id = idx.index_id
join sys.tables t 
    on st.object_id = t.object_id
join sys.schemas s
    on t.schema_id = s.schema_id
where s.name = 'Sales' and t.name = 'SalesOrderDetail'
go

We create a new auto-created statistics object (_WA_Sys_00000006_44CA3770, in this example) by filtering on SalesOrderID and then we delete rows (~30,000 of them). We then introduce further modifications by deleting rows with a specific ProductID. This is important because it's a leading column in at least one of these indexes/stats which, in turn, is important for selectivity calculation reasons (statistics), and b-tree traversal (index).

enter image description here

The index fragmentation isn't so bad, so let's reorganize.

alter index all on Sales.SalesOrderDetail reorganize;
go

select 
    SchemaName = s.name,
    TableName = t.name, 
    IndexName = idx.name, 
    StatsName = st.name, 
    AvgFragmentationPct = idx.avg_fragmentation_in_percent,
    StatsModifications = st.modification_counter,
    StatsPctModified = 100.*st.modification_counter / st.unfiltered_rows,
    StatsSampleRate = st.sample_rate, 
    StatsLastUpdate = st.last_updated,
    Rows = st.rows
from (
    select s.object_id, s.name, sp.modification_counter, sp.unfiltered_rows, s.stats_id, sp.last_updated, sample_rate = (100.*sp.rows_sampled)/sp.unfiltered_rows , sp.rows
    from sys.stats s 
    cross apply sys.dm_db_stats_properties(s.object_id,s.stats_id) sp 
) st
left join (
        select ips.object_id, i.name, ips.partition_number, ips.avg_fragmentation_in_percent, i.index_id
        from sys.dm_db_index_physical_stats(db_id(),null, null, null, default) ips
        join sys.indexes i 
            on ips.index_id = i.index_id
            and ips.object_id = i.object_id
        ) idx
on st.object_id = idx.object_id
    and st.stats_id = idx.index_id
join sys.tables t 
    on st.object_id = t.object_id
join sys.schemas s
    on t.schema_id = s.schema_id
where s.name = 'Sales' and t.name = 'SalesOrderDetail'
go

enter image description here

There are no changes wrought by a reorganization.

Let's rebuild:

alter index all on Sales.SalesOrderDetail rebuild;
go

select 
    SchemaName = s.name,
    TableName = t.name, 
    IndexName = idx.name, 
    StatsName = st.name, 
    AvgFragmentationPct = idx.avg_fragmentation_in_percent,
    StatsModifications = st.modification_counter,
    StatsPctModified = 100.*st.modification_counter / st.unfiltered_rows,
    StatsSampleRate = st.sample_rate, 
    StatsLastUpdate = st.last_updated,
    Rows = st.rows
from (
    select s.object_id, s.name, sp.modification_counter, sp.unfiltered_rows, s.stats_id, sp.last_updated, sample_rate = (100.*sp.rows_sampled)/sp.unfiltered_rows , sp.rows
    from sys.stats s 
    cross apply sys.dm_db_stats_properties(s.object_id,s.stats_id) sp 
) st
left join (
        select ips.object_id, i.name, ips.partition_number, ips.avg_fragmentation_in_percent, i.index_id
        from sys.dm_db_index_physical_stats(db_id(),null, null, null, default) ips
        join sys.indexes i 
            on ips.index_id = i.index_id
            and ips.object_id = i.object_id
        ) idx
on st.object_id = idx.object_id
    and st.stats_id = idx.index_id
join sys.tables t 
    on st.object_id = t.object_id
join sys.schemas s
    on t.schema_id = s.schema_id
where s.name = 'Sales' and t.name = 'SalesOrderDetail'
go

enter image description here

Ok, our fragmentation numbers are modestly better but not significant and, to your point, perhaps not worth the overhead of rebuilding over reorganization.

There are a few other things worth noting, though.

Rebuilding your indexes also updates your statistics with a fullscan. It should be noted that it only updates the statistics that support the index - no other statistics are updated (as evidenced by the sad state of _WA_Sys_00000006_44CA3770).
Reorganizing your index does not update your statistics in any capacity

Conclusion

There's more to evaluate than whether or not you're getting good defragmentation rates. While certainly important to IO, I've also found it quite helpful to explore my systems and see if I can ascertain how to best maintain it, rather than updating, rebuilding, and reorganizing blindly.

REORGANIZE seems to work as well as REBUILD all the time

Tags:

Sql Server

Sql Server 2012

Index

Related

Recent Posts