Clustered Index fragmentation vs Index with Included columns fragmentation

Your questions:

Doesn't an Index with included columns have the exact same problem?

Yes.

Is a table with included columns not just the same as a "shadow table" with the same fragmentation problems?

Yes

Should I migrate to use UserId, TipIndex as a ClusteredIndex instead of Id?

I would, yes.

How to prevent fragmentation?

There are a couple of different types of fragmentation to consider. One is when you only have part of your pages being used because you’ve had page splits. If you have a lot of inserts, this will happen. Don’t stress too much. The other is when you have pages where the subsequent page is in a different extent. Again, I wouldn’t worry too much. If your data is mostly in the buffer cache, it doesn’t really matter if it moves across extents.

So... don’t worry about it too much. But don’t bother having a complete copy of the data in a way that you won’t actually be querying it.


It seems you are too much concerned about fragmentation, As long as you keep updating statistics regularly, fragmentation shouldn't bother you much for performance. You may read more details about this on a video shared by Mr. Brent Ozar and also another page here. Let me try answering your question one by one:

Doesn't an Index with included columns have the exact same problem?

Index with columns in the include or be it a key column, doesn't differ much. Key columns are going to be part of B-tree whereas include columns are not however when you perform any insert/update/delete operation, this will have same expense as these columns need to be updated/inserted/deleted.

Is a table with included columns not just the same as a "shadow table" with the same fragmentation problems?

Not very sure what you meant by shadow table, if you meant by base table here then yes, you would have same problem as far as fragmentation is concerned.

Should I migrate to use UserId, TipIndex as a ClusteredIndex instead of Id?

As per your statement - "99% of the times I query on UserId", this is a good candidate for primary key clustered column. Since you are not going to use Id column very frequently, I don't see any problem in using composite primary clustered key in the form of UserId and TipIndex. In terms of size of index, it is as good as Id(int --> 4 byte) plus an additional column of tinyint type(1 byte).

Kindly understand that, clustered key is nothing but the order of data stored logically and doesn't have any physical existence unlike non-clustered key.

How to prevent fragmentation?

I would say updating statistics should be considered with priority than that of fragmentation. You may use maintenance script used by many DBAs across the world from Ola Hallengren. You can schedule it weekly or bi-weekly basis your requirement.

Hope this helps.