Database size - MDF too large?

In your size estimates, have you taken into account the amount of space taken by indexes? Also if you have text fields that are set as multi-byte (N[VAR]CHAR rather than [VAR]CHAR) and the input files are UTF-8 or plain one-byte-per-character then that will push your storage requirements up by up to a factor of two. Furthermore remember that if you have a clustered key/index on a table the size of this affects all other indexes on the table because they include the clustered key value for every row (so to give an extreme example if a table has an NCHAR(10) key where an INT would do and that is your clustered key/index you are not only using an extra 16 bytes per row in the data pages you also waste 16 bytes per row in every other index on that table).

Also, some space will be allocated but unused, either because the DB engine has left some space allocated after deletes so that it can be used again quickly for new data in that table or because the pattern of inserts and deletes has left many pages only part full.

You can run:

SELECT o.name
     , SUM(ps.reserved_page_count)/128.0 AS ReservedMB
     , SUM(ps.used_page_count)/128.0 AS UsedMB
     , SUM(ps.reserved_page_count-ps.used_page_count)/128.0 AS DiffMB
FROM sys.objects o  
JOIN sys.dm_db_partition_stats ps ON o.object_id = ps.object_id  
WHERE OBJECTPROPERTYEX(o.object_id, 'IsMSShipped') = 0  
GROUP BY o.name  
ORDER BY SUM(ps.reserved_page_count) DESC

to get a quick look at what tables are taking up space.

Also EXEC sp_spaceused run within that DB will return two result sets. The first lists the total space allocated in the filesystem for the data files and how much of that is unallocated, the second lists how much of the allocated space is used for data pages, for index pages, or is currently unused.

sp_spaceused will return the space used by a given object too, so you can loop this to build a table for analysis:

-- TEMP TABLES FOR ANALYSIS
CREATE TABLE #tTables (sName NVARCHAR(MAX), iRows BIGINT, iReservedKB BIGINT, iDataKB BIGINT, iIndexKB BIGINT, iUnusedKB BIGINT)
CREATE TABLE #tTmp (sName NVARCHAR(MAX), iRows BIGINT, sReservedKB NVARCHAR(MAX), sDataKB NVARCHAR(MAX), sIndexKB NVARCHAR(MAX), sUnusedKB NVARCHAR(MAX))
-- COLLECT SPACE USE PER TABLE
EXEC sp_msforeachtable 'INSERT #tTmp EXEC sp_spaceused [?];'
-- CONVERT NUMBER-AS-TEXT COLUMNS TO NUMBER TYPES FOR EASIER ANALYSIS
INSERT #tTables SELECT sName, iRows
                     , CAST(REPLACE(sReservedKB, ' KB', '') AS BIGINT)
                     , CAST(REPLACE(sDataKB    , ' KB', '') AS BIGINT)
                     , CAST(REPLACE(sIndexKB   , ' KB', '') AS BIGINT)
                     , CAST(REPLACE(sUnusedKB  , ' KB', '') AS BIGINT) 
                FROM #tTmp
DROP TABLE #tTmp 
-- DO SOME ANALYSIS 
SELECT sName='TOTALS', iRows=SUM(iRows), iReservedKB=SUM(iReservedKB), iDataKB=SUM(iDataKB),  iIndexKB=SUM(iIndexKB), iUnusedKB=SUM(iUnusedKB) FROM #tTables ORDER BY sName
SELECT * FROM #tTables ORDER BY iReservedKB DESC
-- CLEAN UP
DROP TABLE #tTables

The above code will output all the table sizes in one list, plus a single row for the totals. If needed you can use the various system views (like sys.objects and sys.dm_db_partition_stats used in the first query above, see http://technet.microsoft.com/en-us/library/ms177862.aspx for much more detail) to get more details such as the space used by each index.


There are three classes of unused space in a data file:

  1. That which is not allocated to anything (this shows in the first resultset from sp_spaceused with no object specified)
  2. That which is allocated to an object (reserved) but not currently used (this shows in the "unused" count in sp_spaceused's output.
  3. That locked in part-used pages (this will look to be used as everything is allocated in single page chunks, one page being 8,192 bytes long). This is harder to detect/calculate. It is due to a mix of two factors:
    • Split pages. As data gets added you often end up with part empty pages (the storage engine could always normalise page contents, but this would be very inefficient), and as rows are deleted page contents are not automatically packed (again they could be, but the extra I/O load is generally far from worth it).
    • The storage engine won't split a row over multiple pages (this along with the page size where the 8,192 byte-per-row limit comes from). If your rows are fixed size and take 1,100 bytes each then you are going to "waste" at least 492 bytes of each data block allocated to that table (7 rows take 7,700 bytes and an 8th won't fit so the remaining bytes won't be used). The wider the rows, the worse this may be. Tables/indexes with variable length rows (which are far more common than completely fixed length ones) generally fair better (but are less easy to calculate the matter for).
      Another caveat here is large objects (TEXT columns, [N]VARCHAR(MAX) values above a certain size and so on) as they do get placed off-page, just taking 8 bytes in the main row data to hold a pointer to the data elsewhere) so can break the 8,192 bytes-per-row-limit.

tl;dr: Estimating expected database sizes can be a lot more involved than it is natural to initially assume.


Try running sp_spaceused on your database. As an example it returns:

reserved           data               index_size         unused
------------------ ------------------ ------------------ ------------------
6032 KB            2624 KB            1664 KB            1744 KB

To run it on the database just USE the database then run sp_spaceused.

If it still shows a great deal of unused space you can try the shrink again. Sometimes I do find that it takes multiple tries. Also sometimes I find it works best to shrink the individual file rather than the database as a whole. However what you may find is that you have 2.9Tb of data and another 4+Tb of indexes in which case the 7.5TB is pretty reasonable. If you want to get a feel for the amount of space (data & index) of each table then you can run sp_spaceused at a table level as well. You can run it across all tables in the database by using the following command:

EXEC sp_msforeachtable 'EXEC sp_spaceused [?];'

Although fair warning sp_msforeachtable is undocumented, unsupported and has been known to miss tables. On the other hand I've had a fair amount of luck with it myself.

All of that being said your database SHOULD have a certain percentage of free space depending on your expected growth. Basically you want to make sure that you have space for anywhere from 6 months to a couple of years worth of growth. Also you will want to check your autogrowth settings to make sure they are appropriate to your situation. Particularly given the size of your database you do NOT want to be using a % autogrowth.