Alternative way to compress NVARCHAR(MAX)?

Both page and row compression do not compress BLOBs.

Because of their size, large-value data types are sometimes stored separately from the normal row data on special purpose pages. Data compression is not available for the data that is stored separately.

If you want to compress BLOBs you need to store them as VARBINARY(MAX) and apply your stream compression algorithm of choice. For example GZipStream. There are many examples how to do this, just search for GZipStream and SQLCLR.


There are (now) potentially two ways to accomplish custom compression:

  1. Starting in SQL Server 2016 there are built-in functions for COMPRESS and DECOMPRESS. These functions use the GZip algorithm.

  2. Use SQLCLR to implement any algorithm you choose (as @Remus mentioned in his answer). This option is available in versions prior to SQL Server 2016, going all the way back to SQL Server 2005.

    GZip is an easy choice because it is available within .NET and in the supported .NET Framework libraries (the code can be in a SAFE Assembly). Or, if you want GZip but don't want to deal with coding/deploying it, you can use the Util_GZip and Util_GUnzip functions that are available in the Free version of the SQL# SQLCLR library (which I am the author of).

    If you decide to use GZip, whether you code it yourself or use SQL#, please be aware that the algorithm used in .NET to do the GZip compression changed in Framework version 4.5 for the better (see the "Remarks" section on the MSDN page for GZipStream Class). This means:

    1. If you are using SQL Server 2005, 2008, or 2008 R2 -- all linked to CLR v 2.0 which handles Framework versions 2.0, 3.0, and 3.5 -- then the change made in Framework version 4.5 has no effect and you are unfortunately stuck with .NET's original, sucky algorithm.
    2. If you are using SQL Server 2012 or newer (so far 2014 and 2016) -- all linked to CLR v 4.0 which handles Framework versions 4.0, 4.5.x, 4.6 -- then you can use the newer, better algorithm. The only requirement is that you have updated the .NET Framework on the server running SQL Server to be version 4.5 or newer.

    However, you don't have to use GZip and are free to implement any algorithm like.

PLEASE NOTE: all of the methods noted above are more so "work-arounds" instead of being actual replacements, even though they are technically "alternative ways to compress NVARCHAR(MAX)" data. The difference is that with the built-in Data Compression -- row and page -- offered by SQL Server, the compression is handled behind-the-scenes and the data is still usable, readable, and indexable. But compressing any data into a VARBINARY means that you are saving space, but giving up some functionality. True, a 20k string is not indexable anyway, but it can still be used in a WHERE clause, or with any string functions. In order to do anything with a custom compressed value you would need to decompress it on the fly. When compressing binary files (PDFs, JPEGs, etc) this is a non-issue, but this question was specific to NVARCHAR data.