How does a recursive correlated expression speed up distinct queries?

Paul White explained this "trick" in great detail in his post Performance Tuning the Whole Query Plan under Finding the Distinct Values section.

Why the database cannot do this optimization itself?

Is the recursive CTE useful for other data base systems as well?

Optimizer is not perfect and it doesn't implement all possible techniques. People asked Microsoft to implement it. See this Connect item Implement Index Skip Scan. It was closed as Won't Fix, but that doesn't mean that it won't be addressed in the future. Other DBMSes may have implemented it (Connect item says that Oracle implements this optimization). If this kind of optimization is implemented in the DBMS engine, then this "trick" is not needed and optimizer would choose optimal method of calculating the result based on available statistics.

I don't get why this speeds up the query that much.

I am not sure in which cases this approach is beneficial

The simple DISTINCT query scans the whole index. "Scans" means that it reads each and every page of the index from the disk and aggregates the values in memory (or tempdb) to get a list of distinct values.

If you know that the table has a lot of rows, but only few different distinct values, then reading all these repeating values is a waste of time. Recursive CTE forces the server to seek an index for the first distinct value, then seek an index for the second value and so on. "Seek" means that server uses binary search in the index to find the value. Usually one seek requires reading of only few pages from disk. "Index" is a balanced tree.

If the table has only few distinct values it is faster to seek few times, than read all pages of the index. On the other hand, if there are a lot of distinct values then it would be faster to read all pages sequentially, than seeking for each consecutive value. This should give you an idea in what cases this approach is beneficial.

Obviously, if the table is small, it is faster to scan it. Only when the table becomes "large enough" you start seeing the difference in performance.


There is a related question on dba.se: Is it possible to get seek based parallel plan for distinct/group by?