Optimize Subquery with Windowing Function

When and if you are able to upgrade from SQL Server 2012 to SQL Server 2016, you may be able to take advantage of the much improved performance (especially for frameless window aggregates) provided by the new batch mode Window Aggregate operator.

Almost all large data processing scenarios work better with columnstore storage than rowstore. Even without changing to columnstore for your base tables, you can still gain the benefits of the new 2016 operator and batch mode execution by creating an empty nonclustered columnstore filtered index on one of the base tables, or by redundantly outer joining to a columnstore-organized table.

Using the second option, the query becomes:

-- Just to get batch mode processing and the window aggregate operator
CREATE TABLE #Dummy (a integer NOT NULL, INDEX DummyCC CLUSTERED COLUMNSTORE);

-- Identify any UniqueDates that are greater than the GroupDate within their GroupID
SELECT
    calc_maxUD.UniqueID,
    calc_maxUD.GroupID,
    calc_maxUD.GroupDate,
    calc_maxUD.UniqueDate
FROM 
(
    SELECT
        E.UniqueID,
        E.GroupID,
        E.GroupDate,
        E.UniqueDate,
        maxUniqueDate = MAX(UniqueDate) OVER (
            PARTITION BY GroupID)
    FROM #Example AS E
    LEFT JOIN #Dummy AS D -- The only change to the original query
        ON 1 = 0
) AS calc_maxUD
WHERE 
    calc_maxUD.maxUniqueDate > calc_maxUD.GroupDate
    AND calc_maxUD.maxUniqueDate = calc_maxUD.UniqueDate;

db<>fiddle

Note the only change to the original query is creating an empty temporary table and adding the left join. The execution plan is:

batch mode window aggregate plan

(58 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0
Table '#Example'. Scan count 1, logical reads 40, physical reads 0, read-ahead reads 0

For more information and options, see Itzik Ben-Gan's excellent series, What You Need to Know about the Batch Mode Window Aggregate Operator in SQL Server 2016 (in three parts).


I'm assuming there's no index, as you haven't provided any.

Right off the bat, the following index will eliminate a Sort operator in your plan, which would otherwise potentially consume a lot of memory:

CREATE INDEX IX ON #Example (GroupID, UniqueDate) INCLUDE (UniqueID, GroupDate);

The subquery isn't a performance problem in this case. If anything, I would look at ways to eliminate the window function (MAX... OVER) to avoid the Nested Loop and Table Spool construct.

With the same index, the following query may at first glance look less efficient, and it does go from two to three scans on the base table, but it eliminates a huge number of reads internally because it lacks Spool operators. I'm guessing that it'll still perform better, particularly if you have enough CPU cores and IO performance on your server:

SELECT e.UniqueID
     , e.GroupID
     , e.GroupDate
     , e.UniqueDate
FROM (
    SELECT GroupID, MAX(UniqueDate) AS maxUniqueDate
    FROM #Example
    GROUP BY GroupID) AS agg
INNER JOIN #Example AS e ON agg.GroupID=e.GroupID
WHERE agg.maxUniqueDate > e.GroupDate
    AND agg.maxUniqueDate = e.UniqueDate
OPTION (MERGE JOIN);

(Note: I added a MERGE JOIN query hint, but this should probably happen automatically if your statistics are in order. Best practice is to leave hints like these out if you can.)


I'm just gonna throw the ol' Cross Apply out there:

SELECT e.*
    FROM #Example AS e
    CROSS APPLY ( SELECT TOP 1 e2.UniqueDate AS maxUniqueDate
                    FROM #Example AS e2
                    WHERE e2.GroupID = e.GroupID 
                    ORDER BY e2.UniqueDate DESC
                    ) AS ca
    WHERE ca.maxUniqueDate > e.GroupDate
        AND ca.maxUniqueDate = e.UniqueDate;

With some kinda whatever indexes, it does pretty well.

CREATE CLUSTERED INDEX cx_whatever ON #Example (GroupID)

CREATE UNIQUE NONCLUSTERED INDEX ix_whatever ON #Example (GroupID, UniqueDate DESC, GroupDate)

The stats time and io look like this (your query is the first result)

Table 'Worktable'. Scan count 3, logical reads 28004, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Example'. Scan count 1, logical reads 51, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 20 ms.

Table '#Example'. Scan count 10001, logical reads 21336, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 16 ms,  elapsed time = 11 ms.

Query plans are here (again, yours is first):

https://www.brentozar.com/pastetheplan/?id=BJYJvqAal

Why I prefer this version? I avoid the spools. If those start spilling to disk, it's gonna get ugly.

But you might wanna try this out, too.

SELECT e.*
    FROM #Example AS e
    CROSS APPLY ( SELECT e2.UniqueDate AS maxUniqueDate
                    FROM #Example AS e2
                    WHERE e2.GroupID = e.GroupID 
                    ) AS ca
    WHERE ca.maxUniqueDate > e.GroupDate
        AND ca.maxUniqueDate = e.UniqueDate;

If this is a large DW, you might prefer the Hash Join, and the row filtering in the join, rather than at the end in the TOP 1 query as a Filter operator.

Plan is here: https://www.brentozar.com/pastetheplan/?id=BkUF55ATx

Stats time and io here:

Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Example'. Scan count 2, logical reads 84, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 16 ms,  elapsed time = 5 ms.

Hope this helps!

One edit, based on @ypercube's idea, and a new index.

CREATE NONCLUSTERED INDEX ix_meh ON #Example (UniqueDate,GroupDate) INCLUDE (UniqueID,GroupID);

WITH t1 AS 
(
    SELECT DISTINCT
    e.GroupID ,
    MAX(UniqueDate) AS MaxUniqueDate
    FROM #Example AS e
    GROUP BY e.GroupID
)
SELECT *
FROM #Example AS e
CROSS APPLY (
SELECT *
FROM t1
    WHERE t1.MaxUniqueDate > e.GroupDate
        AND t1.MaxUniqueDate = e.UniqueDate
        AND t1.GroupID = e.GroupID
) ca

Here's the stats time and io:

Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Example'. Scan count 2, logical reads 91, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 4 ms.

Here's the plan:

https://www.brentozar.com/pastetheplan/?id=SJv8foR6g