Why does a subquery reduce the row estimate to 1?

This definitely seems like unintended behavior. It is true that cardinality estimates do not need to be consistent at each step of a plan but this is a relatively simple query plan and the final cardinality estimate is inconsistent with what the query is doing. Such a low cardinality estimate could result in poor choices for join types and access methods for other tables downstream in a more complicated plan.

Through trial and error we can come up with a few similar queries for which the issue does not appear:

SELECT 
  ID
, CASE
    WHEN ID <> 0 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    ELSE (SELECT -1) 
  END AS ID2
FROM dbo.X_HEAP;

SELECT 
  ID
, CASE
    WHEN ID < 500 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    WHEN ID >= 500 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
  END AS ID2
FROM dbo.X_HEAP;

We can also come up with more queries for which the issue appears:

SELECT 
  ID
, CASE
    WHEN ID < 500 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    WHEN ID >= 500 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
    ELSE (SELECT TOP 1 ID FROM X_OTHER_TABLE) 
  END AS ID2
FROM dbo.X_HEAP;

SELECT 
  ID
, CASE
    WHEN ID = 0 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    ELSE (SELECT -1) 
  END AS ID2
FROM dbo.X_HEAP;

SELECT 
  ID
, CASE
    WHEN ID = 0 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    ELSE (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
  END AS ID2
FROM dbo.X_HEAP;

There appears to be a pattern: if there is an expression within the CASE that is not expected to be executed and the result expression is a subquery against a table then the row estimate falls to 1 after that expression.

If I write the query against a table with a clustered index the rules change somewhat. We can use the same data:

CREATE TABLE dbo.X_CI (ID INT NOT NULL, PRIMARY KEY (ID))

INSERT INTO dbo.X_CI WITH (TABLOCK)
SELECT * FROM dbo.X_HEAP;

UPDATE STATISTICS X_CI WITH FULLSCAN;

This query has a 1000 row final estimate:

SELECT 
  ID
, CASE
    WHEN ID = 0 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
    ELSE (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
  END
FROM dbo.X_CI;

But this query has a 1 row final estimate:

SELECT 
  ID
, CASE
    WHEN ID <> 0 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    ELSE (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
  END
FROM dbo.X_CI;

To dig into this further we can use the undocumented trace flag 2363 to get information about how the query optimizer performed selectivity calculations. I found it helpful to pair that trace flag with the undocumented trace flag 8606. TF 2363 seems to give selectivity computations for both the simplified tree and the tree after project normalization. Having both trace flags enabled makes it clear which calculations apply to which tree.

Let's try it for the original query posted in the question:

SELECT 
  ID
, CASE
    WHEN ID <> 0 
    THEN (SELECT TOP 1 ID FROM X_OTHER_TABLE) 
    ELSE (SELECT TOP 1 ID FROM X_OTHER_TABLE_2) 
  END AS ID2
FROM X_HEAP
OPTION (QUERYTRACEON 3604, QUERYTRACEON 2363, QUERYTRACEON 8606);

Here is part of the part of output which I think is relevant along with some comments:

Plan for computation:

  CSelCalcColumnInInterval -- this is the type of calculator used

      Column: QCOL: [SE_DB].[dbo].[X_HEAP].ID -- this is the column used for the calculation

Pass-through selectivity: 0 -- all rows are expected to have a true value for the case expression

Stats collection generated: 

  CStCollOuterJoin(ID=8, CARD=1000 x_jtLeftOuter) -- the row estimate after the join will still be 1000

      CStCollBaseTable(ID=1, CARD=1000 TBL: X_HEAP)

      CStCollBaseTable(ID=2, CARD=1 TBL: X_OTHER_TABLE)

...

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [SE_DB].[dbo].[X_HEAP].ID

Pass-through selectivity: 1 -- no rows are expected to have a true value for the case expression

Stats collection generated: 

  CStCollOuterJoin(ID=9, CARD=1 x_jtLeftOuter) -- the row estimate after the join will still be 1

      CStCollOuterJoin(ID=8, CARD=1000 x_jtLeftOuter) -- here is the row estimate after the previous join

          CStCollBaseTable(ID=1, CARD=1000 TBL: X_HEAP)

          CStCollBaseTable(ID=2, CARD=1 TBL: X_OTHER_TABLE)

      CStCollBaseTable(ID=3, CARD=1 TBL: X_OTHER_TABLE_2)

Now let's try it for a similar query that doesn't have the issue. I'm going to use this one:

SELECT 
  ID
, CASE
    WHEN ID <> 0 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    ELSE (SELECT -1) 
  END AS ID2
FROM dbo.X_HEAP
OPTION (QUERYTRACEON 3604, QUERYTRACEON 2363, QUERYTRACEON 8606);

Debug output at the very end:

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [SE_DB].[dbo].[X_HEAP].ID

Pass-through selectivity: 1

Stats collection generated: 

  CStCollOuterJoin(ID=9, CARD=1000 x_jtLeftOuter)

      CStCollOuterJoin(ID=8, CARD=1000 x_jtLeftOuter)

          CStCollBaseTable(ID=1, CARD=1000 TBL: dbo.X_HEAP)

          CStCollBaseTable(ID=2, CARD=1 TBL: dbo.X_OTHER_TABLE)

      CStCollConstTable(ID=4, CARD=1) -- this is different than before because we select a constant instead of from a table

Let's try another query for which the bad row estimate is present:

SELECT 
  ID
, CASE
    WHEN ID < 500 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    WHEN ID >= 500 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
    ELSE (SELECT TOP 1 ID FROM X_OTHER_TABLE) 
  END AS ID2
FROM dbo.X_HEAP
OPTION (QUERYTRACEON 3604, QUERYTRACEON 2363, QUERYTRACEON 8606);

At the very end the cardinality estimate drops to 1 row, again after Pass-through selectivity = 1. The cardinality estimate is preserved after a selectivity of 0.501 and 0.499.

Plan for computation:

 CSelCalcColumnInInterval

      Column: QCOL: [SE_DB].[dbo].[X_HEAP].ID

Pass-through selectivity: 0.501

...

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [SE_DB].[dbo].[X_HEAP].ID

Pass-through selectivity: 0.499

...

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [SE_DB].[dbo].[X_HEAP].ID

Pass-through selectivity: 1

Stats collection generated: 

  CStCollOuterJoin(ID=12, CARD=1 x_jtLeftOuter) -- this is associated with the ELSE expression

      CStCollOuterJoin(ID=11, CARD=1000 x_jtLeftOuter)

          CStCollOuterJoin(ID=10, CARD=1000 x_jtLeftOuter)

              CStCollBaseTable(ID=1, CARD=1000 TBL: dbo.X_HEAP)

              CStCollBaseTable(ID=2, CARD=1 TBL: dbo.X_OTHER_TABLE)

          CStCollBaseTable(ID=3, CARD=1 TBL: dbo.X_OTHER_TABLE_2)

      CStCollBaseTable(ID=4, CARD=1 TBL: X_OTHER_TABLE)

Let's again switch to another similiar query that does not have the issue. I'm going to use this one:

SELECT 
  ID
, CASE
    WHEN ID < 500 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    WHEN ID >= 500 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
  END AS ID2
FROM dbo.X_HEAP
OPTION (QUERYTRACEON 3604, QUERYTRACEON 2363, QUERYTRACEON 8606);

In the debug output there is never a step which has a pass-through selectivity of 1. The cardinality estimate stays at 1000 rows.

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [SE_DB].[dbo].[X_HEAP].ID

Pass-through selectivity: 0.499

Stats collection generated: 

  CStCollOuterJoin(ID=9, CARD=1000 x_jtLeftOuter)

      CStCollOuterJoin(ID=8, CARD=1000 x_jtLeftOuter)

          CStCollBaseTable(ID=1, CARD=1000 TBL: dbo.X_HEAP)

          CStCollBaseTable(ID=2, CARD=1 TBL: dbo.X_OTHER_TABLE)

      CStCollBaseTable(ID=3, CARD=1 TBL: dbo.X_OTHER_TABLE_2)

End selectivity computation

What about the query when it involves a table with a clustered index? Consider the following query with the row estimate issue:

SELECT 
  ID
, CASE
    WHEN ID <> 0 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
    ELSE (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
  END
FROM dbo.X_CI
OPTION (QUERYTRACEON 3604, QUERYTRACEON 2363, QUERYTRACEON 8606);

The end of the debug output is similar to what we already have seen:

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [SE_DB].[dbo].[X_CI].ID

Pass-through selectivity: 1

Stats collection generated: 

  CStCollOuterJoin(ID=9, CARD=1 x_jtLeftOuter)

      CStCollOuterJoin(ID=8, CARD=1000 x_jtLeftOuter)

          CStCollBaseTable(ID=1, CARD=1000 TBL: dbo.X_CI)

          CStCollBaseTable(ID=2, CARD=1 TBL: dbo.X_OTHER_TABLE)

      CStCollBaseTable(ID=3, CARD=1 TBL: dbo.X_OTHER_TABLE_2)

However, the query against the CI without the issue has different output. Using this query:

SELECT 
  ID
, CASE
    WHEN ID = 0 
    THEN (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE_2) 
    ELSE (SELECT TOP 1 ID FROM dbo.X_OTHER_TABLE) 
  END
FROM dbo.X_CI
OPTION (QUERYTRACEON 3604, QUERYTRACEON 2363, QUERYTRACEON 8606);

Results in different calculators being used. CSelCalcColumnInInterval no longer appears:

Plan for computation:

  CSelCalcFixedFilter (0.559)

Pass-through selectivity: 0.559

Stats collection generated: 

  CStCollOuterJoin(ID=8, CARD=1000 x_jtLeftOuter)

      CStCollBaseTable(ID=1, CARD=1000 TBL: dbo.X_CI)

      CStCollBaseTable(ID=2, CARD=1 TBL: dbo.X_OTHER_TABLE_2)

...

Plan for computation:

  CSelCalcUniqueKeyFilter

Pass-through selectivity: 0.001

Stats collection generated: 

  CStCollOuterJoin(ID=9, CARD=1000 x_jtLeftOuter)

      CStCollOuterJoin(ID=8, CARD=1000 x_jtLeftOuter)

          CStCollBaseTable(ID=1, CARD=1000 TBL: dbo.X_CI)

          CStCollBaseTable(ID=2, CARD=1 TBL: dbo.X_OTHER_TABLE_2)

      CStCollBaseTable(ID=3, CARD=1 TBL: dbo.X_OTHER_TABLE)

In conclusion, we appear to get a bad row estimate after the subquery under the following conditions:

The CSelCalcColumnInInterval selectivity calculator is used. I don't know exactly when this is used but it seems to show up much more often when the base table is a heap.
Pass-through selectivity = 1. In other words, one of the CASE expressions is expected to evaluated to false for all of the rows. It does not matter if the first CASE expression evaluates to true for all rows.
There is an outer join to CStCollBaseTable. In other words, the CASE result expression is a subquery against a table. A constant value will not work.

Perhaps under those conditions the query optimizer is unintentionally applying the pass-through selectivity to the row estimate of the outer table instead of to the work done on the inner part of the nested loop. That would reduce the row estimate to 1.

I was able to find two workarounds. I was not able to reproduce the issue when using APPLY instead of a subquery. The output of trace flag 2363 was very different with APPLY. Here's one way to rewrite the original query in the question:

SELECT 
  h.ID
, a.ID2
FROM X_HEAP h
OUTER APPLY
(
SELECT CASE
    WHEN ID <> 0 
    THEN (SELECT TOP 1 ID FROM X_OTHER_TABLE) 
    ELSE (SELECT TOP 1 ID FROM X_OTHER_TABLE_2) 
  END
) a(ID2);

good query 1

The legacy CE appears to avoid the issue as well.

SELECT 
  ID
, CASE
    WHEN ID <> 0 
    THEN (SELECT TOP 1 ID FROM X_OTHER_TABLE) 
    ELSE (SELECT TOP 1 ID FROM X_OTHER_TABLE_2) 
  END AS ID2
FROM X_HEAP
OPTION (USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

good query 2

A connect item was submitted for this issue (with some of the details that Paul White provided in his answer).

This cardinality estimation (CE) issue surfaces when:

The join is an outer join with a pass-through predicate
The selectivity of the pass-through predicate is estimated to be exactly 1.

Note: The particular calculator used to determine the selectivity is not important.

Details

The CE computes the selectivity of the outer join as the sum of:

The inner join selectivity with the same predicate
The anti join selectivity with the same predicate

The only difference between an outer and inner join is that an outer join also returns rows that do not match on the join predicate. The anti join provides exactly this difference. Cardinality estimation for inner and anti join is easier than for outer join directly.

The join selectivity estimation process is very straightforward:

First, the selectivity S_PT of the pass-through predicate is assessed.
- This is done using whichever calculator is appropriate to the circumstances.
- The predicate is the whole thing, including any negating IsFalseOrNull component.
Inner join selectivity := 1 - S_PT
Anti join selectivity := S_PT

The anti join represents rows that will 'pass through' the join. The inner join represents rows that will not 'pass through'. Note that 'pass through' means rows that flow through the join without running the inner side at all. To emphasise: all rows will be returned by the join, the distinction is between rows that run the inner side of the join before emerging, and those that do not.

Clearly, adding 1 - S_PT to S_PT should always give a total selectivity of 1, meaning all rows are returned by the join, as expected.

Indeed, the above calculation works exactly as described for all values of S_PT except 1.

When S_PT = 1, both inner join and anti join selectivities are estimated to be zero, resulting in a cardinality estimate (for the join as a whole) of one row. As far as I can tell, this is unintentional, and should be reported as a bug.

A related issue

This bug is more likely to manifest than one might think, due to a separate CE limitation. This arises when the CASE expression uses an EXISTS clause (as is common). For example the following modified query from the question does not encounter the unexpected cardinality estimate:

-- This is fine
SELECT 
    CASE
        WHEN XH.ID = 1
        THEN (SELECT TOP (1) XOT.ID FROM dbo.X_OTHER_TABLE AS XOT) 
    END
FROM dbo.X_HEAP AS XH;

Introducing a trivial EXISTS does cause the issue to surface:

-- This is not fine
SELECT 
    CASE
        WHEN EXISTS (SELECT 1 WHERE XH.ID = 1)
        THEN (SELECT TOP (1) XOT.ID FROM dbo.X_OTHER_TABLE AS XOT) 
    END
FROM dbo.X_HEAP AS XH;

Using EXISTS introduces a semi join (highlighted) to the execution plan:

Semi join plan

The estimate for the semi join is fine. The problem is that the CE treats the associated probe column as a simple projection, with a fixed selectivity of 1:

Semijoin with probe column treated as a Project.

Selectivity of probe column = 1

This automatically meets one of the conditions required for this CE issue to manifest, regardless of the contents of the EXISTS clause.

For important background information, see Subqueries in CASE Expressions by Craig Freedman.

Why does a subquery reduce the row estimate to 1?

Details

A related issue

Tags:

Sql Server

Sql Server 2016

Cardinality Estimates

Related

Recent Posts