Filter condition not correctly applied to Clustered Columnstore index

This is a bug with SQL Server. If a column is deleted from a table with a clustered columnstore index, and then a new column is added with the same name, it appears to be using the old, deleted column for the predicate. Here is the MVCE:

This script starts off with 10000 rows with statusId of 1 and statusId2 of 5 - then drops the statusID column and renames statusId2 to statusId. So at the end all rows should have a statusId of 5.

But the following query hits the non clustered index...

select *
from example
where statusId = 1
    and total <= @filter
    and barcode = @barcode
    and id2 = @id2

... and returns 2 rows (with the selected statusId different from implied by the WHERE clause)...

+-------+---------+------+-------+----------+
|  id   | barcode | id2  | total | statusId |
+-------+---------+------+-------+----------+
|     5 |    5    | NULL |  5.00 |        5 |
| 10005 |    5    | NULL |  5.00 |        5 |
+-------+---------+------+-------+----------+

... whereas this one accesses the columnstore and correctly returns 0

select count(*) 
from example 
where statusId = 1

MVCE

/*Create table with clustered columnstore and non clustered rowstore*/
CREATE TABLE example
(
id        INT IDENTITY(1, 1),
barcode   CHAR(22),
id2       INT,
total     DECIMAL(10,2),
statusId  TINYINT,
statusId2 TINYINT,
INDEX cci_example CLUSTERED COLUMNSTORE,
INDEX ix_example (barcode, total)
);

/* Insert 10000 rows all with (statusId,statusId2) = (1,5) */
INSERT example
       (barcode,
        id2,
        total,
        statusId,
        statusId2)
SELECT TOP (10000) barcode = row_number() OVER (ORDER BY @@spid),
                   id2 = NULL,
                   total = row_number() OVER (ORDER BY @@spid),
                   statusId = 1,
                   statusId2 = 5
FROM   sys.all_columns c1, sys.all_columns c2;

ALTER TABLE example
  DROP COLUMN statusid
/* Now have 10000 rows with statusId2 = 5 */


EXEC sys.sp_rename
  @objname = N'dbo.example.statusId2',
  @newname = 'statusId',
  @objtype = 'COLUMN';
/* Now have 10000 rows with StatusID = 5 */

INSERT example
       (barcode,
        id2,
        total,
        statusId)
SELECT TOP (10000) barcode = row_number() OVER (ORDER BY @@spid),
                   id2 = NULL,
                   total = row_number() OVER (ORDER BY @@spid),
                   statusId = 5
FROM   sys.all_columns c1, sys.all_columns c2;
/* Now have 20000 rows with StatusID = 5 */


DECLARE @filter  DECIMAL = 5,
        @barcode CHAR(22) = '5',
        @id2     INT = NULL; 

/*This returns 2 rows from the NCI*/
SELECT *
FROM   example WITH (INDEX = ix_example)
WHERE  statusId = 1
       AND total <= @filter
       AND barcode = @barcode
       AND id2 = @id2;

/*This counts 0 rows from the Columnstore*/
SELECT COUNT(*)
FROM   example
WHERE  statusId = 1;

I have also raised an issue on Azure feedback portal:

And for anyone else who encounters this, rebuilding the Clustered Columnstore Index fixes the problem:

alter index cci_example on example rebuild

Rebuilding the CCI only fixes any existing data. If new records are added, the issue arises again on these records; so currently the only known fix for the table is to recreate it entirely.


This bug does not require dropping or renaming columns.

You will also see the same behaviour for statusId = 100 which was never present in any version of the column.

Requirements

  • A clustered columnstore
  • Nonclustered b-tree index
  • A plan that performs a lookup on the columnstore with
    • Target row(s) in delta store
    • A pushed non-SARG predicate
    • A comparison with NULL using an equality test

Example

DROP TABLE IF EXISTS dbo.Example;
GO
CREATE TABLE dbo.Example
(
    c1 integer NOT NULL,
    c2 integer NULL,

    INDEX CCS CLUSTERED COLUMNSTORE,
    INDEX IX NONCLUSTERED (c1)
);
GO
INSERT dbo.Example
    (c1, c2)
VALUES
    (1, NULL);
GO
DECLARE @c2 integer = NULL;

-- Returns one row but should not
SELECT
    E.* 
FROM dbo.Example AS E 
    WITH (INDEX(IX))
WHERE
    E.c2 = @c2;

Any of the following will avoid the bug:

  • Moving rows out of the delta store using any method including reorganizing with the compress rowgroups option specified
  • Writing the predicate to explicitly reject = NULL
  • Enabling undocumented trace flag 9130 to avoid pushing the predicate into the lookup

db<>fiddle demo.


This bug was fixed in CU15 for SQL Server 2017 (and CU7 for SQL Server 2016 SP2):

FIX: Query against table with both clustered columnstore index and nonclustered rowstore index may return incorrect results in SQL Server 2016 and 2017