Covering index used despite missing column

Case A
Query:

WHERE thread_id = 12345 
  AND placeholder = FALSE
ORDER BY some_column DESC 
LIMIT 20

Index:

(thread_id, date_created)

Plan:

Index is used
Using Where
Using filesort

No problem there, right? If the index is used (to partially match the WHERE condition), we still need a sort operation to order the results by some_column (which is not in the index). We also need an extra check (Using Where) to keep only the rows that match the 2nd condition, too. OK.


Case B (the question)
Query:

WHERE thread_id = 12345 
  AND placeholder = FALSE
ORDER BY date_created DESC 
LIMIT 20

Index:

(thread_id, date_created)

Plan:

Index is used
Using Where
-- no "Using filesort"

So, why does it not need a sort here? Because the index is enough to sort as the query wants. There is of course the additional problem of the extra condition (AND placeholder = FALSE) which is not covered by the index.

OK but we don't really need a sort here. The index can provide us with results that match the first condition (WHERE thread_id = 12345) and are in the wanted order for output. The only additional check we need - and what the plan does - is to get the rows from the table, in the order provided by the index, and check this 2nd condition until we get 20 matches. That's what the **Using Where"" means.

We may get the 20 matches in the first 20 rows (so really good and fast) or in the first 100 (still likely fast enough) or in the first 1000000 (probably very, very slow) or we may get just 19 matches from the table even after reading all the matching rows from the index (really very slow on a big table). It all depends on the distribution of data.


Case C (even better plan)
Query:

WHERE thread_id = 12345 
  AND placeholder = FALSE
ORDER BY date_created DESC 
LIMIT 20

Index:

(placeholder, thread_id, date_created)

Plan:

Index is used
-- no "Using Where"
-- no "Using filesort"

Now our index matches both conditions and the order by. The plan is pretty simple: get the first* 20 matches from the index and read the corresponding rows from the table. No extra check (No "Using Where") and no sort (no "Using filesort") needed.

first*: the first 20 when reading the index backwards from the end (as we have ORDER BY .. DESC) but that's not a problem. B-tree indexes can be read forwards and backwards with almost equal performance.


  • Using index indicates a "Covering index" -- All the columns anywhere in the SELECT are anywhere in the one index. So, you do not have a "covering" index. And it is not practical to make a covering index for your query (too many columns mentioned).
  • Using where -- mostly noise.
  • Using filesort -- The query needs a sort, but it might be in RAM or in a temp table. And there may be multiple sorts (eg, GROUP BY x ORDER BY b)
  • Either of these will make it possible to look only at 20 rows; any other index will require more rows be touched, possibly the entire table:

    INDEX(thread_id, placeholder, date_created)
    INDEX(placeholder, thread_id, date_created)
    
  • No, the cardinality of the components of a composite index does not matter when ordering the columns in the index.

My Cookbook explains how to derive the optimal index, given a SELECT.