INDEX on DATE field and MONTH() YEAR() functions

The problem is not the representation of the date. The problem is the optimization of the query. When you use YEAR() and MONTH() on a date column, the column is a function of an argument.

That means that the compiler would have a lot of information about the function in order to optimize it using an index scan or index lookup. Such as:

  • The function would have to return the same values given the same arguments.
  • The function would take only one argument.
  • The function would need to be monotonic.
  • The function would need to have an inverse (for starting an index scan).

Of course, these things are possible. The challenge is actually building them into a function definition and into the optimizer. No one who builds SQL optimizers would want to add special cases for a specific function (well, almost no one; SQL Server does allow some use of cast() to take advantage of an index).

So, the answer to your question is that Bill's answer is still valid.


Correct. MySQL optimizer won't use an index once a column in the WHERE clause is wrapped with a function. The simple reasoning is that once you have an index on column my_date, the database sees no point to search in it when you're actually looking for a different value, for example the output of YEAR(my_date).

The good news though are that there is a quick and simple solution in some cases.

For example, the following query can be optimized using a different approach:

SELECT flight_company, count(*)
FROM flight_times
WHERE year(FlightDate) = 2017
group by flight_company

Instead of using the YEAR() function in the WHERE clause, you can just replace the function call with a range condition that will check the same thing:

SELECT flight_company, count(*) 
FROM flight_times
WHERE FlightDate between '2017-01-01'
and '2017-12-31'
GROUP BY flight_company

There are functions though that can't simply replaced by a range condition such as in YEAR's case. For example, how would you replace dayofweek() with a condition? Probably harder. Therefore, another approach is to use MySQL 5.7 Virtual (generated) columns. If you're taking this approach, you can just create this virtual column in your CREATE TABLE statement, to actually match the result of dayofweek():

Flight_dayofweek tinyint(4)
 GENERATED ALWAYS AS (dayofweek(FlightDate
 VIRTUAL

As the co-founder of EverSQL, I will humbly suggest you to use EverSQL SQL Query Optimizer to automatically optimize such queries.

Tags:

Mysql