MySQL query / clause execution order

It appears that the generalized pattern in Standard SQL for Logical Query Processing Phase is (at least from SQL-92 - starting on p.177) :

  • from clause
  • joined table
  • where clause
  • group by clause
  • having clause
  • query specification (ie. SELECT)

You can find and download newer Standard SQL Standardization documents from here:

  • https://wiki.postgresql.org/wiki/Developer_FAQ#Where_can_I_get_a_copy_of_the_SQL_standards.3F

For MSSQL (since it tends to stay farily close to standard in my experience) the Logical Query Processing Phase is generally:

  • FROM <left_table>
  • ON <join_condition>
  • <join_type> JOIN <right_table>
  • WHERE <where_condition>
  • GROUP BY <group_by_list>
  • WITH {CUBE | ROLLUP}
  • HAVING <having_condition>
  • SELECT
  • DISTINCT
  • ORDER BY <order_by_list>
  • <TOP_specification> <select_list>
    • From: Chapter 1 of Ben-Gan, Itzik, et. al., Inside Microsoft SQL Server 2005: T-SQL Querying, (Microsoft Press)
    • Also: Ben-Gan, Itzik, et. al., Training Kit (Exam 70-461) Querying Microsoft SQL Server 2012 (MCSA), (Microsoft Press)

It should be noted that MySQL can be configured to operate closer to standard as well if desired by setting the SQL Mode (although probably only recommended for fringe cases):

  • https://dev.mysql.com/doc/refman/8.0/en/sql-mode.html

For MySQL, I searched both MySQL and MariaDB documentation and could find nothing other than the few statements that Gordon Linoff mentioned in passing that were in the MySQL documentation for SELECT. They are:

  • If ORDER BY occurs within a parenthesized query expression and also is applied in the outer query, the results are undefined and may change in a future version of MySQL.
  • If LIMIT occurs within a parenthesized query expression and also is applied in the outer query, the results are undefined and may change in a future version of MySQL.
  • The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization. (LIMIT is applied after HAVING.)
  • From MySQL JOIN Documentation: Natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard

Given that a quick skim through of SQL-92 from "from clause" to "query specification" showed that the Logic can be conditional at times depending on how the query is written, and given that I could not find anything in the MySQL or MariaDB documentation (not saying it is not there, I just could not find it), and other articles on MySQL's Logical Query Processing Phase were conflicting in their order, it seems that the best way that MySQL gives to determine some sort of Logical Query Processing Phase (or at least the steps used for join optimization for the query plan) for a specific query is to do a trace on the query execution by doing the following (from MySQL documentation "Tracing The Optimizer/Typical Usage"):

# Turn tracing on (it's off by default):
SET optimizer_trace="enabled=on";
SELECT ...; # your query here
SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;
# possibly more queries...
# When done with tracing, disable it:
SET optimizer_trace="enabled=off";

You can interpret the results by looking at MySQL documentation's Tracing Example.

Basically, it appears that you want to look for "join_optimization" (which says that the from/join statements are being evaluated for the specific query, the specific query being the one stated as "Select#"), then look for "condition_processing: condition", and then "clause_processing: clause". As it says in the MySQL documentation for General Trace Structure:

A trace follows closely the actual execution path: there is a join-preparation object, a join-optimization object, a join-execution object, for each JOIN... It is far from showing everything happening in the optimizer, but we plan to show more information in the future.

Interesting enough, I found that running a query like the following in MySQL gave me that its apparent process order for query optimization was FROM,WHERE,HAVING,ORDER BY, and then GROUP BY:

SELECT 
    a.id
    , max(a.timestamp) 
FROM database.table AS a 
LEFT JOIN database.table2 AS b on a.id = b.id 
WHERE a.id > 1 
GROUP BY a.id HAVING max(a.timestamp) > 0 
ORDER BY a.id

I am assuming that since "condition_processing" and "clause_processing" are within the "Select#" group that these are processed before SELECT - which lines up with SQL-99, but it is an assumption.

In terms of operators and variables, Zawodny, Jeremy D., et. al., High Performance MySQL, 2nd Edition, (O'Reilly) states that:

The := assignment operator has lower precedence than any other operator, so you have to be careful to parenthesize explicitly.

I only mention this since sometimes it may not be order of Logical Query Processing Phase as much as precedence of assignment operator when working with a variable, or variables, that could be an issue for troubleshooting if the query is not executing as thought.


The actual execution of MySQL statements is a bit tricky. However, the standard does specify the order of interpretation of elements in the query. This is basically in the order that you specify, although I think HAVING and GROUP BY could come after SELECT:

  • FROM clause
  • WHERE clause
  • SELECT clause
  • GROUP BY clause
  • HAVING clause
  • ORDER BY clause

This is important for understanding how queries are parsed. You cannot use a column alias defined in a SELECT in the WHERE clause, for instance, because the WHERE is parsed before the SELECT. On the other hand, such an alias can be in the ORDER BY clause.

As for actual execution, that is really left up to the optimizer. For instance:

. . .
GROUP BY a, b, c
ORDER BY NULL

and

. . .
GROUP BY a, b, c
ORDER BY a, b, c

both have the effect of the ORDER BY not being executed at all -- and so not executed after the GROUP BY (in the first case, the effect is to remove sorting from the GROUP BY and in the second the effect is to do nothing more than the GROUP BY already does).


This is how you can get the rough idea about how mysql executes the select query

DROP TABLE if exists new_table;

CREATE TABLE `new_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`testdecimal` decimal(6,2) DEFAULT NULL,
PRIMARY KEY (`id`));

INSERT INTO `new_table` (`testdecimal`) VALUES ('1234.45');
INSERT INTO `new_table` (`testdecimal`) VALUES ('1234.45');

set @mysqlorder := '';

select @mysqlorder := CONCAT(@mysqlorder," SELECT ") from new_table,(select @mysqlorder := CONCAT(@mysqlorder," FROM ")) tt
JOIN (select @mysqlorder := CONCAT(@mysqlorder," JOIN1 ")) t on ((select @mysqlorder := CONCAT(@mysqlorder," ON1 ")) or rand() < 1)
JOIN (select @mysqlorder := CONCAT(@mysqlorder," JOIN2 ")) t2 on ((select @mysqlorder := CONCAT(@mysqlorder," ON2 ")) or rand() < 1)
where ((select @mysqlorder := CONCAT(@mysqlorder," WHERE ")) or IF(new_table.testdecimal = 1234.45,true,false))
group by (select @mysqlorder := CONCAT(@mysqlorder," GROUPBY ")),id
having (select @mysqlorder := CONCAT(@mysqlorder," HAVING "))
order by (select @mysqlorder := CONCAT(@mysqlorder," ORDERBY "));

select @mysqlorder;

And here is the output from above mysql query, hope you can figure out the mysql execution of a SELECT query :-

FROM JOIN1 JOIN2 WHERE ON2 ON1 ORDERBY GROUPBY SELECT WHERE ON2 ON1 ORDERBY GROUPBY SELECT HAVING HAVING