How to measure or find cost of creating a query plan?

How to find out what the cost of creating a query plan is?

You can look at the properties of the root node in the query plan, for example:

Root properties extract
(screenshot from the free Sentry One Plan Explorer)

This information is also available by querying the plan cache, for example using a query based on the following relationships:

WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
SELECT 
    CompileTime = c.value('(QueryPlan/@CompileTime)[1]', 'int'),
    CompileCPU = c.value('(QueryPlan/@CompileCPU)[1]', 'int'),
    CompileMemory = c.value('(QueryPlan/@CompileMemory)[1]', 'int'),
    ST.[text],
    QP.query_plan
FROM sys.dm_exec_cached_plans AS CP
CROSS APPLY sys.dm_exec_query_plan(CP.plan_handle) AS QP
CROSS APPLY sys.dm_exec_sql_text(CP.plan_handle) AS ST
CROSS APPLY QP.query_plan.nodes('ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS N(c);

Results fragment

For a full treatment of the options you have for handling these sorts of queries, see Erland Sommarskog's recently updated article.


Assuming "cost" is in terms of time (though not sure what else it could be in terms of ;-), then at the very least you should be able to get a sense of it by doing something like the following:

DBCC FREEPROCCACHE WITH NO_INFOMSGS;

SET STATISTICS TIME ON;

EXEC sp_help 'sys.databases'; -- replace with your proc

SET STATISTICS TIME OFF;

The first item reported in the "Messages" tab should be:

SQL Server parse and compile time:

I would run this at least 10 times and average both the "CPU" and "Elapsed" milliseconds.

Ideally you would run this in Production so that you can get a true time estimate, but rarely are people allowed to clear the plan cache in Production. Fortunately, starting in SQL Server 2008 it became possible to clear a specific plan from the cache. In which case you can do the following:

DECLARE @SQL NVARCHAR(MAX) = '';
;WITH cte AS
(
  SELECT DISTINCT stat.plan_handle
  FROM sys.dm_exec_query_stats stat
  CROSS APPLY sys.dm_exec_text_query_plan(stat.plan_handle, 0, -1) qplan
  WHERE qplan.query_plan LIKE N'%sp[_]help%' -- replace "sp[_]help" with proc name
)
SELECT @SQL += N'DBCC FREEPROCCACHE ('
               + CONVERT(NVARCHAR(130), cte.plan_handle, 1)
               + N');'
               + NCHAR(13) + NCHAR(10)
FROM cte;
PRINT @SQL;
EXEC (@SQL);

SET STATISTICS TIME ON;

EXEC sp_help 'sys.databases' -- replace with your proc

SET STATISTICS TIME OFF;

However, depending on the variability of the values being passed in for the parameter(s) causing the "bad" cached plan, there is another method to consider that is a middle ground between OPTION(RECOMPILE) and OPTION(OPTIMIZE FOR UNKNOWN): Dynamic SQL. Yes, I said it. And I even mean non-parameterized Dynamic SQL. Here is why.

You clearly have data that has an uneven distribution, at least in terms of one or more input parameter values. The downsides of the mentioned options are:

  • OPTION(RECOMPILE) will generate a plan for every execution and you will never be able to benefit from any plan re-use, even if the parameter values passed in again are identical to the prior run(s). For procs that are called frequently -- once every few seconds or more frequently -- this will save you from the occasional horrible situation, but still leave you in an always not-all-that-great situation.

  • OPTION(OPTIMIZE FOR (@Param = value)) will generate a plan based on that particular value, which could help several cases but still leave you open to the current issue.

  • OPTION(OPTIMIZE FOR UNKNOWN) will generate a plan based on what amounts to an average distribution, which will help some queries but hurt others. This should be the same as the option of using local variables.

Dynamic SQL, however, when done correctly, will allow for the various values being passed in to have their own separate query plans that are ideal (well, as much as they are gonna be). The main cost here is that as the variety of values being passed in increases, the number of execution plans in the cache increases, and they do take up memory. The minor costs are:

  • needing to validate string parameters to prevent SQL Injections

  • possibly needing to set up a Certificate and Certificate-based User to maintain ideal security abstraction since Dynamic SQL requires direct table permissions.

So, here is how I have managed this situation when I had procs that were called more than once per second and hit multiple tables, each with millions of rows. I had tried OPTION(RECOMPILE) but this proved to be far too detrimental to the process in the 99% of cases that did not have the parameter sniffing / bad cached plan problem. And please keep in mind that one of these procs had about 15 queries in it and only 3 - 5 of them were converted to Dynamic SQL as described here; Dynamic SQL was not used unless it was necessary for a particular query.

  1. If there are multiple input parameters to the stored procedure, figure out which ones are used with columns that have highly disparate data distributions (and hence cause this problem) and which ones are used with columns that have more even distributions (and shouldn't be causing this problem).

  2. Build the Dynamic SQL string using parameters for the proc input params that are associated with evenly distributed columns. This parameterization helps reduce the resulting increase in execution plans in the cache related to this query.

  3. For the remaining parameters that are associated with highly varied distributions, those should be concatenated into the Dynamic SQL as literal values. Since a unique query is determined by any changes to the query text, having WHERE StatusID = 1 is a different query, and hence, different query plan, than having WHERE StatusID = 2.

  4. If any of the proc input parameters that are to be concatenated into the text of the query are strings, then they need to be validate to protect against SQL Injection (though this is less likely to happen if the strings being passed in are generated by the app and not a user, but still). At least do the REPLACE(@Param, '''', '''''') to ensure that single-quotes become escaped single-quotes.

  5. If need be, create a Certificate that will be used to create a User and sign the stored procedure such that direct table permissions will be granted only to the new Certificate-based User and not to [public] or to Users that shouldn't otherwise have such permissions.

Example proc:

CREATE PROCEDURE MySchema.MyProc
(
  @Param1 INT,
  @Param2 DATETIME,
  @Param3 NVARCHAR(50)
)
AS
SET NOCOUNT ON;

DECLARE @SQL NVARCHAR(MAX);

SET @SQL = N'
     SELECT  tab.Field1, tab.Field2, ...
     FROM    MySchema.SomeTable tab
     WHERE   tab.Field3 = @P1
     AND     tab.Field8 >= CONVERT(DATETIME, ''' +
  CONVERT(NVARCHAR(50), @Param2, 121) +
  N''')
     AND     tab.Field2 LIKE N''' +
  REPLACE(@Param3, N'''', N'''''') +
  N'%'';';

EXEC sp_executesql
     @SQL,
     N'@P1 INT',
     @P1 = @Param1;