Is there a way to prevent Scalar UDFs in computed columns from inhibiting parallelism?

Yes if you:

  • are running SQL Server 2014 or later; and
  • are able to run the query with trace flag 176 active; and
  • the computed column is PERSISTED

Specifically, at least the following versions are required:

  • Cumulative Update 2 for SQL Server 2016 SP1
  • Cumulative Update 4 for SQL Server 2016 RTM
  • Cumulative Update 6 for SQL Server 2014 SP2

BUT to avoid a bug (ref for 2014, and for 2016 and 2017) introduced in those fixes, instead apply:

  • Cumulative Update 1 for SQL Server 2017
  • Cumulative Update 5 for SQL Server 2016 SP1
  • Cumulative Update 8 for SQL Server 2016 RTM
  • Cumulative Update 8 for SQL Server 2014 SP2

The trace flag is effective as a start-up –T option, at both global and session scope using DBCC TRACEON, and per query with OPTION (QUERYTRACEON) or a plan guide.

Trace flag 176 prevents persisted computed column expansion.

The initial metadata load performed when compiling a query brings in all columns, not just those directly referenced. This makes all computed column definitions available for matching, which is generally a good thing.

As an unfortunate side-effect, if one of the loaded (computed) columns uses a scalar user-defined function, its presence disables parallelism for the whole query, even when the computed column is not actually used.

Trace flag 176 helps with this, if the column is persisted, by not loading the definition (since expansion is skipped). This way, a scalar user-defined function is never present in the compiling query tree, so parallelism is not disabled.

The main drawback of trace flag 176 (aside from being only lightly documented) is that it also prevents query expression matching to persisted computed columns: If the query contains an expression matching a persisted computed column, trace flag 176 will prevent the expression being replaced by a reference to the computed column.

For more details, see my SQLPerformance.com article, Properly Persisted Computed Columns.

Since the question mentions XML, as an alternative to promoting values using a computed column and scalar function, you could also look at using a Selective XML Index, as you wrote about in Selective XML Indexes: Not Bad At All.


In addition to @Paul's excellent Yes #1, there is actually a Yes #2 that:

  • works as far back as SQL Server 2005,
  • does not require setting a trace flag,
  • does not require that the computed column be PERSISTED, and
  • (due to not requiring trace flag 176), does not prevent query expression matching to persisted computed columns

The only drawbacks (as far as I can tell) are:

  • does not work on Azure SQL Database (at least not yet, though it does work on Amazon RDS SQL Server as well as SQL Server on Linux), and
  • is a little bit outside of many DBA's comfort zone

And this option is: SQLCLR

That's right. One cool aspect of SQLCLR Scalar UDFs is that, if they do not do any data access (neither user nor system), then they do not prohibit parallelism. And that is not just theory or marketing. While I do not have time (at the moment) to do the fully detailed write-up, I have tested and proven this.

I used the initial setup from the following blog post (hopefully the O.P. doesn't consider this to be an unreliable source ):

Bad Idea Jeans: Multiple Index Hints

And performed the following tests:

  1. Ran the initial query as is ─⇾ Parallelism (as expected)
  2. Added a non-persisted computed column defined as ([c2] * [c3]) ─⇾ Parallelism (as expected)
  3. Removed that computed column and added a non-persisted computed column that referenced a T-SQL Scalar UDF (created with SCHEMABINDING) defined as RETURN (@First * @Second); ─⇾ NO Parallelism (as expected)
  4. Removed the T-SQL UDF computed column and added a non-persisted computed column that referenced a SQLCLR Scalar UDF (tried with both IsDeterministic = true and = false) defined as return SqlInt32.Multiply(First, Second); ─⇾ Parallelism (woo hoo!!)

So, while SQLCLR won't work for everyone, it certainly has its advantages for those people / situations / environments that are a good fit. And, as it relates to this specific question — the example given regards using XQuery — it would certain work for that (and, depending on what specifically is being done, it might even be a little faster ).