What's the difference between WITH CTE & WITH CTE (<column_names>)?

You nearly have the answer for one of your questions already.

In the MSDN page, there is a line directly after your quote that explains this:

The basic syntax structure for a CTE is:

WITH expression_name [ ( column_name [,...n] ) ]

AS

( CTE_query_definition )

The list of column names is optional only if distinct names for all resulting columns are supplied in the query definition.

(Emphasis added)

This would mean that you would need to specify column names in a few situations:

  • This would work:

    WITH [test_table] ([NoName], [CAST], [Function]) 
    AS
    (
        SELECT 
            1
          , CAST('1' AS CHAR(1))
          , dbo.CastToChar(1)
    )
    SELECT * FROM [test_table];
    
  • as would this:

    WITH [test_table]  
    AS
    (
        SELECT 
            1 as [NoName]
          , CAST('1' AS CHAR(1)) as [CAST]
          , dbo.CastToChar(1) as [Function]
    )
    SELECT * FROM [test_table];
    
  • But this would not as it does not have distinct names for the columns:

    WITH [test_table] 
    AS
    (
        SELECT 
            1
          , CAST('1' AS CHAR(1))
          , dbo.CastToChar(1)
    )
    SELECT * FROM [test_table];
    

Anecdotally, I prefer to name the columns inside the CTE instead of inside the WITH CTE (xxx) AS1 clause since you'll never inadvertently mismatch the names vs the column contents.

Take for instance the following example:

;WITH MyCTE (x, y)
AS 
(
    SELECT mt.y
         , mt.x
    FROM MySchema.MyTable mt
)
SELECT MyCTE.x
     , MyCTE.y
FROM MyCTE;

What does this display? It shows the contents of the y column under the heading of x, and the contents of the x column under the heading y.

With this realization, I never specify the column names in the (xxx) AS clause, instead I do it like this:

;WITH MyCTE
AS 
(
    SELECT Alias1 = mt.y
         , Alias2 = mt.x
    FROM MySchema.MyTable mt
)
SELECT MyCTE.Alias1
     , MyCTE.Alias2
FROM MyCTE;

This removes all doubt about what the column definitions are.

On a totally unrelated side-note; always specify the schema name when referencing object names, and end your statements with a semi-colon.


Ultimately each column needs a valid name and you can assign it in two ways:

  1. Column list

    ;WITH cte (foo)
    AS
    ( select col from tab )
    select foo from cte;
    
  2. Using original column names or aliases

    ;WITH cte
    AS
    ( select col from tab )
    select col from cte;
    

When you do both alias and column list

  1. Both column list & aliases

    ;WITH cte (foo, bar)
    AS
    ( select col1         -- not valid in outer Select
             col2 as colx -- not valid in outer Select
      from tab )
    select foo, bar from cte;
    

This is similar to the definition of a View or a Derived Table, where you can specify a list of column names, too.

column list: When you have lots a complex calculations it's easier to spot the name, because they're not scattered through the source code. And it's easier if you got a recursive cte and you can you assign two different names for the same column in #3.

original name/aliases: You only have to assign an alias if you do a calculation or want/must to rename a column