Why does SQL Server require the datatype length to be the same when using UNPIVOT?

What is the logic behind requiring the same length when using UNPIVOT?

This question may only be truly answerable by the people who worked on the implementation of UNPIVOT. You might be able to obtain this by contacting them for support. The following is my understanding of the reasoning, which may not be 100% accurate:

T-SQL contains any number of instances of weird semantics and other counter-intuitive behaviours. Some of these will eventually go away as part of deprecation cycles, but others may never be 'improved' or 'fixed'. Quite aside from anything else, applications exist that depend on these behaviours, so backward compatibility has to be preserved.

The rules for implicit conversions, and expression type derivation account for a significant proportion of the weirdness mentioned above. I do not envy the testers who have to ensure that the weird (and often undocumented) behaviours are preserved (under all combinations of SET session values and so on) for new versions.

That said, there is no good reason not to make improvements, and avoid past mistakes, when introducing new language features (with obviously no backward compatibility baggage). New features like recursive common table expressions (as mentioned by Andriy M in a comment) and UNPIVOT were free to have relatively sane semantics and clearly-defined rules.

There will be a range of views as to whether including the length in the type is taking explicit typing too far, but personally I welcome it. In my view, the types varchar(25) and varchar(50) are not the same, any more than decimal(8) and decimal(10) are. Special casing string type conversion complicates things unnecessarily and adds no real value, in my opinion.

One could argue that only implicit conversions that might lose data should be required to be explicitly stated, but there are edge-cases there too. Ultimately, a conversion is going to be needed, so we might as well make it explicit.

If the implicit conversion from varchar(25) to varchar(50) were allowed, it would just be another (most likely hidden) implicit conversion, with all the usual weird edge cases and SET setting sensitivities. Why not make the implementation the simplest and most explicit possible? (Nothing is perfect, however, and it is a shame that hiding varchar(25) and varchar(50) inside a sql_variant is allowed.)

Rewriting the UNPIVOT with APPLY and UNION ALL avoids the (better) type behaviour because the rules for UNION are subject to backward compatibility, and are documented in Books Online as allowing different types so long as they are comparable using implicit conversion (for which the arcane rules of data type precedence are used, and so on).

The workaround involves being explicit about the data types and adding explicit conversions where necessary. This looks like progress to me :)

One way to write the explicitly-typed workaround:

SELECT
    U.PersonId,
    U.ColumnName,
    U.Value
FROM dbo.People AS P
CROSS APPLY
(
    VALUES (CONVERT(varchar(50), Lastname))
) AS CA (Lastname)
UNPIVOT
(
    Value FOR
    ColumnName IN (P.Firstname, CA.Lastname)
) AS U;

Recursive CTE example:

-- Fails
WITH R AS
(
    SELECT Dummy = 'A row'
    UNION ALL
    SELECT 'Another row'
    FROM R
    WHERE Dummy = 'A row'
)
SELECT Dummy
FROM R;

-- Succeeds
WITH R AS
(
    SELECT Dummy = CONVERT(varchar(11), 'A row')
    UNION ALL
    SELECT CONVERT(varchar(11), 'Another row')
    FROM R
    WHERE Dummy = 'A row'
)
SELECT Dummy
FROM R;

Finally, note that the rewrite using CROSS APPLY in the question is not quite the same as the UNPIVOT, because it does not reject NULL attributes.

Why does SQL Server require the datatype length to be the same when using UNPIVOT?

Tags:

Sql Server

Unpivot

Related

Recent Posts