Type system oddity: Enumerable.Cast<int>()

I think it's got to do with the fact that Array implements the non-generic IEnumerable and that there is all sorts of special casing for arrays in C#

Yes, you're correct. More precisely, it has to do with array variance. Array variance is a loosening of the type system that happened in .NET1.0 which was problematic but allowed for some tricky cases to be gotten around. Here's an example:

string[] first = {"a", "b", "c"};
object[] second = first;
string[] third = (string[])second;
Console.WriteLine(third[0]); // Prints "a"

This is quite weak because it doesn't stop us doing:

string[] first = {"a", "b", "c"};
object[] second = first;
Uri[] third = (Uri[])second; // InvalidCastException

And there are worse cases again.

It's less useful (if they ever were justified, which some would debate) now we have generics (from .NET2.0 and C#2 onwards) than before when it allowed us to overcome some of the limitations not having generics imposed on us.

The rules allow us do implicit casts to bases of reference types (e.g. string[] to object[]) explicit casts to derived reference types (e.g. object[] to string[]) and explicit casts from Array or IEnumerable to any type of array and also (this is the sticky part) Array and IEnumerable references to arrays of primitive types or enums can be cast to arrays of primitive types of enums of the same size (int, uint and int-based enums are all the same size).

This means that the attempted optimisation of not casting individual values unnecessarily when one can just cast the source directly can have the surprising effects you note.

A practical effect of this that has tripped me up in the past is if you were to try enumValues.Cast<StringComparison>().ToArray() or enumValues.Cast<StringComparison>().ToList(). These would fail with ArrayTypeMismatchException even though enumValues.Cast<StringComparison>().Skip(0).ToArray() would succeed, because as well as Cast<TResult>() using the optimisation noted, ToArray<TSource>() and ToList<TSource>() use optimisations of calling ICollection<T>.CopyTo() internally, and on arrays that fails with the sort of variance involved here.

In .NET Core there was a loosening of the restrictions on CopyTo() with arrays that means this code succeeds rather than throwing, but I forget at which version that change was introduced.


Jon Hanna's answer is pretty much correct, but I can add a few small details.

I'm expecting it to print the type to which the array is being cast Int32[], but instead it prints the original type Foo[]!

What should you have expected? The contract of Cast<int> is that the object that is returned can be used in any context that expects an IEnumerable<int>, and you got that. That's all you should have expected; the rest is implementation details.

Now, I grant you that the fact that a Foo[] can be used as IEnumerable<int> is odd, but remember, a Foo is just an extremely thin wrapper around an int. The size of a Foo is the same as the size of an int, the contents of a Foo are the same as the contents of an int, and so the CLR in its wisdom answers "yes" when asked "is this Foo[] usable as an IEnumerable<int>?"

But what about this?

enumValues as IEnumerable<int> causes a compiler error

This sure sounds like a contradiction, doesn't it?

The problem is that the rules of C# and the rules of the CLR do not match in this situation.

  • The CLR says "a Foo[] can be used as an int[], and a uint[] and ... ".
  • The C# type analyzer is more restrictive. It does not use all of the lax covariance rules of the CLR. The C# type analyzer will allow string[] to be used as object[], and will allow IEnumerable<string> to be used as IEnumerable<object> but it will not allow Foo[] to be used as int[] or IEnumerable<int> and so on. C# only allows covariance when the varying types are both reference types. The CLR allows covariance when the varying types are reference types, or int, uint, or int-sized enums.

The C# compiler "knows" that the conversion from Foo[] to IEnumerable<int> cannot succeed in the C# type system, and so it produces a compiler error; a conversion in C# must be possible to be legal. The fact that this is possible in the more-lenient CLR type system is not considered by the compiler.

By inserting a cast to object or IEnumerable or whatever, you are telling the C# compiler to stop using the rules of C#, and start letting the runtime figure it out. By removing the cast, you're saying that you want the C# compiler to render its judgment, and it does.

So now we have a language design problem; plainly we have an inconsistency here. There are several ways out of this inconsistency.

  • C# could match the rules of the CLR, and allow covariant conversions amongst integer types.
  • C# could generate the as operator so that it implements the rules of C# at runtime; basically, it would have to detect legal-in-the-CLR but illegal-in-C# conversions and disallow them, making all such conversions slower. Moreover, it would then require your scenario to go to the memory-allocating slow path of Cast<T> instead of the reference-preserving fast path.
  • C# could be inconsistent and live with the inconsistency.

The second choice is obviously unfeasible. It only adds costs and has no benefits other than consistency.

It comes down then to the first and third choices, and the C# 1.0 design team chose the third. (Remember, the C# 1.0 design team did not know that they would be adding generics in C# 2.0 or generic variance in C# 4.0.) For the C# 1.0 design team the question was whether enumValues as int[] should be legal or not, and they decided not. Then that design decision was made again for C# 2.0 and C# 4.0.

There are plenty of principled arguments on either side but in practice this situation almost never arises in real world code, and the inconsistency almost never matters, so the lowest-cost choice is to just live with the odd fact that (IEnumerable<int>)(object)enumValues is legal but (IEnumerable<int>)enumValues is not.

For more on this, see my 2009 article on the subject

https://blogs.msdn.microsoft.com/ericlippert/2009/09/24/why-is-covariance-of-value-typed-arrays-inconsistent/

and this related question:

Why does my C# array lose type sign information when cast to object?

Tags:

C#

Linq