Understanding .AsEnumerable() in LINQ to SQL

There are three implementations of AsEnumerable.

DataTableExtensions.AsEnumerable

Extends a DataTable to give it an IEnumerable interface so you can use Linq against the DataTable.

Enumerable.AsEnumerable<TSource> and ParallelEnumerable.AsEnumerable<TSource>

The AsEnumerable<TSource>(IEnumerable<TSource>) method has no effect other than to change the compile-time type of source from a type that implements IEnumerable<T> to IEnumerable<T> itself.

AsEnumerable<TSource>(IEnumerable<TSource>) can be used to choose between query implementations when a sequence implements IEnumerable<T> but also has a different set of public query methods available. For example, given a generic class Table that implements IEnumerable<T> and has its own methods such as Where, Select, and SelectMany, a call to Where would invoke the public Where method of Table. A Table type that represents a database table could have a Where method that takes the predicate argument as an expression tree and converts the tree to SQL for remote execution. If remote execution is not desired, for example because the predicate invokes a local method, the AsEnumerable<TSource> method can be used to hide the custom methods and instead make the standard query operators available.

In other words.

If I have an

IQueryable<X> sequence = ...;

from a LinqProvider, like Entity Framework, and I do,

sequence.Where(x => SomeUnusualPredicate(x));

that query will be composed and run on the server. This will fail at runtime because the EntityFramework doesn't know how to convert SomeUnusualPredicate into SQL.

If I want that to run the statement with Linq to Objects instead, I do,

sequence.AsEnumerable().Where(x => SomeUnusualPredicate(x));

now the server will return all the data and the Enumerable.Where from Linq to Objects will be used instead of the Query Provider's implementation.

It won't matter that Entity Framework doesn't know how to interpret SomeUnusualPredicate, my function will be used directly. (However, this may be an inefficient approach since all rows will be returned from the server.)


At the point where the enumeration is enumerated through, the database will then be queried, and the entire resultset retrieved.

A part-and-part solution can be the way. Consider

var res = (
    from result in SomeSource
    where DatabaseConvertableCriterion(result)
    && NonDatabaseConvertableCriterion(result)
    select new {result.A, result.B}
);

Let's say also that NonDatabaseConvertableCriterion requires field C from result. Because NonDatabaseConvertableCriterion does what its name suggests, this has to be performed as an enumeration. However, consider:

var partWay =
(
    from result in SomeSource
    where DatabaseConvertableCriterion(result)
    select new {result.A, result.B, result.C}
);
var res =
(
    from result in partWay.AsEnumerable()
    where NonDatabaseConvertableCriterion select new {result.A, result.B}
);

In this case, when res is enumerated, queried or otherwise used, as much work as possible will be passed to the database, which will return enough to continue the job. Assuming that it is indeed really impossible to rewrite so that all the work can be sent to the database, this may be a suitable compromise.


The reason for AsEnumerable is to

AsEnumerable(TSource)(IEnumerable(TSource)) can be used to choose between query implementations when a sequence implements IEnumerable(T) but also has a different set of public query methods available

So when you were calling the Where method before, you were calling a different Where method from the IEnumerable.Where. That Where statement was for LINQ to convert to SQL, the new Where is the IEnumerable one that takes an IEnumerable, enumerates it and yields the matching items. Which explains why you see the different SQL being generated. The table will be taken in full from the database before the Where extension will be applied in your second version of the code. This could create a serious bottle neck, because the entire table has to be in memory, or worse the entire table would have to travel between servers. Allow SQL server to execute the Where and do what it does best.