EF Code First uses nvarchar(max) for all strings. Will this hurt query performance?

Larger nvarchar (max) data items (over 8000 bytes or so) will spill over into text storage and require additional I/O. Smaller items will be stored in-row. There are options that control this behaviour - see this MSDN article for more details.

If stored in-row there is no significant I/O performance overhead; there may be additional CPU overhead on processing the data type but this is likely to be minor.

However, leaving nvarchar (max) columns lying around the database where they are not needed is rather poor form. It does have some performance overhead and often data sizes are quite helpful in understanding a data table - for example, a varchar column 50 or 100 chars wide is likely to be a description or a free-text field where one that's (say) 10-20 chars ling is likely to be a code. You would be surprised how much meaning that one often has to infer from a database through assumptions like this.

Working in data warehousing, as often as not on poorly supported or documented legacy systems, having a database schema that's easy to understand is quite valuable. If you think of the database as the application's legacy, try to be nice to the people who are going to inherit it from you.


Although this doesn't answer your specific question, it may preclude you from needing to ask the question in the first place: It's possible to set a length on your string variables in your C# model class, which will cause Entity Framework to generate SQL that uses a fixed-length nvarchar type (e.g. nvarchar(50)), instead of nvarchar(max).

For example, instead of:

public string Name { get; set; }

You can use:

[StringLength(50)]
public string Name { get; set; }

You can also force the type to be varchar instead of nvarchar, if desired, as follows:

[Column(TypeName = "VARCHAR")]
[StringLength(50)]
public string Name { get; set; }

Source: https://stackoverflow.com/questions/7341783/entity-framework-data-annotations-set-stringlength-varchar/7341920


Indexing the biggest concern. From BOL:

Columns that are of the large object (LOB) data types ntext, text, varchar(max), nvarchar(max), varbinary(max), xml, or image cannot be specified as key columns for an index.

If you can't index properly, you are going to have slow queries. And from a data integrity perspective, having nvarchar(max) will allow more bad data to be put in a field than specifying the limit would be.