SQL 'like' vs '=' performance

See https://web.archive.org/web/20150209022016/http://myitforum.com/cs2/blogs/jnelson/archive/2007/11/16/108354.aspx

Quote from there:

the rules for index usage with LIKE are loosely like this:

  • If your filter criteria uses equals = and the field is indexed, then most likely it will use an INDEX/CLUSTERED INDEX SEEK

  • If your filter criteria uses LIKE, with no wildcards (like if you had a parameter in a web report that COULD have a % but you instead use the full string), it is about as likely as #1 to use the index. The increased cost is almost nothing.

  • If your filter criteria uses LIKE, but with a wildcard at the beginning (as in Name0 LIKE '%UTER') it's much less likely to use the index, but it still may at least perform an INDEX SCAN on a full or partial range of the index.

  • HOWEVER, if your filter criteria uses LIKE, but starts with a STRING FIRST and has wildcards somewhere AFTER that (as in Name0 LIKE 'COMP%ER'), then SQL may just use an INDEX SEEK to quickly find rows that have the same first starting characters, and then look through those rows for an exact match.

(Also keep in mind, the SQL engine still might not use an index the way you're expecting, depending on what else is going on in your query and what tables you're joining to. The SQL engine reserves the right to rewrite your query a little to get the data in a way that it thinks is most efficient and that may include an INDEX SCAN instead of an INDEX SEEK)


You should also keep in mind that when using like, some sql flavors will ignore indexes, and that will kill performance. This is especially true if you don't use the "starts with" pattern like your example.

You should really look at the execution plan for the query and see what it's doing, guess as little as possible.

This being said, the "starts with" pattern can and is optimized in sql server. It will use the table index. EF 4.0 switched to like for StartsWith for this very reason.


It's a measureable difference.

Run the following:

Create Table #TempTester (id int, col1 varchar(20), value varchar(20))
go

INSERT INTO #TempTester (id, col1, value)
VALUES
(1, 'this is #1', 'abcdefghij')
GO

INSERT INTO #TempTester (id, col1, value)
VALUES
(2, 'this is #2', 'foob'),
(3, 'this is #3', 'abdefghic'),
(4, 'this is #4', 'other'),
(5, 'this is #5', 'zyx'),
(6, 'this is #6', 'zyx'),
(7, 'this is #7', 'zyx'),
(8, 'this is #8', 'klm'),
(9, 'this is #9', 'klm'),
(10, 'this is #10', 'zyx')
GO 10000

CREATE CLUSTERED INDEX ixId ON #TempTester(id)CREATE CLUSTERED INDEX ixId ON #TempTester(id)

CREATE NONCLUSTERED INDEX ixTesting ON #TempTester(value)

Then:

SET SHOWPLAN_XML ON

Then:

SELECT * FROM #TempTester WHERE value LIKE 'abc%'

SELECT * FROM #TempTester WHERE value = 'abcdefghij'

The resulting execution plan shows you that the cost of the first operation, the LIKE comparison, is about 10 times more expensive than the = comparison.

If you can use an = comparison, please do so.