What is the reason not to use select *?

The essence of the quote of not prematurely optimizing is to go for simple and straightforward code and then use a profiler to point out the hot spots, which you can then optimize to be efficient.

When you use select * you're make it impossible to profile, therefore you're not writing clear & straightforward code and you are going against the spirit of the quote. select * is an anti-pattern.


So selecting columns is not a premature optimization. A few things off the top of my head ....

  1. If you specify columns in a SQL statement, the SQL execution engine will error if that column is removed from the table and the query is executed.
  2. You can more easily scan code where that column is being used.
  3. You should always write queries to bring back the least amount of information.
  4. As others mention if you use ordinal column access you should never use select *
  5. If your SQL statement joins tables, select * gives you all columns from all tables in the join

The corollary is that using select * ...

  1. The columns used by the application is opaque
  2. DBA's and their query profilers are unable to help your application's poor performance
  3. The code is more brittle when changes occur
  4. Your database and network are suffering because they are bringing back too much data (I/O)
  5. Database engine optimizations are minimal as you're bringing back all data regardless (logical).

Writing correct SQL is just as easy as writing Select *. So the real lazy person writes proper SQL because they don't want to revisit the code and try to remember what they were doing when they did it. They don't want to explain to the DBA's about every bit of code. They don't want to explain to their clients why the application runs like a dog.


One major reason is that if you ever add/remove columns from your table, any query/procedure that is making a SELECT * call will now be getting more or less columns of data than expected.


If your code depends on the columns being in a specific order, your code will break when there are changes to the table. Also, you may be fetching too much from the table when you select *, especially if there is a binary field in the table.

Just because you are using all the columns now, it doesn't mean someone else isn't going to add an extra column to the table.

It also adds overhead to the plan execution caching since it has to fetch the meta data about the table to know what columns are in *.


  1. In a roundabout way you are breaking the modularity rule about using strict typing wherever possible. Explicit is almost universally better.

  2. Even if you now need every column in the table, more could be added later which will be pulled down every time you run the query and could hurt performance. It hurts performance because

    • You are pulling more data over the wire; and
    • Because you might defeat the optimizer's ability to pull the data right out of the index (for queries on columns that are all part of an index.) rather than doing a lookup in the table itself

When TO use select *

When you explicitly NEED every column in the table, as opposed to needing every column in the table THAT EXISTED AT THE TIME YOU WROTE THE QUERY. For example, if were writing an DB management app that needed to display the entire contents of the table (whatever they happened to be) you might use that approach.

Tags:

Sql