select * vs select column

It always pulls a tuple (except in cases where the table has been vertically segmented - broken up into columns pieces), so, to answer the question you asked, it doesn't matter from a performance perspective. However, for many other reasons, (below) you should always select specifically those columns you want, by name.

It always pulls a tuple, because (in every vendors RDBMS I am familiar with), the underlying on-disk storage structure for everything (including table data) is based on defined I/O Pages (in SQL Server for e.g., each Page is 8 kilobytes). And every I/O read or write is by Page.. I.e., every write or read is a complete Page of data.

Because of this underlying structural constraint, a consequence is that Each row of data in a database must always be on one and only one page. It cannot span multiple Pages of data (except for special things like blobs, where the actual blob data is stored in separate Page-chunks, and the actual table row column then only gets a pointer...). But these exceptions are just that, exceptions, and generally do not apply except in special cases ( for special types of data, or certain optimizations for special circumstances)
Even in these special cases, generally, the actual table row of data itself (which contains the pointer to the actual data for the Blob, or whatever), it must be stored on a single IO Page...

EXCEPTION. The only place where Select * is OK, is in the sub-query after an Exists or Not Exists predicate clause, as in:

   Select colA, colB
   From table1 t1
   Where Exists (Select * From Table2
                 Where column = t1.colA)

EDIT: To address @Mike Sherer comment, Yes it is true, both technically, with a bit of definition for your special case, and aesthetically. First, even when the set of columns requested are a subset of those stored in some index, the query processor must fetch every column stored in that index, not just the ones requested, for the same reasons - ALL I/O must be done in pages, and index data is stored in IO Pages just like table data. So if you define "tuple" for an index page as the set of columns stored in the index, the statement is still true.
and the statement is true aesthetically because the point is that it fetches data based on what is stored in the I/O page, not on what you ask for, and this true whether you are accessing the base table I/O Page or an index I/O Page.

For other reasons not to use Select *, see Why is SELECT * considered harmful? :


There are several reasons you should never (never ever) use SELECT * in production code:

  • since you're not giving your database any hints as to what you want, it will first need to check the table's definition in order to determine the columns on that table. That lookup will cost some time - not much in a single query - but it adds up over time

  • if you need only 2/3 of the columns, you're selecting 1/3 too much data which needs to be retrieving from disk and sent across the network

  • if you start to rely on certain aspects of the data, e.g. the order of the columns returned, you could get a nasty surprise once the table is reorganized and new columns are added (or existing ones removed)

  • in SQL Server (not sure about other databases), if you need a subset of columns, there's always a chance a non-clustered index might be covering that request (contain all columns needed). With a SELECT *, you're giving up on that possibility right from the get-go. In this particular case, the data would be retrieved from the index pages (if those contain all the necessary columns) and thus disk I/O and memory overhead would be much less compared to doing a SELECT *.... query.

Yes, it takes a bit more typing initially (tools like SQL Prompt for SQL Server will even help you there) - but this is really one case where there's a rule without any exception: do not ever use SELECT * in your production code. EVER.