What can be the downside of always having a single integer column as primary key?

Other than additional disk space (and in turn memory usage and I/O), there's not really any harm in adding an IDENTITY column even to tables that don't need one (an example of a table that doesn't need an IDENTITY column is a simple junction table, like mapping a user to his/her permissions).

I rail against blindly adding them to every single table in a blog post from 2010:

  • Bad habits to kick : putting an IDENTITY column on every table

But surrogate keys do have valid use cases - just be careful not to assume that they guarantee uniqueness (which is sometimes why they get added - they should not be the only way to uniquely identify a row). If you need to use an ORM framework, and your ORM framework requires single-column integer keys even in cases when your real key is either not an integer, or not a single column, or neither, make sure that you define unique constraints/indexes for your real keys, too.


From my experience, the main and overwhelming reason to use a separate ID for every table is the following:

In almost every case my customer swore a blood oath in the conception phase that some external, "natural" field XYZBLARGH_ID will stay unique forever, and will never change for a given entity, and will never be re-used, there eventually appeared cases where the Primary Key properties were broken. It just does not work out that way.

Then, from a DBA point of view, the things that makes a DB slow or bloated are certainly not 4 bytes (or whatever) per row, but things like wrong or missing indexes, forgotten table/index reorganizations, wrong RAM/tablespace tuning parameters, neglecting to use bind variables and so on. Those can slow down the DB by factors of 10, 100, 10000... not an additional ID column.

So, even if there were a technical, measurable downside of having an additional 32 bit per row, it is not a question of whether you can optimize the ID away, but whether the ID will be essential at some point, which it will be more likely than not. And I'm not going to count off all the "soft" benefits from a software development stance (like your ORM example, or the fact that it makes it easier for software developers when all IDs by design have the same datatype and so on).

N.B.: note that you do not need a separate ID for n:m association tables because for such tables the IDs of the associated entities should form a primary key. A counterexample would be a weird n:m association which allows multiple associations between the same two entities for whatever bizarre reason - those would need their own ID column then, to create a PK. There are ORM libraries which cannot handle multi-column PKs though, so that would be a reason to be lenient with the developers, if they have to work with such a library.


If you invariably add a meaningless extra column to every table and reference only those columns as foreign keys then you will almost inevitably make the database more complex and difficult to use. Effectively you will be removing data of interest to users from the foreign key attributes and forcing the user/application to do an extra join to retrieve that same information. Queries become more complex, the optimizer's job becomes harder and performance may suffer.

Your tables will be more sparsely populated with "real" data than they would otherwise have been. The database will therefore be more difficult to comprehend and verify. You may also find it hard or impossible to enforce certain useful constraints (where constraints would involve multiple attributes that are no longer in the same table).

I'd suggest you choose your keys more carefully and make them integers only if/when you have good reasons to. Base your database designs on good analysis, data integrity, practicality and verifiable results rather than relying on dogmatic rules.