What are Identity Columns?

This is to implement the feature found in the standard. (copied from a draft, date: 2011-12-21):

4.15.11 Identity columns

The columns of a base table BT can optionally include not more than one identity column. The declared type of an identity column is either an exact numeric type with scale 0 (zero), INTEGER for example, or a distinct type whose source type is an exact numeric type with scale 0 (zero). An identity column has a start value, an increment, a maximum value, a minimum value, and a cycle option. ...
... The definition of an identity column may specify GENERATED ALWAYS or GENERATED BY DEFAULT.

It is a property of a column which basically says that the values for the column will be provided by the DBMS and not by the user and in some specific manner and restrictions (increasing, decreasing, having max/min values, cycling if the max/min value is reached).

Sequence generators (usually called just "sequences") are a related SQL standard feature: it's a mechanism that provides such values - and can be used for identity columns.

Note the subtle difference: a SEQUENCE is an object that can be used to provide values for one or more identity columns or even at will.


The various DBMS have so far implemented similar features in different ways and syntax (MySQL: AUTO_INCREMENT, SQL Server: IDENTITY (seed, increment), PostgreSQL: serial using SEQUENCE, Oracle: using triggers, etc) and only recently added sequence generators (SQL Server in version 2012 and Oracle in 12c).

Up to now Postgres has implemented sequence generators (which can be used to provide values for column, either with the special macros serial and bigserial or with nextval() function) but has not yet implemented the syntax for identity columns, as it is in the standard.

Defining identity columns (and the slight difference from serial columns) and various syntax (eg. GENERATED ALWAYS, NEXT VALUE FOR, etc) from the SQL standard is what this feature is about. Some changes / improvements may need to be done on the implementation of sequences as well, as identity columns will be using sequences.

If you follow the link identitity columns (from the page you saw), you'll find:

identity columns

From: Peter Eisentraut
To: pgsql-hackers Subject: identity columns
Date: 2016-08-31 04:00:42
Message-ID: [email protected]


Here is another attempt to implement identity columns. This is a standard-conforming variant of PostgreSQL's serial columns. It also fixes a few usability issues that serial columns have:

  • need to set permissions on sequence in addition to table (*)
  • CREATE TABLE / LIKE copies default but refers to same sequence
  • cannot add/drop serialness with ALTER TABLE
  • dropping default does not drop sequence
  • slight weirdnesses because serial is some kind of special macro

(*) Not actually implemented yet, because I wanted to make use of the NEXT VALUE FOR stuff I had previously posted, but I have more work to do there.

...

Update 2017, September: seems like the feature will be in Postgres 10, which is to be released in a few days/weeks: What's New In Postgres 10: Identity Columns


Oracle have also implemented identity columns and sequences, in version 12c. The syntax is according to the standard, as far as I checked:
Identity Columns in Oracle Database 12c Release 1 (12.1)

The 12c database introduces the ability define an identity clause against a table column defined using a numeric type. The syntax is show below.

GENERATED
[ ALWAYS | BY DEFAULT [ ON NULL ] ]
AS IDENTITY [ ( identity_options ) ]

How they're actually implemented in PG 10

You can see how they're actually implemented now using the test suite's expected output.

Some keys to take away from this.

  • You can specify where to start and how many to skip with a clause on table creation or through ALTER TABLE, START 7 INCREMENT BY 5

  • Inserting into a table with an identity column can now OVERRIDING USER VALUE for the identity column which forces a replacement of the conflicting row:

    INSERT INTO t OVERRIDING USER VALUE VALUES (10, 'xyz');
    
    -- this isn't currently allowed.
    CREATE TABLE t ( a serial PRIMARY KEY, b text );
    INSERT INTO t (a,b) VALUES (1,'foo');
    INSERT INTO t (a,b) VALUES (1,'bar');
    
  • You can specify GENERATED ALWAYS to ensure generation then you need only have OVERRIDING SYSTEM VALUE to ignore that or you'll get an error when you INSERT a row that specifies a value for an identity column.

    ERROR:  cannot insert into column "a"
    DETAIL:  Column "a" is an identity column defined as GENERATED ALWAYS.
    HINT:  Use OVERRIDING SYSTEM VALUE to override.
    
  • Identity columns must be NOT NULL

  • Permissions propagate from the table, no more underlying sequences.

  • Identity can be reset entirely at a different point with RESTART

You can read more about these in the PostgreSQL 10 docs for

  • ALTER TABLE

    ALTER [ COLUMN ] column_name ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence_options ) ]
    ALTER [ COLUMN ] column_name DROP IDENTITY [ IF EXISTS ]
    ALTER [ COLUMN ] column_name { SET GENERATED { ALWAYS | BY DEFAULT } | SET sequence_option | RESTART [ [ WITH ] restart ] } [...]
    
  • CREATE TABLE

    GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence_options ) ]
    
  • CREATE SEQUENCE providing the sequence_option mentioned above