How does SQL Server choose an index key for a foreign key reference?

The (lack of) documentation suggests that this behaviour is an implementation detail, and is therefore undefined and subject to change at any time.

This is in stark contrast to CREATE FULLTEXT INDEX, where you have to specify the name of an index to attach to -- AFAIK, there is no undocumented FOREIGN KEY syntax to do the equivalent (though theoretically, there could be in the future).

As mentioned, it does make sense that SQL Server chooses the smallest physical index with which to associate the foreign key. If you change the script to create the unique constraint as CLUSTERED, the script "works" on 2008 R2. But that behaviour is still undefined and should not be relied upon.

As with most legacy applications, you'll just have to get down to the nitty-gritty and clean things up.


Does SQL Server have a method of choosing between a unique index and a primary key?

At least it is possible to direct SqlServer to reference primary key, when foreign key is being created and alternative key constraints or unique indexes do exist on the table being referenced.

If primary key needs to be referenced, then only name of the table being referenced should be specified in the foreign key definition and list of columns being referenced should be omitted:

ALTER TABLE Child
    ADD CONSTRAINT FK_Child_Parent FOREIGN KEY (ParentID)
        -- omit key columns of the referenced table
        REFERENCES Parent /*(ParentID)*/;

More details below.


Consider the following setup:

CREATE TABLE T (id int NOT NULL, a int, b int, c uniqueidentifier, filler binary(1000));
CREATE TABLE TRef (tid int NULL);

where table TRef intends to reference table T.

To create referential constraint one can use ALTER TABLE command with two alternatives:

ALTER TABLE TRef
    ADD CONSTRAINT FK_TRef_T_1 FOREIGN KEY (tid) REFERENCES T (id);

ALTER TABLE TRef
    ADD CONSTRAINT FK_TRef_T_2 FOREIGN KEY (tid) REFERENCES T;

notice that in the second case no columns of the table being referenced are specified (REFERENCES T versus REFERENCES T (id)).

Since there are no key indexes on T yet, execution of these commands will generate errors.

First command returns following error:

Msg 1776, Level 16, State 0, Line 4

There are no primary or candidate keys in the referenced table 'T' that match the referencing column list in the foreign key 'FK_TRef_T_1'.

The second command, however, returns different error:

Msg 1773, Level 16, State 0, Line 4

Foreign key 'FK_TRef_T_2' has implicit reference to object 'T' which does not have a primary key defined on it.

see that in the first case expectation is primary or candidate keys, whereas in the second case expectation is primary key only.

Let's check if SqlServer will use something other than primary key with the second command or not.

If we add some unique indexes and unique key on T:

CREATE UNIQUE INDEX IX_T_1 on T(id) INCLUDE (filler);
CREATE UNIQUE INDEX IX_T_2 on T(id) INCLUDE (c);
CREATE UNIQUE INDEX IX_T_3 ON T(id) INCLUDE (a, b);

ALTER TABLE T
    ADD CONSTRAINT UQ_T UNIQUE CLUSTERED (id);

command for FK_TRef_T_1 creation succeeds, but command for FK_TRef_T_2 creation still fails with Msg 1773.

Finally, if we add primary key on T:

ALTER TABLE T
    ADD CONSTRAINT PK_T PRIMARY KEY NONCLUSTERED (id);

command for FK_TRef_T_2 creation succeeds.

Let's check what indexes of the table T are referenced by foreign keys of table TRef:

select
    ix.index_id,
    ix.name as index_name,
    ix.type_desc as index_type_desc,
    fk.name as fk_name
from sys.indexes ix
    left join sys.foreign_keys fk on
        fk.referenced_object_id = ix.object_id
        and fk.key_index_id = ix.index_id
        and fk.parent_object_id = object_id('TRef')
where ix.object_id = object_id('T');

this returns:

index_id  index_name  index_type_desc   fk_name
--------- ----------- ----------------- ------------
1         UQ_T        CLUSTERED         NULL
2         IX_T_1      NONCLUSTERED      FK_TRef_T_1
3         IX_T_2      NONCLUSTERED      NULL
4         IX_T_3      NONCLUSTERED      NULL
5         PK_T        NONCLUSTERED      FK_TRef_T_2

see that FK_TRef_T_2 correspond to PK_T.

So, yes, with use of REFERENCES T syntax foreign key of TRef is mapped to primary key of T.

I was not able to find such behavior described in SqlServer documentation directly, but dedicated Msg 1773 suggests that it is not accidental. Likely such implementation provides compliance with the SQL Standard, below is short excerpt from section 11.8 of ANSI/ISO 9075-2:2003

11 Schema definition and manipulation

11.8 <referential constraint definition>

Function
Specify a referential constraint.

Format

<referential constraint definition> ::=
    FOREIGN KEY <left paren> <referencing columns> <right paren>
        <references specification>

<references specification> ::=
    REFERENCES <referenced table and columns>
    [ MATCH <match type> ]
    [ <referential triggered action> ]
...

Syntax Rules
...
3) Case:
...
b) If the <referenced table and columns> does not specify a <reference column list>, then the table descriptor of the referenced table shall include a unique constraint that specifies PRIMARY KEY. Let referenced columns be the column or columns identified by the unique columns in that unique constraint and let referenced column be one such column. The <referenced table and columns> shall be considered to implicitly specify a <reference column list> that is identical to that <unique column list>.
...

Transact-SQL supports and extends ANSI SQL. It does not conform to SQL Standard exactly however. There is a document named SQL Server Transact-SQL ISO/IEC 9075-2 Standards Support Document (MS-TSQLISO02 in short, see here) describing level of support that is provided by Transact-SQL. The document lists extensions and variations to standard. For example it documents that MATCH clause is not supported in the referential constraint definition. But there are no documented variations relevant to cited piece of standard. So, my opinion is that observed behavior is documented enough.

And with use of REFERENCES T (<reference column list>) syntax it seems that SqlServer selects first suitable nonclustered index among the indexes of table being referenced (the one with the least index_id seemingly, not the one with the smallest physical size as assumed in the question comments), or clustered index if it suits and there are no suitable nonclustered indexes. Such behavior seems to be consistent since SqlServer 2008 (version 10.0). This is just observation of course, no guarantees in this case.