Help understanding why a deadlock occurred on row level index lock

I don't understand why the deadlock is happening.

For this execution plan, the sequence of locking operations involved in deleting each row is:

  1. U lock nonclustered index (taken at the index seek)
  2. U lock clustered index (taken at the delete operator)
  3. X lock clustered index (at the delete operator)
  4. X lock nonclustered index (at the delete operator)

I'm not sure why ... process330f11fc28 has an X-lock on this index, but the others don't.

The plan has no blocking operators, so it is a simple pipeline (roughly speaking, each row gets to the end of the pipeline before the next one is processed).

When the deadlock occured, one process (session 193) had an X lock on a nonclustered index row (final step above). Sessions 181 and 201 were blocked at the first step, trying to get an incompatible U lock on the same nonclustered index row session 193 has exclusively locked.

I apologise in advance that the detailed explanation is somewhat involved.

Internal update locks

The update lock on the nonclustered index is taken automatically by the engine to avoid a common type of conversion deadlock, which occurs where two processes acquire an S lock on the same resource, then both try to convert to X. Each is prevented from converting S to X by the other, so a deadlock occurs.

Taking a U lock prevents this because U is compatible with S but not another U. Naturally S locks are not normally taken under RCSI, but these U locks are. This avoids attempting to update a stale version of the row.

The automatic U lock is taken under RCSI only for the instance of the table that provides the row locator for the update operation. Other tables in the query (including any additional references to the update target) continue to use row versioning.

These automatic U locks have a different lifetime from regular update locks (such as might be taken with an UPDLOCK hint). Regular U locks are held to the end of the transaction. Internal U locks are held to the end of the statement, with an exception: if the same plan operator that took the lock can deduce that the row does not qualify for the update, the lock is released immediately.

See my article Data Modifications under Read Committed Snapshot Isolation.

Cyclic deadlock

This automatic U lock does not provide protection from a cyclic deadlock. Two transactions that modify resource A and resource B inside a transaction, but in reverse order, are guaranteed to deadlock:

  1. Transaction T1 modifies row R1 (but does not commit or abort).
  2. Transaction T2 modifies row R2 (but does not commit or abort).
  3. T1 attempts to lock row R2 and blocks (T2 has an incompatible lock).
  4. T2 attempts to lock row R1 and blocks (T1 has an incompatible lock).
  5. Deadlock.

Where "modifies" above includes insert, update, delete etc.

Specific deadlock

The example in the question is a variation on this theme, where:

  • Session 193 has deleted row R1, holding X on that row
  • Session 193 is waiting to acquire U on row R2
  • Session 181 owns a U lock on R2
  • Session 181 is waiting to acquire U on R1
  • Deadlock

(session 201 is also waiting to acquire U on row R2 but it is an innocent bystander.)

To be clear: The exact deadlock sequence above cannot occur for the precise execution plan shown in the question. Session 181 could not hold U on R2 and go on to request U on R1 due to the lack of a blocking operator, and/or separation between acquire and release points for the nonclustered U lock. Any U locked row found by the index seek is guaranteed to get converted to X before the next seek row is processed.

Nevertheless, just because that is the plan for the statement now, does not mean that was the plan when the deadlock occurred. For example, when a statement-level recompile occurs SQL Server can see the cardinality of the table variable. This could well lead to a hash join plan instead.

Hash join plan

In a hash join plan, rows from the table variable would be used to build a hash table. Once that is completed, SQL Server can start reading rows from AttributeValueHyperlink, taking a U lock on each row emitted by the index scan (there is nothing to seek on now).

At the hash join, each probe-side row is evaluated against the join predicate. If a match is found the row goes on to the Clustered Index Delete operator, where the clustered U, X, and nonclustered X locks are taken as part of locating and deleting the entries corresponding to the current row.

However, if the row does not join at the hash join, the U lock is not released. The U locks for unjoined rows will continue to accumulate until they are all released when the current statement ends. This is simply a consequence of the U locks being taken at one operator (nonclustered index scan) but tested for eligibility at another (the hash join).

Anyway, multiple U locks make the reported deadlock possible.

Avoiding the deadlock

Of course the simple nested loops plan might also deadlock when processing the same data (the locks would just be more obviously a cyclic deadlock). To avoid the deadlock, you would need to ensure the input sets are disjoint, or the rows in each set are processed in strictly the same order (sorted the same way, and processed in the same sequence by the execution plan).