Is a MERGE with OUTPUT better practice than a conditional INSERT and SELECT?

Because you are using a Sequence, you can use the same NEXT VALUE FOR function -- that you already have in a Default Constraint on the Id Primary Key field -- to generate a new Id value ahead of time. Generating the value first means that you don't need to worry about not having SCOPE_IDENTITY, which then means that you don't need either the OUTPUT clause or doing an additional SELECT to get the new value; you will have the value before you do the INSERT, and you don't even need to mess with SET IDENTITY INSERT ON / OFF :-)

So that takes care of part of the overall situation. The other part is handling the concurrency issue of two processes, at the exact same time, not finding an existing row for the exact same string, and proceeding with the INSERT. The concern is about avoiding the Unique Constraint violation that would occur.

One way to handle these types of concurrency issues is to force this particular operation to be single threaded. The way to do that is by using application locks (which work across sessions). While effective, they can be a bit heavy-handed for a situation like this where the frequency of collisions is probably fairly low.

The other way to deal with the collisions is to accept that they will sometimes occur and handle them rather than try to avoid them. Using the TRY...CATCH construct, you can effectively trap a specific error (in this case: "unique constraint violation", Msg 2601) and re-execute the SELECT to get the Id value since we know that it now exists due to being in the CATCH block with that particular error. Other errors can be handled in the typical RAISERROR / RETURN or THROW manner.

Test Setup: Sequence, Table, and Unique Index

USE [tempdb];

CREATE SEQUENCE dbo.MagicNumber
  AS INT
  START WITH 1
  INCREMENT BY 1;

CREATE TABLE dbo.NameLookup
(
  [Id] INT NOT NULL
         CONSTRAINT [PK_NameLookup] PRIMARY KEY CLUSTERED
        CONSTRAINT [DF_NameLookup_Id] DEFAULT (NEXT VALUE FOR dbo.MagicNumber),
  [ItemName] NVARCHAR(50) NOT NULL         
);

CREATE UNIQUE NONCLUSTERED INDEX [UIX_NameLookup_ItemName]
  ON dbo.NameLookup ([ItemName]);
GO

Test Setup: Stored Procedure

CREATE PROCEDURE dbo.GetOrInsertName
(
  @SomeName NVARCHAR(50),
  @ID INT OUTPUT,
  @TestRaceCondition BIT = 0
)
AS
SET NOCOUNT ON;

BEGIN TRY
  SELECT @ID = nl.[Id]
  FROM   dbo.NameLookup nl
  WHERE  nl.[ItemName] = @SomeName
  AND    @TestRaceCondition = 0;

  IF (@ID IS NULL)
  BEGIN
    SET @ID = NEXT VALUE FOR dbo.MagicNumber;

    INSERT INTO dbo.NameLookup ([Id], [ItemName])
    VALUES (@ID, @SomeName);
  END;
END TRY
BEGIN CATCH
  IF (ERROR_NUMBER() = 2601) -- "Cannot insert duplicate key row in object"
  BEGIN
    SELECT @ID = nl.[Id]
    FROM   dbo.NameLookup nl
    WHERE  nl.[ItemName] = @SomeName;
  END;
  ELSE
  BEGIN
    ;THROW; -- SQL Server 2012 or newer
    /*
    DECLARE @ErrorNumber INT = ERROR_NUMBER(),
            @ErrorMessage NVARCHAR(4000) = ERROR_MESSAGE();

    RAISERROR(N'Msg %d: %s', 16, 1, @ErrorNumber, @ErrorMessage);
    RETURN;
    */
  END;

END CATCH;
GO

The Test

DECLARE @ItemID INT;
EXEC dbo.GetOrInsertName
  @SomeName = N'test1',
  @ID = @ItemID OUTPUT;
SELECT @ItemID AS [ItemID];
GO

DECLARE @ItemID INT;
EXEC dbo.GetOrInsertName
  @SomeName = N'test1',
  @ID = @ItemID OUTPUT,
  @TestRaceCondition = 1;
SELECT @ItemID AS [ItemID];
GO

Question from O.P.

Why is this better than the MERGE? Won't I get the same functionality without the TRY by using the WHERE NOT EXISTS clause?

MERGE has various "issues" (several references are linked in @SqlZim's answer so no need to duplicate that info here). And, there is no additional locking in this approach (less contention), so it should be better on concurrency. In this approach you'll never get a Unique Constraint violation, all without any HOLDLOCK, etc. It's pretty much guaranteed to work.

The reasoning behind this approach is:

  1. If you have enough executions of this procedure such that you need to worry about collisions, then you don't want to:
    1. take any more steps than are necessary
    2. hold locks on any resources for longer than necessary
  2. Since collisions can only happen upon new entries (new entries submitted at the exact same time), the frequency of falling into the CATCH block in the first place will be pretty low. It makes more sense to optimize the code that will run 99% of the time instead of the code that will run 1% of the time (unless there is no cost to optimizing both, but that is not the case here).

Comment from @SqlZim's answer (emphasis added)

I personally prefer to try and tailor a solution to avoid doing that when possible. In this case, I don't feel that using the locks from serializable is a heavy handed approach, and I would be confident it would handle high concurrency well.

I would agree with this first sentence if it were amended to state "and _when prudent". Just because something is technically possible does not mean that the situation (i.e. intended use-case) would be benefited by it.

The issue I see with this approach is that it locks more than what is being suggested. It is important to re-read the quoted documentation on "serializable", specifically the following (emphasis added):

  • Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.

Now, here is the comment in the example code:

SELECT [Id]
FROM   dbo.NameLookup WITH (SERIALIZABLE) /* hold that key range for @vName */

The operative word there is "range". The lock being taken is not just on the value in @vName, but more accurately a range starting at the location where this new value should go (i.e. between the existing key values on either side of where the new value fits), but not the value itself. Meaning, other processes will be blocked from inserting new values, depending on the value(s) currently being looked up. If the lookup is being done at the top of the range, then inserting anything that could occupy that same position will be blocked. For example, if values "a", "b", and "d" exist, then if one process is doing the SELECT on "f", then it will not be possible to insert values "g" or even "e" (since any one of those will come immediately after "d"). But, inserting a value of "c" will be possible since it wouldn't be placed in the "reserved" range.

The following example should illustrate this behavior:

(In query tab (i.e. Session) #1)

INSERT INTO dbo.NameLookup ([ItemName]) VALUES (N'test5');

BEGIN TRAN;

SELECT [Id]
FROM   dbo.NameLookup WITH (SERIALIZABLE) /* hold that key range for @vName */
WHERE  ItemName = N'test8';

--ROLLBACK;

(In query tab (i.e. Session) #2)

EXEC dbo.NameLookup_getset_byName @vName = N'test4';
-- works just fine

EXEC dbo.NameLookup_getset_byName @vName = N'test9';
-- hangs until you either hit "cancel" in this query tab,
-- OR issue a COMMIT or ROLLBACK in query tab #1

EXEC dbo.NameLookup_getset_byName @vName = N'test7';
-- hangs until you either hit "cancel" in this query tab,
-- OR issue a COMMIT or ROLLBACK in query tab #1

EXEC dbo.NameLookup_getset_byName @vName = N's';
-- works just fine

EXEC dbo.NameLookup_getset_byName @vName = N'u';
-- hangs until you either hit "cancel" in this query tab,
-- OR issue a COMMIT or ROLLBACK in query tab #1

Likewise, if value "C" exists, and value "A" is being selected (and hence locked), then you can insert a value of "D", but not a value of "B":

(In query tab (i.e. Session) #1)

INSERT INTO dbo.NameLookup ([ItemName]) VALUES (N'testC');

BEGIN TRAN

SELECT [Id]
FROM   dbo.NameLookup WITH (SERIALIZABLE) /* hold that key range for @vName */
WHERE  ItemName = N'testA';

--ROLLBACK;

(In query tab (i.e. Session) #2)

EXEC dbo.NameLookup_getset_byName @vName = N'testD';
-- works just fine

EXEC dbo.NameLookup_getset_byName @vName = N'testB';
-- hangs until you either hit "cancel" in this query tab,
-- OR issue a COMMIT or ROLLBACK in query tab #1

To be fair, in my suggested approach, when there is an exception, there will be 4 entries in the Transaction Log that won't be happening in this "serializable transaction" approach. BUT, as I said above, if the exception happens 1% (or even 5%) of the time, that is far less impacting than the far more likely case of the initial SELECT temporarily blocking INSERT operations.

Another, albeit minor, issue with this "serializable transaction + OUTPUT clause" approach is that the OUTPUT clause (in its present usage) sends the data back as a result set. A result set requires more overhead (probably on both sides: in SQL Server to manage the internal cursor, and in the app layer to manage the DataReader object) than a simple OUTPUT parameter. Given that we are only dealing with a single scalar value, and that the assumption is a high frequency of executions, that extra overhead of the result set probably adds up.

While the OUTPUT clause could be used in such a way as to return an OUTPUT parameter, that would require additional steps to create a temporary table or table variable, and then to select the value out of that temp table / table variable into the OUTPUT parameter.

Further Clarification: Response to @SqlZim's Response (updated answer) to my Response to @SqlZim's Response (in the original answer) to my statement regarding concurrency and performance ;-)

Sorry if this part is a wee-bit long, but at this point we are just down to the nuances of the two approaches.

I believe the way the information is presented could lead to false assumptions about the amount of locking one could expect to encounter when using serializable in the scenario as presented in the original question.

Yes, I will admit that I am biased, though to be fair:

  1. It is impossible for a human to not be biased, at least to some small degree, and I do try to keep it at a minimum,
  2. The example given was simplistic, but that was for illustrative purposes to convey the behavior without over-complicating it. Implying excessive frequency was not intended, though I do understand that I also didn't explicitly state otherwise and it could be read as to imply a larger problem than actually exists. I will try to clarify that below.
  3. I did also include an example of locking a range between two existing keys (the second set of "Query tab 1" and "Query tab 2" blocks).
  4. I did find (and volunteer) the "hidden cost" of my approach, that being the four extra Tran Log entries each time the INSERT fails due to a Unique Constraint violation. I have not seen that mentioned in any of the other answers / posts.

Regarding @gbn's "JFDI" approach, Michael J. Swart's "Ugly Pragmatism For The Win" post, and Aaron Bertrand's comment on Michael's post (regarding his tests showing what scenarios have decreased performance), and your comment on your "adapation of Michael J. Stewart's adaptation of @gbn's Try Catch JFDI procedure" stating:

If you are inserting new values more often than selecting existing values, this may be more performant than @srutzky's version. Otherwise I would prefer @srutzky's version over this one.

With respect to that gbn / Michael / Aaron discussion related to the "JFDI" approach, it would be incorrect to equate my suggestion to gbn's "JFDI" approach. Due to the nature of the "Get or Insert" operation, there is an explicit need to do the SELECT to get the ID value for existing records. This SELECT acts as the IF EXISTS check, which makes this approach more so equate to the "CheckTryCatch" variation of Aaron's tests. Michael's re-written code (and your final adaptation of Michael's adaptation) also includes a WHERE NOT EXISTS to do that same check first. Hence, my suggestion (along with Michael's final code and your adaptation of his final code) won't actually hit the CATCH block all that often. It could only be situations where two sessions, given the same non-existent ItemName, and running the INSERT...SELECT at the exact same moment such that both sessions receive a "true" for the WHERE NOT EXISTS at the exact same moment and thus both attempt to do the INSERT at the exact same moment. That very specific scenario happens much less often than either selecting an existing ItemName or inserting a new ItemName when no other process is attempting to do so at the exact same moment.

WITH ALL OF THE ABOVE IN MIND: Why do I prefer my approach?

First, let's look at what locking takes place in the "serializable" approach. As mentioned above, the "range" that gets locked depends on the existing key values on either side of where the new key value would fit. The beginning or ending of the range could also be the beginning or ending of the index, respectively, if there is no existing key value in that direction. Assume we have the following index and keys (^ represents the beginning of the index while $ represents the end of it):

Range #:    |--- 1 ---|--- 2 ---|--- 3 ---|--- 4 ---|
Key Value:  ^         C         F         J         $

If session 55 attempts to insert a key value of:

  • A, then range # 1 (from ^ to C) is locked: session 56 cannot insert a value of B, even if unique and valid (yet). But session 56 can insert values of D, G, and M.
  • D, then range # 2 (from C to F) is locked: session 56 cannot insert a value of E (yet). But session 56 can insert values of A, G, and M.
  • M, then range # 4 (from J to $) is locked: session 56 cannot insert a value of X (yet). But session 56 can insert values of A, D, and G.

As more key values are added, the ranges between key values becomes narrower, hence reducing the probability / frequency of multiple values being inserted at the same time fighting over the same range. Admittedly, this is not a major problem, and fortunately it appears to be a problem that actually decreases over time.

The issue with my approach was described above: it only happens when two sessions attempt to insert the same key value at the same time. In this respect it comes down to what has the higher probability of happening: two different, yet close, key values are attempted at the same time, or the same key value is attempted at the same time? I suppose the answer lies in the structure of the app doing the inserts, but generally speaking I would assume it to be more likely that two different values that just happen to share the same range are being inserted. But the only way to really know would be to test both on the O.P.s system.

Next, let's consider two scenarios and how each approach handles them:

  1. All requests being for unique key values:

    In this case, the CATCH block in my suggestion is never entered, hence no "issue" (i.e. 4 tran log entries and the time it takes to do that). But, in the "serializable" approach, even with all inserts being unique, there will always be some potential for blocking other inserts in the same range (albeit not for very long).

  2. High-frequency of requests for same key value at the same time:

    In this case -- a very low degree of uniqueness in terms of incoming requests for non-existent key values -- the CATCH block in my suggestion will be entered regularly. The effect of this will be that each failed insert will need to auto-rollback and write the 4 entries to the Transaction Log, which is a slight performance hit each time. But the overall operation should never fail (at least not due to this).

    (There was an issue with the previous version of the "updated" approach that allowed it to suffer from deadlocks. An updlock hint was added to address this and it no longer gets deadlocks.) BUT, in the "serializable" approach (even the updated, optimized version), the operation will deadlock. Why? Because the serializable behavior only prevents INSERT operations in the range that has been read and hence locked; it doesn't prevent SELECT operations on that range.

    The serializable approach, in this case, would seem to have no additional overhead, and might perform slightly better than what I am suggesting.

As with many / most discussions regarding performance, due to there being so many factors that can affect the outcome, the only way to really have a sense of how something will perform is to try it out in the target environment where it will run. At that point it won't be a matter of opinion :).


Updated Answer


Response to @srutzky

Another, albeit minor, issue with this "serializable transaction + OUTPUT clause" approach is that the OUTPUT clause (in its present usage) sends the data back as a result set. A result set requires more overhead (probably on both sides: in SQL Server to manage the internal cursor, and in the app layer to manage the DataReader object) than a simple OUTPUT parameter. Given that we are only dealing with a single scalar value, and that the assumption is a high frequency of executions, that extra overhead of the result set probably adds up.

I agree, and for those same reasons I do use output parameters when prudent. It was my mistake not to use an output parameter on my initial answer, I was being lazy.

Here is a revised procedure using an output parameter, additional optimizations, along with next value for that @srutzky explains in his answer:

create procedure dbo.NameLookup_getset_byName (@vName nvarchar(50), @vValueId int output) as
begin
  set nocount on;
  set xact_abort on;
  set @vValueId = null;
  if nullif(@vName,'') is null                                 
    return;                                        /* if @vName is empty, return early */
  select  @vValueId = Id                                              /* go get the Id */
    from  dbo.NameLookup
    where ItemName = @vName;
  if @vValueId is not null                                 /* if we got the id, return */
    return;
  begin try;                                  /* if it is not there, then get the lock */
    begin tran;
      select  @vValueId = Id
        from  dbo.NameLookup with (updlock, serializable) /* hold key range for @vName */
        where ItemName = @vName;
      if @@rowcount = 0                    /* if we still do not have an Id for @vName */
      begin;                                         /* get a new Id and insert @vName */
        set @vValueId = next value for dbo.IdSequence;      /* get next sequence value */
        insert into dbo.NameLookup (ItemName, Id)
          values (@vName, @vValueId);
      end;
    commit tran;
  end try
  begin catch;
    if @@trancount > 0 
      begin;
        rollback transaction;
        throw;
      end;
  end catch;
end;

update note: Including updlock with the select will grab the proper locks in this scenario. Thanks to @srutzky, who pointed out that this could cause deadlocks when only using serializable on the select.

Note: This might not be the case, but if it is possible the procedure will be called with a value for @vValueId, include set @vValueId = null; after set xact_abort on;, otherwise it can be removed.


Concerning @srutzky's examples of key range locking behavior:

@srutzky only uses one value in his table, and locks the "next"/"infinity" key for his tests to illustrate key range locking. While his tests illustrate what happens in those situations, I believe the way the information is presented could lead to false assumptions about the amount of locking one could expect to encounter when using serializable in the scenario as presented in the original question.

Even though I perceive a bias (perhaps falsely) in the way he presents his explanation and examples of key range locking, they are still correct.


After more research, I found a particularly pertinent blog article from 2011 by Michael J. Swart: Mythbusting: Concurrent Update/Insert Solutions. In it, he tests multiple methods for accuracy and concurrency. Method 4: Increased Isolation + Fine Tuning Locks is based on Sam Saffron's post Insert or Update Pattern For SQL Server, and the only method in the original test to meet his expectations (joined later by merge with (holdlock)).

In February of 2016, Michael J. Swart posted Ugly Pragmatism For The Win. In that post, he covers some additional tuning he made to his Saffron upsert procedures to reduce locking (which I included in the procedure above).

After making those changes, Michael wasn't happy that his procedure was starting to look more complicated and consulted with a colleage named Chris. Chris read all of the original Mythbusters post and read all the comments and asked about @gbn's TRY CATCH JFDI pattern. This pattern is similar to @srutzky's answer, and is the solution that Michael ended up using in that instance.

Michael J Swart:

Yesterday I had my mind changed about the best way to do concurrency. I describe several methods in Mythbusting: Concurrent Update/Insert Solutions. My preferred method is to increase the isolation level and fine tune locks.

At least that was my preference. I recently changed my approach to use a method that gbn suggested in the comments. He describes his method as the “TRY CATCH JFDI pattern”. Normally I avoid solutions like that. There’s a rule of thumb that says developers should not rely on catching errors or exceptions for control flow. But I broke that rule of thumb yesterday.

By the way, I love the gbn’s description for the pattern “JFDI”. It reminds me of Shia Labeouf’s motivational video.


In my opinion, both solutions are viable. While I still prefer to increase the isolation level and fine tune locks, @srutzky's answer is also valid and may or may not be more performant in your specific situation.

Perhaps in the future I too will arrive at the same conclusion that Michael J. Swart did, but I'm just not there yet.


It isn't my preference, but here is what my adapation of Michael J. Stewart's adaptation of @gbn's Try Catch JFDI procedure would look like:

create procedure dbo.NameLookup_JFDI (
    @vName nvarchar(50)
  , @vValueId int output
  ) as
begin
  set nocount on;
  set xact_abort on;
  set @vValueId = null;
  if nullif(@vName,'') is null                                 
    return;                     /* if @vName is empty, return early */
  begin try                                                 /* JFDI */
    insert into dbo.NameLookup (ItemName)
      select @vName
      where not exists (
        select 1
          from dbo.NameLookup
          where ItemName = @vName);
  end try
  begin catch        /* ignore duplicate key errors, throw the rest */
    if error_number() not in (2601, 2627) throw;
  end catch
  select  @vValueId = Id                              /* get the Id */
    from  dbo.NameLookup
    where ItemName = @vName
  end;

If you are inserting new values more often than selecting existing values, this may be more performant than @srutzky's version. Otherwise I would prefer @srutzky's version over this one.

Aaron Bertrand's comments on Michael J Swart's post links to relevant testing he has done and led to this exchange. Excerpt from comment section on Ugly Pragmatism For the Win:

Sometimes, though, JFDI leads to worse performance overall, depending on what % of calls fail. Raising exceptions has substantial overhead. I showed this in a couple of posts:

http://sqlperformance.com/2012/08/t-sql-queries/error-handling

https://www.mssqltips.com/sqlservertip/2632/checking-for-potential-constraint-violations-before-entering-sql-server-try-and-catch-logic/

Comment by Aaron Bertrand — February 11, 2016 @ 11:49 am

and the reply of:

You’re right Aaron, and we did test it.

It turns out that in our case, the percent of calls that failed was 0 (when rounded to the nearest percent).

I think you illustrate the point that as much as possible, evaluate things on a case-by-case basis over following rules-of-thumb.

It’s also why we added the not-strictly-necessary WHERE NOT EXISTS clause.

Comment by Michael J. Swart — February 11, 2016 @ 11:57 am


New links:

  • Ugly Pragmatism For The Win - Michael J. Swart - 2016-02-11
  • Mythbusting: Concurrent Update/Insert Solutions - Michael J. Swart - 2011-09-08
  • @gbn's TRY CATCH JFDI Pattern
  • Checking for potential constraint violations before entering SQL Server TRY and CATCH logic - Aaron Bertrand
  • Performance impact of different error handling techniques - Aaron Bertrand
  • SQL Server Isolation Levels: A Series - Paul White

Original answer


I still prefer the Sam Saffron upsert approach vs using merge, especially when dealing with a single row.

I would adapt that upsert method to this situation like this:

declare @vName nvarchar(50) = 'Invader';
declare @vValueId int       = null;

if nullif(@vName,'') is not null /* this gets your where condition taken care of before we start doing anything */
begin tran;
  select @vValueId = Id
    from dbo.NameLookup with (serializable) 
    where ItemName = @vName;
  if @@rowcount > 0 
    begin;
      select @vValueId as id;
    end;
    else
    begin;
      insert into dbo.NameLookup (ItemName)
        output inserted.id
          values (@vName);
      end;
commit tran;

I would be consistent with your naming, and as serializable is the same as holdlock, pick one and be consistent in its use. I tend to use serializable because it is the same name used as when specifying set transaction isolation level serializable.

By using serializable or holdlock a range lock is taken based on the value of @vName which makes any other operations wait if they selecting or inserting values into dbo.NameLookup that include the value in the where clause.

For the range lock to work properly, there needs to be an index on the ItemName column this applies when using merge as well.


Here is what the procedure would look like mostly following Erland Sommarskog's whitepapers for error handling, using throw. If throw isn't how you are raising your errors, change it to be consistent with the rest of your procedures:

create procedure dbo.NameLookup_getset_byName (@vName nvarchar(50) ) as
begin
  set nocount on;
  set xact_abort on;
  declare @vValueId int;
  if nullif(@vName,'') is null /* if @vName is null or empty, select Id as null */
    begin
      select Id = cast(null as int);
    end 
    else                       /* else go get the Id */
    begin try;
      begin tran;
        select @vValueId = Id
          from dbo.NameLookup with (serializable) /* hold key range for @vName */
          where ItemName = @vName;
        if @@rowcount > 0      /* if we have an Id for @vName select @vValueId */
          begin;
            select @vValueId as Id; 
          end;
          else                     /* else insert @vName and output the new Id */
          begin;
            insert into dbo.NameLookup (ItemName)
              output inserted.Id
                values (@vName);
            end;
      commit tran;
    end try
    begin catch;
      if @@trancount > 0 
        begin;
          rollback transaction;
          throw;
        end;
    end catch;
  end;
go

To summarize what is going on in the procedure above: set nocount on; set xact_abort on; like you always do, then if our input variable is null or empty, select id = cast(null as int) as the result. If it isn't null or empty, then get the Id for our variable while holding that spot in case it isn't there. If the Id is there, send it out. If it isn't there, insert it and send out that new Id.

Meanwhile, other calls to this procedure trying to find the Id for the same value will wait until the first transaction is done and then select & return it. Other calls to this procedure or other statements looking for other values will continue on because this one isn't in the way.

While I agree with @srutzky that you can handle collisions and swallow the exceptions for this sort of issue, I personally prefer to try and tailor a solution to avoid doing that when possible. In this case, I don't feel that using the locks from serializable is a heavy handed approach, and I would be confident it would handle high concurrency well.

Quote from sql server documentation on the table hints serializable / holdlock:

SERIALIZABLE

Is equivalent to HOLDLOCK. Makes shared locks more restrictive by holding them until a transaction is completed, instead of releasing the shared lock as soon as the required table or data page is no longer needed, whether the transaction has been completed or not. The scan is performed with the same semantics as a transaction running at the SERIALIZABLE isolation level. For more information about isolation levels, see SET TRANSACTION ISOLATION LEVEL (Transact-SQL).

Quote from sql server documentation on transaction isolation level serializable

SERIALIZABLE Specifies the following:

  • Statements cannot read data that has been modified but not yet committed by other transactions.

  • No other transactions can modify data that has been read by the current transaction until the current transaction completes.

  • Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.


Links related to the solution above:

  • Insert or Update pattern for Sql Server - Sam Saffron

  • Documentation on serializable and other Table Hints - MSDN

  • Error and Transaction Handling in SQL Server Part One – Jumpstart Error Handling - Erland Sommarskog

  • Erland Sommarskog's advice regarding @@rowcount, (which I didn't follow in this instance).

MERGE has a spotty history, and it seems to take more poking around to make sure that the code is behaving how you want it to under all that syntax. Relevant merge articles:

  • An Interesting MERGE Bug - Paul White

  • UPSERT Race Condition With Merge - sqlteam

  • Use Caution with SQL Server's MERGE Statement - Aaron Bertrand

  • Can I optimize this merge statement - Aaron Bertrand

  • If you are using indexed views and MERGE, please read this! - Aaron Bertrand

One last link, Kendra Little did a rough comparison of merge vs insert with left join, with the caveat where she says "I didn’t do thorough load testing on this", but it is still a good read.