Can't update "CO2" to "CO₂" in table row

The subscript 2 is not part of the varchar character set (in any collation, not just Modern_Spanish). So make it a nvarchar constant:

UPDATE test SET description = N'CO₂' WHERE id = 1;

@gbn already explained the basic reason and fix, but the specific reason for the behavior that you are seeing is this:

You are using a VARCHAR literal (no N prefix) instead of an NVARCHAR literal (string with N prefix), hence the Unicode character will get converted into VARCHAR.
VARCHAR is an 8-bit encoding that is, in most cases, one byte per character, but can also be two bytes per character. On the other hand, NVARCHAR is a 16-bit encoding (UTF-16 Little Endian) that is either two bytes or four bytes per character.
Due to the difference in number of available bytes to use for mapping characters, 8 bit encodings are, by their very nature, much more limited in the number of characters that can be mapped. VARCHAR data is up to 256 characters for Single-Byte Character Sets (the majority of them) and up to 65,536 characters for Double-Byte Character Sets (only a few of these). On the other hand, NVARCHAR data can map just over 1.1 million Unicode characters (though just under 250k currently mapped).
Due to the limited number of mappings that can be done with 8-bit / VARCHAR data, different groupings of characters (based on Language / Culture) are spread out across multiple "Code Pages" (i.e. character sets)
Each Collation specifies which Code Page, if any, to use for VARCHAR data (NVARCHAR is all characters)
When converting a string literal or variable from NVARCHAR (i.e. Unicode / UTF-16 / all characters) to VARCHAR (character set based on Code Page which is specified in most Collations), the default Collation of the Database is used
If the Code Page of the Collation being used for the conversion does not contain the same character, but contains a "best fit" mapping, then the "best fit" mapping will be used.
If the Code Page of the Collation being used for the conversion does not contain the same character or contain a "best fit" mapping, then the default "replacement" character will be used (most commonly ?).

So, what you are seeing is an NVARCHAR to VARCHAR conversion due to missing the N prefix on the string literal. And, the Code Page of the default Collation for the Database does not contain the exact same character, but a "best fit" mapping was found, which is why you are getting a 2 instead of a ?.

You can see this effect by doing the following simple test:

SELECT '₂', N'₂';

Returns:

2    ₂

To be clear, IF the Code Page of the default Collation for the Database did contain the exact same character, then it would have translated into the same character in that Code Page. And, then, in your case, since you are storing into an NVARCHAR column, it would have translated again, back to the original Unicode character. The final example below shows this behavior.

IMPORTANT: Please be aware that the conversion happens as the string literal is being interpreted, which is before it is stored into the column. This means that even if the column can hold that character, it will have already been converted into something else, based on the Database's default Collation, all due to leaving off the N prefix on that string literal. And this is exactly what you are (or were) experiencing.

For example, if the default Collation of your Database would have been one of the Korean Collations (one of the four Double-Byte Character Sets), then you would not have seen this problem as the "Subscript 2" character is available in that character set (Code Page 949). Try the following test to see (it uses the Collation of the column instead of the Database's default Collation as that is easier to show):

CREATE TABLE #TestChar
(
    [8bit_Latin1_General-1252] VARCHAR(2) COLLATE Latin1_General_100_CI_AS_SC,
    [8bit_Korean-949] VARCHAR(2) COLLATE Korean_100_CI_AS_SC,
    [UTF16LE_Latin1_General-1252] NVARCHAR(2) COLLATE Latin1_General_100_CI_AS_SC
);

INSERT INTO #TestChar VALUES (N'₂', N'₂', N'₂');

SELECT * FROM #TestChar;

Returns:

8bit_Latin1_General-1252    8bit_Korean-949    UTF16LE_Latin1_General-1252
2                           ₂                  ₂

As you can see, the Latin1_General Collations, which use Code Page 1252 (same Code Page that the Modern_Spanish Collations use) for VARCHAR data, do not have an exact match, but they do have a "best fit" mapping (which is what you are seeing). BUT, the Korean Collations, which use Code Page 949 for VARCHAR data, do have an exact match for the "Subscript 2" character.

To further illustrate, we can create a new Database with a default Collation of one of the Korean Collations, and then run the exact SQL that is in the question:

CREATE DATABASE [TestKorean-949] COLLATE Korean_100_CI_AS_KS_WS_SC;
ALTER DATABASE [TestKorean-949] SET RECOVERY SIMPLE;
GO

USE [TestKorean-949];

CREATE TABLE test (
    id INT NOT NULL,
    description NVARCHAR(100) COLLATE Modern_Spanish_CI_AS NOT NULL
);
INSERT INTO test (id, description) VALUES (1, 'CO2');


SELECT * FROM test WHERE id = 1;
UPDATE test SET description = 'CO₂' WHERE id = 1;
SELECT * FROM test WHERE id = 1;

Returns:

id  description
1   CO2


id  description
1   CO₂

UPDATE

For anyone who is interested in finding out more about what exactly is going on here (i.e. all of the gory details), please see the two-part investigation I just posted:

Which Collation is Used to Convert NVARCHAR to VARCHAR in a WHERE Condition? (Part A of 2: “Duck”)
Which Collation is Used to Convert NVARCHAR to VARCHAR in a WHERE Condition? (Part B of 2: “Rabbit”)

Can't update "CO2" to "CO₂" in table row

Tags:

Sql Server

Unicode

Sql Server 2008 R2

Collation

T Sql

Related

Recent Posts