How to save unicode data to oracle?

I can see five potential areas for problems:

  1. How are you actually getting the text into your .NET application? If it's hardcoded in a string literal, are you sure that the compiler is assuming the right encoding for your source file?

  2. There could be a problem in how you're sending it to the database.

  3. There could be a problem with how it's being stored in the database.

  4. There could be a problem with how you're fetching it in the database.

  5. There could be a problem with how you're displaying it again afterwards.

Now areas 2-4 sound like they're less likely to be an issue than 1 and 5. How are you displaying the text afterwards? Are you actually fetching it out of the database in .NET, or are you using Toad or something similar to try to see it?

If you're writing it out again from .NET, I suggest you skip the database entirely - if you just display the string itself, what do you see?

I have an article you might find useful on debugging Unicode problems. In particular, concentrate on every place where the encoding could be going wrong, and make sure that whenever you "display" a string you dump out the exact Unicode characters (as integers) so you can check those rather than just whatever your current font wants to display.

EDIT: Okay, so the database is involved somewhere in the problem.

I strongly suggest that you remove anything like ASP and HTML out of the equation. Write a simple console app that does nothing but insert the string and fetch it again. Make it dump the individual Unicode characters (as integers) before and after. Then try to see what's in the database (e.g. using Toad). I don't know the Oracle functions to convert strings into sequences of individual Unicode characters and then convert those characters into integers, but that would quite possibly be the next thing I'd try.

EDIT: Two more suggestions (good to see the console app, btw).

  1. Specify the data type for the parameter, instead of just giving it an object. For instance:

    command.Parameters.Add (":UnicodeString",
                            OracleType.NVarChar).Value = stringToSave;
    
  2. Consider using Oracle's own driver instead of the one built into .NET. You may wish to do this anyway, as it's generally reckoned to be faster and more reliable, I believe.


You can determine what characterset your database uses for NCHAR with the query:

SQL> SELECT VALUE
  2    FROM nls_database_parameters
  3   WHERE parameter = 'NLS_NCHAR_CHARACTERSET';

VALUE
------------
AL16UTF16

to check if your database configuration is correct, you could run the following in SQL*Plus:

SQL> CREATE TABLE unicodedata (ID NUMBER, unicodestring NVARCHAR2(100)); 

Table created
SQL> INSERT INTO unicodedata VALUES (11, 'Τι κάνεις;');

1 row inserted
SQL> SELECT * FROM unicodedata;

        ID UNICODESTRING
---------- ---------------------------------------------------
        11 Τι κάνεις;