Formatting a comma-delimited CSV to force Excel to interpret value as a string

For those that have control over the source data, apparently Excel will auto-detect the format of a CSV field unless the CSV column is in this format:

"=""Data Here"""

eg...

20,       5.5%,      "0404 123 351", "3-6",  "=""123"""
[number]  [percent]  [number]        [date]  [string]  <-- how Excel interprets

It also works in Google Spreadsheet, but not sure if other spreadsheet apps support this notation.

If you suspect any the data may contain quotes itself, you need to double-escape them, like this...

"=""She said """"Hello"""" to him"""



(EDIT: Updated with corrections, thanks DMA57361!)


Like many, I have been struggling with the same decisions that Microsoft makes and tried various suggested solutions.

For Excel 2007 the following goes:

  • Putting all values in double quotes does NOT help
  • Putting an = before all values after putting them in double quutes DOES help, BUT makes the csv file useless for most other applications
  • Putting parentheses around the double quotes around all values is rubbish
  • Putting a space before all values before putting double quotes around them DOES prevent conversions to dates, but DOES NOT prevent trimming of leading or trailing zeroes.
  • Putting a single quote in front of a value only works when entering data within Excel.

However:

Putting a tab before all values before putting double quotes around them DOES prevent conversions to dates AND DOES prevent trimming of leading or trailing zeroes and the sheet does not even show nasty warning markers in the upper left corner of each cell.

E.g.:

"<tab character><some value>","<tab character><some other value>"

Note that the tab character has to be within the double quotes. Edit: it turns out that the double quotes are not even necessary.

Double clicking the csv file can open the file as a spreadsheet in Excel showing all values that are treated as just above, like text data. Make sure to set Excel to use the '.' as the decimal point and not the ',' or every line of the csv file will end up as one text in the first cell of each row. Apparently Microsoft thinks that CSV means "Not the decimal point" Separated Value.


Using Excel's import functionality allows you to specify the format (auto, text or date) each column should be interpreted as and does not require any modification to the data files.

You can find it as DataGet External DataFrom Text in Excel 2007/2010.
Or DataImport External DataImport Data in Excel 2003.

Here's an image of the Excel 2003 Text Import Wizard in action on the example data given, showing me importing the latter two columns as text:

Excel 2003: Text Import Wizard on Step 3 - data types