How to read \" double-quote escaped values with read.table in R

My apologies ahead of time that this isn't more detailed -- I'm right in the middle of a code crunch.

You might consider using the scan() function. I created a simple sample file "sample.csv," which consists of:

V1,V2
"_:b5507F4C7x59005","Fabiana D\"atri"

Two quick possibilities are (with output commented so you can copy-paste to the command line):

test <- scan("sample.csv", sep=",", what='character',allowEscapes=TRUE)
## Read 4 items
test
##[1] "V1"                "V2"                "_:b5507F4C7x59005"
##[4] "Fabiana D\\atri\n"

or

test <- scan("sample.csv", sep=",", what='character',comment.char="\\")
## Read 4 items
test
## [1] "V1"                "V2"                "_:b5507F4C7x59005"
## [4] "Fabiana D\\atri\n"

You'll probably need to play around with it a little more to get what you want. And I see that you've already mentioned writeLines, so you may have already tried this. Either way, good luck!


It seems to me that read.table/read.csv cannot handle escaped quotes.

...But I think I have an (ugly) work-around inspired by @nullglob;

  • First read the file WITHOUT a quote character. (This won't handle embedded , as @Ben Bolker noted)
  • Then go though the string columns and remove the quotes:

The test file looks like this (I added a non-string column for good measure):

13,"foo","Fab D\"atri","bar"
21,"foo2","Fab D\"atri2","bar2"

And here is the code:

# Generate test file
writeLines(c("13,\"foo\",\"Fab D\\\"atri\",\"bar\"",
             "21,\"foo2\",\"Fab D\\\"atri2\",\"bar2\"" ), "foo.txt")

# Read ignoring quotes
tbl <- read.table("foo.txt", as.is=TRUE, quote='', sep=',', header=FALSE, row.names=NULL)

# Go through and cleanup    
for (i in seq_len(NCOL(tbl))) {
    if (is.character(tbl[[i]])) {
        x <- tbl[[i]]
        x <- substr(x, 2, nchar(x)-1) # Remove surrounding quotes
        tbl[[i]] <- gsub('\\\\"', '"', x) # Unescape quotes
    }
}

The output is then correct:

> tbl
  V1   V2          V3   V4
1 13  foo  Fab D"atri  bar
2 21 foo2 Fab D"atri2 bar2

On Linux/Unix (or on Windows with cygwin or GnuWin32), you can use sed to convert the escaped double quotes \" to doubled double quotes "" which can be handled well by read.csv:

p <- pipe(paste0('sed \'s/\\\\"/""/g\' "', FILENAME, '"'))
d <- read.csv(p, ...)
rm(p)

Effectively, the following sed command is used to preprocess the CSV input:

sed 's/\\"/""/g' file.csv

I don't call this beautiful, but at least you don't have to leave the R environment...