Remove comma between the quotes only in a comma delimited file

If the quotes are balanced, you will want to remove commas between every other quote, this can be expressed in awk like this:

awk -F'"' -v OFS='' '{ for (i=2; i<=NF; i+=2) gsub(",", "", $i) } 1' infile

Output:

123,ABC DEV 23,345,534.202,NAME

Explanation

The -F" makes awk separate the line at the double-quote signs, which means every other field will be the inter-quote text. The for-loop runs gsub, short for globally substitute, on every other field, replacing comma (",") with nothing (""). The 1 at the end invokes the default code-block: { print $0 }.


There is a good response, using sed simply one time with a loop:

echo '123,"ABC, DEV 23",345,534,"some more, comma-separated, words",202,NAME'|
  sed ':a;s/^\(\([^"]*,\?\|"[^",]*",\?\)*"[^",]*\),/\1 /;ta'
123,"ABC  DEV 23",345,534,"some more  comma-separated  words",202,NAME

Explanation:

  • :a; is a label for furter branch
  • s/^\(\([^"]*,\?\|"[^",]*",\?\)*"[^",]*\),/\1 / could contain 3 enclosed parts
    • first the 2nd: [^"]*,\?\|"[^",]*",\? match for a string containing no double quote, maybe followed by a coma or a string enclosed by two double quote, without coma and maybe followed by a coma.
    • than the first RE part is composed by as many repetition of previously described part 2, followed by 1 double quote and some caracteres, but no double-quote, nor comas.
    • The first RE part as to be followed by a coma.
    • Nota, the rest of the line don't need to be touched
  • ta will loop to :a if previous s/ command did some change.

Once loop done, you could even add s/ */ /g:

echo '123,"ABC, DEV 23",345,534,"some more, comma-separated, words",202,NAME'|
    sed ':a;s/^\(\([^"]*,\?\|"[^",]*",\?\)*"[^",]*\),/\1 /;ta;s/  */ /g'

will suppress double spaces:

123,"ABC DEV 23",345,534,"some more comma-separated words",202,NAME

A general solution that can also handle several commas between balanced quotes needs a nested substitution. I implement a solution in perl, which process every line of a given input and only substitute commas in every other pair of quotes:

perl -pe 's/ "  (.+?  [^\\])  "               # find all non escaped 
                                              # quoting pairs
                                              # in a non-greedy way

           / ($ret = $1) =~ (s#,##g);         # remove all commas within quotes
             $ret                             # substitute the substitution :)
           /gex'

or in short

perl -pe 's/"(.+?[^\\])"/($ret = $1) =~ (s#,##g); $ret/ge'

You can either pipe the text you want to process to the command or specify the textfile to be processed as last command line argument.