Remove only the commas present within the double quotes

If perl is OK, here is a short (and probably fast, if not necessarily simple :) ) way of doing it:

perl -pe 's:"(\d[\d,]+)":$1=~y/,//dr:eg' file

The e flag to the s::: operator (which is just another way of writing s///) causes the replacement to be treated as an expression which is evaluated every time. That expression takes the $1 capture from the regex (which is already missing the quotes) and translates (y///, which can also be written as tr///) it by deleting (/d) all the commas. The r flag to y is necessary in order to get the value to be the translated string, instead of the count of translations.

For those who somehow feel sullied by perl, here is the python equivalent. Python is really not a shell one-liner tool, but sometimes it can be cajoled into co-operating. The following can be written as one line (unlike for loops, which cannot be), but the horizontal scrolling makes it (even more) unreadable:

python -c '
import re;
import sys;
r=re.compile("\"(\d+(,\d+)*)\"");
all(not sys.stdout.write(r.sub(lambda m:m.group(1).replace(",",""),l))
    for l in sys.stdin)
' < file

This (adapted from here) should do what you need though @rici's Perl one is much simpler:

$ sed -r ':a;s/(("[0-9,]*",?)*"[0-9,]*),/\1/;ta; s/""/","/g; 
          s/"([0-9]*)",?/\1,/g ' file
56,72,123454,x,y,"foo,a,b,bar"
56,92,1234,x,y,"foo,a,b,bar"
56,72,12345478765467,x,y,"foo,a,b,bar"
56,72,x,y,"foo,a,b,bar",123454,
56,72,x,y,"foo,a,b,bar",123454,45578492,"bar,foo"

Explanation

  • :a : define a label called a.
  • s/(("[0-9,]*",?)*"[0-9,]*),/\1/ : This one needs to be broken down
    • First of all, using this construct : (foo(bar)), \1 will be foobar and \2 will be bar.
    • "[0-9,]*",? : match 0 or more of 0-9 or ,, followed by 0 or 1 ,.
    • ("[0-9,]*",?)* : match 0 or more of the above.
    • "[0-9,]* : match 0 or more of 0-9 or , that come right after a "
  • ta; : go back to the label a and run again if the substitution was successful.
  • s/""/","/g; : post-processing. Replace "" with ",".
  • s/"([0-9]*)",?/\1,/g : remove all quotes around numbers.

This might be easier to understand with another example:

$ echo '"1,2,3,4"' | sed -nr ':a;s/(("[0-9,]*",?)*"[0-9,]*),/\1/;p;ta;'
"1,2,34"
"1,234"
"1234"
"1234"

So, while you can find a number that is right after a quote and followed by a comma and another number, join the two numbers together and repeat the process until it is no longer possible.

At this point I believe it is useful to mention a quote from info sed that appears in the section describing advanced functions such as the label used above (thanks for finding if @Braiam):

In most cases, use of these commands indicates that you are probably better off programming in something like `awk' or Perl.


For CSV data, I'd use a language with a real CSV parser. For example with Ruby:

ruby -rcsv -pe '
  row = CSV::parse_line($_).map {|e| e.delete!(",") if e =~ /^[\d,]+$/; e} 
  $_  = CSV::generate_line(row)
' <<END
56,72,"12,34,54",x,y,"foo,a,b,bar"
56,92,"12,34",x,y,"foo,a,b,bar"
56,72,"12,34,54,78,76,54,67",x,y,"foo,a,b,bar"
56,72,x,y,"foo,a,b,bar","12,34,54"
56,72,x,y,"foo,a,b,bar","12,34,54","45,57,84,92","bar,foo"
END
56,72,123454,x,y,"foo,a,b,bar"
56,92,1234,x,y,"foo,a,b,bar"
56,72,12345478765467,x,y,"foo,a,b,bar"
56,72,x,y,"foo,a,b,bar",123454
56,72,x,y,"foo,a,b,bar",123454,45578492,"bar,foo"