How to grep lines which have more than specific number of special characters

Perl solution:

perl -ne 'print if tr/,// > 4'
  • -n reads the file line by line
  • the tr operator returns the number of matches.

To print the lines with less than 4, just change > to <.


Using the grep command:

grep -E '(,.*){5}' myfile

does the job. Explanation:

-E: use an Extended Regex...

'(,.*): ... to find one comma followed by any number of characters, even zero...

{5}': ... and repeat the previous pattern 5 times.

If you want to grep lines with less than 4 commas, you'd need:

grep -xE '([^,]*,){0,3}[^,]*' myfile

This time, we need -x so the pattern is anchored at both start and end of the line so it matches the full line. And we use [^,]* instead of .* as the latter would otherwise happily match strings containing ,s as . matches any character.

Another approach is to reverse with -v the previous approach. "Fewer that 4" is the same as not "at least 4", so:

grep -vE '(,.*){4}' myfile

The awk version:

awk -F, 'NF > 5' myfile