grep count multiple occurrences

You can get what you need just by using grep, sort and uniq.

grep -EIho 'alfa|beta|gamma' *|sort|uniq -c

I don't think grep is capable of what you want to do.

Just use awk instead:-)

This solution may not work well for large files (is not optimized). And works for plain words only - not regexps. But it's easy to add some features if so desired.

Low end version with restrictions outlined in comments below:

awk '
{
    split($0, b); for (i in b) ++A[b[i]]
}
END {
    split("'"$*"'", a)
    for (i in a) print sprintf("%s %d", a[i], A[a[i]])
}
'

just give the search strings directly to the script

[EDIT]
fixed version with regex support (see comment below). Please tell me if there still are any open issues.

# ---- my favorite ----
awk -F' ?-c ' '
BEGIN { split("'"$*"'", a) }
{ for (i = 2; a[i]; ++i) if (match($0, a[i])) ++A[i] }
END { for (i = 2; a[i]; ++i) if (A[i]) print a[i] " " A[i] }
'
# ---- my favorite ----

sample usage:

script_name -c alfa -c beta -c gamma << !
alfa
beta
gamma
gamma
!

gives:

alfa 1
beta 1
gamma 2

regex usage:

script_name -c   "^al"    -c "beta" -c gamma -c "m.$" << !
alfa
beta
gamma
gamma
!

gives:

^al 1
beta 1
gamma 2
m.$ 2

[/EDIT]


Another awk solution, with shell script wrapper thrown in:

#!/bin/sh –
awk '
BEGIN { split("alfa beta gamma", keyword)
        for (i in keyword) count[keyword[i]]=0
}
/alfa/  { count["alfa"]++ }
/beta/  { count["beta"]++ }
/gamma/ { count["gamma"]++ }
END   {
        for (i in keyword) print keyword[i], count[keyword[i]]
}'

If you want to be able to choose the search keywords at runtime (and provide them as arguments, as in sparkie’s answer), this script can be adapted to build the awk script dynamically.

Tags:

Bash

Grep