How to count the occurrence of specific string on a specific line in a file?

This needs to be done in three steps:

  1. Select line number N (example uses line 42):

    sed '42!d'
    
  2. Search the line for all occurrences of a specific pattern (here the string/regular expression hello) and print those separately:

    grep -o 'hello'
    
  3. Count the matches:

    wc -l
    

Or to put it in one single command pipe, reading from file.txt:

sed '42!d' file.txt | grep -o 'hello' | wc -l

This is a good use case for putting Unix tools together in a pipeline.

line=5
str="ipsum"
sed -n "${line}p" filename | grep -o -- "$str" | wc -l

The sed p command outputs the given line of the file, and feeds it into grep. Grep's -o option tells it to output all the matches for the given string, and each match is output on a separate line. Grep's output is fed to wc, which counts the number of lines.


Python

Here's one way to do it in Python via list comprehension (see below for alternative shorter version).

$ python -c 'import sys;print([ l for i,l in enumerate(sys.stdin,1) if i==2][0].count("word"))' < input.txt                                          
3
$ cat input.txt
nothing here
word and another word, and one more word
last line

How this works:

  • we run python interpreter with -c flag, were commands are contained within single quotes;
  • the input file input.txt is redirected into stdin stream of the python interpreter via < shell operator. Hence we need sys module.
  • Using list comprehension structure [something for item in something], we read lines of text from sys.stdin.
  • enumerate(sys.stdin,1) allows us to count enumerate the lines, i.e. with each iteration of list comprehension, we'll get the line of text into l variable and index into i variable starting the count at 1.
  • The i==2 will filter out only line which index equals to 2. That's how we know which line to extract.
  • Thus as a result our list will contain only one item, and within the list its index is 0. So, we refer to that item as [<list comprehension stuff here>][0]. -The .count("word") is what actually does the job of counting. By definition it returns a number of non-overlapping occurrences of a substring in a string.
  • finally all of that stuff was contained within print() statement. So whatever number the .count() method returns will show up on screen.

Shorter version

The shorter way to do the same in Python would be to use readlines() method instead of list comprehension, and refer to specific item in the list that readlines() produces. Note, that readlines() produces a list, and lists in Python are 0-indexed, which means if you want to read line x, you should reference list item x-1. For instance,

$ python -c 'import sys;print(sys.stdin.readlines()[1].count("word"))' < input.txt       
3

sed+grep

Of course, we don't have to stick with scripting languages alone. sed and grep provide sufficient tools which we can use to suit our needs. With grep -c we can count occurrence of matched lines, so all we have to do is extract the specific line we need, and split all words in that line into separate lines. Like so:

$ sed -n  '2{s/ /\n/g;p}' input.txt | grep -c 'word'
3