Print only the Nth line before each line that matches a pattern

A buffer of lines needs to be used.

Give a try to this:

awk -v N=4 -v pattern="example.*pattern" '{i=(1+(i%N));if (buffer[i]&& $0 ~ pattern) print buffer[i]; buffer[i]=$0;}' file

Set N value to the Nth line before the pattern to print.

Set patternvalue to the regex to search.

buffer is an array of N elements. It is used to store the lines. Each time the pattern is found, the Nth line before the pattern is printed.


That code doesn't work for previous lines. To get lines before the matched pattern, you need to somehow save the lines already processed. Since awk only has associative arrays, I can't think of an equally simple way of doing what you want in awk, so here's a perl solution:

perl -ne 'push @lines,$_; print $lines[0] if /PAT/; shift(@lines) if $.>LIM;' file 

Change PAT to the pattern you want to match and LIM to the number of lines. For example, to print the 5th line before each occurrence of foo, you would run:

perl -ne 'push @lines,$_; print $lines[0] if /foo/; shift(@lines) if $.>5;' file 

Explanation

  • perl -ne : read the input file line by line and apply the script given by -e to each line.
  • push @lines,$_ : add the current line ($_) to the array @lines.
  • print $lines[0] if /PAT/ : print the first element in the array @lines ($lines[0]) if the current line matches the desired pattern.
  • shift(@lines) if $.>LIM; : $. is the current line number. If that is greater than the limit, remove the 1st value from the array @lines. The result is that @lines will always have the last LIM lines.

tac file | awk 'c&&!--c;/pattern/{c=N}' | tac

But this has the same omission as the 'forwards' use case when there are multiple matches within N lines of each other.

And it won't work so well when the input is piped from a running process, but it's the simplest way when the input file is complete and not growing.