How can I "grep" patterns across multiple lines?

Here's a sed one that will give you grep-like behavior across multiple lines:

sed -n '/foo/{:start /bar/!{N;b start};/your_regex/p}' your_file

How it works

  • -n suppresses the default behavior of printing every line
  • /foo/{} instructs it to match foo and do what comes inside the squigglies to the matching lines. Replace foo with the starting part of the pattern.
  • :start is a branching label to help us keep looping until we find the end to our regex.
  • /bar/!{} will execute what's in the squigglies to the lines that don't match bar. Replace bar with the ending part of the pattern.
  • N appends the next line to the active buffer (sed calls this the pattern space)
  • b start will unconditionally branch to the start label we created earlier so as to keep appending the next line as long as the pattern space doesn't contain bar.
  • /your_regex/p prints the pattern space if it matches your_regex. You should replace your_regex by the whole expression you want to match across multiple lines.

I generally use a tool called pcregrep which can be installed in most of the linux flavour using yum or apt.

For eg.

Suppose if you have a file named testfile with content

abc blah
blah blah
def blah
blah blah

You can run the following command:

$ pcregrep -M  'abc.*(\n|.)*def' testfile

to do pattern matching across multiple lines.

Moreover, you can do the same with sed as well.

$ sed -e '/abc/,/def/!d' testfile

Simply a normal grep which supports Perl-regexp parameter P will do this job.

$ echo 'abc blah
blah blah
def blah
blah blah' | grep -oPz  '(?s)abc.*?def'
abc blah
blah blah
def

(?s) called DOTALL modifier which makes dot in your regex to match not only the characters but also the line breaks.