Print lines between start & end pattern, but if end pattern does not exist, don't print

You can accomplish this as follows:

$ sed -e '
    /BEGIN/,/END/!d
    H;/BEGIN/h;/END/!d;g
' inp

How it works is, for the begin/end range of lines it stores them in hold space. Then deletes till you meet the END line. At which point we recall what is in hold. OTW, we get nothing out. HTH.


cat input |
sed '/\*\*\*\*\* BEGIN \*\*\*\*\*/,/\*\*\*\*\* END *\*\*\*\*/ p;d' | 
tac |
sed '/\*\*\*\*\* END \*\*\*\*\*/,/\*\*\*\*\* BEGIN *\*\*\*\*/ p;d' |
tac

It works by having tac reverse the lines so that sed can find both delimiters in both orders.


With pcregrep:

pcregrep -M '(?s)BEGIN.*?END'

That also works if BEGIN and END are on the same line, but not in cases like:

BEGIN 1 END foo BEGIN 2
END

Where pcregrep catches the first BEGIN 1 END, but not the second one.

To handle those, with awk, you could do:

awk '
  !inside {
    if (match($0, /^.*BEGIN/)) {
      inside = 1
      remembered = substr($0, 1, RLENGTH)
      $0 = substr($0, RLENGTH + 1)
    } else next
  }
  {
    if (match($0, /^.*END/)) {
      print remembered $0
      if (substr($0, RLENGTH+1) ~ /BEGIN/)
        remembered = ""
      else
        inside = 0
    } else
      remembered = remembered $0 ORS
  }'

On an input like:

a
BEGIN blah END BEGIN 1
2
END
b
BEGIN foo END
c
BEGIN
bar
END BEGIN
baz END
d
BEGIN
xxx

It gives:

BEGIN blah END BEGIN 1
2
END
BEGIN foo END
BEGIN
bar
END BEGIN
baz END

Both need to store everything from the BEGIN to the following END in memory. So if you have a huge file whose first line contains BEGIN but without an END, the whole file will be stored in memory for nothing.

The only way around that would be to process the file twice, but of course that could only be done when the input is a regular file (not a pipe for instance).

Tags:

Linux

Sed