Print lines between start & end pattern, but if end pattern does not exist, don't print
You can accomplish this as follows:
$ sed -e '
/BEGIN/,/END/!d
H;/BEGIN/h;/END/!d;g
' inp
How it works is, for the begin/end range of lines it stores them in hold space. Then deletes till you meet the END line. At which point we recall what is in hold. OTW, we get nothing out. HTH.
cat input |
sed '/\*\*\*\*\* BEGIN \*\*\*\*\*/,/\*\*\*\*\* END *\*\*\*\*/ p;d' |
tac |
sed '/\*\*\*\*\* END \*\*\*\*\*/,/\*\*\*\*\* BEGIN *\*\*\*\*/ p;d' |
tac
It works by having tac
reverse the lines so that sed
can find both delimiters in both orders.
With pcregrep
:
pcregrep -M '(?s)BEGIN.*?END'
That also works if BEGIN and END are on the same line, but not in cases like:
BEGIN 1 END foo BEGIN 2
END
Where pcregrep
catches the first BEGIN 1 END
, but not the second one.
To handle those, with awk
, you could do:
awk '
!inside {
if (match($0, /^.*BEGIN/)) {
inside = 1
remembered = substr($0, 1, RLENGTH)
$0 = substr($0, RLENGTH + 1)
} else next
}
{
if (match($0, /^.*END/)) {
print remembered $0
if (substr($0, RLENGTH+1) ~ /BEGIN/)
remembered = ""
else
inside = 0
} else
remembered = remembered $0 ORS
}'
On an input like:
a
BEGIN blah END BEGIN 1
2
END
b
BEGIN foo END
c
BEGIN
bar
END BEGIN
baz END
d
BEGIN
xxx
It gives:
BEGIN blah END BEGIN 1
2
END
BEGIN foo END
BEGIN
bar
END BEGIN
baz END
Both need to store everything from the BEGIN to the following END in memory. So if you have a huge file whose first line contains BEGIN but without an END, the whole file will be stored in memory for nothing.
The only way around that would be to process the file twice, but of course that could only be done when the input is a regular file (not a pipe for instance).