sed ranges aren't always able to match only one line

Yes, that's an annoying thing about sed (see the sed FAQ about that). Since you're using GNU sed (-r is GNU specific), you can do:

 sed -En "0,/$1/p"

(I prefer -E over -r as it's also supported by some other seds like FreeBSDs and is consistent with grep and a few other tools (and is going to be in the next issue of the POSIX/Single UNIX Specification standards)).

A better alternative (and portable) would be:

sed "/$1/q"

To tell sed to quit (and stop reading) after the first match.

Note that awk doesn't have the issue so you can write:

PATTERN=$1 awk 'NR==1, $0 ~ ENVIRON["PATTERN"]'

(though like for sed, you'd rather write):

PATTERN=$1 awk '1; $0 ~ ENVIRON["PATTERN"] {exit}'

It's normal behavior of sed. From POSIX sed documentation:

Addresses in sed

An address is either a decimal number that counts input lines cumulatively across files, a '$' character that addresses the last line of input, or a context address (which consists of a BRE, as described in Regular Expressions in sed , preceded and followed by a delimiter, usually a slash).

An editing command with no addresses shall select every pattern space.

An editing command with one address shall select each pattern space that matches the address.

An editing command with two addresses shall select the inclusive range from the first pattern space that matches the first address through the next pattern space that matches the second. (If the second address is a number less than or equal to the line number first selected, only one line shall be selected.) Starting at the first line following the selected range, sed shall look again for the first address. Thereafter, the process shall be repeated. Omitting either or both of the address components in the following form produces undefined results:

[address[,address]]

You can see, sed will print the inclusive range from the first address through the next matched address.

In your case, 1,/1/p, sed print the first line because it matches address 1. Then from second line, sed will search for second address that matches pattern /1/. And stop print if found. Because from second line, you don't have any pattern that matches /1/, so sed print the rest.

In case with 1./2/p, sed print the first line as above, then second line match pattern /2/, sed print it and repeat action for the rest. But You can not match address 1 for the rest, so sed does not print anything.

An example:

$ echo 1 2 3 1 4 1 | tr ' ' $'\n' | sed -rn '1,/1/p'
1
2
3
1

Because you use GNU sed, you can use form 0,addr2:

0,addr2
              Start  out  in  "matched  first  address"  state, until addr2 is
              found.  This is similar to 1,addr2, except that if addr2 matches
              the very first line of input the 0,addr2 form will be at the end
              of its range, whereas the 1,addr2 form  will  still  be  at  the
              beginning of its range.  This works only when addr2 is a regular
              expression.

So, your command become:

seq 1 4 | tr ' ' $'\n' | sed -rn '0,/'"$1"'/p'

Then:

$ ./1.sh 1
1

There are several things you can do. For instance, your comment indicates you mean to:

...delete everything from the start of the file to some particular line, and that line happens to be the first one...

You can do this like:

sed -n "/$1"'/,$p'

You just reverse the form. The above command will only print from some particular line til end of file.

If you want to not print that particular line...

sed -n "/$1"'/,$p' | sed 1d

...should do the trick...

Else you could just address the line and take the cycle into your own hands.

seq 20 | sed -ne"/$1"'/!d;:B' -e'n;p;bB'
seq 20 | sed -n "/$1"'/!d;h;n;G;P;D'

Both commands delete every incoming line until encountering a $1 pattern.

The first one then overwrites pattern-space with the next input line and sets up a :b label. It then prints the line and overwrites pattern-space again with the next-line, before branching back to the :b label. It loops in this way until end of file. This command is probably faster than the second - it does less.

The second one overwrites hold-space with the $1 match. It then also overwrites pattern-space with the next line on input. Next it Gets hold space and appends it the input line it just pulled in - thus reversing the order of the two lines and delimiting between them with a newline character. Something like:

line 1 > hold space
line 2 > line 1
hold space >> line 2
= line 2 \n line 1

At this point sed Prints only up to the first \newline character occurring in pattern-space and Deletes same before restarting the cycle with the remainder - which winds up always being the line that first matched $1 for every line. So the first line matching $1 is always in pattern-space but never printed.

And so if $1 is 5 the following is printed:

sed ranges aren't always able to match only one line

Tags:

Shell

Sed

Related

Recent Posts