Confused by sed output when using N. Can someone explain these results?

First, note that your solution doesn't really work. Consider this test file:

$ cat test1
Network
Administrator Network
Administrator

And then run the command:

$ sed '
 s/Network Administrator/System User/
 N
 s/Network\nAdministrator/System\nUser/
 s/Network Administrator/System User/
 ' test1
System
User Network
Administrator

The problem is that the code does not substitute in for the last Network\nAdministrator.

This solution does work:

$ sed ':a; /Network$/{$!{N;ba}}; s/Network\nAdministrator/System\nUser/g; s/Network Administrator/System User/g' test1
System
User System
User

We can also apply this to your guide.txt:

$ sed ':a; /Network$/{$!{N;ba}}; s/Network\nAdministrator/System\nUser/g; s/Network Administrator/System User/g' guide.txt 
This guide is meant to walk you through a day as a System
User. By the end, hopefully you will be better
equipped to perform your duties as a System User
and maybe even enjoy being a System User that much more.
System User
System User
I'm a System User

The key is to keep reading in lines until you find one that does not end with Network. When that is accomplished, the substitutions can be done.

Compatibility Note: All the above use \n in the replacement text. This requires GNU sed. It will not work on BSD/OSX sed.

[Hat tip to Philippos.]

Multiline version

If it helps clarify, here is the same command split over multiple lines:

$ sed ':a
    /Network$/{
       $!{
           N
           ba
       }
    }
    s/Network\nAdministrator/System\nUser/g
    s/Network Administrator/System User/g
    ' filename

How it works

  1. :a

    This creates a label a.

  2. /Network$/{ $!{N;ba} }

    If this line ends with Network, then, if this is not the last line ($!) read and append the next line (N) and branch back to label a (ba).

  3. s/Network\nAdministrator/System\nUser/g

    Make the substitution with the intermediate newline.

  4. s/Network Administrator/System User/g

    Make the substitution with the intermediate blank.

Simpler solution (GNU only)

With GNU sed (not BSD/OSX), we only need one substitute command:

$ sed -zE 's/Network([[:space:]]+)Administrator/System\1User/g' test1
System
User System
User

And on the guide.txt file:

$ sed -zE 's/Network([[:space:]]+)Administrator/System\1User/g' guide.txt 
This guide is meant to walk you through a day as a System
User. By the end, hopefully you will be better
equipped to perform your duties as a System User
and maybe even enjoy being a System User that much more.
System User
System User
I'm a System User

In this case, -z tells sed to read in up to the first NUL character. Since text files never have a null character, this has the effect of reading the whole file in at once. We can then make the substitution without worrying about missing a line.

This method is not good if the file is huge (usually meaning gigabytes). If it is that large, then reading it all in at once might strain the system RAM.

Solution that works on both GNU and BSD sed

As suggested by Phillipos, the following is a portable solution:

sed 'H;1h;$!d;x;s/Network\([[:space:]]\)Administrator/System\1Us‌​er/g'

As you are learning sed, I'll take the time to add to @John1024's answer:

1) Please note that you are using \n in the replacement string. This works in GNU sed, but is not part of POSIX, so it will insert a backslash and an n in many other seds (using \n in the pattern is portable, btw).

Instead of this I suggest to do s/Network\([[:space:]]\)Administrator/System\1Us‌​er/g: The [[:space:]] will match newline or whitespace, so you don't need two s commands, but combine them in one. By surrounding it with \(...\) you can refer to it in the replacement: The \1 will get replaced by whatever was matched in the first pair of \(\).

2) To properly match patterns over two lines, you should know the N;P;D pattern:

 sed '$!N;s/Network\([[:space:]]\)Administrator/System\1User/g;P;D'

The N is always append the next line (except for the last line, that's why it's "addressed" with $! (=if not last line; you should always consider to preceed N with $! to avoid accidentally ending the script). Then after the replacement the P prints only the first line in the pattern space and the D deletes this line and starts the next cycle with the remains of the pattern space (without reading the next line). This is probably what you originally intended.

Remember this pattern, you will often need it.

3) Another useful pattern for multiline editing, especially when more than two lines are involved: Hold space collecting, as I suggested to John:

sed 'H;1h;$!d;g;s/Network\([[:space:]]\)Administrator/System\1Us‌​er/g'

I repeat it to explain it: H appends each line to the hold space. As this would result in an extra newline before the first line, the first line needs to be moved instead of appended with 1h. The following $!d means "for all lines except the last one, delete the pattern space and start over". Thus, the rest of the script is only executed for the last line. At this point, the whole file is collected in the hold space (so don't use this for very large files!) and the g moves it to the pattern space, so you can do all replacements at once like you can with the -z option of GNU sed.

This is another useful pattern I suggest to keep in mind.

Tags:

Sed