how to tell sed "dot match new line"

sed is line-based tool. I don't think these is an option.
You can use h/H(hold), g/G(get).

$ echo -e 'one\ntwo\nthree' | sed -n '1h;1!H;${g;s/one.*two/one/p}'
one
three

Maybe you should try vim

:%s/one\_.*two/one/g

If you use a GNU sed, you may match any character, including line break chars, with a mere ., see :

.
         Matches any character, including newline.

All you need to use is a -z option:

echo -e "one\ntwo\nthree" | sed -z 's/one.*two/one/'
# => one
#    three

See the online sed demo.

However, one.*two might not be what you need since * is always greedy in POSIX regex patterns. So, one.*two will match the leftmost one, then any 0 or more chars as many as possible, and then the rightmost two. If you need to remove one, then any 0+ chars as few as possible, and then the leftmost two, you will have to use perl:

perl -i -0 -pe 's/one.*?two//sg' file             # Non-Unicode version
perl -i -CSD -Mutf8 -0 -pe 's/one.*?two//sg' file # S&R in a UTF8 file 

The -0 option enables the slurp mode so that the file could be read as a whole and not line-by-line, -i will enable inline file modification, s will make . match any char including line break chars, and .*? will match any 0 or more chars as few as possible due to a non-greedy *?. The -CSD -Mutf8 part make sure your input is decoded and output re-encoded back correctly.


You can use python this way:

$ echo -e "one\ntwo\nthree" | python -c 'import re, sys; s=sys.stdin.read(); s=re.sub("(?s)one.*two", "one", s); print s,'
one
three
$

This reads the entire python's standard input (sys.stdin.read()), then substitutes "one" for "one.*two" with dot matches all setting enabled (using (?s) at the start of the regular expression) and then prints the modified string (the trailing comma in print is used to prevent print from adding an extra newline).


This might work for you:

<<<$'one\ntwo\nthree' sed '/two/d'

or

<<<$'one\ntwo\nthree' sed '2d'

or

<<<$'one\ntwo\nthree' sed 'n;d'

or

<<<$'one\ntwo\nthree' sed 'N;N;s/two.//'

Sed does match all characters (including the \n) using a dot . but usually it has already stripped the \n off, as part of the cycle, so it no longer present in the pattern space to be matched.

Only certain commands (N,H and G) preserve newlines in the pattern/hold space.

  1. N appends a newline to the pattern space and then appends the next line.
  2. H does exactly the same except it acts on the hold space.
  3. G appends a newline to the pattern space and then appends whatever is in the hold space too.

The hold space is empty until you place something in it so:

sed G file

will insert an empty line after each line.

sed 'G;G' file

will insert 2 empty lines etc etc.

Tags:

Sed