Why can I not group sed commands after an address in a block?

To output all lines of a file until the matching of a particular pattern (and to not output that matching line), you may use

sed -n '/PATTERN/q; p;' file

Here, the default output of the pattern space at the end of each cycle is disabled with -n. Instead we explicitly output each line with p. If the given pattern matches, we halt processing with q.

Your actual, longer, command, which changes the name of chromosome 21 from just 21 to chr21 on the first line of a fasta file, and then proceeds to extract the DNA for that chromosome until it hits the next fasta header line, may be written as

sed -n -e '1 { s/^>21/>chr21/p; d; }' \
       -e '/^>/q' \
       -e p <in.fasta >out.fasta

or

sed -n '1 { s/^>21/>chr21/p; d; }; /^>/q; p' <in.fasta >out.fasta

The issue with your original expression is that the d starts a new cycle (i.e., it forces the next line to be read into the pattern space and there's a jump to the start of the script). This means q would never be executed.

Note that to be syntactically correct on non-GNU systems, your original script should look like /PATTERN/ { d; q; }. Note the added ; after q (the spaces are not significant).


d does not just delete the pattern space: from the POSIX specification

[2addr]d

Delete the pattern space and start the next cycle.

(my emphasis)

The q command is unreachable.

Tags:

Sed