Grouped sorting of continuous paragraphs (separated by blank line)?

awk -v RS= -v cmd=sort '{print | cmd; close(cmd); print ""}' file

Setting the record separator RS to an empty string makes awk step in paragraphs at a time. For each paragraph, pipe the paragraph (in $0) to cmd (which is set to sort) and print the output. Print out a blank line to separate the output paragraphs with a print "".

If we're giving perl examples, then I present an alternative approach than that of Stephane's:

perl -e 'undef $/; print join "\n", sort (split /\n/), "\n" 
    foreach(split(/\n\n/, <>))' < file

Unset the field separator (undef $/), this allows us to use <> and get the whole of STDIN. We then split that around \n\n (paragraphs). foreach "paragraph", sort the lines by splitting around newlines, sorting and then joining them back together and tacking on a trailing \n.

However, this has one side effect of adding a "trailing paragraph" separator on the last paragraph (if it didn't have one before). You can get around that with the slightly less pretty:

perl -e 'undef $/; print join "\n", sort (split /\n/) , (\$_ == \$list[-1] ? "" : "\n")
    foreach(@list = split(/\n\n/, <>))' < file

This assigns the paragraphs to @list, and then there is a "ternary operation" to check if it is the last element of the foreach (the \$_ == \$list[-1] check). print "" if it is (? ...), else (: ...) print "\n" for all other "paragraphs" (elements of @list).


Drav's awk solution is good, but that means running one sort command per paragraph. To avoid that, you could do:

< file awk -v n=0 '!NF{n++};{print n,$0}' | sort -k1n -k2 | cut -d' ' -f2-

Or you could do the whole thing in perl:

perl -ne 'if (/\S/){push@l,$_}else{print sort@l if@l;@l=();print}
          END{print sort @l if @l}' < file

Note that above, separators are blank lines (for the awk one, lines with only space or tab characters, for the perl one, any horizontal or vertical spacing character) instead of empty lines. If you do want empty lines, you can replace !NF with !length or $0=="", and /\S/ with /./.


I wrote a tool in haskell that allows you to use sort, shuf, tac or any other command on paragraphs of text.

https://gist.github.com/siers/01306a361c22f2de0122
EDIT: the tool is also included in this repo: https://github.com/siers/haskell-import-sort

It splits the text into blocks, joins the subblocks with \0 char, pipes through the command and finally does the same thing in reverse.

28-08-2015: I found an other, personal use for this tool — selecting N paragraphs after a line.

paramap grep -aA2 '^reddit usernames' < ~/my-username-file
reddit usernames

foo
bar
baz

a couple
more of these