Sort text files with multiple lines as a row

msort(1) was designed to be able to sort files with multi-line records. It has an optional gui, as well as a normal and usable-for-humans command line version. (At least, humans that like to read manuals carefully and look for examples...)

AFAICT, you can't use an arbitrary pattern for records, so unless your records are fixed-size (in bytes, not characters or lines). msort does have a -b option for records that are blocks of lines separated by blank lines.

You can transform your input into a format that will work with -b pretty easily, by putting a blank line before every ###... (except the first one).

By default, it prints statistics on stderr, so at least it's easy to tell when it didn't sort because it thought the entire input was a single record.


msort works on your data. The sed command prepends a newline to every #+ line except for line 1. -w sorts the whole record (lexicographically). There are options for picking what part of a record to use as a key, but I didn't need them.

I also left out stripping the extra newlines.

$ sed '2,$ s/^#\+/\n&/' unsorted.records | msort -b -w 2>/dev/null 
####################################
KEY1
VAL11
VAL12
VAL13
VAL14

####################################
KEY2
VAL21
VAL22
VAL23
VAL24

####################################
KEY3
VAL31
VAL32
VAL33
VAL34

I didn't have any luck with -r '#' to use that as the record separator. It thought the whole file was one record.


A solution is to first change the line feeds inside a block to a unused character of your choice ('|' in the example below), to sort the result and to change back the chosen separator to the original line feed:

sed -e 'N; N; N; N; N; s/\n/|/g' file.txt \
| sort -k2,2 -t\| \
| sed 's/|/\n/g'

perl -0ne 'print sort /(#+[^#]*)/g' file.txt
  • perl -0 slurps the entire file
  • /(....)/g match and extract the records
  • print sort ... sort and print them