sort every field numerically, varying field counts

I think that you problem is that you do not understand what sort is doing. The basic sort is based on ASCII character values, where numbers are before uppercase which are before lowercase: '1' == 49, 'A' == 65, 'a' = 97. That explains the sort column, where numbers like '23' is sorted before '8 ' which is before 'b b': the ASCII value for '2' is 50, the ASCII value for '8' is 56 and for 'b' is 98.

When sorting numerically (sort -n), non-numeric entries are sorted by the regular method, but interpreted as zero when compared to numbers, like 23 or 8; but since the value is treated as a number, not a character value, '8' is before '23'. So the alphabetic entries will sort before the numeric entries.

Your best bet is to normalize the data so each column has the same type of value: either all numbers or all alpha-numeric, and sort appropriately.

In the last column (sorting by field), it will sort the entries with more fields first since you are explicitly specifying 4 (or more) fields. So (1,2,3) would be before (1,2). Without the -k option, sort takes the line as a whole into account.

You can read more information on the info coreutils sort page.


echo -e "b b 1\n23 44\nb 3\na 7\nb b 2\na 1\nb a 10\nb b 10\nb 1\nb a 1\n18 2\nb 10\n18 15\nb a 2\n23 9\nb 2" \
| sed -r 's/[a-z]/9999&/g' | sort -n -k1 -k2 -k3 | sed 's/9999//g' 
18 2
18 15
23 9
23 44
a 1
b 1
b 2
b 3
a 7
b 10
b a 1
b b 1
b a 2
b b 2
b a 10
b b 10

Is it this, what you want? Sort numerically, if numeric, and numbers before other characters?

I prefix every String with a high number, to put the Strings last by sorting, and remove the high numbers (9999) in the end.