Listing directories based on size from largest to smallest on single line

If you are confident that the directory names do not contain whitespace, then it is simple to get all the directory names on one line:

du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2" "}'

Getting the information into python

If you want to capture that output in a python program and make it into a list. Using python2.7 or better:

import subprocess
dir_list = subprocess.check_output("du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2\" \"}'", shell=True).split()

In python2.6:

import subprocess
subprocess.Popen("du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2\" \"}'", shell=True, stdout=subprocess.PIPE).communicate()[0].split()

We can also take advantage of python's features to reduce the amount of work done by the shell and, in particular, to eliminate the need for awk:

subprocess.Popen("du -sk [a-z]*/ | sort -nr", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()[0].split()[1::2]

One could go further and read the du output directly into python, convert the sizes to integers, and sort on size. It is simpler, though, just to do this with sort -nr in the shell.

Specifying a directory

If the directories whose size you want are not in the current directory, there are two possibilities:

du -sk /some/path/[a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2" "}'

and also:

cd /some/path/ && du -sk [a-z]*/ 2>/dev/null | sort -nr | awk '{printf $2" "}'

The difference between these two is whether /some/path is included in the output or not.


Using paste

du -sk [a-z]* 2>/dev/null | sort -nr| cut -f2- | paste -s -

zsh has the ability to sort its globs using globbing qualifiers. You can also define your own glob qualifiers with functions. For instance:

zdu() REPLY=$(du -s -- "$REPLY")

print -r -- [[:alpha:]]*(/nO+zdu)

would print the directories (/) whose name starts with a letter (btw, [a-z] only makes sense in the C locale) numerically (n) reverse sorted (O) using the zdu function.

Note that when you do:

du -s a b

If a and b contain hardlinks to the same files, their disk usage will be counted for a but not for b. The zsh approach here avoids that.

If you're going to use python, I'd do the same from there: call du -s for each of the files, and sort that list there. Remember that file names can contain any character including space, tab and newline.