Find where inodes are being used

I saw this question over on stackoverflow, but I didn't like any of the answers, and it really is a question that should be here on U&L anyway.

Basically an inode is used for each file on the filesystem. So running out of inodes generally means you've got a lot of small files laying around. So the question really becomes, "what directory has a large number of files in it?"

In this case, the filesystem we care about is the root filesystem /, so we can use the following command:

{ find / -xdev -printf '%h\n' | sort | uniq -c | sort -k 1 -n; } 2>/dev/null

This will dump a list of every directory on the filesystem prefixed with the number of files (and subdirectories) in that directory. Thus the directory with the largest number of files will be at the bottom.

In my case, this turns up the following:

   1202 /usr/share/man/man1
   2714 /usr/share/man/man3
   2826 /var/lib/dpkg/info
 306588 /var/spool/postfix/maildrop

So basically /var/spool/postfix/maildrop is consuming all the inodes.

*Note, this answer does have three caveats that I can think of. It does not properly handle anything with newlines in the path. I know my filesystem has no files with newlines, and since this is only being used for human consumption, the potential issue isn't worth solving and one can always replace the \n with \0 and use -z options for the sort and uniq commands above as following:

{ find / -xdev -printf '%h\0' |sort -z |uniq -zc |sort -zk1rn; } 2>/dev/null

Optionally you can add head -zn10 to the command to get top 10 most used inodes.

It also does not handle if the files are spread out among a large number of directories. This isn't likely though, so I consider the risk acceptable. It will also count hard links to a same file (so using only one inode) several times. Again, unlikely to give false positives*


The key reason I didn't like any of the answers on the stackoverflow answer is they all cross filesystem boundaries. Since my issue was on the root filesystem, this means it would traverse every single mounted filesystem. Throwing -xdev on the find commands wouldn't even work properly.
For example, the most upvoted answer is this one:

for i in `find . -type d `; do echo `ls -a $i | wc -l` $i; done | sort -n

If we change this instead to

for i in `find . -xdev -type d `; do echo `ls -a $i | wc -l` $i; done | sort -n

even though /mnt/foo is a mount, it is also a directory on the root filesystem, so it'll turn up in find . -xdev -type d, and then it'll get passed to the ls -a $i, which will dive into the mount.

The find in my answer instead lists the directory of every single file on the mount. So basically with a file structure such as:

/foo/bar
/foo/baz
/pop/tart

we end up with

/foo
/foo
/pop

So we just have to count the number of duplicate lines.


This is reposted from here at the asker's behest:

du --inodes -S | sort -rh | sed -n \
        '1,50{/^.\{71\}/s/^\(.\{30\}\).*\(.\{37\}\)$/\1...\2/;p}'

And if you want to stay in the same filesystem you do:

du --inodes -xS

Here's some example output:

15K     /usr/share/man/man3
4.0K    /usr/lib
3.6K    /usr/bin
2.4K    /usr/share/man/man1
1.9K    /usr/share/fonts/75dpi
...
519     /usr/lib/python2.7/site-packages/bzrlib
516     /usr/include/KDE
498     /usr/include/qt/QtCore
487     /usr/lib/modules/3.13.6-2-MANJARO/build/include/config
484     /usr/src/linux-3.12.14-2-MANJARO/include/config

NOW WITH LS:

Several people mentioned they do not have up-to-date coreutils and the --inodes option is not available to them. So, here's ls:

ls ~/test -AiR1U | 
sed -rn '/^[./]/{h;n;};G;
    s|^ *([0-9][0-9]*)[^0-9][^/]*([~./].*):|\1:\2|p' | 
sort -t : -uk1.1,1n |
cut -d: -f2 | sort -V |
uniq -c |sort -rn | head -n10

If you're curious, the heart-and-soul of that tedious bit of regex there is replacing the filename in each of ls's recursive search results with the directory name in which it was found. From there it's just a matter of squeezing repeated inode numbers then counting repeated directory names and sorting accordingly.

The -U option is especially helpful with the sorting in that it specifically does not sort, and instead presents the directory list in original order - or, in other words, by inode number.

And of course -1 is incredibly helpful in that it ensures a single result per line, regardless of possibly included newlines in filenames or other spectacularly unfortunate problems that might occur when you attempt to parse a list.

And of course -A for all and -i for inode and -R for recursive and that's the long and short of it.

The underlying method to this is that I replace every one of ls's filenames with its containing directory name in sed. Following on from that... Well, I'm a little fuzzy myself. I'm fairly certain it's accurately counting the files, as you can see here:

% _ls_i ~/test
> 100 /home/mikeserv/test/realdir
>   2 /home/mikeserv/test
>   1 /home/mikeserv/test/linkdir

This is providing me pretty much identical results to the du command:

DU:

15K     /usr/share/man/man3
4.0K    /usr/lib
3.6K    /usr/bin
2.4K    /usr/share/man/man1
1.9K    /usr/share/fonts/75dpi
1.9K    /usr/share/fonts/100dpi
1.9K    /usr/share/doc/arch-wiki-markdown
1.6K    /usr/share/fonts/TTF
1.6K    /usr/share/dolphin-emu/sys/GameSettings
1.6K    /usr/share/doc/efl/html

LS:

14686   /usr/share/man/man3:
4322    /usr/lib:
3653    /usr/bin:
2457    /usr/share/man/man1:
1897    /usr/share/fonts/100dpi:
1897    /usr/share/fonts/75dpi:
1890    /usr/share/doc/arch-wiki-markdown:
1613    /usr/include:
1575    /usr/share/doc/efl/html:
1556    /usr/share/dolphin-emu/sys/GameSettings:

I think the include thing just depends on which directory the program looks at first - because they're the same files and hardlinked. Kinda like the thing above. I could be wrong about that though - and I welcome correction...

DU DEMO

% du --version
> du (GNU coreutils) 8.22

Make a test directory:

% mkdir ~/test ; cd ~/test
% du --inodes -S
> 1       .

Some children directories:

% mkdir ./realdir ./linkdir
% du --inodes -S
> 1       ./realdir
> 1       ./linkdir
> 1       .

Make some files:

% printf 'touch ./realdir/file%s\n' `seq 1 100` | . /dev/stdin
% du --inodes -S
> 101     ./realdir
> 1       ./linkdir
> 1       .

Some hardlinks:

% printf 'n="%s" ; ln ./realdir/file$n ./linkdir/link$n\n' `seq 1 100` | 
    . /dev/stdin
% du --inodes -S
> 101     ./realdir
> 1       ./linkdir
> 1       .

Look at the hardlinks:

% cd ./linkdir
% du --inodes -S
> 101

% cd ../realdir
% du --inodes -S
> 101

They're counted alone, but go one directory up...

% cd ..
% du --inodes -S
> 101     ./realdir
> 1       ./linkdir
> 1       .

Then I ran my ran script from below and:

> 100     /home/mikeserv/test/realdir
> 100     /home/mikeserv/test/linkdir
> 2       /home/mikeserv/test

And Graeme's:

> 101 ./realdir
> 101 ./linkdir
> 3 ./

So I think this shows that the only way to count inodes is by inode. And because counting files means counting inodes, you cannot doubly count inodes - to count files accurately inodes cannot be counted more than once.


I used this answer from SO Q&A titled: Where are all my inodes being used? when our NAS ran out about 2 years ago:

$ find . -type d -print0 \
    | while IFS= read -rd '' i; do echo $(ls -a "$i" | wc -l) "$i"; done \
    | sort -n

Example

$ find . -type d -print0 \
    | while IFS= read -rd '' i; do echo $(ls -a "$i" | wc -l) "$i"; done \
    | sort -n
...
110 ./MISC/nodejs/node-v0.8.12/out/Release/obj.target/v8_base/deps/v8/src
120 ./MISC/nodejs/node-v0.8.12/doc/api
123 ./apps_archive/monitoring/nagios/nagios-check_sip-1.3/usr/lib64/nagios
208 ./MISC/nodejs/node-v0.8.12/deps/openssl/openssl/doc/crypto
328 ./MISC/nodejs/node-v0.8.12/deps/v8/src
453 ./MISC/nodejs/node-v0.8.12/test/simple

Checking device's Inodes

Depending on your NAS it may not offer a fully featured df command. So in these cases you can resort to using tune2fs instead:

$ sudo tune2fs -l /dev/sda1 |grep -i inode
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super huge_file uninit_bg dir_nlink extra_isize
Inode count:              128016
Free inodes:              127696
Inodes per group:         2032
Inode blocks per group:   254
First inode:              11
Inode size:           128
Journal inode:            8
Journal backup:           inode blocks

Crossing filesystem boundaries

You can use the -xdev switch to direct find to narrow it's search to only the device where you're initiating the search.

Example

Say I have my /home directory automounting via NFS shares from my NAS, whose name is mulder.

$ df -h /home/sam 
Filesystem            Size  Used Avail Use% Mounted on
mulder:/export/raid1/home/sam
                      917G  572G  299G  66% /home/sam

Notice that the mount point is still considered local to the system.

$ df -h /home/ .
Filesystem            Size  Used Avail Use% Mounted on
-                        0     0     0   -  /home
/dev/mapper/VolGroup00-LogVol00
                      222G  159G   52G  76% /

Now when I initiate find:

$ find / -xdev  | grep '^/home'
/home

It found /home but none of the automounted contents because they're on a different device!

Filesystem types

You can utilize the switch to find, -fstype to control which type's of filesystems find will look into.

   -fstype type
          File is on a filesystem of type type.  The valid filesystem types 
          vary among different versions of Unix; an incomplete list of 
          filesystem  types that are accepted on some version of Unix or 
          another is: ufs, 4.2, 4.3, nfs, tmp, mfs, S51K, S52K.  You can use 
          -printf with the %F directive to see the types of your
          filesystems.

Example

What filesystem's do I have?

$ find . -printf "%F\n" | sort -u
ext3

So you can use this to control the crossing:

only ext3

$ find . -fstype ext3 | head -5
.
./gdcm
./gdcm/gdcm-2.0.16
./gdcm/gdcm-2.0.16/Wrapping
./gdcm/gdcm-2.0.16/Wrapping/CMakeLists.txt

only nfs

$ find . -fstype nfs | head -5
$ 

ext3 & ext4

$ find . -fstype ext3 -o -fstype ext4 | head -5
.
./gdcm
./gdcm/gdcm-2.0.16
./gdcm/gdcm-2.0.16/Wrapping
./gdcm/gdcm-2.0.16/Wrapping/CMakeLists.txt