How to delete millions of files without disturbing the server

Solution 1:

Make a bash script like this:

rm -- "$*"
sleep 0.5

Save it with name for example. Run chmod u+x to make it executable.

This script deletes all files passed to it as arguments, and then sleeps 0.5 seconds.

Then, you can run

find cache.bak -print0 | xargs -0 -n 5

This command retrieves a list of all files in cache.bak and passes the five filenames at a time to the delete script.

So, you can adjust how many files are deleted at a time, and how long a delay is between each delete operation.

Solution 2:

You should consider saving your cache on a separate filesystem that you can mount/unmount as someone stated in comments. Until you do, you can use this one liner /usr/bin/find /path/to/files/ -type f -print0 -exec sleep 0.2 \; -exec echo \; -delete assuming your find binary is located under /usr/bin and you want to see the progress on screen. Adjust the sleep accordingly, so you don't over stress your HDD.

Solution 3:

You may want to try ionice on a script consuming a the output of a find command. Something like the following:

ionice -c3 $(
for file in find cache.bak -type f; do
    rm $file
for dir in find cache.bak -depthe -type d -empty; do
    rmdir $dir

Depending on the filesystem each file delete may result in rewriting that entire directory. For large directories that can be quite a hit. There are additional updates required to the inode table, and possibly a free space list.

If the file system has a journal, changes are written to the journal; applied; and removed from the journal. This increases I/O requirements for write intensive activity.

You may want to use a filesystem without a journal for the cache.

Instead of ionice, you can use a sleep command to rate limit the actions. This will work even if ionice does not, but it will take a long time to delete all your files.

Solution 4:

I got many useful answers / comments here, which I'd like to conclude as well as show my solution as well.

  1. Yes, the best way to prevent such thing happening is to keep the cache dir on a separate filesystem. Nuking / quick formatting a file system always takes a few seconds (maybe minutes) at most, unrelated to how many files / dirs were present on it.

  2. The ionice / nice solutions didn't do anything, because the deleting process actually caused almost no I/O. What caused the I/O was I believe kernel / filesystem level queues / buffers filling up when files were deleted too quickly by the delete process.

  3. The way I solved it is similar to Tero Kilkanen's solution, but didn't require calling a shell script. I used rsync's built in --bwlimit switch to limit the speed of deleting.

Full command was:

mkdir empty_dir
rsync -v -a --delete --bwlimit=1 empty_dir/ cache.bak/

Now bwlimit specifies bandwidth in kilobyes, which in this case applied to the filename or path of the files. By setting it to 1 KBps, it was deleting around 100,000 files per hour, or 27 files per second. Files had relative paths like cache.bak/e/c1/db98339573acc5c76bdac4a601f9ec1e, which is 47 characters long, so it would give 1000/47 ~= 21 files per second, so kind of similar to my guess of 100,000 files per hour.

Now why --bwlimit=1? I tried various values:

  • 10000, 1000, 100 -> system slowing down like before
  • 10 -> system working quite well for a while, but produces partial slowdowns once a minute or so. HTTP response times still < 1 sec.
  • 1 -> no system slowdown at all. I'm not in a hurry and 2 million files can be deleted in < 1 day this way, so I choose it.

I like the simplicity of rsync's built in method, but this solution depends on the relative path's length. Not a big problem as most people would find the right value via trial and error.