On Linux, what is a faster way than `find` or `diff -r` to see if something inside a directory has changed?

Solution 1:

GNU tar has a --newer-mtime option, which requires a date argument, which would presumably be the last time you did a backup. Depending on how much work you wanted to restore the filesystem, this could either be the last full backup, in which case you'd need to restore the full dump and the last daily, or you could do it since the last incremental, in which case, you'd need to restore the full dump and every dump after that.

This option does rely on the modification timestamp on the file, so if that has been explicitly changed, then there's a chance your backup will miss it.

Solution 2:

The incron utility uses inotify to run commands when filesystem events occur. The configuration file is like a crontab, but instead of times you specify paths and events.

The command could either be your backup script (in which case backup will start almost immediately after the files were modified), or you could have it create some file, and have the backup script check for the existence of that file and then delete it. If the file exists, one of the events occurred since the last run.


Solution 3:

You could always pipe find's output to wc and get an integer count of changed files:

find . -ctime 1 | wc -l

Although David's answer requires fewer code changes :)


Solution 4:

This is a little bit of a wild idea, but you could play a little with md5sum and ls.

This idea is to only look at a md5sum of one file, and that that file is a file listing of the dir you are watching. And as long as nothing changes, the md5sum is the same. But if a timestamp is updated the md5sum will change, and you know you need to do a new tar and send it to your ftp server.

We could start with something like this

ls -lR /path/to/dir/ | md5sum > file_list.txt.md5

Then you would need to add a comparison between the old md5 and the current... etc etc

/Johan


Solution 5:

Recent versions of GNU find have the action "-quit", which causes find to immediately stop searching:

— Action: -quit

Exit immediately (with return value zero if no errors have occurred). This is different to ‘-prune’ because ‘-prune’ only applies to the contents of pruned directories, whilt ‘-quit’ simply makes find stop immediately. No child processes will be left running, but no more files specified on the command line will be processed. For example, find /tmp/foo /tmp/bar -print -quit will print only ‘/tmp/foo’. Any command lines which have been built by ‘-exec ... +’ or ‘-execdir ... +’ are invoked before the program is exited.

You could use a find-expression to find files that have changed, and use -quit to stop as soon as you find one. That should be faster than find continuing its scan.

-quit was added in fileutils V4.2.3

Tags:

Linux

Backup