Better to use local running rm -rf instead of over nfs?

Of course the ssh is the better.

Nfs uses a complex network protocol with various remote procedure calls and data synchronization waiting times. In the case of ssh, these don't apply.

Furthermore, there are many locks. File deletion in nfs works on this way:

your rm command gives the unlink() syscall
nfs driver converts it to a sunrpc request, sends it to the nfs server
nfs server converts this sunrpc request back to an unlink() call
executes this unlink() call on the remote side
after it succeed, gives back the rpc reply message equivalent of "all right, it is done" to the client
the kernel driver of the client-side converts this back to the exit code 0 of the unlink() call of your original rm
rm iterates to the next file, goto 1

Now, the important thing is: between 2-7, rm has to wait. It could send the next unlink() call asynchronously, but it is a single-threaded, not event-oriented tool. Even if it could, it would still require tricky nfs mount flags. Until it doesn't get the result, it waits.

Nfs - and any network filesystem - is always much slower.

In many cases, you can make recursive deletions quasi-infinite speed with a trick:

First move the directory to a different name (mv -vf oldfilms oldfilms-)
Delete in the background (rm -rf oldfilms- &)

From many (but not all) aspects, this directory removal will look as if it had been happened in practically zero time.

Extension: As @el.pascado mentions in his excellent comment, actually 2-7 has to run 3x for any files:

to determine if it is a file or a directory (with an lstat() syscall),
then do accordingly. In the cases of ordinary files, unlink(), in the case of directories, opendir(), deleting all files/directories in it recursively, then closedir(), finally rmdir().
finally, iterate to the next directory entry with a readdir() call.

This, it requires 3 nfs RPC commands for files, and an additional 3 for directories.

Yes. Well, maybe. It depends. For a small number of files and directories, it wouldn't do much difference.

Doing file operation in bulk on an NFS mounted directory is slow. If you have the opportunity to log into the NFS server itself and do them on the actual directory, then this would be quicker.

Let's test it by removing the OpenBSD ports collection that I have checked out from CVS and mounted over NFS:

On NFS server:

$ cd /export/shared/ports

$ du -hs .
2.6G    .

$ find . | wc -l
  179688

$ time rm -rf /export/shared/ports/*
0m20.87s real     0m00.12s user     0m04.62s system

On client (after restoring the original files from backup):

$ time rm -rf /usr/ports/*
6m49.73s real     0m01.55s user     1m08.96s system

Better to use local running rm -rf instead of over nfs?

Tags:

Performance

Nfs

Related

Recent Posts