Big rsync -- push or pull?

The way rsync algorithm works can be found from here.

The algorithm identifies parts of the source file which are identical to some part of the destination file, and only sends those parts which cannot be matched in this way. Effectively, the algorithm computes a set of differences without having both files on the same machine. The algorithm works best when the files are similar, but will also function correctly and reasonably efficiently when the files are quite different.

So it would not make a difference whether you are uploading or downloading as the algorithm works on checksums of the source and destination files. So, any file can be the source/destination.

I find some more useful information from here. Some of the excerpts are,

RSync is a remote file (or data) synchronization protocol. It allows you to synchronize files between two computers. By synchronize, I mean make sure that both copies of the file is the same. If there are any differences, RSync detects these differences, and sends across the differences, so the client or server can update their copy of the file, to make the copies the same.

RSync is capable of synchronizing files without sending the whole file across the network. In the implementation I've done, only data corresponding to about 2% of the total file size is exchanged, in addition to any new data in the file, of course. New data has to be sent across the wire, byte for byte.

Because of the way RSync works, it can also be used as an incremental download / upload protocol, allowing you to upload or download a file over many sessions. If the current upload or download fails, you can just resume it later.


The rsync program actually runs a copy of itself on the remote server. Once rsync is running on both ends, they negotiate between themselves how to best transfer the requested files. I don't think it matters which one is started first.

However, I would usually initiate the transfer from the machine that is closest to me. That way, if something goes wrong I am more likely to be able to monitor the file transfer progress. If both machines are on the same LAN, then this reason wouldn't be a reason to pick one over the other.

Tags:

Rsync