Why is my rsync so slow?

Solution 1:

Reasons can include: compression, encryption, the number and size of files being copied, your source and destination systems' disk I/O capabilities, TCP overhead... These are all factors that can influence the type of transfer you're conducting.

Please post the rsync command you're using and provide details on the specifications of both computers.


Edit: Encryption is often a limiting factor in rsync speeds. You can run with ssh and a lighter-weight encryption cipher like arcfour

Something like: rsync -e "ssh -c arcfour"

Or you can use a modified rsync/ssh that can disable encryption. See hpn-ssh: http://psc.edu/networking/projects/hpn-ssh

But again, your laptop has a slow drive compared to your workstation. Writes may be blocked and waiting for I/O going to your laptop. What are your real performance expectations?

Solution 2:

Another way to mitigate high CPU usage but still keep the functionality of rsync, is by moving from rsync/SSH to rsync/NFS. You could export the paths you want to copy from via NFS and then use rsync locally from the NFS mount to your destination location.

In one test from a WD MyBook Live network disk, one or more rsyncs from the NAS on a Gigabit network towards 2 local USB disks would not copy more than 10MB/sec (CPU: 80% usr, 20% sys), after exporting over NFS and rsyncing locally from the NFS share to both disks I got a total of 45MB/sec (maxing out both USB2 disks) and little CPU usage. Disk utilization when using rsync/SSH was about 6% and using rsync/NFS was closer to 24%, while both USB2 disks where close to 100%.

So we effectively moved the bottleneck from the NAS CPU to both USB2 disks.


Solution 3:

After some more testing, I finally found the answer myself. rsync uses tunneling over ssh by default. The crypto makes it slow. So I needed to get around that crypto stuff.

Solution 1: Setting up an rsync server

To use it via the rsync protocol, you have to set up an rsyncd server. There was an /etc/init.d/rsync script on my laptop, so I guessed, rsyncd was running. I was wrong. /etc/init.d/rsync start exists silently, when rsync is not enabled in /etc/default/rsync. Then you also have to configure it in /etc/rsyncd.conf, which is a pain.

If you get all this done, you have to use rsync file.foo user@machine::directory. Please note, that there are two colons.

Solution 2: Old-school rsh-server

However, the configuration was way too complicated for me. So I just installed and rsh-server on my laptop. Invoking rsync on the workstation with -e rexec then uses rsh instead of ssh. Which then almost doubled the performance to 44.6 MB/s, which is still slow. The speed bounces between 58 MB/s and 33 MB/s, which indicates there may be some buffer or congestion control problems. But that is beyond the scope of this question.


Solution 4:

These are a very old question and answers, but one important thing is missing: if you are copying already-compressed or encrypted data, turn off compression.

If your data is neither compressed nor encrypted, you still only want to compress it once! Rsync compresses with -z, ssh compresses with -C (might be by default). I haven't tested which is better since my data is compressed.

While I'm at it, you can turn off X forwarding and TTY allocation, resulting in:

rsync -avh -e "ssh -x -T -c arcfour -o Compression=no" $src $dst

Lastly, make sure (for example using iptraf) that you are actually using the network interface you think you are using. I have to my great surprise noted that on my OSX the outgoing ssh was binding to the IP on the default outgoing interface instead of to the IP on the interface the packets were supposed to be routed out on. My direct GB cross-connect between two laptops also connected by WiFi was not being used. After investigation, it was due to using 169.254/16, which the Mac puts on all interfaces, and the destination computer replying to ARP requests even though the request came in on a different interface.