Maximizing rsync performance and throughput - directly-connected gigabit servers

Solution 1:

The file count and SSH encryption overhead are likely the biggest barriers. You're not going to see wire-speed on a transfer like this.

Options to improve include:

  • Using rsync+SSH with a less costly encryption algorithm (e.g. -e "ssh -c arcfour")
  • Eliminating encryption entirely over the SSH transport with something like HPN-SSH.
  • Block-based transfers. Snapshots, dd, ZFS snapshot send/receive, etc.
  • If this is a one-time or infrequent transfer, using tar, netcat (nc), mbuffer or some combination.
  • Check your CentOS tuned-adm settings.
  • Removing the atime from your filesystem mounts. Examining other filesystem mount options.
  • NIC send/receive buffers.
  • Tuning your rsync command. Would -W, the whole-files option make sense here? Is compression enabled?
  • Optimize your storage subsystem for the type of transfers (SSDs, spindle-count, RAID controller cache.)

Solution 2:

As you probably know copying a lot of little files (eg mailboxes using MailDir format or similar) is definitely not the best option to take advantage of high bandwith interfaces. SSH is probably not the best transport protocol for that either. I would try using tar to create a tarball on the source host prior to send it to you secondary host.

tar c /var/mail | ssh root@secondary-host 'tar x -C /var/backups'

If you need incremental backup you may want to try the -g options of tar. If you still need to maximize throuput, try using netcat instead of ssh.