Sync LVM snapshots to backup server

Solution 1:

Although there are 'write-device' and 'copy-device' patches for RSync they only work well on small images (1-2GB). RSync will spend ages searching around for matching blocks on larger images and it's almost useless of 40GB or larger devices/files.

We use the following to perform a per 1MB checksum comparison and then simply copy the content if it doesn't match. We use this to backup servers on a virtual host in the USA to a backup system in the UK, over the public internet. Very little CPU activity and snapshot performance hit is only after hours:

Create snapshot:

lvcreate -i 2 -L 25G /dev/vg_kvm/company-exchange -n company-exchange-snap1

export dev1='/dev/mapper/vg_kvm-company--exchange--snap1';
export dev2='/dev/mapper/vg_kvm-company--exchange';
export remote='[email protected]';

Initial seeding:

dd if=$dev1 bs=100M | gzip -c -9 | ssh -i /root/.ssh/rsync_rsa $remote "gzip -dc | dd of=$dev2"

Incremental nightly backup (only sends changed blocks):

ssh -i /root/.ssh/rsync_rsa $remote "
  perl -'MDigest::MD5 md5' -ne 'BEGIN{\$/=\1024};print md5(\$_)' $dev2 | lzop -c" |
  lzop -dc | perl -'MDigest::MD5 md5' -ne 'BEGIN{$/=\1024};$b=md5($_);
    read STDIN,$a,16;if ($a eq $b) {print "s"} else {print "c" . $_}' $dev1 | lzop -c |
ssh -i /root/.ssh/rsync_rsa $remote "lzop -dc |
  perl -ne 'BEGIN{\$/=\1} if (\$_ eq\"s\") {\$s++} else {if (\$s) {
    seek STDOUT,\$s*1024,1; \$s=0}; read ARGV,\$buf,1024; print \$buf}' 1<> $dev2"

Remove snapshot:

lvremove -f company-exchange-snap1

Solution 2:

Standard rsync is missing this feature, but there is a patch for it in the rsync-patches tarball (copy-devices.diff) which can be downloaded from http://rsync.samba.org/ftp/rsync/ After appling and recompiling, you can rsync devices with the --copy-devices option.


Solution 3:

People interested in doing this specifically with LVM snapshots might like my lvmsync tool, which reads the list of changed blocks in a snapshot and sends just those changes.


Solution 4:

Take a look at Zumastor Linux Storage Project it implements "snapshot" backup using binary "rsync" via the ddsnap tool.

From the man-page:

ddsnap provides block device replication given a block level snapshot facility capable of holding multiple simultaneous snapshots efficiently. ddsnap can generate a list of snapshot chunks that differ between two snapshots, then send that difference over the wire. On a downstream server, write the updated data to a snapshotted block device.


Solution 5:

There's a python script called blocksync which is a simple way to synchronize two block devices over a network via ssh, only transferring the changes.

  • Copy blocksync.py to the home directory on the remote host
  • Make sure your remote user can either sudo or is root itself
  • Make sure your local user (root?) can read the source device & ssh to the remote host
  • Invoke: python blocksync.py /dev/source user@remotehost /dev/dest

I've recently hacked on it to clean it up and change it to use the same fast-checksum algorithm as rsync (Adler-32).