Copy large file from one Linux server to another

Solution 1:

Sneakernet Anyone?

Assuming this is a one time copy, I don't suppose its possible to just copy the file to a CD (or other media) and overnight it to the destination is there?

That might actually be your fastest option as a file transfer of that size, over that connection, might not copy correctly... in which case you get to start all over again.


rsync

My second choice/attempt would be rsync as it detects failed transfers, partial transfers, etc. and can pick up from where it left off.

rsync --progress file1 file2 user@remotemachine:/destination/directory

The --progress flag will give you some feedback instead of just sitting there and leaving you to second guess yourself. :-)


Vuze (bittorrent)

Third choice would probably be to try and use Vuze as a torrent server and then have your remote location use a standard bitorrent client to download it. I know of others who have done this but you know... by the time they got it all set up running, etc... I could have overnighted the data...

Depends on your situation I guess.

Good luck!


UPDATE:

You know, I got thinking about your problem a little more. Why does the file have to be a single huge tarball? Tar is perfectly capable of splitting large files into smaller ones (to span media for example) so why not split that huge tarball into more managable pieces and then transfer the pieces over instead?

Solution 2:

I've done that in the past, with a 60GB tbz2 file. I do not have the script anymore but it should be easy to rewrite it.

First, split your file into pieces of ~2GB :

split --bytes=2000000000 your_file.tgz

For each piece, compute an MD5 hash (this is to check integrity) and store it somewhere, then start to copy the pieces and their md5 to the remote site with the tool of your choice (me : netcat-tar-pipe in a screen session).

After a while, check with the md5 if your pieces are okay, then :

cat your_file* > your_remote_file.tgz

If you have also done an MD5 of the original file, check it too. If it is okay, you can untar your file, everything should be ok.

(If I find the time, I'll rewrite the script)


Solution 3:

Normally I'm a big advocate of rsync, but when transferring a single file for the first time, it doesn't seem to make much sense. If, however, you were re-transferring the file with only slight differences, rsync would be the clear winner. If you choose to use rsync anyway, I highly recommend running one end in --daemon mode to eliminate the performance-killing ssh tunnel. The man page describes this mode quite thoroughly.

My recommendation? FTP or HTTP with servers and clients that support resuming interrupted downloads. Both protocols are fast and lightweight, avoiding the ssh-tunnel penalty. Apache + wget would be screaming fast.

The netcat pipe trick would also work fine. Tar is not necessary when transferring a single large file. And the reason it doesn't notify you when it's done is because you didn't tell it to. Add a -q0 flag to the server side and it will behave exactly as you'd expect.

server$ nc -l -p 5000 > outfile.tgz

client$ nc -q0 server.example.com 5000 < infile.tgz

The downside to the netcat approach is that it won't allow you to resume if your transfer dies 74GB in...


Solution 4:

Give netcat (sometimes called nc) a shot. The following works on a directory, but it should be easy enough to tweak for just coping one file.

On the destination box:

netcat -l -p 2342 | tar -C /target/dir -xzf -

On the source box:

tar czf * | netcat target_box 2342

You can try removing the 'z' option in both tar command for a bit more speed seeing as the file is already compressed.