2011-08-21

Copying files with on-the-fly compression

having to copy a really big file in the shortest time, this might be useful:

gzip -1 </path/to/sourcefile | ssh -c blowfish-cbc target.host gunzip ">" /path/to/destfile

the light compression (-1) seems to be a good tradeoff: higher compression ratios lead to higher overall transfer time.
the blowfish encryption puts lower strain on the cpu.
using an old 10 Mbit connection, i've copied a vm disk file with a throughput of 13.980.000 bytes/sec

2 comments:

  1. Suricou Raven8/11/11 08:27

    Depends on the hosts involved. If your connection is only 10mbit, and the hosts are modern... I'll run a quick test. 24M test file, i5 2.2GHz, and it takes 1.23 real. That's 19.5M/s, a lot more than the 10mbit connection can take - and I didn't use the -1. Bzip2 still manages 6.67M/s, which is still more than the 10mbit connection can take, so in this case bzip2 would achieve higher effective throughput than gzip (For I have yet to find any data where gzip compresses better). You might even be able to manage 7z and some of the more processor-intensive options. It's just a matter of which resource is most limited: Processing time or network throughput.

    ReplyDelete
  2. there's one more resource you could be limited in: host obsolescence.
    gzip is itself old and widespread enough to be found nearly everywhere.
    in my experience, the same can't be said about bzip2 and 7z.

    ReplyDelete