
Mikuru tried to compress my files too using her superpower energy. Rzip still worked better.
After reading an old post on Jeremy Zawodny's weblog and installing it myself, I have to say Rzip is my new favourite compression algorithm!
From the developer's website:
rzip is a compression program, similar in functionality to gzip or bzip2, but able to take advantage long distance redundancies in files, which can sometimes allow rzip to produce much better compression ratios than other programs. The original idea behind rzip is described in my PhD thesis.
For a bit of real world testing, I decided to try compressing the www folder in my home directory on my MacBook Pro. I thought this folder would be a useful test because it's relatively large and contains a few large files mixed in with hundreds of smaller ones. From what I understand of compression algorithms, they each tend to favour compressing certain types of files and in certain quantities so I figured this way it would show a more balanced result.
The original folder size was 436.0 MiB with 312 files. The Tape Archive is the control because it's needed for all but ZIP to archive the files before they can be compressed. For convenience the names also redirect to their associated Wikipedia pages.
| Algorithm | Extension | File size | % of original | % saved |
|---|---|---|---|---|
| Tape Archive | www.tar | 423.9 MiB | - | - |
| ZIP | www.tar.zip | 290.9 MiB | 68.62 | 31.38 |
| Bzip2 | www.tar.bz2 | 286.3 MiB | 67.72 | 32.28 |
| GNU zip | www.tar.gz | 284.8 MiB | 67.54 | 32.46 |
| Rzip | www.tar.rz | 104.7 MiB | 24.70 | 75.30 |
What's curious is that Gzip was more efficient than Bzip2, in almost every other circumstance I've come across the reverse was true. I'm not sure how much that affected the results of the other formats. The final result is clear though, Rzip was able to squash like nobody else!

Image © Jan Mehlich, from Wikimedia Commons. As with the image above, I thought it was mildly amusing given the subject matter. I hate dry weblog posts without pictures you see.
From what I can make out reading the developer's website; and with help from dadaist in real-time on Twitter; is that Rzip isn't an entirely new compression algorithm per-se, it essentially just uses larger chunks of data over much longer distances, and then uses existing algorithms to process it all.
I theorise from reading up on this that only in the last decade have computers had enough processing power, and more importantly memory, to be able to pull this off. 900MiB of looking space is great for compression, but can suck up all your resources pretty fast if you don't have much. This is why we haven't seen this level of compression until recently.
In any case, I know what I'll be using to compress all my large files and folders with now :).
That's a nice anime picture! :)
Speaking of compression, it would be interesting to see RAR archiver in the comparison list. It is not free or open source (UNRAR is free) but still it quite popular. I am wondering how it compares with the rest.
[...] other concern is that I do have backups of all this stuff, but the backups are in ultra compressed Rzip format to save space. The problem is, I don’t have enough drive space amongst my other external hard [...]
Could you give example of how to use rzip to compress a directory? I am trying and failing. Or can you only use it to compress individual files? It doesn't even work for me with wildcards.
I want to use it compress my /home which has 170GB to make backup on a smaller hard disk. If I first run tar then do I need 170GB spare (which I don't have)?