29 October 2006
Standard Linux Compression Utilities
Looking into compression utilities recently for the highest compression level (time was not that important), I decided to share some of my findings. Most of the utilities are only Linux/Unix based.
The most popular of which gzip, gives a compression range from 110% to about 128%. The default compression level of 6 in the benchmarks I've read is not that much worse than the compression level of 9. The compression becomes a lot better with very little time difference for the first 5 levels and begins leveling off at 6 with the levels 7 to 9 being almost the same compression with a large increase in time for each consecutive level.
Another popular utility, bzip2, according to the benchmarks has it's beginning compression level at about 130% and rises to about 155% or 160% at it's highest. The disadvantage is that it is a lot slower than gzip by about 30 or 40 times.
The popular zip utilities found on most computers seem to have a similar compression level as gzip.
Having done my own test on these three utilities on a tar file containing image files as well as XML and XSL documents. The tar contained about 36 files, 22 of which were JPEG images. The best performance for that tar was gzip and zip giving 20 KB more compression than bzip2. All of these were compressed using the highest possible option.
Generally bzip2 compresses files more than gzip, but not always and if you are really looking for the most compression, you probably should compress the file with both and choose the one with the highest level of compression, provided that speed isn't important.
According to the benchmarks 7za and lzma provide the highest compression levels at about 425% to 430% but require 100 to 150 times more time to compress.
The most popular of which gzip, gives a compression range from 110% to about 128%. The default compression level of 6 in the benchmarks I've read is not that much worse than the compression level of 9. The compression becomes a lot better with very little time difference for the first 5 levels and begins leveling off at 6 with the levels 7 to 9 being almost the same compression with a large increase in time for each consecutive level.
Another popular utility, bzip2, according to the benchmarks has it's beginning compression level at about 130% and rises to about 155% or 160% at it's highest. The disadvantage is that it is a lot slower than gzip by about 30 or 40 times.
The popular zip utilities found on most computers seem to have a similar compression level as gzip.
Having done my own test on these three utilities on a tar file containing image files as well as XML and XSL documents. The tar contained about 36 files, 22 of which were JPEG images. The best performance for that tar was gzip and zip giving 20 KB more compression than bzip2. All of these were compressed using the highest possible option.
Generally bzip2 compresses files more than gzip, but not always and if you are really looking for the most compression, you probably should compress the file with both and choose the one with the highest level of compression, provided that speed isn't important.
According to the benchmarks 7za and lzma provide the highest compression levels at about 425% to 430% but require 100 to 150 times more time to compress.
Labels: compression, linux, os, OSS, unix