Lzip Frequently Asked Questions

edit

1. What is lzip?

edit

Lzip is the most advanced file compression utility ever conceived. It is literally years ahead of gzip (though admittedly gzip was around first), and makes use of mathematical transforms the bzip developers have never even heard of. The practical upshot of this is that when you use lzip, you get the best compression on the planet. Smaller file sizes; faster compression/uncompression times. Used properly, lzip is capable of reducing a file down to 0% of its original size. Yes, you read that correctly: 0% of its original size. And regardless of file size, this can be done in constant time. Now do you see why some people are calling lzip the "holy grail" of file utilities? (top)

2. What makes lzip different from gzip/bzip2?

edit

Well, other than the performance benefits mentioned above, the real difference is that lzip uses a "lossy" compression scheme. Most other file compression utilities use a "lossless" compression scheme, mostly because the lossless algorithms are better understood and simpler mathematically (most programmers take shortcuts, particularly in areas that involve a lot of math). This has two side effects. The first is that files compressed with lzip cannot be restored to their original state -- this is the "lossy" in lossy compression. The second is that the performance is vastly improved. Why don't go go back up to question number one and read that second paragraph again. We're talking about a constant-time algorithm that can reduce a file down to 0% of its original size. What's not to like? (top)

3. What do you mean I can't restore my files?

edit

Ha! A common misconception. You can restore your files after they have been compressed with lzip. They just won't be exactly the same as they were before. This makes sense when you think about it; if you lose a lot of weight suddenly, and then put the same weight back on suddenly, you wouldn't expect to be in exactly the same health that you were when you started, would you? Compression is a dramatic process, and dramatic processes often change people. It's no different for your files. On the reassuring side, it is important to note that the compression algorithm used by lzip only discards the unimportant data. And if it was unimportant before, what makes it so important now? Huh? In fact, many users may find that compressing their entire file system and then restoring it will be a good way to learn what is truly important. (top)

4. What is lossy compression?

edit

Simply put, a lossy compression algorithm is one in which not all of the data is preserved. The JPEG file format uses lossy compression. Alternatively, the GIF format uses lossless compression. And just look at all the trouble that decision has caused. Specifically, lzip uses the Lessiss-Moore algorithm to do its compression. You specify the level of compression that you want on the command line, and lzip meets your needs by tweaking the algorithm. The algorithm used by lunzip is currently a modified version of the PLACeBO algorithm, although this may change with the next release. (top)

5. What are the benefits?

edit

Numerous; numerous. The size factor, obviously, is a prime benefit. But on a deeper level, using lossy compression to manage your files is a way to learn something about yourself. You will most likely experience a feeling of euphoria or lightheadedness as you watch your free disk space cascade upwards to 100%. You will become bolder, have increased stamina, and adrenaline may make you temporarily impervious to pain. You may also gain a new appreciation for backup devices (this has been widely reported among the developers). Lossy compression has benefits that extend well beyond day-to-day file management. Our short list includes: permanent (irretrievable) archiving; ultra-high speed transfers over existing network lines, and high-security "steganographic" storage of sensitive information. (top)

6. Are there any drawbacks?

edit

Not that we know of. Occasionally, in the pre-1.0 days, someone would compress a file down to 0K and it would be lost for good. But that has been happening less and less frequently, and these days it has been a long time since we received any complaints from the people who reported this originally.


7. Why don't more people use lossy compression?

edit

Probably because it is so new. The Lessiss-Moore algortihm that lzip uses was only invented a few days ago, and the decompression algorithm is even now still under development. There are also a lot of peole who are just content to stay satisfied with the status quo. We call these people "lazy dopes." Where would the world be today if it weren't for go-getters and dreamers like Tom Edison, Karl Marx, Henry Ford, or even Voltaire or the Earl of Sandwich? Just reflect on that next time you're eating lunch, if you catch my drift.


8. What is the Lessiss-Moore algorithm?

edit

The Lessiss-Moore algorithm was invented by Werner von Lessiss and R.T. Moore in the middle of the last Century. I'm sorry; I meant to say the middle of last week. [note to nate: change this]. It utilizes a two-pass bit-sieve to first remove all unimportant data from the data set. Lzip implements this quiet effectively by eliminating all of the 0's. It then sorts the remaining bits into increasing order, and begins searching for patterns. The number of passes in this search is set to (10-N) in lzip, where N is the numeric command-line argument we've been telling you about. For every pattern of length (10/N) found in the data set, the algorithm makes a mark in its hash table. By keeping the hash table small, we can reduce memory overhead. Lzip uses a two-entry hash table. Then data in this table is then plotted in three dimensions, and a discrete cosine transform transforms it into frequency and amplitude data. This data is filtered for sounds that are beyond the range of the human ear, and the result is transformed back (via an indiscrete cosine) into the hash table, in random order. Take each pattern in the original data set, XOR it with the log of it's entry in the new hash table, then shuffle each byte two positions to the left and you're done! And you can see, there is some very advanced thinking going on here. It is no wonder this algorithm took so long to develop!


9. What is the PLACeBO algorithm?

edit

PLACeBO was the lzip team's first attempt to implement the Lessiss-Moore compression filter in reverse. The results were less than astounding, however, as analysis has shown Lessiss-Moore to be a trapdoor function. In the end, PLACeBO may be abandoned in favor of something else. For now, however, it is the method used by lunzip to decopress lzip-compressed files, even if it has it's flaws.


10. Since PLACeBO doesn't work, why does lzip use it?

edit

It may not be perfect, but it is the best tool we have. I don't want anyone to get the wrong impression: just because PLACeBO doesn't work, doesn't mean it can't be used. Lunzip makes up for the shortcomings in PLACeBO by patching in a couple of support functions. We use the Warren Interior Point Method from Operations Research to step backwards through the cosine transform. This method, alternatively known as the Warren "Dice-Prayer" method, is very useful in OR problems when you don't have the time or perhaps the willpower to work through Simplex. The application of it to our filtering problem was not straightforward, but late in the process we added fast monte-carlo sorting to the mix and everything seems to have turned out fine.


11. What is the Free Object-Oriented License?

edit

The Free Object-Oriented License (or FO2L, or "foo" license) is an Open Source license we created under which to release the code for lzip. Many people create their own licenses every day, and we figured we should take a look at the existing ones to see which best met our needs. Unfortunately, the creation of the Lzip logo graphic took a lot longer than expected, and we never got around to looking at the existing licenses. The FO2L is what we came up with on our own. You can read it here. (top)

12. What is Free Software?

edit

I'm not sure. I've heard a lot about it, though, so I'm going to assume that it's here to stay. We decided to include "Free" in the name of our license because we liked the way it sounded, and we needed an "F" for the acronym to come out how we wanted. A lot of sites that talk about free software seem to point here. When I get the chance, i plan to check it out myself someday. (top)

13. How can I aid in the development of lzip?

edit

We'd love to have you help out! Unfortunately, the odds are pretty low that you have anything good to offer. You have to be pretty smart to keep up with the lzip team. We're already staffed with really smart people, several of whom have quite a bit of experience writing software of this sort. Those that have computers (unlike myself) tell me that programming isn't really all that interesting anyway. But, if you're still up to the challenge, please tell us in the discussion forum.


14. What are your plans for the future of lzip?

edit

We have many plans, including creating a library in addition to the standalone program, and adding a GUI with a variety of themeable "skins." Your suggestions are welcome.