BLZ77 serves as the model, replacing repeated patterns with short references to reduce redundancy; Huffman serves as the entropy coder, compressing the references and literal symbols to near their entropy
CLZ77 handles text data; Huffman handles binary data
DBoth perform the same function redundantly for error correction
This is the model-coder separation central to modern compression. LZ77 identifies and eliminates sequential redundancy by replacing repeated substrings with (distance, length) pairs pointing back to earlier occurrences. This transforms the data into a new representation with lower entropy. Huffman (or arithmetic) coding then compresses this representation close to its entropy. DEFLATE (used in gzip, PNG, ZIP) uses exactly this pipeline. The model captures structure; the coder converts it to bits.
Question 2 Multiple Choice
A 1 MB file of truly random bytes (each byte uniformly and independently distributed) is fed to gzip. The output will be approximately:
BApproximately 1 MB plus a small overhead — random data has maximum entropy (8 bits/byte) and cannot be compressed
CExactly 0 bytes — gzip recognizes random data and discards it
DApproximately 0.5 MB — compression always achieves at least 50% reduction
Truly random bytes have entropy 8 bits per byte — the maximum possible for a byte-valued source. The source coding theorem guarantees no lossless compressor can reduce this on average. gzip will find no repeated patterns (LZ77 matches) and the byte frequencies will be nearly uniform (no Huffman advantage). The output will be the original data plus gzip headers and metadata — slightly LARGER than 1 MB. This is a fundamental limit, not a deficiency of gzip.
Question 3 Short Answer
Lossy compression can exceed the entropy limit that bounds lossless compression. Explain why this is not a contradiction.
Think about your answer, then reveal below.
Model answer: The entropy bound applies to LOSSLESS compression: if you must reconstruct the original data exactly, you need at least H bits per symbol. Lossy compression relaxes this requirement — it allows some distortion (error) in the reconstruction. By tolerating distortion D, you only need to distinguish between equivalence classes of sources that are within distance D of each other, not between every individual source. The number of distinguishable classes is smaller, requiring fewer bits. Rate-distortion theory formalizes this: the rate-distortion function R(D) gives the minimum bit rate needed to represent the source with average distortion at most D. R(0) = H (lossless), and R(D) decreases as D increases.
This is why JPEG (lossy) can compress a photo to 1/20th its raw size while PNG (lossless) might only achieve 1/2. Lossy compression exploits the fact that the human visual or auditory system cannot distinguish small differences, so the 'wasted' information is information the receiver doesn't need.