Arithmetic coding represents an entire message as a single number in the interval [0, 1), achieving compression rates arbitrarily close to the source entropy. Unlike Huffman coding, which assigns a discrete codeword to each symbol, arithmetic coding encodes sequences by successively narrowing a subinterval: each symbol shrinks the current interval proportionally to its probability. The final interval width is the product of all symbol probabilities, and specifying a point within it requires approximately -log2 of that product = sum of -log2(p_i) bits — exactly the information content. Arithmetic coding is theoretically optimal and is the basis of modern entropy coders like ANS (asymmetric numeral systems).
Huffman coding assigns integer-length codewords to individual symbols, which limits it to at most 1 bit above entropy per symbol. Arithmetic coding removes this limitation by encoding entire messages as single numbers, effectively achieving fractional bit lengths per symbol.
The idea is beautifully simple. The unit interval [0, 1) is partitioned among the alphabet symbols proportionally to their probabilities. The first symbol in the message selects the corresponding subinterval. That subinterval is then partitioned again in the same proportions, and the second symbol selects a sub-subinterval. This continues for every symbol. After processing the entire message, you have a tiny interval whose width equals the probability of the specific message (the product of all symbol probabilities for i.i.d. sources). To transmit the message, you send enough bits to uniquely identify a point inside that interval — approximately -log2(width) bits.
The magic is that -log2(product of probabilities) = sum of -log2(p_i), which is exactly the information content of the message. More probable messages produce wider intervals requiring fewer bits; improbable messages produce narrow intervals requiring more bits. Over many symbols, the average rate converges to the entropy H(X). The overhead is at most 2 bits for the entire message (to handle interval boundaries and termination), making the per-symbol overhead negligible for long sequences.
In practice, arithmetic coding operates on integers using finite-precision arithmetic, with a "renormalization" step that outputs bits as the interval narrows and shifts, keeping the working precision manageable. Modern variants like ANS (asymmetric numeral systems), used in Facebook's Zstandard and Apple's LZFSE, achieve the same theoretical optimality with higher throughput by encoding the state as a single integer rather than maintaining interval endpoints. Context-adaptive arithmetic coding (as in H.265/HEVC video compression) pairs the arithmetic coder with a sophisticated context model, achieving compression rates that track the conditional entropy given recent context — far surpassing what per-symbol Huffman can achieve.