A 1000-bit string x consists of '01' repeated 500 times. A 1000-bit string y was generated by fair coin flips. Which has higher Kolmogorov complexity?
Ax has higher complexity because it is longer
BBoth have the same complexity because both are 1000 bits
Cy almost certainly has higher complexity — x can be generated by a short program ('print 01 500 times'), while y has no short description and its shortest program is approximately y itself
Dx has higher complexity because repetitive patterns are harder to encode
K(x) is very small — something like 'repeat 01 500 times' is a program of maybe 30 bits that produces x. K(y) is approximately 1000 bits because a random string has no exploitable patterns — the shortest program that produces it essentially has to contain y as a literal. With high probability over y, K(y) >= 1000 - c for a small constant c. This is the formal sense in which random strings are 'incompressible' — their Kolmogorov complexity is approximately their length.
Question 2 True / False
Kolmogorov complexity K(x) is computable — given any string x, there exists an algorithm that outputs K(x).
TTrue
FFalse
Answer: False
Kolmogorov complexity is NOT computable. This can be proved by contradiction: if K were computable, you could construct a program that finds the first string with K(x) > n and outputs it, contradicting the fact that this program has length approximately log(n) << n. This uncomputability is not a practical limitation that better algorithms could overcome — it is a fundamental consequence of the halting problem. We can compute upper bounds on K(x) (by finding short programs that output x), but we can never verify that we have found the SHORTEST program.
Question 3 Short Answer
Explain the relationship between Kolmogorov complexity and Shannon entropy. When does K(x) relate to H(X)?
Think about your answer, then reveal below.
Model answer: For a random variable X with distribution P, the expected Kolmogorov complexity E[K(X)] is approximately H(X) (up to an additive constant). Shannon entropy measures the average information content across a distribution; Kolmogorov complexity measures the information content of individual strings. Typical strings (in the AEP sense) from a source with entropy H have K approximately nH — they are individually incompressible at the level predicted by the source's entropy. The key difference: Shannon entropy requires a probability model; Kolmogorov complexity is model-free. Shannon theory says 'on average, strings from this distribution need H bits'; Kolmogorov theory says 'this particular string x needs K(x) bits, period.'
The connection is deep: Shannon's source coding theorem says most strings drawn from an entropy-H source need about nH bits. Kolmogorov complexity identifies exactly which strings need more or fewer bits — but at the cost of uncomputability. In practice, Shannon entropy is computable and sufficient for communication system design; Kolmogorov complexity provides the theoretical foundation for individual-sequence randomness.