A string x of length n is called 'c-incompressible' if K(x) >= n - c. What fraction of n-bit strings are c-incompressible?
AA negligible fraction — most strings have short descriptions
BAt least (1 - 2^(-c)) of all n-bit strings — by counting, there are fewer than 2^(n-c) programs shorter than n-c bits, so at most a fraction 2^(-c) of n-bit strings can be compressed by c or more bits
CExactly half of all strings
DIt depends on the choice of universal Turing machine
There are 2^n strings of length n but at most 2^0 + 2^1 + ... + 2^(n-c-1) < 2^(n-c) programs of length less than n-c. So at most 2^(n-c) strings can have K(x) < n-c, meaning at most a fraction 2^(-c) are compressible by c bits. The remaining fraction 1 - 2^(-c) are c-incompressible. For c = 10, over 99.9% of strings are incompressible. This is the counting argument: randomness (incompressibility) is the norm, not the exception.
Question 2 True / False
Chaitin's halting probability Omega is a well-defined real number that can be approximated from below but never computed exactly.
TTrue
FFalse
Answer: True
Omega = sum over halting programs p of 2^(-|p|) is perfectly well-defined — it is the probability that a prefix-free universal Turing machine halts on a random input. It can be approximated from below by running more programs and adding the contributions of those that halt. But it cannot be computed to arbitrary precision, because knowing the first n bits of Omega would solve the halting problem for all programs of length n. Omega is an example of a 'computably enumerable but not computable' real number — you can get closer from below but never know when you've converged.
Question 3 Short Answer
Explain how Solomonoff induction uses Kolmogorov complexity to define a universal theory of prediction, and why it is uncomputable but theoretically important.
Think about your answer, then reveal below.
Model answer: Solomonoff induction assigns prior probability to a hypothesis (program) p proportional to 2^(-|p|) — shorter programs get higher prior weight. The predictive distribution for the next observation given past data is the mixture over all programs consistent with the data, weighted by this prior. This is 'universal' because it converges to the true distribution regardless of what the true data-generating process is (as long as it is computable). It is uncomputable because it requires summing over all programs and determining which ones are consistent with the data (equivalent to solving the halting problem). Despite uncomputability, Solomonoff induction provides the theoretical gold standard for prediction: any computable predictor can be shown to be at most a constant factor worse.
Solomonoff induction bridges Kolmogorov complexity and Bayesian inference. The 'universal prior' 2^(-K(x)) is the shortest-description analog of Occam's razor: simpler hypotheses get more weight. This directly inspires the practical minimum description length (MDL) principle in statistics, where model selection favors models that compress the data most effectively.