← Graph View All Domains

A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Kolmogorov Complexity

Graduate Depth 92 in the knowledge graph ☐ I know this ☆ Set as goal

5topics build on this

528prerequisites beneath it

See this on the map →

Turing Machines Algorithm Analysis and Complexity Classes +4 more→→Algorithmic Information Theory

algorithmic-information-theory randomness descriptional-complexity

Core Idea

The Kolmogorov complexity K(x) of a string x is the length of the shortest program that outputs x on a fixed universal Turing machine. It provides an objective measure of the information content or 'randomness' of a string — a string is random if its shortest description is roughly as long as itself. Kolmogorov complexity is uncomputable: no algorithm can compute K(x) for all x. It has deep connections to data compression, statistical inference, and the mathematical foundations of probability.

How It's Best Learned

Start with concrete examples: a string of one million zeros has very low Kolmogorov complexity (a short program generates it), while a truly random string of the same length likely requires a program nearly as long as itself. Prove the incompressibility lemma to rigorously establish that most strings are almost incompressible.

Common Misconceptions

Kolmogorov complexity depends on the choice of universal Turing machine, but only up to an additive constant (the invariance theorem), making it machine-independent up to a fixed offset.
Randomness in the Kolmogorov sense is a property of individual strings, not of a probability distribution — a specific string either is or is not complex, regardless of how it was generated.

Explainer

You already understand Turing machines as universal computing devices. Kolmogorov complexity asks a different question about them: not "what can be computed?" but "how concisely can something be described?" The Kolmogorov complexity K(x) of a string x is the length (in bits) of the shortest program that, running on a fixed universal Turing machine, produces x and halts. It is the algorithmic analog of information content.

The intuition is immediate with examples. A string of one million zeros has K(x) around log₂(1,000,000) ≈ 20 bits — you only need to say "print zero a million times." But a string produced by fair coin flips has, with overwhelming probability, no description shorter than itself: the string *is* the most compact representation of itself. Such strings are called incompressible or, in Kolmogorov's sense, random. Note the paradox: most strings are random in this sense, yet you cannot name a specific one — the very act of specifying a string you claim is random gives it a short description (the specification itself).

The invariance theorem resolves the worry that K depends on which universal Turing machine you choose. If U and V are two universal machines, then |K_U(x) − K_V(x)| ≤ c_UV for a constant c_UV that depends only on the two machines, not on x. So the complexity of any string is machine-independent up to a fixed additive constant — a constant that becomes negligible for long strings. This makes K an objective property of the string itself, not of any particular computational formalism.

Kolmogorov complexity is uncomputable. The proof uses the incompressibility argument: if K were computable, you could enumerate all strings in order of increasing K value and thereby name a very complex string with a short description — a contradiction. This is closely related to the unsolvability of the halting problem you already know: computing K requires deciding whether short programs halt, which is in general undecidable. Kolmogorov complexity is therefore a theoretical tool rather than a practical compression algorithm; its power lies in the arguments it enables, not in computing it directly.

The incompressibility method is one of the most useful proof techniques that Kolmogorov complexity provides. To prove a combinatorial lower bound, you assume for contradiction that all objects of a certain type have a short description, then derive a string that is complex but which the assumption compresses — contradiction. Many lower bounds in combinatorics, data structures, and communication complexity have clean proofs via this method. Kolmogorov complexity also gives a rigorous foundation for randomness: a sequence of bits is random if and only if every prefix has Kolmogorov complexity close to its length, capturing the intuition that random sequences have no exploitable pattern.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Introduction to Propositional Logic → Introduction to Predicate Logic (First-Order Logic) → First-Order Logic Syntax → ZFC Axioms Overview → Axiom Schema of Separation → Axiom Schema of Replacement → Von Neumann Ordinals → Hereditarily Finite Sets → Recursive Definitions on Finite Sets → Well-Founded Relations and Transfinite Recursion → The Axiom of Choice and Equivalent Formulations → Axiom of Choice → Well-Ordering Theorem → Infinite Cardinal Numbers → Cantor's Theorem → Uncountability and the Diagonal Argument → The Cantor Set: An Uncountable Nowhere Dense Example → Uncountable Sets and Cantor Diagonalization → The Halting Problem → Computability Reductions → Post Correspondence Problem → Rice's Theorem → Recursively Enumerable and Co-RE Languages → Kolmogorov Complexity

Longest path: 93 steps · 528 total prerequisite topics

Prerequisites (6)

Turing Machineshard Recursively Enumerable and Co-RE Languagessoft Cardinality and Countabilitysoft Big-O Notation and Asymptotic Analysissoft Combinationssoft Algorithm Analysis and Complexity Classessoft

Leads To (1)

Algorithmic Information Theoryhard