A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Viterbi Algorithm

Research Depth 97 in the knowledge graph ☐ I know this ☆ Set as goal

613prerequisites beneath it

Dynamic Programming Hidden Markov Models→

Core Idea

Viterbi finds the most likely hidden state sequence in an HMM given observations using dynamic programming. It maintains the maximum probability path to each state at each time step, eliminating suboptimal paths with O(T × N²) complexity.

Explainer

You know from Hidden Markov Models that a system has hidden states generating observable outputs — for example, weather conditions (hidden) producing observable activity choices, or part-of-speech tags (hidden) generating observed words. Given a sequence of observations, a natural question is: what is the most likely sequence of hidden states that produced them? A brute-force approach would enumerate every possible state sequence, compute its probability, and pick the best one — but with N possible states and T time steps, that means N^T candidates, which is exponentially intractable. The Viterbi algorithm solves this problem in O(T × N²) time using dynamic programming.

The key insight, which you will recognize from your study of dynamic programming, is optimal substructure: the most likely path of length T ending in state sⱼ must consist of the most likely path of length T−1 ending in some state sᵢ, followed by a transition from sᵢ to sⱼ. You do not need to consider all possible length-(T−1) paths — only the best one reaching each state. At each time step t, the algorithm maintains a table δₜ(j) representing the probability of the most likely path that ends in state j at time t and produces the observations seen so far. The recursion is: δₜ(j) = maxᵢ [δₜ₋₁(i) × transition(i→j) × emission(j→oₜ)]. You also store backpointers recording which predecessor state i achieved the maximum, so you can reconstruct the full path at the end.

The algorithm proceeds left to right through the observation sequence. Initialization sets δ₁(j) = π(j) × emission(j→o₁) for each state j, where π(j) is the initial state probability. Recursion fills in each subsequent column of the table using the formula above, taking O(N²) per time step (for each of N states, you maximize over N predecessors). Termination finds the state with the highest δ_T value, and backtracking follows the stored pointers backward from that state to recover the most likely complete path.

The Viterbi algorithm appears throughout computer science and engineering: in speech recognition (decoding the most likely phoneme sequence), natural language processing (part-of-speech tagging), bioinformatics (gene finding), and digital communications (decoding convolutional codes). Its efficiency comes from the same principle that powers all dynamic programming — recognizing that exponentially many candidate solutions share overlapping subproblems, and that you only need to keep the best partial solution reaching each intermediate state. Once you see Viterbi as "shortest path through a trellis graph," the connection to dynamic programming becomes concrete: the trellis has N nodes per time step, and you are finding the highest-probability path from any start node to any end node.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Conditional Distributions → Conditional Expectation → Markov Chains → Hidden Markov Models → Viterbi Algorithm

Longest path: 98 steps · 613 total prerequisite topics

Prerequisites (2)

Hidden Markov Modelshard Dynamic Programminghard

Leads To (0)

No topics depend on this one yet.