A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Edit Distance: Levenshtein Distance and DP

College Depth 90 in the knowledge graph ☐ I know this ☆ Set as goal

50topics build on this

470prerequisites beneath it

Dynamic Programming Longest Common Subsequence (LCS) Problem→→0/1 Knapsack Problem: Bounded Capacity DP Floyd-Warshall Algorithm for All-Pairs Shortest Paths

Core Idea

Edit distance (Levenshtein distance) is the minimum number of single-character edits (insert, delete, replace) to transform one string to another. DP solves it in O(mn) time and space. Applications include spell checking, sequence alignment, and DNA comparison.

How It's Best Learned

Implement the DP recurrence: dp[i][j] = min(dp[i-1][j]+1, dp[i][j-1]+1, dp[i-1][j-1] + cost). Trace on short strings. Optimize space to O(min(m,n)) using rolling arrays.

Common Misconceptions

Confusing edit distance with longest common subsequence; they're related but distinct.
Not understanding the three operations (insert, delete, replace) and their costs.
Assuming O(mn) space is necessary; space optimization often applies.

Explainer

From your study of dynamic programming, you know the core technique: break a problem into overlapping subproblems, solve each one once, and store the results in a table. The edit distance problem (also called Levenshtein distance) is one of the cleanest applications of this idea. Given two strings — say "kitten" and "sitting" — the edit distance is the minimum number of single-character operations (insert, delete, or replace) needed to transform one string into the other. For "kitten" → "sitting," the answer is 3: replace k→s, replace e→i, insert g.

The DP solution builds a 2D table where dp[i][j] represents the edit distance between the first i characters of string A and the first j characters of string B. The base cases are straightforward: dp[i][0] = i (deleting all i characters from A to reach an empty B) and dp[0][j] = j (inserting all j characters of B into an empty A). For the general case, you compare A[i] with B[j]. If they match, no operation is needed and dp[i][j] = dp[i-1][j-1]. If they differ, you take the minimum of three choices: replace A[i] with B[j] (cost = dp[i-1][j-1] + 1), delete A[i] (cost = dp[i-1][j] + 1), or insert B[j] after A[i] (cost = dp[i][j-1] + 1). Each cell in the table depends only on the cell above, to the left, and diagonally above-left.

Walking through a small example makes this concrete. For A = "cat" and B = "car," the table is 4×4. The diagonal represents matching characters: c=c (cost 0), a=a (cost 0), t≠r (cost 1 for replacement). The final cell dp[3][3] = 1, confirming that one replacement transforms "cat" into "car." For longer strings, the table fills out the same way — and the beauty of DP is that each cell is computed once, giving O(mn) time for strings of length m and n.

The applications are remarkably broad. Spell checkers use edit distance to rank correction candidates — "teh" has distance 1 from "the" but distance 2 from "tea." Bioinformatics uses a generalized version (sequence alignment) where different operations have different costs to compare DNA sequences and identify mutations. Search engines use it to handle typos in queries. The space optimization is also worth knowing: since each row of the table depends only on the previous row, you can use two rolling arrays of size min(m, n) instead of the full m × n table, reducing space from O(mn) to O(min(m, n)) while keeping the same O(mn) time complexity.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Conditional Statements → Defining and Calling Functions → Functions: Decomposing Problems → Function Parameters and Argument Passing → Return Values → Variable Scope → Introduction to Classes → Objects and Instances → Methods and Attributes → Algorithm Design Basics → Tree Structure and Node Properties → Binary Trees → Tree Traversals → Depth-First Search (DFS) → Depth-First Search: Implementation and Applications → Topological Sort → Dynamic Programming → Longest Common Subsequence (LCS) Problem → Edit Distance: Levenshtein Distance and DP

Longest path: 91 steps · 470 total prerequisite topics

Prerequisites (2)

Dynamic Programminghard Longest Common Subsequence (LCS) Problemsoft

Leads To (2)

0/1 Knapsack Problem: Bounded Capacity DPsoft Floyd-Warshall Algorithm for All-Pairs Shortest Pathssoft