Dynamic programming (DP) solves optimization and counting problems by breaking them into overlapping subproblems and storing solutions to avoid redundant computation. Two key properties must hold: optimal substructure (the optimal solution contains optimal solutions to subproblems) and overlapping subproblems (the same subproblems recur many times). Classic examples include Fibonacci, 0/1 knapsack, longest common subsequence, and coin change. DP transforms exponential naive recursion into polynomial time by caching intermediate results.
Start with memoized Fibonacci to see the speedup from caching in isolation. Then tackle structured DP problems: coin change, longest common subsequence, 0/1 knapsack. For each, explicitly define the subproblem before writing any code.
Recursion solves a problem by reducing it to smaller instances of itself. The danger is that naive recursion often recomputes the same smaller instances many times. Consider computing Fibonacci(50) recursively: Fibonacci(3) gets recalculated billions of times because the call tree branches at every level, producing exponential work. Dynamic programming patches this by storing each result the first time it is computed and looking it up instead of recomputing — a technique called memoization.
Two structural properties must hold for DP to apply. Optimal substructure means the solution to the full problem can be assembled from optimal solutions to subproblems. The shortest path from A to C through B must use the shortest path from A to B as a segment — if that sub-path were suboptimal, you could improve the full path, contradicting its optimality. Overlapping subproblems means the same sub-instances appear repeatedly across the recursion tree. Without overlap, caching adds overhead with no benefit; this is why merge sort, which splits arrays into non-overlapping halves, is divide-and-conquer rather than DP.
The hardest part of dynamic programming is not the code — it is defining the subproblem correctly before you write a line. You should be able to state "dp[i] means ___" in a precise sentence. For coin change, the right definition is: "dp[i] = the minimum number of coins needed to make exactly amount i." Given that, the recurrence follows mechanically: dp[i] = 1 + min over all coins c ≤ i of dp[i - c]. A vague definition like "dp[i] represents the coins used so far" produces an ambiguous recurrence and subtle bugs that are very hard to debug.
Once the subproblem is defined, you can implement it two ways. Top-down (memoization): write the natural recursive solution, add a cache (array or hash map), and check the cache before computing. The code reads like the mathematical recurrence and only computes subproblems that are actually needed. Bottom-up (tabulation): fill a table starting from base cases, computing each entry using previously filled entries. This avoids recursion stack overhead and often performs better in practice. Both have the same asymptotic time complexity — they solve each of the O(n) (or O(n²), etc.) subproblems exactly once.
Your background in recurrence relations connects directly: the DP recurrence is that relation made computational. Your knowledge of time-space complexity explains why DP matters — the transformation from exponential naive recursion to polynomial time, achieved by eliminating redundant computation. And mathematical induction provides the proof structure: show the base case, assume all smaller subproblems are solved correctly, and verify the recurrence step is correct. The inductive structure of correctness proofs and the recursive structure of DP are two sides of the same coin.