Big-O notation describes asymptotic upper bounds: f(n) ∈ O(g(n)) if f(n) ≤ cg(n) for large n and constant c. It abstracts away constant factors and lower-order terms. Big-Θ and Big-Ω provide tighter and lower bounds respectively.
Start with simple functions like n, n², 2^n. Compare growth rates by computing limits and building intuition.
When analyzing algorithms, we want to understand how running time grows as the input size n increases — not the exact number of operations, which depends on the hardware, compiler, and implementation details. Big-O notation provides a language for this: we say f(n) ∈ O(g(n)) if there exist constants c and n₀ such that f(n) ≤ c·g(n) for all n ≥ n₀. In plain terms, g(n) is an upper bound on f(n) for large inputs, up to a constant multiplier.
The crucial feature of Big-O is what it deliberately ignores: constant factors and lower-order terms. The function 5n² + 1000n + 7 is O(n²) because for large enough n, the n² term dominates everything else. This simplification is intentional — it focuses attention on the fundamental growth rate, which is what matters when n is very large. The cost is that Big-O cannot compare two O(n²) algorithms; they might differ by a factor of 100 in practice.
Big-O (O) provides only an upper bound. Two companion notations give more precision: Ω(g(n)) is a lower bound — f grows at least as fast as g — and Θ(g(n)) means both, so f grows at exactly the rate g. Most introductory analysis establishes O-bounds (they are easier to prove), but Θ is the more informative statement. When someone says "merge sort runs in O(n log n)", they usually mean Θ(n log n): the best and worst cases both scale as n log n.
A common trap is confusing what Big-O says about an algorithm's speed versus an algorithm's input. O(n²) does not mean "slow for all inputs" — for n = 10, an O(n²) algorithm performs at most 100 operations (scaled by a constant). Asymptotic notation only becomes meaningful as n grows large. For small n, empirical profiling and constant factors matter far more than the Big-O class. Developing judgment about when the asymptotics kick in is part of practical algorithm design.