Any comparison-based sorting algorithm requires at least Ω(n log n) comparisons in the worst case, proved via a decision-tree argument: sorting n elements has n! possible outcomes, and a binary decision tree of depth d has at most 2^d leaves, so d ≥ log₂(n!) = Ω(n log n). Non-comparison sorts break this barrier by exploiting structure in the data. Counting sort runs in O(n + k) for integers in range [0, k]. Radix sort achieves O(dn) for d-digit numbers. These algorithms are faster in practice when keys have bounded range but cannot sort arbitrary comparable objects.
Prove the Ω(n log n) lower bound using the decision tree argument step by step. Implement counting sort and radix sort, then benchmark both against merge sort on large integer datasets to confirm the speedup.
You have implemented merge sort, quicksort, and heapsort, and you know they all achieve O(n log n) in the worst or average case. A natural question arises: can we do better? The comparison-based sorting lower bound proves that the answer is no — any algorithm that sorts by comparing pairs of elements must make at least Ω(n log n) comparisons in the worst case. The proof is elegant and uses a tool called a decision tree.
Imagine modeling any comparison sort as a binary tree where each internal node represents a comparison ("is a[i] < a[j]?") and each leaf represents one possible output permutation. Since the sort must be able to produce any of the n! permutations of the input, the tree needs at least n! leaves. A binary tree of height d has at most 2^d leaves, so we need 2^d ≥ n!, which gives d ≥ log₂(n!). Stirling's approximation tells us that log₂(n!) = Θ(n log n), so the minimum height — and therefore the minimum number of comparisons — is Ω(n log n). This is not a statement about any particular algorithm; it is a statement about the entire class of comparison-based sorts. Merge sort and heapsort are therefore asymptotically optimal within this class.
Non-comparison sorts sidestep this bound entirely by not comparing elements to each other. Counting sort works on integers in a known range [0, k]: it counts how many times each value appears, then uses those counts to place elements directly into their final positions. No comparisons are needed — just array indexing. The runtime is O(n + k), which is linear when k is O(n). The key constraint is that you must know the range of values in advance, and if k is much larger than n, the auxiliary count array wastes space and the algorithm loses its advantage.
Radix sort extends counting sort to handle larger keys by sorting digit by digit, from least significant to most significant. At each digit position, it uses a stable sort (typically counting sort) to arrange elements by that digit alone. Stability is critical: elements with the same digit at the current position must retain the relative order established by previous digit passes. After d passes (one per digit), the array is fully sorted. The runtime is O(d(n + k)) where d is the number of digits and k is the base. For fixed-width integers, this is effectively O(n). The takeaway is that the Ω(n log n) barrier is real but conditional: it applies only when your sole operation is comparing elements. If you can exploit the structure of your keys — their finiteness, their digit decomposition — you can sort in linear time.