The PRAM (Parallel Random Access Machine) is the standard theoretical model for shared-memory parallel computation: p processors operate synchronously on a shared memory, executing one instruction per step. The key complexity measures are work (total operations across all processors) and depth (longest chain of sequential dependencies, also called span or parallel time). Brent's theorem connects these: any algorithm with work W and depth D can be executed on p processors in time O(W/p + D). The complexity class NC (Nick's Class) captures problems solvable in polylogarithmic depth with polynomial work -- the parallel analog of P. Classic results include O(log n)-depth parallel prefix sums, O(log^2 n)-depth sorting networks, and the striking fact that some problems in P appear to be inherently sequential (P-complete problems), admitting no significant parallel speedup.
Sequential algorithm analysis asks one question: how many steps does the algorithm take? Parallel algorithm analysis asks two: work (total operations, summed over all processors) and depth (the longest chain of dependent operations that must execute sequentially). These two measures capture fundamentally different aspects of an algorithm's parallelizability. An algorithm with small depth can exploit many processors simultaneously, while an algorithm with small work avoids wasting computation. The ideal is both: work matching the best sequential algorithm and polylogarithmic depth.
Brent's theorem bridges theory and practice by showing that any algorithm with work W and depth D can run on p processors in O(W/p + D) time. The W/p term represents the work distributed evenly among processors; the D term represents the inherent sequential bottleneck. When p is much smaller than W/D (the algorithm's parallelism), the running time is approximately W/p -- linear speedup. When p exceeds W/D, adding more processors does not help because the algorithm is depth-bound. This theorem justifies focusing on work-efficient algorithms (W equal to sequential optimal): they guarantee that any available parallelism translates to proportional speedup.
The PRAM model provides the theoretical framework. It assumes p processors sharing a common memory, operating in lockstep. PRAM variants differ in memory access rules: EREW (exclusive read, exclusive write) is the most restrictive and models real hardware most closely; CRCW (concurrent read, concurrent write) is the most permissive and simplifies algorithm design. The difference matters: computing the OR of n bits takes O(1) depth on CRCW (every processor with a 1-bit writes concurrently) but Omega(log n) on EREW. In general, CRCW can be simulated on EREW with a logarithmic slowdown, so the models are polynomially equivalent but the constant factors in depth differ.
The parallel prefix (scan) operation is the workhorse of PRAM algorithms. Given an array [x_1, ..., x_n] and an associative operator, it computes all prefixes [x_1, x_1 + x_2, ..., x_1 + ... + x_n] in O(n) work and O(log n) depth. This deceptively simple primitive underlies an enormous range of parallel algorithms: array compaction (removing marked elements while preserving order), load balancing, segmented operations, tree computations (Euler tour technique), and even sorting. Once you can do parallel prefix, many problems that seem inherently sequential yield to elegant parallel solutions.
The complexity class NC formalizes "efficiently parallelizable": problems solvable with polynomial work and polylogarithmic depth. NC is contained in P (polylog depth, polynomial work implies polynomial sequential time), but whether NC equals P is a major open question. P-complete problems -- like the Circuit Value Problem (evaluating a Boolean circuit) -- are the hardest problems in P for parallel computation: they are in NC only if NC = P. The existence of P-complete problems suggests that some polynomial-time computations are inherently sequential, resisting any significant parallel speedup. This parallels the NP-completeness story: just as NP-complete problems are believed to require superpolynomial sequential time, P-complete problems are believed to require polynomial (not polylogarithmic) parallel depth.
No topics depend on this one yet.