5 questions to test your understanding
An algorithm performs W = O(n) total work and has depth D = O(log n). Using Brent's theorem, what is the running time on p = n / log n processors?
The class NC contains exactly those problems solvable in O(log^k n) depth with O(n^c) total work for constants k and c. A problem that is P-complete under logspace reductions is in NC if and only if NC = P.
The parallel prefix (scan) operation computes all prefix sums of an array of n elements. What are its work and depth, and why is this operation fundamental to parallel algorithm design?
PRAM models differ in how they handle concurrent memory access. The EREW (Exclusive Read Exclusive Write) model is strictly weaker than CRCW (Concurrent Read Concurrent Write) -- some problems require asymptotically more depth on EREW than CRCW.
Why is work-efficiency considered more important than minimizing depth in practical parallel algorithm design?