Alice has an n-bit string x and Bob has an n-bit string y. They want to determine if x = y. Why does the deterministic communication complexity of equality require Omega(n) bits while the randomized complexity is O(1)?
ADeterministic protocols must handle 2^n possible inputs for Alice, so by pigeonhole any protocol with fewer than n bits of communication conflates two of Alice's inputs — but they map to different outputs for some y. Randomized protocols use fingerprinting: Alice sends a random hash of x (O(log n) bits), Bob checks against his hash of y, and collision probability is O(1/n) per trial
BDeterministic protocols are slower because they cannot use hashing
CRandomized protocols use compression to reduce x to O(1) bits
DThe equality function is trivial — both complexities are O(1)
The deterministic lower bound follows from a counting argument: Alice's messages partition her 2^n possible inputs into at most 2^c groups (where c is the communication), and if c < n some group contains two distinct inputs x1 != x2. Bob cannot distinguish them, so for y = x1 he must give the same answer for both x1 and x2, but the correct answers differ. The randomized protocol exploits fingerprinting: pick a random prime p, send x mod p — this is O(log n) bits and equals y mod p with probability 1 when x = y, and disagrees with high probability when x != y. With O(1) rounds of amplification, the communication is O(log n), and with shared randomness even O(1) bits suffice.
Question 2 True / False
The set disjointness problem (are Alice's and Bob's subsets of {1,...,n} disjoint?) requires Omega(n) communication even with randomization. This lower bound directly implies that estimating the number of distinct elements in a data stream requires Omega(n) space.
TTrue
FFalse
Answer: True
The reduction works as follows: if a streaming algorithm uses s bits of space for distinct-element estimation, Alice can run the algorithm on her set elements, send the s-bit state to Bob, who continues the stream with his elements and checks the distinct count. If the sets are disjoint, distinct count = |A| + |B|; if they share an element, distinct count < |A| + |B|. Distinguishing these cases with the streaming algorithm solves set disjointness with s bits of communication. Since set disjointness requires Omega(n) randomized communication, s = Omega(n) for exact distinct counting. For approximate counting (epsilon-relative error), the lower bound is Omega(1/epsilon^2), also from communication complexity.
Question 3 Short Answer
Explain the information-theoretic approach to communication complexity lower bounds and how it strengthens the basic combinatorial approach.
Think about your answer, then reveal below.
Model answer: The information complexity framework measures not just how many bits are communicated, but how much INFORMATION the transcript reveals about the inputs. The information cost of a protocol is I(X; transcript | Y) + I(Y; transcript | X) — the mutual information between inputs and the transcript. Any protocol with communication c has information cost at most c, so information cost lower bounds imply communication lower bounds. The key advantage is that information cost is additive under direct-sum composition: computing f on k independent instances has information cost at least k times the single-instance cost. This yields tight bounds for problems like set disjointness (Omega(n) per instance) and provides the main technique for proving streaming lower bounds, where the stream can be decomposed into independent sub-problems.
The information complexity approach, developed by Bar-Yossef, Jayram, Kumar, Sivakumar, and refined by Braverman and others, unified many previously ad-hoc lower bound arguments. It also connects communication complexity to information theory and rate-distortion theory, enabling tight characterizations of multi-party and multi-round communication.
Question 4 True / False
Nondeterministic communication complexity of a function f can be exponentially smaller than deterministic communication complexity.
TTrue
FFalse
Answer: True
In nondeterministic communication complexity, a prover provides a certificate, and Alice and Bob verify it using minimal communication. For example, the 'not-equal' function (output 1 if x != y) has nondeterministic complexity O(log n): the prover specifies a position i where x and y differ, Alice sends x_i, Bob checks against y_i. But deterministic complexity is Theta(n). This exponential gap (log n vs n) is one of the largest known separations in communication complexity. The nondeterministic model is closely related to covering numbers and log-rank of Boolean matrices.