Questions: Streaming Algorithms

4 questions to test your understanding

Score: 0 / 4
Question 1 Multiple Choice

A count-min sketch uses d = log(1/delta) hash functions and w = ceil(e/epsilon) counters per row. When querying frequency of item x, it returns the MINIMUM of d counter values. Why the minimum, not the average?

AThe minimum is faster to compute than the average
BAll d counters are unbiased estimators that can only OVERCOUNT (due to hash collisions adding to the count), so the minimum is closest to the true frequency — averaging would preserve the positive bias from collisions
CThe minimum provides a lower bound while the average provides an upper bound
DThe minimum uses less memory than storing all d values for averaging
Question 2 Short Answer

HyperLogLog estimates the number of distinct elements in a stream. It works by hashing each element and tracking the maximum number of leading zeros observed. Explain why this works and why multiple registers are needed.

Think about your answer, then reveal below.
Question 3 True / False

The Alon-Matias-Szegedy result showed that computing the second frequency moment F_2 exactly requires Omega(n) space in the streaming model, but it can be (1+epsilon)-approximated in O(1/epsilon^2 * log n) space.

TTrue
FFalse
Question 4 True / False

Count-min sketches are mergeable: the sketch of a combined stream equals the entry-wise sum of the individual sketches. This makes them suitable for distributed and parallel settings.

TTrue
FFalse