The van Emde Boas (vEB) layout stores a complete binary tree so that a root-to-leaf search touches O(log_B N) cache blocks, matching B-trees. How does it achieve this without knowing B?
AIt stores the tree in BFS order, which naturally groups nearby levels into blocks
BIt recursively splits the tree at height h/2: the top subtree of height h/2 and all bottom subtrees of height h/2 are each stored contiguously. At the recursion level where subtree size ≈ B, each subtree fits in one block, and a root-to-leaf path crosses O(log N / log B) = O(log_B N) such subtrees
CIt randomizes the tree layout so that on average each block contains useful nodes
DIt stores the tree in DFS order, which optimizes for sequential access
The vEB layout is recursive: split a tree of height h into a top tree of height h/2 and sqrt(N) bottom trees of height h/2. Store the top tree contiguously, followed by each bottom tree contiguously. At the recursion depth where subtree size equals B, each subtree fits in a single cache block. A root-to-leaf path traverses h = log N levels, grouped into subtrees of size ~B, crossing at most log(N)/log(B) = log_B(N) subtrees = cache blocks. This works for ANY B because the recursive structure adapts to all block sizes simultaneously.
Question 2 True / False
Cache-oblivious algorithms assume an ideal (optimal offline) cache replacement policy. In practice, LRU provides a constant-factor simulation of the optimal policy.
TTrue
FFalse
Answer: True
Sleator and Tarjan showed that LRU with cache size 2M has at most twice the cache misses of the optimal offline policy (Bélády's MIN) with cache size M. Since cache-oblivious analysis uses the optimal policy, an LRU cache of size 2M achieves the same asymptotic I/O bounds. This constant-factor slack in cache size is why cache-oblivious algorithms work in practice with standard hardware caches that use LRU or LRU-approximation replacement policies. The assumption of an ideal replacement policy is not a weakness — it is validated by this simulation result.
Question 3 Short Answer
Explain why standard (RAM-model) mergesort is NOT cache-oblivious optimal, and how funnelsort achieves cache-oblivious optimality.
Think about your answer, then reveal below.
Model answer: Standard 2-way mergesort performs O(N/B * log_2(N/B)) I/Os because it merges only 2 runs at a time, regardless of how much memory is available. Cache-oblivious optimal is O(N/B * log_{M/B}(N/B)), requiring M/B-way merging — but a cache-oblivious algorithm cannot know M. Funnelsort solves this with a recursive 'funnel' data structure: a k-funnel merges k sorted streams using a binary tree of mergers, where each merger recursively uses sub-funnels. The recursive structure ensures that when a sub-funnel fits in cache (size ~M), it merges ~sqrt(M/B) streams without cache misses. This implicitly achieves the optimal merge degree at each cache level without knowing M or B. Funnelsort's I/O complexity matches the external memory sorting bound for all M and B simultaneously.
The key insight is that recursive decomposition at all scales creates a structure that automatically adapts to the cache size. At the level where a sub-funnel fits in cache, it achieves the bandwidth of an M/B-way merge. This 'automatic parameter adaptation' is the hallmark of cache-oblivious algorithm design.
Question 4 True / False
A cache-oblivious algorithm that is optimal for a two-level memory hierarchy (cache + main memory) is automatically optimal for all levels of a multi-level hierarchy.
TTrue
FFalse
Answer: True
This is one of the most elegant properties of cache-oblivious algorithms. Since the analysis holds for ALL M and B simultaneously, it applies to every adjacent pair of levels in a multi-level hierarchy (L1-L2, L2-L3, L3-RAM, RAM-disk). An algorithm that achieves the optimal I/O bound at each level pair is automatically optimal for the entire hierarchy. Cache-aware (external memory) algorithms, by contrast, are tuned for specific M and B values and may be suboptimal at other hierarchy levels. This multi-level optimality is the primary practical advantage of the cache-oblivious approach.