Two control flow paths converge at point P. On path A, the expression 'a + b' is computed and neither operand is modified before reaching P. On path B, 'a' is reassigned before P, and 'a + b' is not recomputed after the assignment. Can CSE safely reuse the result of 'a + b' at point P?
AYes — 'a + b' was computed on at least one incoming path, so it is available
BNo — available expressions analysis requires the expression to be computed on ALL incoming paths with operands unchanged since their last computation
CYes — the compiler can substitute the known new value of 'a' to compute a corrected result
DIt depends on which path the program takes at runtime, so the compiler must insert a runtime check
Available expressions analysis uses intersection at join points: an expression is available at P only if it is available on EVERY path reaching P. On path B, 'a' was reassigned after the last computation of 'a + b', so the stored result may not reflect the current value of 'a'. The compiler cannot know at compile time which path will execute, so it must conservatively declare 'a + b' unavailable. Using the path-A result when path B was taken would generate wrong code. This intersection semantics — not union — is what makes global CSE safe.
Question 2 Multiple Choice
A compiler performs available expressions analysis at a join point where 'x * y' is available on one incoming control flow edge but not the other. What does the analysis conclude?
A'x * y' is available — availability on any path is sufficient to justify reuse
B'x * y' is not available — intersection requires availability on ALL paths before reuse is safe
C'x * y' may be available depending on the loop iteration count
D'x * y' is available only if 'x' and 'y' are loop-invariant variables
Intersection is the conservative, correct choice for available expressions. If 'x * y' is available on only one of two incoming edges, the compiler cannot guarantee that a stored result exists for both execution paths. Reusing a non-existent result on the other path would produce incorrect output. Option A (union) would be unsafe: it claims availability even when the result is absent on some paths. Safety requires that the expression is available on ALL paths, which intersection enforces — potentially missing some opportunities, but never generating incorrect code.
Question 3 True / False
Local CSE and global CSE differ primarily in scale — both rely on the same underlying dataflow analysis, applied to a larger or smaller region of code.
TTrue
FFalse
Answer: False
Local CSE requires NO dataflow analysis. A basic block is a straight-line sequence with no branches, so there is only one path through it. The compiler scans forward, maintains a table of previously computed expressions, and replaces duplicates — no join points, no gen/kill propagation needed. Global CSE, by contrast, must propagate availability information across basic block boundaries using a full forward dataflow analysis with gen sets, kill sets, and intersection at join points. The two methods differ fundamentally in technique, not just code region size.
Question 4 True / False
The kill set for available expressions analysis at a statement that assigns to variable 'x' includes all expressions in the program that contain 'x' as an operand.
TTrue
FFalse
Answer: True
When 'x' is assigned a new value, any previously computed expression containing 'x' may now be stale — the stored result no longer reflects what 'x + y' or 'x * z' would yield with the new value of 'x'. Therefore, all such expressions must be killed (removed from the available set) at that assignment. This is conservative but correct: even if the new value of 'x' happens to equal the old value, the compiler cannot generally know this and must invalidate all expressions using 'x' to preserve safety.
Question 5 Short Answer
Why does global CSE use intersection (not union) at join points in the available expressions dataflow analysis? What would go wrong if union were used?
Think about your answer, then reveal below.
Model answer: Intersection is required because CSE can only safely reuse a result that is guaranteed to exist no matter which path reached the current point. If union were used, the compiler would claim an expression is 'available' even when it was computed on only one of several paths — on the other paths, no stored result exists. When the program takes one of those paths, the compiler would try to reuse a nonexistent result, generating incorrect output. Intersection ensures the result was definitely computed on every possible execution path to this point, making reuse unconditionally safe.
This is the fundamental correctness constraint in dataflow analysis for optimization. Intersection gives a safe under-approximation: the compiler may miss some reuse opportunities (it's conservative), but it never reuses a result that might be stale or absent. Union is an over-approximation — it finds more 'available' expressions but is unsafe. The broader principle: optimizations must be conservative when in doubt; performance is sacrificed before correctness. Global CSE is typically combined with loop-invariant code motion to recover additional reuse opportunities that intersection alone might miss.