Questions: Scanner Generator Implementation

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

A scanner specification lists the keyword 'if' before the general identifier pattern [a-zA-Z_][a-zA-Z0-9_]*. When the scanner processes the input 'iffy', which token does it produce?

ATwo tokens: keyword 'if' followed by identifier 'fy'
BOne token: identifier 'iffy', because the longest match rule takes precedence
CA lexical error, because 'iffy' partially matches both a keyword and an identifier
DOne token: keyword 'if', because keywords always have highest priority
Question 2 Multiple Choice

Why does a scanner generator convert the combined NFA to a DFA before emitting scanner code, rather than simulating the NFA directly at runtime?

ANFAs cannot recognize the same languages as DFAs and would miss some tokens
BDFAs enable deterministic, O(1)-per-character processing: each state and input character maps to exactly one next state, enabling a simple table-driven scanner loop
CNFAs require exponentially more memory than DFAs and cannot be stored in a transition table
DDFAs are simpler to construct from regular expressions than NFAs using Thompson's construction
Question 3 True / False

A scanner generator combines all token patterns into a single NFA (using alternation) before converting to a DFA, so that the resulting DFA can classify tokens from any of the specified patterns in a single left-to-right pass.

TTrue
FFalse
Question 4 True / False

Because scanner generators use regular expressions, a sufficiently complex regex can recognize inputs with balanced nested parentheses, eliminating the need for a separate parser phase.

TTrue
FFalse
Question 5 Short Answer

Describe the pipeline from a regular expression specification to executable scanner code. What happens at each stage and why?

Think about your answer, then reveal below.