Questions: Alphabets, Strings, and Language Definition
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
Let Σ = {a, b}. A student claims that because Σ is finite (only 2 symbols), Σ* must also be finite. What is wrong with this reasoning?
ANothing is wrong — Σ* is finite when Σ is finite
BΣ* is infinite because strings can be any finite length, so there are strings of length 1, 2, 3, ... with no upper bound — even a 2-symbol alphabet generates infinitely many strings
CΣ* is only infinite when Σ contains more than 10 symbols
DΣ* is uncountably infinite, so the student's intuition about finiteness is irrelevant
This is the single most important insight in this topic: a finite alphabet generates an infinite set of strings. There is no limit on string length, so over {a, b} you can form strings of length 0 (ε), 1 (a, b), 2 (aa, ab, ba, bb), 3 (aaa, aab, ...), and so on without end. Σ* is countably infinite even when Σ is small. Confusing the finiteness of the alphabet with the finiteness of Σ* is a persistent beginner error.
Question 2 Multiple Choice
Which of the following correctly distinguishes the empty string ε from the empty language ∅?
AThey are equivalent — both represent the absence of any string or symbol
Bε is a string of length zero (a member of Σ*); ∅ is the language containing no strings — {ε} is a language with one element, not an empty language
Cε is a symbol in some alphabets; ∅ is ε when the alphabet is empty
DBoth ε and ∅ are languages, and both are empty, just written differently
This distinction is crucial and commonly confused. ε is a string — specifically, the unique string of length zero. It is a valid member of Σ* for any alphabet. The language {ε} contains exactly one string (the empty string) and is therefore not empty. The language ∅ contains no strings at all — it is genuinely empty. The analogy to arithmetic: ε is like zero (a legitimate value), while ∅ is like an empty container. {ε} is a container holding the value zero.
Question 3 True / False
Because Σ (an alphabet) is defined as a finite set, Σ* (the set of most finite strings over Σ) is also finite.
TTrue
FFalse
Answer: False
A finite alphabet generates infinitely many strings. Σ* contains strings of every possible finite length — and there is no maximum length. For Σ = {0, 1}, Σ* includes ε, 0, 1, 00, 01, 10, 11, 000, 001, ..., and so on without end. The finiteness constraint on Σ means only that you cannot have infinitely many distinct symbols — it places no restriction on how long strings can be.
Question 4 True / False
A formal language over Σ can be any subset of Σ*, including finite sets, infinite sets, and even the empty set ∅.
TTrue
FFalse
Answer: True
The definition of a formal language is extraordinarily broad: any subset of Σ*. This includes the empty language ∅ (no strings), finite languages like {ab, ba, aab}, infinite languages like 'all strings over {a, b} with equal numbers of a's and b's,' and even Σ* itself (the set of all strings). The entire project of automata theory and computability is about classifying which of these infinitely many possible languages can be recognized by which kinds of computational machines.
Question 5 Short Answer
Why is the empty string ε considered a foundational element in formal language theory rather than a trivial edge case?
Think about your answer, then reveal below.
Model answer: ε plays a structural role in formal definitions that parallels zero in arithmetic — it seems trivial but is essential. ε is the identity element for string concatenation: appending ε to any string leaves it unchanged (w·ε = ε·w = w). It is the base case in inductive definitions of Σ* and in recursive language definitions. Many important language properties hinge on whether ε is included: the distinction between {ε} and ∅ can determine whether a language is recognized by a given automaton, and the question of whether ε belongs to a language is often the first test in parsing algorithms.
Students who treat ε as 'just the empty case' typically run into trouble when language definitions and automaton behaviors depend on it precisely. Understanding ε as a legitimate string with a specific role — rather than an absence of string — is necessary for working with formal proofs, regular expressions, and automaton transitions.