The moment-generating function M(t) = E[e^{tX}] doesn't exist for distributions with heavy tails (e.g., Cauchy), while the characteristic function φ(t) = E[e^{itX}] always exists. What is the fundamental mathematical reason for this difference?
AThe imaginary unit i makes the expectation automatically finite by algebraic convention
B|e^{itX}| = 1 for all real t and X, so the integral always converges absolutely regardless of the tail behavior of X
CThe characteristic function averages positive and negative oscillations that cancel, keeping the result bounded
DThe Fourier transform is always bounded while the Laplace transform may not be — it's a transform-theory fact
By Euler's formula, e^{itX} = cos(tX) + i·sin(tX), and the modulus |e^{itX}| = √(cos²(tX) + sin²(tX)) = 1 for all real t and X. The integrand in E[e^{itX}] = ∫ e^{itx} dF(x) is therefore always bounded by 1 in absolute value, guaranteeing absolute convergence of the integral for any probability distribution, including heavy-tailed ones. For the MGF, e^{tX} grows exponentially with |X|, so heavy-tailed distributions where P(|X| > x) doesn't decay fast enough will have infinite MGF for any t ≠ 0.
Question 2 Multiple Choice
Random variables X and Y are independent, each with characteristic function φ(t) = e^{−t²/2} (the standard normal). What is the characteristic function of X + Y?
Ae^{−t²/2} — the same, because normal distributions are closed under addition
Be^{−t²} = (e^{−t²/2})²
C2e^{−t²/2} — the sum of the two characteristic functions
De^{−t⁴/4} — the convolution of two Gaussians in the frequency domain
For independent random variables, φ_{X+Y}(t) = φ_X(t) · φ_Y(t) — convolution of distributions corresponds to pointwise multiplication of characteristic functions. So φ_{X+Y}(t) = e^{−t²/2} · e^{−t²/2} = e^{−t²}. This is itself a Gaussian characteristic function (corresponding to N(0,2)), confirming closure of the normal family under addition. Option C confuses multiplication with addition; the multiplication property, not addition, is what convolution corresponds to.
Question 3 True / False
When the moment-generating function of a distribution exists, it contains strictly more probabilistic information than the characteristic function of the same distribution.
TTrue
FFalse
Answer: False
Both the MGF (when it exists) and the characteristic function uniquely determine the probability distribution — neither contains more information. The characteristic function is more general because it always exists, while the MGF may not. When both exist, they are related by analytic continuation and carry equivalent information about all moments and the full distributional shape. The advantage of the characteristic function is universality, not additional information content.
Question 4 True / False
The continuity theorem states that pointwise convergence of characteristic functions to a limit that is continuous at 0 implies convergence in distribution of the corresponding random variables.
TTrue
FFalse
Answer: True
This is the precise mathematical statement that makes characteristic functions the standard tool for proving limit theorems. The CLT proof proceeds by: (1) computing φ_{Sₙ/√n}(t) for the standardized sum, (2) showing it converges pointwise to e^{−t²/2} using Taylor expansion and the limit (1 + x/n)^n → e^x, (3) invoking the continuity theorem to conclude convergence in distribution to N(0,1). Each step is clean algebra. The continuity theorem converts pointwise function convergence (which is analytically tractable) directly into the probabilistic conclusion.
Question 5 Short Answer
Explain why proving the central limit theorem via characteristic functions is more tractable than direct approaches, and identify the key algebraic steps that make it work.
Think about your answer, then reveal below.
Model answer: Characteristic functions convert the sum of n independent variables into a product of n identical factors — φ_{Sₙ}(t) = [φ_X(t)]^n. For the standardized sum S_n/√n, this becomes [φ_X(t/√n)]^n. Taylor-expanding φ_X(t/√n) around 0 using the facts that E[X] = 0 and Var(X) = σ² gives approximately 1 − t²σ²/(2n) + O(n^{−3/2}). Raising this to the n-th power and taking n → ∞ uses the fundamental limit (1 + x/n)^n → e^x, yielding e^{−t²σ²/2} — the normal characteristic function. The continuity theorem then converts this pointwise limit into convergence in distribution to N(0, σ²). Direct approaches via CDFs or densities require controlling integrals of increasingly complex functions over unbounded domains, which is far more technically demanding.
The two algebraic pivots are the multiplication-under-independence property (turning sums into products) and the limit (1 + x/n)^n → e^x (turning the product into an exponential). These steps are clean and elementary given characteristic functions. Without them, the proof requires heavy measure-theoretic machinery.