Questions: Gram-Schmidt Process and QR Decomposition
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
After applying Gram-Schmidt to {v₁, v₂, v₃} to produce {u₁, u₂, u₃}, which statements are guaranteed to be true?
Aspan{u₁} = span{v₁, v₂, v₃} — the first output vector spans the full original space
Bspan{u₁, u₂} = span{v₁, v₂} — the process preserves the subspace structure at each prefix
Cu₁ = v₁ — the first output is always identical to the first input
Du₃ is the projection of v₃ onto the plane spanned by u₁ and u₂
The key structural guarantee of Gram-Schmidt is that span{u₁, …, uᵢ} = span{v₁, …, vᵢ} at every step — the orthonormal prefix spans the same subspace as the original prefix. This is more than just saying the full sets span the same space; the agreement holds at every intermediate level. Option A is wrong: u₁ spans only the line through v₁, not the full space. Option C is wrong: u₁ = v₁/‖v₁‖ (normalized, not identical, unless v₁ already has unit length). Option D has the direction inverted — u₃ is the *residual* of v₃ after projecting out u₁ and u₂ directions, then normalized.
Question 2 Multiple Choice
A numerical analyst must solve a least-squares problem Ax = b where the columns of A are nearly linearly dependent. She must choose between forming AᵀA and solving the normal equations, or computing A = QR and solving via back-substitution. Which is numerically safer and why?
ANormal equations — they reduce the problem from a rectangular to a square system, which is simpler
BQR decomposition — it avoids squaring the condition number of A, preventing amplification of floating-point errors
CBoth methods produce identical numerical results because they solve the same mathematical problem
DNormal equations — AᵀA is always symmetric positive definite, which guarantees stability
When the columns of A are nearly linearly dependent, A has a large condition number κ. Forming AᵀA squares the condition number to κ², dramatically amplifying rounding errors. This can make the normal equations numerically useless even when the true solution is well-defined. QR decomposition avoids this: solving via the orthonormal Q and triangular R never squares the condition number. Option C is mathematically true but practically false — identical in exact arithmetic, but hugely different under floating-point. Option D is also true but misses the point: symmetric positive definite is not sufficient for stability when the condition number is enormous.
Question 3 True / False
The Gram-Schmidt process can be applied to any set of vectors, linearly independent or not, and generally produces an orthonormal set of the same size as the input.
TTrue
FFalse
Answer: False
When a vector is linearly dependent on the preceding ones, its residual after subtracting all projections is the zero vector — which cannot be normalized (division by zero). The process breaks down at that vector. In practice, a linearly dependent vector is discarded, and the output set is smaller than the input. Gram-Schmidt produces an orthonormal basis for the *span* of the input vectors; if the inputs are linearly dependent, the span has dimension less than the number of input vectors.
Question 4 True / False
In QR decomposition A = QR, the matrix R is upper triangular because each new orthonormal vector is built by subtracting projections only onto previously computed basis vectors, not future ones.
TTrue
FFalse
Answer: True
The entry Rᵢⱼ records the projection coefficient of vⱼ onto uᵢ. When processing vⱼ during Gram-Schmidt, you subtract projections onto u₁, …, uⱼ₋₁ (vectors already built). Vectors uⱼ, uⱼ₊₁, …, uₖ have not yet been constructed, so there are no projection terms involving them. Consequently Rᵢⱼ = 0 whenever i > j — upper triangular. The triangular structure is not an imposed constraint; it is a direct consequence of the sequential, forward-only structure of the Gram-Schmidt algorithm.
Question 5 Short Answer
Why does Gram-Schmidt subtract projections onto ALL previously computed orthonormal vectors at each step, rather than just the most recent one?
Think about your answer, then reveal below.
Model answer: Because each new vector uᵢ must be perpendicular to the entire set {u₁, …, uᵢ₋₁}, not just the previous vector uᵢ₋₁. If you only subtracted the projection onto uᵢ₋₁, the residual might still have components along u₁, …, uᵢ₋₂ from earlier steps. By subtracting the projection onto every previously established direction simultaneously, you remove all those components and guarantee the residual is orthogonal to the full set built so far. Each projection subtraction eliminates one direction; you need as many subtractions as there are previously established directions.
This is why the algorithm is iterative rather than pairwise. The orthogonality requirement is cumulative — each new vector must be orthogonal not just to its neighbor but to everything that came before. Classical Gram-Schmidt does all these subtractions simultaneously, while modified Gram-Schmidt applies them sequentially to improve numerical stability, but both achieve the same mathematical result.