Jacobi iteration is applied to a 10,000 × 10,000 sparse linear system. After many iterations, the residual does not decrease — the error stays roughly constant or oscillates. What is the most likely cause?
AThe matrix is too large; Jacobi only works for systems with fewer than 1,000 unknowns
BThe spectral radius of the Jacobi iteration matrix is greater than or equal to 1, so the iteration does not converge
CThe right-hand side vector b contains numerical errors that prevent convergence
DJacobi requires a symmetric matrix; the system must not be symmetric
Convergence of Jacobi iteration depends entirely on the spectral radius ρ(M) of the iteration matrix M = I − D⁻¹A, where D is the diagonal of A. If ρ(M) ≥ 1, the iteration diverges or fails to contract the error, regardless of system size. Matrix size, symmetry, or right-hand side accuracy are not the governing factors — the spectral radius is. A large sparse system is actually the use case for which iterative methods are designed; the problem here is the structure of A, not its size.
Question 2 Multiple Choice
Why is the conjugate gradient method typically preferred over Jacobi or Gauss-Seidel for large symmetric positive definite (SPD) systems?
AConjugate gradient does not require storing the matrix A, while Jacobi and Gauss-Seidel both require dense factorizations
BConjugate gradient converges in at most n steps in exact arithmetic and selects optimal update directions from a Krylov subspace, while Jacobi and Gauss-Seidel use simple component-wise updates with slower convergence rates
CConjugate gradient is the only iterative method guaranteed to work on sparse matrices
DJacobi and Gauss-Seidel cannot handle SPD matrices because the diagonal is always positive
For SPD systems, conjugate gradient builds its updates from a growing Krylov subspace, selecting at each step the direction that most efficiently reduces the error. This gives a convergence rate governed by the condition number κ(A) — well-conditioned SPD systems converge in very few iterations. Jacobi and Gauss-Seidel use simple component-wise updates that do not exploit the global structure of A in this way, leading to slower convergence. All three methods share the key advantage of preserving sparsity by working only with matrix-vector products.
Question 3 True / False
Gauss-Seidel typically converges faster than Jacobi on the same system because it uses the most recently computed values of x immediately within each iteration sweep.
TTrue
FFalse
Answer: True
This is the defining algorithmic difference between the two methods. In Jacobi, the entire new iterate x^(k+1) is computed using only values from x^(k). In Gauss-Seidel, as soon as a component x_i^(k+1) is updated, it is used immediately to compute x_{i+1}^(k+1). This means Gauss-Seidel is implicitly using fresher information throughout the sweep. For many matrix structures (diagonally dominant, SPD) this accelerates convergence at no additional computational cost per iteration. The convergence rate depends on the spectral radius of Gauss-Seidel's iteration matrix, which is typically smaller than Jacobi's.
Question 4 True / False
Iterative methods are typically preferable to direct methods (like Gaussian elimination) for solving large linear systems, since direct methods are too slow for any system of practical size.
TTrue
FFalse
Answer: False
The choice depends on the structure of the system. Direct methods like Gaussian elimination are O(n³) and fill in zeros during elimination, destroying sparsity — this makes them impractical for very large sparse systems. But for small or moderately sized systems, or for systems that must be solved exactly rather than to a tolerance, direct methods are often preferable because they always succeed and require no convergence analysis. Iterative methods are the right choice specifically for large sparse systems where matrix-vector products are cheap and a solution to moderate precision is acceptable. Neither class is universally superior.
Question 5 Short Answer
Explain why the spectral radius ρ(M) of the iteration matrix governs convergence of an iterative method, and what happens geometrically when ρ(M) ≥ 1.
Think about your answer, then reveal below.
Model answer: Each iteration applies the matrix M to the error vector e^(k) = x^(k) − x*, so e^(k) = M^k e^(0). The spectral radius is the largest absolute eigenvalue of M, which governs the long-run behavior of M^k. If ρ(M) < 1, repeated multiplication by M shrinks every component of the error in the directions of the eigenvectors, driving the error to zero. If ρ(M) ≥ 1, at least one eigencomponent of the error does not shrink — it stays constant or grows — so the iteration fails to converge. Geometrically, ρ(M) < 1 means the iteration is a contraction mapping on the error, pulling successive iterates toward the fixed point x*; ρ(M) ≥ 1 means the mapping is not a contraction and the iterates may spiral outward or oscillate.
The connection between spectral radius and convergence is the discrete analogue of asking whether a differential equation's solution decays (negative eigenvalue → decay) or grows (positive eigenvalue → growth). The iteration matrix M encodes exactly how much the error is scaled in each direction per step; ρ(M) is the worst-case scaling factor. Preconditioning works by transforming M into a matrix with smaller spectral radius, accelerating convergence.