Krylov Subspace MethodEdit
The Krylov subspace method refers to a broad family of iterative techniques for solving large linear systems and eigenvalue problems by building approximations within sequences of subspaces generated from repeated matrix-vector applications. These methods are particularly well suited to problems where the matrix is large and sparse, so that each matrix-vector product can be computed cheaply and the data footprint remains manageable. At heart, Krylov subspace methods seek to approximate the solution by combining the action of the matrix with initial information in a way that progressively improves accuracy without forming or factorizing the entire matrix.
A central theme across the family is the idea that, by exploring the subspace spanned by b, Ab, A^2b, ..., A^{m-1}b for solving Ax=b, one can capture important directions in which the solution lives. These subspaces are the Krylov subspaces K_m(A,b) = span{b, Ab, ..., A^{m-1}b}. Different methods differ in how they choose the basis for these subspaces, how they enforce orthogonality or minimize residuals, and how they manage memory and stability. The Conjugate Gradient method Conjugate gradient is the prototypical Krylov method for symmetric positive definite matrices, while GMRES GMRES and related algorithms extend the idea to general nonsymmetric matrices. In eigenvalue contexts, the Lanczos method Lanczos method and related Arnoldi-based algorithms Arnoldi iteration are used to build compact representations that reveal spectral information.
History
The concept behind Krylov subspace methods has roots in the early 20th century, with developments by the Russian mathematician Krylov subspace theorists who studied iterative processes for linear systems. The practical prominence of these ideas came with the conjugate gradient method, developed by Hestenes and Stiefel in the early 1950s, which provided a robust, memory-efficient way to solve Ax=b when A is symmetric positive definite Linear systems. Over the decades, the family expanded to handle nonsymmetric and indefinite problems, with variants such as BiCG and BiCGSTAB addressing some of the shortcomings of earlier approaches.
A major milestone was the Generalized Minimal Residual method, GMRES, introduced by Saad and Schultz in the late 1980s. GMRES builds an orthonormal basis for a Krylov subspace via the Arnoldi iteration and solves a small least-squares problem that minimizes the residual over that subspace. While powerful, full GMRES can be memory-intensive, which led to restarted versions (GMRES(m)) and a broader set of Krylov recycling and deflation strategies designed to reuse information across a sequence of related problems. The Lanczos method, introduced by Cornelius Lanczos in the 1950s, provided a highly effective approach for extracting eigenvalues from large symmetric matrices and also underpins many modern Krylov techniques for spectral problems. The development of these methods reflects a long-running effort to balance convergence speed, robustness, and resource use in high-dimensional computations.
Theory and algorithms
Krylov subspace methods rest on the observation that many linear systems and eigenvalue problems can be approximated by operating within a subspace generated from the initial vector and repeated applications of the matrix. The fundamental objects and ideas include:
Krylov subspace: K_m(A,b) = span{b, Ab, A^2b, ..., A^{m-1}b}. These subspaces grow with m and capture directions that the matrix A uses to transform b.
Conjugate Gradient (for SPD A): The CG method stays within Krylov subspaces and minimizes the A-norm of the error (or equivalently, the energy norm) over K_m(A,b). It uses a three-term recurrence that keeps track of search directions and residuals, avoiding explicit matrix inversions. See Conjugate gradient.
GMRES (general nonsymmetric case): GMRES builds an orthonormal basis for K_m(A,b) using the Arnoldi iteration and solves the smallest residual problem in that subspace. It minimizes the Euclidean norm of the residual in each step, which often yields robust convergence for a broad class of problems. Full GMRES requires storing the entire basis, which can be expensive for large m, leading to restarted versions GMRES(m). See GMRES and Arnoldi iteration.
Lanczos method and related eigenvalue solvers: For Hermitian or symmetric problems, the Lanczos process produces a tridiagonal representation that makes eigenvalue extraction efficient. The connection to CG is clear in the SPD case, but Lanczos is particularly valued for spectral computations and for building deflated or recycled Krylov spaces. See Lanczos method.
Other Krylov variants for nonsymmetric and indefinite problems: Methods such as BiCG (biconjugate gradient), BiCGSTAB, QMR, MINRES, and others offer different trade-offs between convergence speed, stability, and memory usage. These methods share the Krylov subspace philosophy but differ in how they generate and combine search directions and how they handle breakdowns or numerical instability. See BiCG, BiCGSTAB.
Preconditioning: A central practical concern is preconditioning, which aims to transform the original system into one that has more favorable spectral properties for Krylov convergence. Preconditioning can be left, right, or split, and may involve incomplete factorizations (e.g., ILU or Incomplete Cholesky), diagonal scaling, or even polynomial preconditioners. See Preconditioning and related techniques.
Restarting, deflation, and recycling: To manage memory and to exploit structure across related problems, practitioners use restarting (GMRES(m)), deflation (removing converged eigencomponents), and recycling (reusing Krylov information across sequences of solves). These ideas connect Krylov subspace methods to modern large-scale workflows. See Krylov subspace recycling.
Conjugate Gradient method (CG)
Applied to A that is symmetric positive definite, CG generates a sequence of approximate solutions x_k in K_k(A,b) that minimize the A-norm of the error. Each iteration involves a matrix-vector product with A, vector updates, and a few inner products, keeping the cost per iteration low and scalable to large, sparse systems. CG is also a natural partner to preconditioning, where a suitable preconditioner P approximates A^{-1} and improves convergence. See Conjugate gradient and Preconditioning.
GMRES and Arnoldi-based methods
GMRES builds an orthonormal basis for the Krylov subspace via Arnoldi iteration, then solves a small least-squares problem to minimize the residual over that subspace. The method is memory-intensive for large m, which led to restarted GMRES (GMRES(m)) and to numerous extensions that trade memory for robustness or speed. GMRES is often preferred when A is nonsymmetric or indefinite and when robustness across a broad class of problems is desired. See GMRES and Arnoldi iteration.
Lanczos and eigenvalue problems
The Lanczos method focuses on Hermitian problems and provides an efficient route to a few extreme eigenvalues and eigenvectors by producing a tridiagonal representation of A in the Krylov subspace. Lanczos-based variants underpin many eigenvalue solvers and share the same core idea of reducing a large problem to a smaller, structured one. See Lanczos method.
Practical considerations
Memory versus convergence: Full GMRES can converge quickly but requires storing the entire basis, making it impractical for very large problems. Restarting, or switching to a different Krylov method, is common in practice. See GMRES.
Orthogonality and rounding errors: In finite precision arithmetic, the orthogonality of vectors generated by Arnoldi or Lanczos can deteriorate, potentially slowing convergence or causing breakdowns. Reorthogonalization and stabilization techniques mitigate these issues but add cost. See Orthogonalization.
Deflation and recycling: In sequences of related problems (for example, parameter studies or time-stepping in PDE simulations), deflation of converged eigencomponents or recycling of Krylov spaces can dramatically reduce total effort. See Krylov subspace recycling.
Preconditioning and practical considerations
Effective preconditioning is often the determining factor in practical performance. A good preconditioner P should satisfy two goals: (1) improve the spectrum or conditioning of the transformed system, and (2) be cheap to apply at each iteration. Common strategies include incomplete factorizations (e.g., ILU), incomplete Cholesky for SPD problems, diagonal or block-diagonal scaling, and even polynomial preconditioners that approximate A^{-1} over a spectral range. The choice of preconditioner is problem-dependent and frequently interacts with the chosen Krylov method.
Another practical issue is memory management. Since GMRES stores a basis vector for each iteration, large problems can exhaust memory quickly. Restarting, deflation, and recycling provide ways to balance memory usage with convergence speed, particularly in simulations that produce a sequence of related linear systems. See Preconditioning and Krylov subspace recycling.
Applications
Krylov subspace methods dominate many areas of computational science and engineering. They are routinely used to solve large sparse linear systems arising from discretizations of partial differential equations in physics and engineering, including computational fluid dynamics, structural analysis, and electromagnetics. They also play a major role in optimization, model reduction, and certain areas of machine learning where large sparse systems appear. For spectral problems, Lanczos- and Arnoldi-based methods are standard tools for computing a few eigenvalues and eigenvectors of large symmetric or nonsymmetric matrices. See Sparse matrix and Numerical linear algebra.