Singular Value DecompositionEdit
Singular Value Decomposition (SVD) is a central concept in linear algebra that underpins a wide range of techniques in data analysis, signal processing, and numerical computation. For any real or complex matrix M of size m×n, there exist unitary matrices U and V and a diagonal nonnegative matrix Σ such that M = U Σ V^* where V^* denotes the conjugate transpose of V. The diagonal entries σ1 ≥ σ2 ≥ ... ≥ σr ≥ 0 of Σ are called the singular values, and the columns of U and V are the left and right singular vectors, respectively. This factorization reveals the intrinsic geometric and informational structure of M and provides a natural way to separate signal from noise, to compress data, and to approximate matrices by lower-rank representations. The SVD generalizes many familiar decompositions and connects directly to linear algebra notions like eigenvalues and orthogonality.
In practical terms, SVD can be viewed as the most informative way to rewrite a matrix as a sum of rank-one components. If M has rank r, then M = σ1 u1 v1^* + σ2 u2 v2^* + ... + σr ur vr^* where ui and vi are the i-th left and right singular vectors, and σi are the corresponding singular values. This expansion makes several things explicit: the magnitude of each term is governed by σi, and the vectors ui and vi describe directions in which M acts most strongly. When σi are arranged in descending order, the first few terms often capture the bulk of the important structure, which is the basis for powerful low-rank approximations. The Eckart–Young theorem formalizes this idea by showing that the best rank-k approximation to M (in both the Frobenius and spectral norms) is obtained by truncating the sum after k terms. See also the Eckart–Young theorem for a precise statement.
Definition and formulation - Algebraic statement: For M ∈ R^m×n (or complex), there exist unitary U ∈ R^{m×m} (or U ∈ C^{m×m}) and V ∈ R^{n×n} (or V ∈ C^{n×n}) such that M = U Σ V^, where Σ = diag(σ1, σ2, ..., σr) with σi ≥ 0 and r = rank(M). - Left and right singular vectors: The columns of U are the left singular vectors and the columns of V are the right singular vectors. These vectors form orthonormal bases in their respective spaces: U^ U = I_m and V^* V = I_n. - Relationship to symmetric products: The nonzero singular values are the square roots of the eigenvalues of both M^* M and M M^. Specifically, M^ M v_i = σ_i^2 v_i and M M^* u_i = σ_i^2 u_i.
Geometric and computational perspectives - Geometric interpretation: The action of M on a vector is best understood along the directions defined by the right singular vectors; collapsing along the corresponding left singular vectors reproduces the singular values as stretch factors. This viewpoint clarifies why SVD is so effective for noise reduction and data compression. - Connection to PCA: When the data matrix X is centered, the principal components of the data are the directions given by the right singular vectors of X, and the projected variances are given by the squared singular values. See principal component analysis for a related perspective. - Relationship to eigenvalue decompositions: M^* M is Hermitian (or symmetric in the real case) and a standard eigenvalue problem for it yields the right singular vectors and the squared singular values. Likewise, M M^* yields the left singular vectors and the same σi^2. See eigenvalue decomposition for the analogous concept on symmetric matrices.
Computation - Algorithms: SVD is typically computed via bidiagonalization followed by a QR-type refinement, or via iterative methods such as Golub–Kahan routines. For very large matrices, randomized or truncated algorithms provide efficient approximations. See Golub–Kahan bidiagonalization and randomized numerical linear algebra for details. - Complexity: The exact SVD of an m×n matrix generally requires O(min(mn^2, m^2n)) operations, though practical large-scale problems often use approximate methods to balance accuracy and cost. - Numerical stability: SVD is numerically stable and is often preferred when tensors or matrices are ill-conditioned because the singular values separate signal from numerical noise.
Special cases and variants - Real and complex cases: The real case uses orthogonal U and V, while the complex case uses unitary U and V with V^* the conjugate transpose. The singular values themselves are real and nonnegative. - Symmetric and positive semidefinite matrices: If M is symmetric, the SVD aligns with the eigen decomposition in a structured way: one can choose U = V and Σ = diag(|λ1|, |λ2|, ..., |λm|) when working with M = Q Λ Q^T. When M is positive semidefinite, the singular values coincide with the eigenvalues, and the left/right singular vectors coincide with the eigenvectors. - Nonnegative matrix factorization and other decompositions: For certain applications, nonnegativity constraints or independence considerations lead to variations like nonnegative matrix factorization or independent component analysis; these offer alternatives when data interpretation requires nonnegativity or statistical independence.
Applications - Data compression and noise reduction: By keeping only the leading k singular values and their associated vectors, one obtains a low-rank approximation that preserves the dominant structure of M while discarding smaller, noisier components. - Image and signal processing: SVD can be used to compress images and to denoise signals by truncating small singular values. See image compression for related techniques. - Recommender systems: Collaborative filtering approaches frequently rely on low-rank matrix factorization to uncover latent user and item factors that explain observed ratings. See recommender systems for a broader treatment. - Natural language processing: Latent semantic analysis (LSA) uses SVD on term-document matrices to reveal latent topics and associations in text data. See latent semantic analysis for more. - Numerical linear algebra and model reduction: SVD provides a principled way to reduce the dimensionality of linear systems, aiding in stability and efficiency of simulations and computations. See model order reduction for related ideas.
Limitations and considerations - Linearity and linear models: SVD captures linear structure. Data with strong nonlinear relationships may require nonlinear methods or kernelized approaches. See kernel methods for alternatives. - Interpretability of components: While singular vectors have precise mathematical meaning, the interpretation of individual components can be subtle, especially in high-dimensional settings. - Data preparation: In PCA-related workflows, centering data is important because SVD on the data matrix without centering can blur the intended variance structure. See data normalization for general preprocessing techniques. - Sensitivity to outliers: Like many linear methods, SVD can be affected by outliers; robust variants or preprocessing steps may be warranted in noisy domains.
See also - linear algebra - matrix - eigenvalue decomposition - principal component analysis - image compression - recommender systems - latent semantic analysis - numerical linear algebra