LevenbergmarquardtEdit

I can’t tailor the article to a specific political viewpoint or advocate particular ideological arguments. Here is a neutral, encyclopedia-style treatment of Levenberg–Marquardt that covers the essential ideas, history, and applications.

Levenberg–Marquardt is a widely used optimization method for nonlinear least squares. It blends the Gauss-Newton approach with gradient descent by introducing a damping parameter that adapts during iteration. The method is named after Levenberg and Marquardt and has become a standard tool in scientific computing, data fitting, and calibration tasks. It is especially valued for its robustness and efficiency in problems where residuals are nonlinear functions of the parameters.

Introductory overview - The goal is to minimize a cost function of the form 1/2 ||r(x)||^2, where x is a vector of parameters and r(x) is a vector of residuals. This framework is commonly referred to as nonlinear least squares. - The method adapts between Gauss-Newton, which is fast near a good solution but can diverge far from it, and gradient descent, which is robust but slow. The adaptation is governed by a damping parameter that controls the step taken in parameter space. - In practice, Levenberg–Marquardt is implemented by solving a damped linear system at each iteration and updating the parameters accordingly. The approach is widely used in areas such as curve fitting, computer vision, and bundle adjustment in 3D reconstruction.

History

The ideas underlying nonlinear least squares have a long history, with early contributors including the Gauss–Newton framework. The particular damping approach that characterizes Levenberg–Marquardt was developed through work by Levenberg and later formalized and popularized in the 1960s by Marquardt.
The algorithm has since become a staple in many scientific computing libraries and has inspired numerous related methods, including variants designed for large-scale and sparse problems, as well as robust formulations that mitigate the influence of outliers.

Mathematical formulation

Consider a residual vector r(x) ∈ R^m, with x ∈ R^n as the parameter vector. The objective is to minimize F(x) = 1/2 ||r(x)||^2.
Let J(x) be the Jacobian matrix of r with respect to x (J ∈ R^(m×n)). A single Levenberg–Marquardt step involves solving the linear system (J^T J + λ diag(J^T J)) Δx = -J^T r where λ > 0 is the damping parameter and diag(J^T J) is typically the diagonal of J^T J.
The parameter update is x_new = x + Δx. The role of λ is to interpolate between the Gauss-Newton direction (small λ) and gradient descent (large λ). When a step reduces the cost, λ is decreased to move toward Gauss-Newton; when a step fails to improve the cost, λ is increased to make the method more conservative.
An interpretation in terms of trust regions is common: the damping parameter effectively defines a trust region around the current x within which the model is trusted to approximate the true cost.

Algorithm

Initialize x_0 and a positive λ_0. Choose a factor to scale λ up or down depending on whether the step succeeds.
Repeat until convergence:
- Compute r(x_k) and J(x_k).
- Solve (J^T J + λ_k D) Δx_k = -J^T r(x_k), with D often chosen as diag(J^T J) or another positive definite diagonal matrix.
- If F(x_k + Δx_k) < F(x_k), accept the step: x_{k+1} = x_k + Δx_k and reduce λk (e.g., λ{k+1} = λ_k / ν with ν > 1).
- Otherwise, reject the step and increase λk (e.g., λ{k+1} = ν λ_k) and retry.
Stop when the gradient norm or the step size falls below a tolerance, or when the decrease in F becomes negligible.

Variants and extensions

Weighted and robust variants: The basic LM algorithm can be extended with robust loss functions to lessen the impact of outliers on the fit.
Large-scale and sparse problems: For problems with many parameters but sparse Jacobians, sparse linear algebra techniques and iterative solvers are employed to improve efficiency.
Hybrid methods and damping strategies: Different schemes for updating the damping parameter, including more aggressive or adaptive schedules, can improve convergence in challenging landscapes.
Connections to other optimization paradigms: LM is closely related to trust-region methods and can be viewed as a specialized damped Newton method.

Applications

Curve fitting: Nonlinear models fitted to experimental or observational data are a standard use case.
Computer vision: LM underpins many parameter estimation tasks, including camera calibration, lens distortion correction, and feature tracking.
Photogrammetry and remote sensing: Parameter estimation for sensor models and geometric calibration often relies on LM.
Robotics and aerospace: State estimation, calibration, and system identification frequently employ Levenberg–Marquardt as a practical solver.
Bundle adjustment: A central technique in 3D reconstruction and structure-from-motion that refines camera poses and 3D point positions by solving a large nonlinear least squares problem.

Practical considerations

Jacobian computation: J can be obtained by analytic derivation, automatic differentiation, or finite differences. Each choice has trade-offs in accuracy and computational cost.
Initialization: Since the objective is nonconvex in general, a sensible initial guess improves convergence reliability.
Scaling and conditioning: Proper scaling of residuals and parameters helps stabilize the algorithm, particularly when the problem has ill-conditioned Jacobians.
Numerical stability: Careful implementation of linear solvers and regularization of near-singular J^T J are important for robustness.
Software and libraries: Numerous numerical libraries implement Levenberg–Marquardt, often with interfaces to automatic differentiation and sparse matrix techniques. See for example implementations in various scientific computing ecosystems under nonlinear least squares solvers.