Infimal ConvolutionEdit
Infimal convolution is a fundamental operation in convex analysis and optimization that blends two penalty or objective functions into a single, well-behaved surrogate. By taking, for each point x, the best possible split of x into y and x − y that minimizes the sum of the two functions, infimal convolution encapsulates a form of additive decomposition that is central to regularization, proximal methods, and dual reformulations. When paired with classics from convex analysis, it serves as a bridge between non-smooth penalties and smooth approximations, and it plays a key role in algorithmic frameworks for large-scale optimization.
From a mathematical standpoint, infimal convolution gives a principled way to combine two objectives while preserving essential structural properties. It generalizes many familiar constructions and provides a versatile tool for both theory and computation. In practice, it often yields new functions that are easier to handle in optimization routines, while still reflecting the original goals encoded by f and g. The operation is deeply linked to duality, to the geometry of sublevel sets, and to the world of proximal methods that underpin modern convex optimization.
Definition and basic properties
Let X be a real vector space (typically ℝ^n) and let f, g: X → (−∞, +∞] be proper lower semicontinuous convex functions. The infimal convolution of f and g is the function defined by (f ⊞ g)(x) = inf_y { f(y) + g(x − y) } for all x ∈ X. In many texts, this is written as the infimal convolution of f and g, and it can be viewed as the best way to decompose x into a sum y + (x − y) so that the total penalty f(y) + g(x − y) is minimized.
Key properties and consequences: - Commutativity: f ⊞ g = g ⊞ f, so the order of the terms does not matter. - Identity element: The indicator function δ0 of the origin, defined by δ_0(z) = 0 if z = 0 and δ_0(z) = +∞ otherwise, acts as the neutral element: f ⊞ δ_0 = f. - Regularization connection: The infimal convolution with a quadratic function yields the Moreau envelope, a smooth approximation of f. Concretely, if q(z) = (1/2λ)‖z‖^2, then eλ f = f ⊞ q, where e_λ f is the Moreau envelope of f with parameter λ. - Convexity and regularity: If f and g are proper lsc convex, then f ⊞ g is also proper lsc convex. This makes infimal convolution a natural tool for constructing well-behaved surrogate objectives. - Dual viewpoint: The Fenchel conjugate of an infimal convolution satisfies (f ⊞ g)^* = f^* + g^*, connecting the primal operation to a simple sum in the dual world and enabling convenient dual analyses. - Subdifferential structure: Under mild regularity, the subdifferential of the infimal convolution at x can be traced to subgradients of f and g at an optimal splitting y that achieves the infimum, providing a practical handle for optimization algorithms.
Further reading on these dual and subgradient relationships often appears in the context of Fenchel conjugate theory and Convex analysis.
Examples
Hubber-style smoothing (Huber loss): The Moreau envelope example e_λ f, with f(x) = |x|, yields the Huber loss as a smooth approximation to the absolute value. This is a classic illustration of how infimal convolution with a quadratic produces a function that behaves like f away from zero but is differentiable, a property exploited in robust statistics and signal processing. See also Huber loss for connections to robust criteria.
Indicator functions and penalties: If f is a penalty on a variable and g is an indicator of a constraint, their infimal convolution can express a relaxed or decomposed objective that enforces the constraint in a soft or split way. For indicators, the identity δ_0 plays a central role as noted above.
Proximal mappings: While the proximal operator prox_f is defined via a minimization that combines f with a quadratic term, infimal convolution provides a broader lens for understanding how penalties interact with additive decompositions and regularization. See Proximal operator for related constructs and algorithmic implications.
Applications and relevance
Infimal convolution arises in several core areas of optimization and applied mathematics:
Regularization and model decomposition: By combining a data-fitting term with a regularizer through infimal convolution, one can obtain surrogate objectives that balance fidelity and structure in a controlled way. This perspective underpins many additive decomposition schemes used in signal processing and statistics.
Moreau envelope and smoothing: The link to the Moreau envelope makes infimal convolution a natural tool for producing smooth approximations of non-smooth penalties. This is particularly valuable in gradient-based optimization where differentiability accelerates convergence and stability. See Moreau envelope for a detailed treatment.
Dual formulations and operator splitting: The duality property (f ⊞ g)^* = f^* + g^* often simplifies the derivation of dual problems and informs the design of operator-splitting algorithms such as Douglas–Rachford and related proximal splitting methods. See Douglas–Rachford algorithm and Proximal algorithm for context.
Robust and statistical methods: Because infimal convolution interacts with convex penalties in ways that preserve tractability, it features in robust estimation and regularized learning where non-smooth penalties (like the ℓ1 norm) are paired with smoothing terms or constraints.