Jensens InequalityEdit

Jensen's inequality is a cornerstone result in mathematics with wide-ranging implications in economics, statistics, and analysis. At its core, it compares the value of a convex (or concave) function applied to an average with the average of the function applied to individual values. This simple idea captures why nonlinear preferences and nonlinear losses behave differently from linear ones, and it provides practical bounds in uncertainty and optimization problems.

The inequality is named after the Danish mathematician Johan Jensen and appears in both discrete and continuous settings. It is a standard tool in the toolkit of convex analysis and is frequently invoked in the study of probability theory and statistics. Its power lies in translating information about the average input into information about the average output of a nonlinear transformation.

Statement and intuition

Let X be a random variable with a finite expectation (mean), and let φ be a function defined on an interval that contains the range of X. If φ is convex function on that interval, then the inequality E[φ(X)] ≥ φ(E[X]) holds, where E denotes the expected value with respect to the underlying probability distribution. If φ is strictly convex, equality occurs only when X is almost surely constant (i.e., its values do not vary). If φ is concave instead of convex, the inequality reverses: E[φ(X)] ≤ φ(E[X]).

A useful intuition is that a convex function “curves upward.” When you average inputs and then apply a curved function, you typically lose some of the smoothing you would get by applying the function first and then averaging. In the discrete case, if X takes finitely many values with probabilities p_i, the same bound reads: ∑ p_i φ(x_i) ≥ φ(∑ p_i x_i).

Key variants include the extension to conditional expectations, the version for measures other than probability measures, and the generalizations to vector-valued inputs or to φ that are convex on more complex domains. See also the notions of convex function and concave function in the context of real-valued functions, and the role of the inequality in the broader framework of convex analysis.

Applications of the inequality often rely on choosing φ to reflect a particular loss, utility, or growth structure. For example, when φ(t) = t^2, Jensen’s inequality implies that E[X^2] ≥ (E[X])^2, which is equivalent to the familiar nonnegativity of variance: Var(X) = E[X^2] − (E[X])^2 ≥ 0.

History and context

The result is attributed to Johan Jensen and became a standard reference point in early 20th-century developments in probability and analysis. Over time, it was recognized for its utility across disciplines: - In economics and econometrics, Jensen's inequality underpins arguments about risk, uncertainty, and the behavior of non-linear utility functions. - In statistics and research on estimators, it provides a simple route to derive bounds and to understand how nonlinear transformations interact with averaging. - In optimization and machine learning, convex losses and convex penalties often lead to bounds and convergence guarantees that rely on Jensen-type reasoning.

Variants and extensions

  • Conditional Jensen inequality: E[φ(X) | F] ≥ φ(E[X | F]) for a convex φ and a σ-algebra F, providing a way to reason about information available up to a certain point.
  • Multivariate and functional forms: The inequality extends to expectations of convex functionals on spaces of random vectors, enabling bounds in higher dimensions and in function spaces.
  • Generalizations to other spaces: Variants exist for measures on general spaces with appropriate integrability and convexity conditions.
  • Equality conditions: The precise character of when equality holds depends on the convexity type and the distribution of X; strict convexity yields equality only when X is degenerate.

Applications and implications

  • Risk and utility in economics: For a concave utility function u, Jensen's inequality implies E[u(W)] ≤ u(E[W]), illustrating why diversification and uncertainty can reduce expected utility relative to a sure outcome with the same mean wealth. This underpins the qualitative idea of risk aversion and the value placed on reducing variance in outcomes.
  • Finance and portfolio theory: The inequality helps justify why nonlinear pricing, convex payoff structures, and diversification strategies affect expected payoffs differently than linear models would suggest.
  • Statistics and estimation: By bounding the expectation of a nonlinear transformation, Jensen's inequality provides quick, interpretable bounds for moments and for estimators that involve nonlinear functions.
  • Information theory and learning: In machine learning, convex loss functions and convex combinations of predictions give rise to guarantees and bounds that rely on Jensen-type reasoning about averages and nonlinear maps.

See also