Log Normal DistributionEdit
The log normal distribution is a probability distribution of a positive-valued variable whose logarithm is normally distributed. In practical terms, if a random variable X is log-normally distributed, then Y = ln(X) follows a normal distribution. This relationship makes the log normal a natural model for quantities that grow multiplicatively over time or through successive stages, rather than by simple addition. The distribution plays a central role in statistics and in applied fields where products of random factors accumulate.
Formally, X is log-normally distributed with parameters mu and sigma^2 if ln(X) ~ N(mu, sigma^2). The probability density function for X > 0 is f(x) = (1 / (x sigma sqrt(2 pi))) exp(- (ln x - mu)^2 / (2 sigma^2)). The cumulative distribution function is F(x) = Phi((ln x - mu) / sigma), where Phi is the standard normal CDF. Key moments include: - median(x) = exp(mu) - mean(x) = exp(mu + sigma^2 / 2) - variance(x) = (exp(sigma^2) - 1) exp(2 mu + sigma^2)
The log normal is intimately connected to the normal distribution. If a variable X is log-normal, its logarithm is normal, and many properties of X can be derived from the corresponding normal distribution. One important consequence is closure under multiplication: if two independent positive variables are log-normal, their product is also log-normal. This makes the log normal a natural model for processes in which growth compounds across independent steps.
Characteristics
- Support and shape: The log normal is strictly positive and is typically right-skewed, with a long tail extending toward large values. The degree of skew depends on sigma; larger sigma yields a fatter tail.
- Relationship to the normal distribution: The transformation Y = ln(X) maps the log-normal distribution to a normal distribution, providing a bridge between multiplicative processes and familiar Gaussian intuition.
- Moments and tails: The mean is pulled upward by the tail, so the mean need not match the most probable value (the median). This makes the log normal useful for modeling quantities where a few extremely large outcomes drive averages.
Conceptually, the log-normal distribution arises in contexts where a quantity results from the product of many small, independent factors. In such a multiplicative growth picture, the central limit theorem applied to the logarithm leads to normality in the log scale, and hence log-normality in the original scale. This perspective is common in studies of geometric growth, stock prices under certain models, and the sizes of firms or cities in some regimes.
Generation and interpretation
- Multiplicative processes: If a quantity grows by random proportional factors over time, the logarithm of the quantity tends to accumulate normal fluctuations, yielding a log-normal distribution for the quantity itself.
- Stochastic models: In finance, geometric Brownian motion is a standard model for stock prices; at a fixed time, prices are log-normally distributed, while the log returns are normally distributed. This connection to stochastic calculus underpins the use of log-normal models in risk management and option pricing.
- Gibrat's law and firm sizes: Gibrat's law posits that growth rates are independent of size, a hypothesis that can lead to log-normal-sized firms in certain conditions. In practice, empirical firm-size distributions often show a log-normal body with deviations in the tails, leading to ongoing debate about the precise tail behavior.
Linkages to related concepts include normal distribution, geometric Brownian motion, multiplicative process, and income distribution.
Applications
- Economics and business: The log-normal distribution is used to model incomes and firm sizes in certain ranges, as well as the distribution of product and city sizes under multiplicative growth assumptions. It is a convenient baseline model when additive aggregation is not appropriate.
- Finance: Stock prices in the standard geometric Brownian motion framework are log-normally distributed at a fixed future time. The log-normal form provides tractable expressions for prices, derivatives, and risk measures when combined with normal-log-return assumptions.
- Natural and engineering sciences: Particle sizes in aerosols, reaction-rate constants in multiplicative environments, and various biological measures that result from multiplicative growth processes are modeled with log-normal distributions.
- Modeling considerations: In practice, practitioners assess whether a log-normal model adequately fits data, especially in the tails. When data exhibit heavier tails than the log-normal, alternatives such as mixtures or Pareto-like tail models may be invoked.
From a policy and entrepreneurship viewpoint, the log-normal framework is often cited as a reflection of how many competitive processes unfold in free-market settings: many small, independent growth factors accumulate, producing a distribution with a pronounced rise near small values and a tail extending toward large outcomes. This interpretation emphasizes efficiency, competition, and the cumulative advantage that comes with multiplicative gains. Critics, however, point to empirical evidence that real-world distributions can exhibit tail behavior inconsistent with a pure log-normal picture, prompting the exploration of alternative models and mixed families.
Limitations and controversies
- Tail behavior and realism: In many empirical settings, especially for wealth or firm sizes, data show tails that are heavier than the log-normal, with Pareto-like behavior at the upper end. This has led researchers to consider models that blend log-normal cores with power-law tails or to use entirely different families to capture extremes. See discussions around the Pareto distribution and related heavy-tailed concepts.
- Model selection and data fitting: Choosing a log-normal model involves estimating mu and sigma from data, and mis-specification can misstate probabilities of extreme events. Model validation often requires careful attention to both the central region and the tails.
- Competing explanations: While a multiplicative-growth narrative accounts for many observations, others emphasize structural factors such as market concentration, risk preference, policy environments, or network effects. In the study of wealth and income distributions, these debates feature prominently, with some advocating for models that include regulatory influences or social mobility constraints.
- Controversies from different perspectives: Proponents of minimal intervention in markets often view log-normality as a natural outcome of competitive growth, whereas critics argue that policy choices and redistribution mechanisms significantly shape outcomes away from pure multiplicative dynamics. The debate centers on how much of the observed distribution is endogenous to market processes versus exogenous to policy and institutional design.
Despite these debates, the log-normal model remains a staple in probability and statistics for its mathematical tractability and its intuitive foundation in multiplicative growth. It provides a useful benchmark and a baseline against which more complicated models can be compared, especially in regions where empirical data suggest a reasonable fit for many moderate-range values.
See also
- normal distribution
- log-normal distribution (this article is sometimes cross-referenced under related topics)
- geometric Brownian motion
- multiplicative process
- Gibrat's law
- Pareto distribution
- income distribution
- wealth distribution
- financial economics
- risk management