PmfEdit
Pmf, short for probability mass function, is the function that assigns the probability that a discrete random variable X takes a particular value x. It is the complete statistical description of a variable that can only assume a countable set of outcomes. As the discrete counterpart to the density function used for continuous variables, the pmf p(x) encapsulates all the odds and frequencies that matter for prediction, decision-making, and risk assessment in situations where outcomes come in whole units. See Discrete random variable and Probability distribution for broader context on how discrete variables fit within probability theory.
In practice, the pmf underpins analyses across a wide range of fields—economics, finance, manufacturing, data science, and public policy—whenever one must reason about counts, categories, or other discrete outcomes. It enables direct calculation of the probability of a given outcome, as well as derived quantities such as the expected value and the spread of outcomes. See Expected value and Variance for the core moments that summarize a pmf, and Cumulative distribution function for a related way to summarize probabilities.
Definition and basic properties
- Definition: For a discrete random variable X with a countable support S, the pmf is p_X(x) = P(X = x) for x in S. The probabilities outside S are zero, and the sum over all x in S satisfies Σ_{x∈S} p_X(x) = 1.
- Notation: A pmf is commonly written as p(x) or p_X(x). When X is understood, it is common to write p(x) without subscript.
- Values and support: The support S is the set of values that X can actually take. The pmf assigns nonnegative probabilities to each x in S, and zero to all other values.
- Relation to the cumulative distribution function: The cumulative distribution function F(x) = P(X ≤ x) is obtained by summing the pmf over all values t in S with t ≤ x: F(x) = Σ_{t≤x} p(t).
- Moments: If X takes numeric values, its mean (expected value) is E[X] = Σ_x x p(x], and its variance is Var(X) = E[X^2] − (E[X])^2, where the sums are taken over the support S.
- Multivariate extension: For a vector of discrete variables (X1, X2, …), the joint pmf p(x1, x2, …) describes the probability of each combination of values; independence between components means the joint pmf factors into the product of the marginals.
Examples
- Bernoulli distribution: X ∈ {0, 1} with p(1) = p and p(0) = 1 − p. This pmf models a single yes/no trial, such as a binary decision or a success/failure event. See Bernoulli distribution.
- Binomial distribution: The number of successes in n independent Bernoulli trials with probability p of success each has a pmf p(k) = C(n, k) p^k (1 − p)^{n−k} for k = 0, 1, …, n. See Binomial distribution.
- Poisson distribution: Counts of events in a fixed interval when events occur with a known average rate λ and independently of the time since the last event have pmf p(k) = e^{−λ} λ^k / k! for k = 0, 1, 2, … . See Poisson distribution.
- Uniform discrete distribution: If X is equally likely to take any value in a finite set S with |S| = m, then p(x) = 1/m for x ∈ S and p(x) = 0 otherwise. See Discrete uniform distribution.
Multivariate PMFs and independence
When dealing with several discrete variables, the joint pmf p(x1, x2, …) assigns probabilities to each combination of outcomes. If the variables are independent, the joint pmf factors into the product of the marginals: p(x1, x2, …) = p1(x1) p2(x2) …. This property is central to both theory and computation, and it supports modular modeling where complex systems are built from simpler components. See Joint probability distribution and Independence (probability theory).
Estimation, inference, and computation
- Empirical pmf: Given a sample, the empirical or plug-in pmf assigns p_hat(x) = (number of observations equal to x) / n. This nonparametric estimate converges to the true pmf as the sample size grows, under standard sampling assumptions.
- Maximum likelihood estimation: If a parametric family is assumed (for example, a binomial family with unknown p), parameters are chosen to maximize the likelihood of observed data. See Maximum likelihood estimation.
- From data to model: The pmf provides a bridge from observed frequencies to a probabilistic model. Analysts select a discrete model that fits the data well and aligns with domain knowledge about the process generating the counts. See Model selection.
- Simulation and sampling: To generate random samples from a specified pmf, methods such as inverse transform sampling or the alias method can be used. See Inverse transform sampling and Alias method.
Applications and uses
Pmf is foundational wherever counts, categories, or discrete outcomes matter. In economics and policy analysis, it supports modeling of discrete choices and outcomes of interest to stakeholders. In actuarial science, it underpins the assessment of claim counts and premium calculations. In quality control and reliability engineering, pmfs describe defect counts and failure events. In data science and machine learning, pmfs appear in language modeling (word frequencies), recommender systems, and discrete latent-variable models. See Actuarial science, Quality control, Data science, and Machine learning for broader connections.
Limitations and cautions
- Discreteness matters: The pmf fully describes discrete variables, but when data are naturally continuous, a density function or distribution model may be more appropriate. See Probability distribution.
- Sparse support: If the set of possible outcomes is large and data are sparse, estimates of p(x) may be noisy; regularization or pooling categories can help. See Empirical distribution.
- Model misspecification: Relying on an inappropriate discrete model can lead to biased inferences; model checking and validation against held-out data are important. See Model validation.
See also
- Discrete random variable
- Probability distribution
- Cumulative distribution function
- Joint probability distribution
- Independence (probability theory)
- Bernoulli distribution
- Binomial distribution
- Poisson distribution
- Multinomial distribution
- Maximum likelihood estimation
- Empirical distribution
- Inverse transform sampling
- Alias method