Rejection SamplingEdit
Rejection sampling is a straightforward Monte Carlo technique for drawing samples from a target distribution when direct sampling is difficult but simulating from a related, easier distribution is feasible. The core idea is simple: you sample from a proposal distribution g(x) that is easy to sample from and then decide whether to keep or discard that sample according to a probabilistic rule that links the target (often unnormalized) density f(x) to the proposal. If you can bound f(x) by a constant M times g(x) everywhere, i.e., f(x) ≤ M g(x) for all x, then you can obtain samples from the target by repeating the following steps: draw x from g, draw u from Uniform(0,1), and accept x if u ≤ f(x)/(M g(x)); otherwise reject x and try again. The accepted x’s are distributed according to the target distribution after normalization.
Rejection sampling is valued for its mathematical clarity and its exactness: with the bound and the acceptance test in place, the process produces unbiased samples from the intended distribution. Its simplicity makes it a common teaching tool and a reliable workhorse in situations where the target density is known up to a normalization constant, and where a suitable envelope g can be found. The trade-off, however, is efficiency. The acceptance rate is roughly 1/M when g and f are tightly matched, and in many practical problems, especially as dimensionality grows, the gap between f and M g widens. The result can be a large number of rejected samples, which makes the method impractical for high-dimensional targets or for very peaked distributions.
Conceptual foundations and terminology
- Target distribution: the distribution from which you ultimately want samples; often specified by f(x), the unnormalized density, with the normalizing constant needed to turn f into a proper density. See Probability distribution.
- Proposal distribution: the distribution from which you actually generate samples, denoted g(x). The choice of g is crucial for efficiency; a good match to f minimizes wasted samples. See proposal distribution.
- Majorizing constant: the constant M that provides the envelope f(x) ≤ M g(x) for all x. A smaller M yields a higher acceptance rate. See Acceptance–rejection method.
- Density and envelope: rejection sampling operates with a density (or unnormalized density) and an envelope that bounds it from above; proper normalization is what makes the final sample set reflect the target distribution. See Probability density function.
Variants and practical extensions
- Adaptive rejection sampling (ARS): for log-concave targets, ARS builds a tight, piecewise-exponential upper bound to the log-density, improving efficiency while maintaining exactness. See Adaptive rejection sampling.
- Piecewise envelopes and more sophisticated envelopes: in practice, one often uses envelopes that adapt to the shape of f to keep M moderate, sometimes by partitioning the domain and using local envelopes. See Acceptance–rejection method.
- Slice sampling and related methods: when a good envelope is hard to obtain, alternative exact or near-exact methods such as Slice sampling or other MCMC approaches may be preferable. See Markov chain Monte Carlo.
- Hybrid use with other methods: rejection sampling can complement other techniques, for example as an outer loop to generate samples from a simple posterior, with inner steps using MCMC or importance sampling to explore subspaces more efficiently. See Importance sampling and Metropolis-Hastings algorithm.
Efficiency, limitations, and when it makes sense
- Dimensionality and efficiency: the method scales poorly with dimensionality because constructing a tight envelope becomes harder as the target becomes more complex or concentrated. In many real-world, high-dimensional problems, the acceptance rate drops to impractical levels, making alternatives like Markov chain Monte Carlo methods preferable.
- Simplicity and transparency: when a good envelope is available, rejection sampling is extremely transparent and easy to audit, which can be appealing in applications where reproducibility and code simplicity matter. See Probability distribution and Random number generator.
- Unnormalized targets and Bayesian contexts: rejection sampling is particularly natural when working with unnormalized densities, such as certain posterior distributions in Bayesian statistics where the normalizing constant is difficult to compute. See Bayesian statistics.
Controversies and debates within the approach
- Efficiency versus exactness: a central debate centers on whether the efficiency losses of rejection sampling are acceptable given its exactness. Proponents argue that, with a good envelope and careful implementation, rejection sampling remains a robust baseline, especially for simple or low-dimensional problems. Critics point out that for complex or high-dimensional problems, more flexible families of methods (notably MCMC approaches or importance sampling with adaptive techniques) typically yield better resource use.
- When to deploy versus when to switch: in practice, practitioners weigh the cost of crafting a suitable envelope against the benefits of exact sampling. If a fast, approximate solution suffices, or if an envelope is hard to obtain, alternatives may dominate. The broader point is that method choice should align with problem structure, computational budgets, and the required fidelity of the samples.
- Interpretability of results: supporters of transparent sampling routines highlight that rejection sampling makes the sampling mechanism explicit and easy to reason about, which can help in audits and validation. Critics sometimes treat any single-method reliance as a potential bottleneck; the pragmatic stance is to use the simplest tool that reliably delivers correct results for the task at hand.
Controversies from a pragmatic perspective and a note on critique
- Some critics external to the field have argued that certain algorithmic approaches, including sampling methods, reflect broader debates about data interpretation and fairness. From a practical standpoint, the math of rejection sampling does not embed social or moral values; the method either correctly samples from the target or it does not, regardless of how results are interpreted in policy or social contexts. The responsible takeaway is to separate model design and data choices from the intrinsic properties of the sampling algorithm, and to apply transparent, well-validated methods in analysis. This clarity helps prevent misattributing methodological shortcomings to broader ideological critiques.
See also