Composite LikelihoodEdit

Composite likelihood is a practical approach to statistical inference designed for problems where the full joint likelihood is too complex to handle. By stitching together a collection of lower-dimensional likelihoods—such as pairwise, marginal, or conditional components—the method yields estimators that are typically consistent and asymptotically normal under mild regularity conditions. It is especially appealing when the model involves high-dimensional dependencies or intricate spatial or network structure, making a full likelihood calculation prohibitively expensive or even intractable. In that spirit, composite likelihood emphasizes transparent assumptions, tractable computation, and results that practitioners can rely on without requiring enormous computational budgets.

From a pragmatic, results-driven perspective, composite likelihood offers a disciplined compromise between theoretical optimality and practical feasibility. It tends to scale well with data size and model complexity, enabling analysts to produce estimates and uncertainty quantification in contexts where traditional likelihood-based methods would stall. This has made it popular in fields such as spatial statistics spatial statistics, population genetics population genetics, epidemiology, and environmental modeling. The approach also aligns with a broader preference in applied statistics for methods that remain robust when the full joint distribution is misspecified or unknown, provided the component models capture key dependencies.

Nevertheless, the method is not without controversy. Critics point out that, by construction, composite likelihood sacrifices some statistical efficiency relative to the full likelihood. In practice, that means standard errors can be larger and confidence intervals may be less tight than those derived from the full likelihood, especially when the joint dependence structure is strong or when components overlap heavily. Proponents respond that the trade-off is often worthwhile, because the gain in computational tractability and model robustness can lead to more reliable inference in real-world settings. They emphasize that proper variance estimation—typically via a robust, sandwich-type form based on Godambe information—and careful selection of components help maintain valid inference even when the joint model is difficult to specify precisely Godambe information.

What is composite likelihood?

Composite likelihood is formed by combining the likelihoods of simpler, lower-dimensional component models. In many applications, this takes the form of multiplying component likelihoods, possibly with weights to reflect their relative importance or independence. The resulting estimator maximizes the composite log-likelihood and is denoted as the composite likelihood estimator. Important variants include:

marginal composite likelihood, which uses marginal distributions of small subsets of variables marginal likelihood;
conditional composite likelihood, which uses conditional distributions given other components;
pairwise likelihood, which multiplies likelihoods of bivariate (two-variable) marginal densities; this is a particularly common and tractable specialization pairwise likelihood.

Related ideas appear under the umbrella of pseudo-likelihood methods, especially in settings such as Markov random fields, where exact likelihoods are costly but local conditional distributions can be efficient to work with pseudo-likelihood.

The core insight is that, even when the full joint distribution is messy, many useful inferences can be obtained by focusing on carefully chosen parts of the distribution. The estimation proceeds by maximizing the composite likelihood with respect to the parameter vector θ, often under a weighting scheme that tempering redundant information among overlapping components.

Theory and properties

Consistency and asymptotic normality: Under standard regularity conditions, the composite likelihood estimator is consistent for the true parameter values and converges in distribution to a normal as the sample size grows. The asymptotic covariance is not the inverse of the usual Fisher information, but a robust form built from the variability of the score contributions and their sensitivity to θ. This robust covariance is commonly referred to via Godambe information and is estimated with a sandwich-type estimator Godambe information.
Efficiency and comparison to full likelihood: The composite likelihood estimator generally loses some efficiency relative to the full likelihood, particularly when the components do not capture all dependence in the data or when they overlap substantially. However, in many practical problems the computational gains and robustness to misspecification justify the efficiency trade-off, especially when the full likelihood is unavailable or too costly to compute.
Variance estimation and hypothesis testing: Because standard errors come from a robust, composite-information-based formula, standard likelihood-based tests (e.g., likelihood ratio tests) do not apply in the same way. Instead, tests and confidence intervals rely on the Godambe-based asymptotics and appropriate resampling or sandwich-variance procedures sandwich estimator.
Model misspecification and robustness: Composite likelihood can be more robust to misspecification of the full joint model if the chosen components still capture the essential dependence structure. Nevertheless, misspecification of the component models can bias inference, so careful model checking and sensitivity analyses are important.

Methods and variants

Pairwise likelihood: Uses the joint likelihoods of pairs of variables. This is the most common and computationally tractable variant in high-dimensional problems pairwise likelihood.
Marginal and conditional composite likelihoods: Use lower-dimensional marginal or conditional distributions to assemble the objective function.
Block and hierarchical composite likelihoods: Partition the data into blocks (possibly with overlapping components) and combine likelihoods within and across blocks. This approach lends itself to parallel computation and scalable analysis computational statistics.
Weighted composite likelihoods: Apply weights to components to reflect their information content or to address overlap, helping to balance efficiency and robustness.

Computational considerations

Scalability: Because each component involves only a subset of variables, composite likelihood methods scale more favorably to large datasets and complex dependence structures than full likelihood methods.
Parallelization: Blockwise or pairwise components can often be evaluated independently, enabling substantial speedups on multi-core processors or distributed computing environments.
Software and implementation: Implementations typically provide routines for constructing the composite likelihood, maximizing it, and estimating the robust covariance. Users should pay attention to the choice of components and weights, as these have a direct impact on inference.

Applications

Spatial statistics: Modeling spatial fields and areal data where exact joint distributions are unwieldy; pairwise and conditional components are common in geostatistics and environmental modeling spatial statistics.
Population genetics and genomics: Analyzing sequence data or linkage patterns where full likelihoods are computationally prohibitive; composite likelihood offers a tractable route to parameter estimation and hypothesis testing population genetics.
Epidemiology and biostatistics: Studying correlated outcomes and hierarchical data where simplifying assumptions about the joint distribution aid inference without sacrificing practical accuracy.
Time-series and longitudinal data: Component-based likelihoods can handle dependencies across time while avoiding intractable joint models.