Mcdonald Kreitman TestEdit
The McDonald–Kreitman test is a foundational method in population genetics used to infer whether natural selection has acted on protein-coding genes by comparing genetic variation within a species to fixed differences between species. The core insight is simple: if non-synonymous changes (which alter the amino acid sequence of a protein) are neutral in the long run, their rate should track that of synonymous changes (which do not alter the protein). When the observed patterns deviate from this expectation, it is evidence that selection has shaped the gene in question. The test was introduced in the early 1990s by McDonald and Kreitman, and it has since become a standard tool in the comparative genomics toolkit, applied across taxa from drosophila to humans Drosophila.
At its heart the MK test uses a 2x2 contingency framework that catalogs four quantities for a given gene: - Pn and Ps: the numbers of nonsynonymous and synonymous polymorphisms within a species - Dn and Ds: the numbers of fixed nonsynonymous and synonymous differences between species Under neutrality, the ratio Pn/Ps should equal Dn/Ds. A significant excess of fixed nonsynonymous differences (Dn) relative to polymorphic nonsynonymous differences (Pn) suggests adaptive evolution, whereas an excess of non-synonymous polymorphisms (Pn) relative to fixed differences (Dn) points toward purifying selection acting within the lineage. Statistical significance is typically assessed with a Fisher’s exact test or related methods, and researchers often report a neutrality index that summarizes the deviation from neutrality.
History
The method traces back to a 1991 publication by McDonald–Kreitman test that framed a simple testable prediction about protein-coding evolution. Early applications focused on model organisms such as Drosophila species, where high-quality polymorphism data and clear outgroup comparisons made the approach especially informative. Over time, the MK test was adopted across diverse lineages, including Homo sapiens and other vertebrates, and it spurred a wave of methodological refinements aimed at improving robustness to realistic population histories and genomic architecture.
Methodology
- Data requirements: coding sequence data from a focal species, an outgroup or closely related species to polarize substitution direction, and reasonably large samples to count polymorphisms and fixed differences with confidence. Researchers typically separate sites into synonymous and nonsynonymous categories and tally Pn, Ps, Dn, and Ds.
- Analysis steps: construct the 2x2 table, compute the neutrality index (NI) or related statistics, and test for deviation from neutrality (often via Fisher’s exact test). A value of NI less than 1 signals more fixed nonsynonymous differences than expected under neutrality (positive selection), while NI greater than 1 signals an excess of nonsynonymous polymorphisms (purifying selection within the lineage).
- Extensions and refinements: several variants aim to address shared concerns such as the impact of demography, slightly deleterious mutations, and outgroup choice. These include approaches that evaluate the site frequency spectrum, incorporate polarized site data across multiple populations, or adjust for weakly deleterious variants that inflate the polymorphism counts at nonsynonymous sites. Researchers also combine MK-type analyses with other methods of detecting selection to triangulate evidence of adaptation.
Interpretations and debates
A central challenge in applying the MK test is separating the signal of selection from the echoes of population history. Demographic events such as bottlenecks, expansions, or population structure can distort the balance between polymorphisms and fixed differences, producing apparent deviations from neutrality even in the absence of adaptive evolution. Proponents emphasize that, when used carefully, MK-based inferences about selection are robust to many demographic scenarios or can be corrected with complementary data and models.
A related issue concerns slightly deleterious mutations. These mutations can segregate within a population but contribute little to divergence between species, inflating Pn relative to Dn and potentially mimicking purifying selection. To address this, researchers have developed extended approaches that emphasize higher-frequency variants, apply frequency-spectrum corrections, or simulate scenarios under differing demographic histories to gauge how robust conclusions are under realistic conditions.
Choice of outgroup matters as well. If the outgroup is too distant, multiple substitutions at the same site can obscure the true direction of change; if it is too close, insufficient divergence can reduce statistical power. These practical considerations shape how MK test results are interpreted and motivate the use of multiple lines of evidence when asserting adaptive evolution at a gene.
Applications
The McDonald–Kreitman framework has been applied to a broad set of genes and organisms. In drosophila and related species, the test has supported episodes of adaptive evolution in immune-related genes and other functional categories, contributing to a narrative in which host–pathogen interactions drive rapid protein evolution. In humans and other mammals, MK-type analyses have been used to probe adaptation in immune genes, metabolism, and other pathways, often in concert with other population-genetic tools to map the landscape of selection across the genome. The method remains a go-to approach for researchers seeking a relatively simple, testable criterion for selection in coding regions, especially when high-quality polymorphism data and appropriate outgroups are at hand. See also adaptive evolution and positive selection for related perspectives, and consult projects that integrate MK analyses with broader comparative genomics frameworks such as population genetics.