Hardy–Weinberg equilibrium (hardy weinberg equilibrium) is a population-genetic model for a diploid, sexually reproducing population in which a locus with two alleles (A and a) has stable allele frequencies and predictable genotype frequencies from one generation to the next. The model links allele frequencies to genotype frequencies through a simple binomial expansion.
Equilibrium meaning in population genetics
“Equilibrium” describes constancy of allele frequencies over generations when the model conditions hold. Under random union of gametes and in the absence of systematic evolutionary forces, genotype frequencies depend only on allele frequencies and take the characteristic proportions \(p^2\), \(2pq\), and \(q^2\) for genotypes AA, Aa, and aa.
Allele-frequency model and genotype-frequency equations
Let \(p\) be the frequency of allele A in the gene pool and \(q\) be the frequency of allele a. For two alleles,
Under Hardy–Weinberg assumptions, expected genotype frequencies are:
These three frequencies sum to 1 because
Standard model conditions
The Hardy–Weinberg equilibrium model is most informative when its assumptions are made explicit, because departures from expectations can reflect biology, sampling, or measurement.
| Model condition | Meaning in a population | Common biological or practical departure |
|---|---|---|
| Random mating at the locus | Mating is not correlated with genotype at this locus | Inbreeding, assortative mating, population structure |
| No selection | Genotypes have equal survival and reproductive success | Viability or fertility differences among genotypes |
| No mutation | Alleles do not convert to other alleles at appreciable rates | New alleles introduced by mutation over time |
| No migration (gene flow) | Allele frequencies are not altered by immigration/emigration | Mixing between subpopulations with different p and q |
| Large population size | Sampling noise in allele transmission is negligible | Genetic drift in small populations or founder effects |
| Accurate genotyping and counting | Observed genotype counts reflect true genotypes | Misclassification, null alleles, missing data |
Estimating p and q from observed genotype data
With genotype frequencies or counts for AA, Aa, and aa, allele frequencies follow directly from allele counting. Using genotype frequencies,
Using counts with sample size \(N\) and observed counts \(n_{\mathrm{AA}}, n_{\mathrm{Aa}}, n_{\mathrm{aa}}\),
Visualization of genotype-frequency curves across allele frequency
Worked example from a recessive phenotype frequency
When a trait is fully recessive and expressed only by genotype aa, the phenotype frequency can approximate \(q^2\) in many introductory applications. With an observed recessive phenotype frequency of \(0.09\),
The expected genotype frequencies become:
Observed versus expected counts and a standard equilibrium check
With a sample size \(N\), expected genotype counts follow \(E_{\mathrm{AA}}=Np^2\), \(E_{\mathrm{Aa}}=N(2pq)\), and \(E_{\mathrm{aa}}=Nq^2\). A widely used goodness-of-fit statistic compares observed counts \(O\) to expected counts \(E\):
The table below illustrates the calculation for \(N=200\) with observed counts \((100, 80, 20)\). Allele frequencies from these observations are \(p=\frac{2(100)+80}{2(200)}=0.7\) and \(q=0.3\), giving expected counts \((98,84,18)\).
| Genotype | Observed \(O\) | Expected \(E\) | \((O-E)^2/E\) |
|---|---|---|---|
| AA | 100 | 98 | \(\frac{(100-98)^2}{98} \approx 0.0408\) |
| Aa | 80 | 84 | \(\frac{(80-84)^2}{84} \approx 0.1905\) |
| aa | 20 | 18 | \(\frac{(20-18)^2}{18} \approx 0.2222\) |
| Total | 200 | 200 | \(\chi^2 \approx 0.4535\) |
For a single biallelic locus when allele frequencies are estimated from the same sample, a commonly used reference is \(\mathrm{df}=1\). At a typical \(\alpha=0.05\), the critical value is approximately \(3.84\), so \(\chi^2 \approx 0.45\) is consistent with Hardy–Weinberg expectations for this illustration.
A statistically significant departure from Hardy–Weinberg expectations does not uniquely identify a cause. Heterozygote deficiency can reflect inbreeding or population subdivision, heterozygote excess can reflect certain selection regimes, and apparent deviations can arise from genotyping error or non-random sampling.
Common pitfalls
Hardy–Weinberg equilibrium is often conflated with “no evolution” in a broad sense; the model is locus-specific and force-specific, and other loci can evolve under selection or drift while one locus matches Hardy–Weinberg expectations. Another frequent error is treating \(2pq\) as a fixed number rather than a function of allele frequency; heterozygosity changes predictably with \(p\) and peaks at \(p=0.5\). Small expected genotype counts reduce the reliability of large-sample chi-square approximations, so interpretation benefits from attention to sample size and data quality.