What is hardy weinberg equilibrium, and how do allele frequencies p and q determine the expected genotype frequencies p², 2pq, and q² in a diploid population?

Hardy–Weinberg equilibrium describes a population state where allele frequencies remain constant and genotype frequencies follow p², 2pq, and q² for alleles A and a under standard model conditions.

Hardy–Weinberg Equilibrium: Genotype Frequencies and Conditions

Accepted answer Answer included

Hardy–Weinberg equilibrium (hardy weinberg equilibrium) is a population-genetic model for a diploid, sexually reproducing population in which a locus with two alleles (A and a) has stable allele frequencies and predictable genotype frequencies from one generation to the next. The model links allele frequencies to genotype frequencies through a simple binomial expansion.

Equilibrium meaning in population genetics

“Equilibrium” describes constancy of allele frequencies over generations when the model conditions hold. Under random union of gametes and in the absence of systematic evolutionary forces, genotype frequencies depend only on allele frequencies and take the characteristic proportions \(p^2\), \(2pq\), and \(q^2\) for genotypes AA, Aa, and aa.

Allele-frequency model and genotype-frequency equations

Let \(p\) be the frequency of allele A in the gene pool and \(q\) be the frequency of allele a. For two alleles,

\[ p + q = 1 \]

Under Hardy–Weinberg assumptions, expected genotype frequencies are:

\[ f(\mathrm{AA}) = p^2,\quad f(\mathrm{Aa}) = 2pq,\quad f(\mathrm{aa}) = q^2 \]

These three frequencies sum to 1 because

\[ p^2 + 2pq + q^2 = (p+q)^2 = 1 \]

Standard model conditions

The Hardy–Weinberg equilibrium model is most informative when its assumptions are made explicit, because departures from expectations can reflect biology, sampling, or measurement.

Model condition	Meaning in a population	Common biological or practical departure
Random mating at the locus	Mating is not correlated with genotype at this locus	Inbreeding, assortative mating, population structure
No selection	Genotypes have equal survival and reproductive success	Viability or fertility differences among genotypes
No mutation	Alleles do not convert to other alleles at appreciable rates	New alleles introduced by mutation over time
No migration (gene flow)	Allele frequencies are not altered by immigration/emigration	Mixing between subpopulations with different p and q
Large population size	Sampling noise in allele transmission is negligible	Genetic drift in small populations or founder effects
Accurate genotyping and counting	Observed genotype counts reflect true genotypes	Misclassification, null alleles, missing data

Estimating p and q from observed genotype data

With genotype frequencies or counts for AA, Aa, and aa, allele frequencies follow directly from allele counting. Using genotype frequencies,

\[ p = f(\mathrm{AA}) + \frac{1}{2}f(\mathrm{Aa}),\quad q = f(\mathrm{aa}) + \frac{1}{2}f(\mathrm{Aa}) \]

Using counts with sample size \(N\) and observed counts \(n_{\mathrm{AA}}, n_{\mathrm{Aa}}, n_{\mathrm{aa}}\),

\[ p = \frac{2n_{\mathrm{AA}} + n_{\mathrm{Aa}}}{2N},\quad q = \frac{2n_{\mathrm{aa}} + n_{\mathrm{Aa}}}{2N} = 1-p \]

Visualization of genotype-frequency curves across allele frequency

For any allele frequency \(p\), Hardy–Weinberg equilibrium predicts a specific split of genotype frequencies into \(p^2\), \(2pq\), and \(q^2\). Heterozygosity \(2pq\) is maximized at \(p=0.5\).

Worked example from a recessive phenotype frequency

When a trait is fully recessive and expressed only by genotype aa, the phenotype frequency can approximate \(q^2\) in many introductory applications. With an observed recessive phenotype frequency of \(0.09\),

\[ q^2 = 0.09 \Rightarrow q = \sqrt{0.09} = 0.3,\quad p = 1-q = 0.7 \]

The expected genotype frequencies become:

\[ p^2 = 0.49,\quad 2pq = 2(0.7)(0.3)=0.42,\quad q^2 = 0.09 \]

Observed versus expected counts and a standard equilibrium check

With a sample size \(N\), expected genotype counts follow \(E_{\mathrm{AA}}=Np^2\), \(E_{\mathrm{Aa}}=N(2pq)\), and \(E_{\mathrm{aa}}=Nq^2\). A widely used goodness-of-fit statistic compares observed counts \(O\) to expected counts \(E\):

\[ \chi^2 = \sum \frac{(O-E)^2}{E} \]

The table below illustrates the calculation for \(N=200\) with observed counts \((100, 80, 20)\). Allele frequencies from these observations are \(p=\frac{2(100)+80}{2(200)}=0.7\) and \(q=0.3\), giving expected counts \((98,84,18)\).

Genotype	Observed \(O\)	Expected \(E\)	\((O-E)^2/E\)
AA	100	98	\(\frac{(100-98)^2}{98} \approx 0.0408\)
Aa	80	84	\(\frac{(80-84)^2}{84} \approx 0.1905\)
aa	20	18	\(\frac{(20-18)^2}{18} \approx 0.2222\)
Total	200	200	\(\chi^2 \approx 0.4535\)

For a single biallelic locus when allele frequencies are estimated from the same sample, a commonly used reference is \(\mathrm{df}=1\). At a typical \(\alpha=0.05\), the critical value is approximately \(3.84\), so \(\chi^2 \approx 0.45\) is consistent with Hardy–Weinberg expectations for this illustration.

A statistically significant departure from Hardy–Weinberg expectations does not uniquely identify a cause. Heterozygote deficiency can reflect inbreeding or population subdivision, heterozygote excess can reflect certain selection regimes, and apparent deviations can arise from genotyping error or non-random sampling.

Common pitfalls

Hardy–Weinberg equilibrium is often conflated with “no evolution” in a broad sense; the model is locus-specific and force-specific, and other loci can evolve under selection or drift while one locus matches Hardy–Weinberg expectations. Another frequent error is treating \(2pq\) as a fixed number rather than a function of allele frequency; heterozygosity changes predictably with \(p\) and peaks at \(p=0.5\). Small expected genotype counts reduce the reliability of large-sample chi-square approximations, so interpretation benefits from attention to sample size and data quality.

Vote on the accepted answer

Upvotes: 0 Downvotes: 0 Score: 0