Genetic drift simulation (random sampling in finite populations)
Genetic drift is the change in allele frequencies caused by random sampling in a finite population. Even if an
allele has no selective advantage, its frequency can rise or fall purely by chance from one generation to the next.
Drift is strongest when population size is small and can lead to fixation (allele frequency becomes 1) or
loss (allele frequency becomes 0).
Model setup
Consider one locus with two alleles, \(A\) and \(a\). Let \(p_t\) be the frequency of allele \(A\) in generation \(t\),
and \(q_t\) be the frequency of allele \(a\). By definition:
\[
\begin{aligned}
p_t + q_t &= 1
\end{aligned}
\]
The population has size \(N\) diploid individuals, so there are \(2N\) allele copies at this locus each generation.
The key idea is that the next generation’s alleles are a random sample of the current generation’s alleles.
Drift as binomial sampling
If the current allele frequency is \(p_t\), then the number of \(A\) alleles in the next generation, denoted \(k_t\),
is modeled as a binomial random variable:
\[
\begin{aligned}
k_t &\sim \text{Binomial}(2N,\ p_t)
\end{aligned}
\]
The updated allele frequency is then:
\[
\begin{aligned}
p_{t+1} &= \frac{k_t}{2N}, \qquad q_{t+1}=1-p_{t+1}
\end{aligned}
\]
This calculator repeats this update for the number of generations you choose. Because the update includes randomness,
each run produces a different trajectory, even with the same starting value \(p_0\).
Expected behavior and sampling variance
Drift does not have a directional “push” like selection. In the binomial sampling model:
\[
\begin{aligned}
\mathbb{E}[p_{t+1}\mid p_t] &= p_t
\end{aligned}
\]
So the expected allele frequency stays the same in one step, but there is variance due to sampling:
\[
\begin{aligned}
\mathrm{Var}(p_{t+1}\mid p_t) &= \frac{p_t(1-p_t)}{2N}
\end{aligned}
\]
This formula explains two important patterns you can observe in the graphs:
• Drift is stronger when \(N\) is smaller (larger variance).
• Drift is strongest near \(p_t=0.5\) and weakest near \(p_t=0\) or \(p_t=1\).
Fixation and loss
If a run reaches \(p_t=1\), allele \(A\) is fixed. If it reaches \(p_t=0\), allele \(A\) is lost.
These are absorbing boundaries in the model: once a run reaches 0 or 1, it stays there.
When the calculator uses many replicate simulations, it estimates:
\[
\begin{aligned}
\widehat{P}(\text{fix by generation }G) &= \frac{\#\{r:\ p_G^{(r)}=1\}}{R} \\
\widehat{P}(\text{loss by generation }G) &= \frac{\#\{r:\ p_G^{(r)}=0\}}{R}
\end{aligned}
\]
Here \(R\) is the number of replicates, and \(p_G^{(r)}\) is the allele frequency at generation \(G\) in replicate \(r\).
Runs that are neither 0 nor 1 at generation \(G\) are still segregating.
What the visualizations show
Spaghetti plot: Each line is one replicate trajectory \(p(t)\). With the same \(p_0\), lines spread out due to drift.
Some reach fixation or loss earlier than others, especially for small \(N\). Hovering the plot reveals generation-specific
values and the highlighted replicate.
Final-generation histogram: This shows the distribution of \(p_G\) across replicates. For strong drift (small \(N\)
and/or many generations), the distribution often piles up near 0 and 1 as more runs fix or lose the allele.
Interpreting the replicate summary
The calculator reports the mean and standard deviation of final allele frequencies:
\[
\begin{aligned}
\overline{p}_G &= \frac{1}{R}\sum_{r=1}^{R} p_G^{(r)} \\
s_G &= \sqrt{\frac{1}{R-1}\sum_{r=1}^{R}\left(p_G^{(r)}-\overline{p}_G\right)^2}
\end{aligned}
\]
The mean summarizes the average outcome across replicates, while the standard deviation quantifies how spread out the
outcomes are. A large \(s_G\) indicates high variability among runs.
Notes and assumptions
This is a basic drift model with random sampling only. It assumes no selection, no mutation, no migration, and a constant
population size \(N\). Because the model is stochastic, using a random seed makes the simulation reproducible: the same
inputs and seed generate the same trajectories.
Recent reviews
very useful
nice