Test cross outcomes
A test cross is a genetics strategy used to infer an unknown genotype by crossing the unknown individual with a
homozygous recessive tester (aa, aabb, or aabbcc). Because the tester can contribute only recessive alleles,
the offspring phenotypes reveal which alleles the unknown parent contributes.
Key assumptions used in this calculator
- Complete dominance at each gene: A dominates a, B dominates b, C dominates c.
- Independent assortment between genes (no linkage) so probabilities multiply across loci.
- No epistasis, lethality, or selection affecting offspring classes (intro-level model).
Phenotype class notation
Each gene has two phenotype states:
dominant phenotype (written as A_) or recessive phenotype (written as aa).
The underscore means “at least one dominant allele.”
For multiple genes, classes are combined, for example:
A_ B_, A_ bb, aa B_, aabb.
One-gene test cross (A/a)
The tester is aa. The unknown parent could be AA, Aa, or aa.
Offspring outcomes:
\[
\begin{aligned}
\text{AA} \times \text{aa} &\Rightarrow 100\% \; \text{A\_} \\
\text{Aa} \times \text{aa} &\Rightarrow 50\% \; \text{A\_} \;+\; 50\% \; \text{aa} \\
\text{aa} \times \text{aa} &\Rightarrow 100\% \; \text{aa}
\end{aligned}
\]
Interpretation: if you observe both dominant and recessive offspring in a ~1:1 pattern, the unknown parent is most consistent with Aa.
Two-gene test cross (A/a and B/b)
The tester is aabb. Each locus contributes either a fixed outcome (if the unknown is homozygous) or a 1:1 split (if heterozygous).
Probabilities combine by multiplication.
Classic example: if the unknown is AaBb, then each gene splits 1:1, producing 4 phenotype classes in a 1:1:1:1 pattern:
\[
\begin{aligned}
\text{AaBb} \times \text{aabb} \Rightarrow
&\; \frac{1}{4}\; \text{A\_B\_} \;+\; \frac{1}{4}\; \text{A\_bb} \;+\; \frac{1}{4}\; \text{aaB\_} \;+\; \frac{1}{4}\; \text{aabb}
\end{aligned}
\]
If only one gene is heterozygous (for example AaBB), then you get only 2 phenotype classes with a 1:1 split:
\[
\begin{aligned}
\text{AaBB} \times \text{aabb} \Rightarrow \frac{1}{2}\; \text{A\_B\_} \;+\; \frac{1}{2}\; \text{aaB\_}
\end{aligned}
\]
Three-gene test cross (A/a, B/b, C/c)
The tester is aabbcc. The same logic applies: every heterozygous locus doubles the number of phenotype classes.
If the unknown is heterozygous at all three genes (AaBbCc), there are 8 classes, each expected at 1/8.
General probability model used
For each locus j, the calculator assigns a probability of dominant vs recessive phenotype depending on the unknown genotype at that locus:
\[
\begin{aligned}
\text{If } \text{AA} &: \; P(\text{A\_}) = 1,\; P(\text{aa})=0 \\
\text{If } \text{Aa} &: \; P(\text{A\_}) = \frac{1}{2},\; P(\text{aa})=\frac{1}{2} \\
\text{If } \text{aa} &: \; P(\text{A\_}) = 0,\; P(\text{aa})=1
\end{aligned}
\]
For a multi-gene phenotype class, the calculator multiplies per-locus probabilities:
\[
\begin{aligned}
P(\text{phenotype class}) &= \prod_{j=1}^{k} P_j
\end{aligned}
\]
From expected proportions to expected counts
If you enter observed offspring counts, the total is
\(N = \sum_i O_i\), and expected counts are computed by:
\[
\begin{aligned}
E_i &= N \cdot p_i
\end{aligned}
\]
Comparing observed data to hypotheses (lightweight inference)
When observed counts are provided, the calculator compares each genotype hypothesis to the data using a simple chi-square statistic:
\[
\begin{aligned}
\chi^2 &= \sum_{i=1}^{m} \frac{(O_i - E_i)^2}{E_i} \\
\text{df} &= m - 1
\end{aligned}
\]
- Best match: the hypothesis with the smallest \(\chi^2\) (equivalently, the largest p-value).
- Goodness-of-fit indicator: a quick label (good / borderline / poor) based on the p-value.
Important caution: chi-square approximations are less reliable when some expected counts are small (a common rule of thumb is \(E_i < 5\)).
This calculator will warn you when that happens.
How to use the calculator effectively
- Select 1, 2, or 3 genes. The tester genotype updates automatically (aa, aabb, aabbcc).
- For each gene, choose the unknown genotype option (e.g., A_ tests both AA and Aa).
-
Optional: enter observed counts per phenotype class.
You can paste counts, paste a two-column CSV (phenotype,count), or upload a CSV file.
- Click Calculate to see expected ratios for each hypothesis and the best match (if data were provided).
How to interpret the visualizations
-
Expected vs observed bar chart: side-by-side bars per phenotype class.
Hover a bar to see the exact value. Use mouse wheel or the zoom buttons to zoom, and drag to pan.
-
Flow diagram: “Hypothesis → Expected ratio → Compare to data” summarizes the inference for the selected hypothesis.
-
Tester badge: reminds you that the recessive tester forces the unknown parent’s alleles to be revealed in offspring phenotypes.
Common reasons real data may not match perfectly
- Sampling variation when the number of offspring is small.
- Linkage (genes on the same chromosome) breaks independent assortment.
- Epistasis (one gene masking another) changes phenotype class structure.
- Viability/selection differences among genotype classes.
If your dataset consistently deviates from all hypotheses under the independent-assortment model, consider linkage or gene interactions as next-step explanations.