Definition and conditions
The law of total probability is a rule for computing an overall probability by breaking a problem into mutually exclusive cases and adding the case-wise contributions. The key requirement is a partition of the sample space.
Partition conditions: events \(B_1,\dots,B_n\) must satisfy:
- Mutually exclusive: \(B_i \cap B_j = \varnothing\) for \(i \ne j\)
- Exhaustive: \(B_1 \cup \cdots \cup B_n = \Omega\)
- Nonzero where used: \(P(B_i) > 0\) whenever \(P(A\mid B_i)\) is taken
Law of total probability (general form)
If \(B_1,\dots,B_n\) form a partition, then for any event \(A\),
Interpretation: compute \(P(A)\) by taking a weighted average of the conditional probabilities \(P(A\mid B_i)\), with weights \(P(B_i)\).
Derivation in one line
Because \(A = (A\cap B_1)\cup \cdots \cup (A\cap B_n)\) and the sets \(A\cap B_i\) are disjoint, \[ P(A)=\sum_{i=1}^{n} P(A\cap B_i)=\sum_{i=1}^{n} P(A\mid B_i)\cdot P(B_i). \]
The figure partitions the sample space into cases \(B_1,B_2,B_3\). The overall probability \(P(A)\) is obtained by adding the contributions from each case: \(P(A\cap B_i)\).
Why this rule is natural in biology
Biological outcomes are often mixtures of distinct underlying states. In Mendelian genetics, an observed phenotype can correspond to multiple genotypes, and an offspring outcome depends on which parental genotypes are actually present. The law of total probability formalizes the idea:
- choose the hidden state (a genotype, a cross type, a population subgroup) with probability \(P(B_i)\);
- within that state, the outcome occurs with probability \(P(A\mid B_i)\);
- sum over all states.
Worked genetics example: dominant phenotype parent × recessive parent
Consider a single gene with alleles \(A\) (dominant) and \(a\) (recessive). A parent shows the dominant phenotype, so the genotype is either \(AA\) or \(Aa\). The other parent is known to be \(aa\). The goal is to compute the probability that a child is \(aa\).
Step 1: Define events
- \(A\): “child genotype is \(aa\)”
- \(B_1\): “dominant-phenotype parent is \(AA\)”
- \(B_2\): “dominant-phenotype parent is \(Aa\)”
Here \(\{B_1,B_2\}\) is a partition of the dominant parent’s possible genotypes (given that the phenotype is dominant).
Step 2: Assign genotype probabilities for the dominant parent
Assume Hardy–Weinberg genotype frequencies in the population with allele frequencies \(p=P(A)=0.8\) and \(q=P(a)=0.2\). Then \[ P(AA)=p^2=0.64,\quad P(Aa)=2pq=0.32,\quad P(aa)=q^2=0.04. \] Conditioning on the parent showing the dominant phenotype (not \(aa\)): \[ P(AA\mid \text{dom})=\frac{0.64}{0.64+0.32}=\frac{0.64}{0.96}=\frac{2}{3},\quad P(Aa\mid \text{dom})=\frac{0.32}{0.96}=\frac{1}{3}. \]
Step 3: Compute the conditional child probabilities
With the other parent \(aa\):
- If \(B_1\) (parent \(AA\)), then all children receive an \(A\) from that parent, so \(P(A\mid B_1)=0\).
- If \(B_2\) (parent \(Aa\)), half the gametes carry \(a\), so \(P(A\mid B_2)=\frac{1}{2}\).
Step 4: Apply the law of total probability
The probability that the child is \(aa\) is \(\frac{1}{6}\approx 0.1667\) under the stated allele-frequency assumption.
| Case \(B_i\) | Meaning | \(P(B_i)\) | \(P(A\mid B_i)\) | Contribution \(P(A\mid B_i)\cdot P(B_i)\) |
|---|---|---|---|---|
| \(B_1\) | Dominant parent is \(AA\) | \(\frac{2}{3}\) | \(0\) | \(0\) |
| \(B_2\) | Dominant parent is \(Aa\) | \(\frac{1}{3}\) | \(\frac{1}{2}\) | \(\frac{1}{6}\) |
Common pitfalls
- Overlapping cases: if the \(B_i\) are not mutually exclusive, summing double-counts outcomes.
- Incomplete case list: if the \(B_i\) are not exhaustive, the computed \(P(A)\) is missing probability mass.
- Conditioning on the wrong information: in genetics, probabilities often must be conditioned on observed phenotype or family history before applying the rule.
Connection to Bayes’ theorem
The law of total probability often supplies the denominator in Bayes’ theorem by providing \(P(A)\) from case-wise likelihoods \(P(A\mid B_i)\). In biological inference problems (carrier testing, diagnostic sensitivity/specificity), this linkage is essential for converting test outcomes into posterior probabilities.