What is the law of total probability, and how is it used to compute overall probabilities from conditional biological cases?

The law of total probability states that if events \(B_1,\dots,B_n\) form a partition of the sample space, then \(P(A)=\sum_{i=1}^n P(A\mid B_i)\cdot P(B_i)\), allowing overall outcomes to be computed by weighting each conditional case by its chance.

Law of Total Probability (Biology & Genetics)

Accepted answer Answer included

Definition and conditions

The law of total probability is a rule for computing an overall probability by breaking a problem into mutually exclusive cases and adding the case-wise contributions. The key requirement is a partition of the sample space.

Partition conditions: events \(B_1,\dots,B_n\) must satisfy:

Mutually exclusive: \(B_i \cap B_j = \varnothing\) for \(i \ne j\)
Exhaustive: \(B_1 \cup \cdots \cup B_n = \Omega\)
Nonzero where used: \(P(B_i) > 0\) whenever \(P(A\mid B_i)\) is taken

Law of total probability (general form)

If \(B_1,\dots,B_n\) form a partition, then for any event \(A\),

\[ P(A)=\sum_{i=1}^{n} P(A\mid B_i)\cdot P(B_i) \]

Interpretation: compute \(P(A)\) by taking a weighted average of the conditional probabilities \(P(A\mid B_i)\), with weights \(P(B_i)\).

Derivation in one line

Because \(A = (A\cap B_1)\cup \cdots \cup (A\cap B_n)\) and the sets \(A\cap B_i\) are disjoint, \[ P(A)=\sum_{i=1}^{n} P(A\cap B_i)=\sum_{i=1}^{n} P(A\mid B_i)\cdot P(B_i). \]

The figure partitions the sample space into cases \(B_1,B_2,B_3\). The overall probability \(P(A)\) is obtained by adding the contributions from each case: \(P(A\cap B_i)\).

Why this rule is natural in biology

Biological outcomes are often mixtures of distinct underlying states. In Mendelian genetics, an observed phenotype can correspond to multiple genotypes, and an offspring outcome depends on which parental genotypes are actually present. The law of total probability formalizes the idea:

choose the hidden state (a genotype, a cross type, a population subgroup) with probability \(P(B_i)\);
within that state, the outcome occurs with probability \(P(A\mid B_i)\);
sum over all states.

Worked genetics example: dominant phenotype parent × recessive parent

Consider a single gene with alleles \(A\) (dominant) and \(a\) (recessive). A parent shows the dominant phenotype, so the genotype is either \(AA\) or \(Aa\). The other parent is known to be \(aa\). The goal is to compute the probability that a child is \(aa\).

Step 1: Define events

\(A\): “child genotype is \(aa\)”
\(B_1\): “dominant-phenotype parent is \(AA\)”
\(B_2\): “dominant-phenotype parent is \(Aa\)”

Here \(\{B_1,B_2\}\) is a partition of the dominant parent’s possible genotypes (given that the phenotype is dominant).

Step 2: Assign genotype probabilities for the dominant parent

Assume Hardy–Weinberg genotype frequencies in the population with allele frequencies \(p=P(A)=0.8\) and \(q=P(a)=0.2\). Then \[ P(AA)=p^2=0.64,\quad P(Aa)=2pq=0.32,\quad P(aa)=q^2=0.04. \] Conditioning on the parent showing the dominant phenotype (not \(aa\)): \[ P(AA\mid \text{dom})=\frac{0.64}{0.64+0.32}=\frac{0.64}{0.96}=\frac{2}{3},\quad P(Aa\mid \text{dom})=\frac{0.32}{0.96}=\frac{1}{3}. \]

Step 3: Compute the conditional child probabilities

With the other parent \(aa\):

If \(B_1\) (parent \(AA\)), then all children receive an \(A\) from that parent, so \(P(A\mid B_1)=0\).
If \(B_2\) (parent \(Aa\)), half the gametes carry \(a\), so \(P(A\mid B_2)=\frac{1}{2}\).

Step 4: Apply the law of total probability

\[ P(A)=P(A\mid B_1)\cdot P(B_1)+P(A\mid B_2)\cdot P(B_2) =0\cdot \frac{2}{3}+\frac{1}{2}\cdot \frac{1}{3}=\frac{1}{6}. \]

The probability that the child is \(aa\) is \(\frac{1}{6}\approx 0.1667\) under the stated allele-frequency assumption.

Case \(B_i\)	Meaning	\(P(B_i)\)	\(P(A\mid B_i)\)	Contribution \(P(A\mid B_i)\cdot P(B_i)\)
\(B_1\)	Dominant parent is \(AA\)	\(\frac{2}{3}\)	\(0\)	\(0\)
\(B_2\)	Dominant parent is \(Aa\)	\(\frac{1}{3}\)	\(\frac{1}{2}\)	\(\frac{1}{6}\)

Common pitfalls

Overlapping cases: if the \(B_i\) are not mutually exclusive, summing double-counts outcomes.
Incomplete case list: if the \(B_i\) are not exhaustive, the computed \(P(A)\) is missing probability mass.
Conditioning on the wrong information: in genetics, probabilities often must be conditioned on observed phenotype or family history before applying the rule.

Connection to Bayes’ theorem

The law of total probability often supplies the denominator in Bayes’ theorem by providing \(P(A)\) from case-wise likelihoods \(P(A\mid B_i)\). In biological inference problems (carrier testing, diagnostic sensitivity/specificity), this linkage is essential for converting test outcomes into posterior probabilities.

Vote on the accepted answer

Upvotes: 0 Downvotes: 0 Score: 0