The keyword “how does linearity work for hypergeometric distribution” is answered most directly by expressing the hypergeometric count as a sum of indicator random variables. This makes the mean immediate by linearity of expectation, even though the draws are dependent.
1) Setup: the hypergeometric model
A finite population has size \(N\). Exactly \(K\) items are labeled “success” and \(N-K\) are “failure”. A sample of size \(n\) is drawn without replacement.
Let \(X\) be the number of successes in the sample. Then \(X\) follows a hypergeometric distribution with parameters \((N, K, n)\).
2) The key idea: linearity of expectation
Define indicator variables for the draws:
The total number of successes is the sum of these indicators:
Linearity of expectation states that for any random variables \(Y_1,\dots,Y_n\), \[ \mathbb{E}\!\left[\sum_{i=1}^{n} Y_i\right] = \sum_{i=1}^{n} \mathbb{E}[Y_i]. \] Independence is not required.
3) Applying linearity to find the hypergeometric mean
First compute the expectation of one indicator. Each draw (viewed marginally) is a success with probability \(p = K/N\), so
Then linearity gives
4) Where dependence matters: variance requires covariances
Although linearity makes the mean straightforward, the variance must account for dependence between draws. Using \(\mathrm{Var}\!\left(\sum Y_i\right)=\sum \mathrm{Var}(Y_i)+2\sum_{i<j}\mathrm{Cov}(Y_i,Y_j)\), one obtains
| Component | Value in the hypergeometric setting | Reason |
|---|---|---|
| \(\mathrm{Var}(I_i)\) | \(\dfrac{K}{N}\left(1-\dfrac{K}{N}\right)\) | \(I_i\) is Bernoulli with \(p=K/N\) marginally. |
| \(\mathrm{Cov}(I_i,I_j)\) for \(i\neq j\) | \(-\dfrac{K(N-K)}{N^2(N-1)}\) | Without replacement makes successes slightly less likely after a success (negative dependence). |
A compact derivation of the covariance uses:
Substituting into the variance formula (and simplifying) yields the standard hypergeometric variance:
5) Numerical example
Suppose \(N=20\), \(K=7\), and \(n=5\). Then \(p=K/N=7/20=0.35\).
6) Visualization: “sum of indicators” view that explains linearity
7) Summary of “linearity” for the hypergeometric distribution
- Expressing \(X\) as \(X=\sum_{i=1}^{n} I_i\) makes the mean immediate: \[ \mathbb{E}[X]=n \cdot \frac{K}{N}. \]
- Independence is unnecessary for the mean because linearity of expectation always holds.
- Dependence matters for the variance because covariance terms appear, producing the finite population correction: \[ \mathrm{Var}(X)=n \cdot \frac{K}{N}\left(1-\frac{K}{N}\right)\cdot \frac{N-n}{N-1}. \]