Hypergeometric distribution: sampling without replacement
The hypergeometric distribution models the number of “successes” you obtain when you draw items
without replacement from a finite population. Imagine a population of size \(N\) that contains \(K\) successes
(for example, \(K\) hearts in a deck, or \(K\) defective parts in a batch). If you draw \(n\) items at random, the random
variable \(X\) counts how many of the drawn items are successes.
Probability mass function (PMF)
To have exactly \(k\) successes in the sample, you must choose \(k\) items from the \(K\) successes and
\(n-k\) items from the \(N-K\) failures. The total number of equally likely samples of size \(n\) is \(\binom{N}{n}\).
Therefore the PMF is
\[
P(X=k)=\frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}.
\]
Not every \(k\) is possible. Feasibility requires
\[
\max\!\bigl(0,\; n-(N-K)\bigr)\le k \le \min(n,\;K),
\]
because you can’t draw more successes than exist (\(k\le K\)) and you can’t draw more successes than the sample size (\(k\le n\)),
while also ensuring the remaining \(n-k\) items can come from the failures.
Mean and variance
The expected number of successes equals the sample size times the population success fraction:
\[
\mathbb{E}[X]=n\frac{K}{N}.
\]
The variance includes a finite population correction because draws are dependent when you do not replace items:
\[
\mathrm{Var}(X)=n\frac{K}{N}\left(1-\frac{K}{N}\right)\frac{N-n}{N-1}.
\]
When \(N\) is very large compared to \(n\), the correction \(\frac{N-n}{N-1}\) is close to 1, and the distribution can resemble a binomial model.
Why “without replacement” matters
In a binomial model, trials are independent: the success probability stays constant from trial to trial. In a hypergeometric model,
the probability changes after each draw because the population composition changes. This is exactly what happens when you draw cards from a deck
or sample items from a shipment without putting them back.
How to use this tool
Enter \(N\) (population size), \(K\) (successes in the population), \(n\) (draw size), and the target \(k\).
The calculator returns \(P(X=k)\) and, if enabled, cumulative values like \(P(X\le k)\) and the tail \(P(X\ge k)\).
The interactive visualization shows an urn-style population view and the PMF bar chart.
After you calculate, press Play to animate drawing \(n\) items without replacement and see the simulated success count compared
to the theoretical PMF. As an advanced extension, related university topics include the multivariate hypergeometric distribution (more than two categories)
and approximations (binomial or normal) for large populations.