Loading…

The Hypergeometric Probability Distribution

Statistics • Discrete Random Variables and Their Probability Distributions

View all topics

The Hypergeometric Probability Distribution

Use this when you sample without replacement from a finite population (so trials are dependent). Enter the population size N, the number of successes in the population r, the sample size n, and the number of observed successes x.

Quick setup N = population size, r = successes in population, N − r = failures in population, n = sample size, x = successes in sample, n − x = failures in sample.

For cumulative probabilities (like “at most” / “at least”), we use the addition rule because the events X = 0, X = 1, … are mutually exclusive.

Finite population. For performance, keep N reasonably sized (e.g., ≤ 20000).
Must satisfy 0 ≤ r ≤ N.
Must satisfy 0 ≤ n ≤ N (sampling without replacement).
Valid range depends on N, r, n.
Ready
Enter N, r, n, then click Calculate.

Rate this calculator

0.0 /5 (0 ratings)
Be the first to rate.
Your rating
You can update your rating any time.

Frequently Asked Questions

What is the hypergeometric probability distribution used for?

It models the number of successes X in a sample of size n drawn without replacement from a finite population of size N that contains r successes. It is the standard model when draws are dependent because items are not replaced.

What is the hypergeometric probability formula for P(X = x)?

The formula is P(X = x) = [C(r, x) x C(N - r, n - x)] / C(N, n). It counts samples with x successes and n - x failures divided by all samples of size n.

What values of x are possible in a hypergeometric problem?

Not every integer is feasible. The support is max(0, n - (N - r)) <= x <= min(n, r).

How do you calculate P(X <= x) or P(X >= x) for a hypergeometric distribution?

Cumulative probabilities are sums of exact probabilities because outcomes X = 0, 1, 2, ... are mutually exclusive. For example, P(X <= k) is the sum of P(X = x) over all x up to k.

When is the binomial distribution a reasonable approximation to hypergeometric?

When the population is very large and the sample is a small fraction of the population, the hypergeometric model can be close to binomial with p approximately r/N. For finite populations with sampling without replacement, hypergeometric is the correct default.