Poisson statistical distribution
The poisson statistical distribution (Poisson distribution) is a discrete probability distribution for a random count \(X\) of events occurring in a fixed time window, distance, area, or volume. The model is appropriate when events occur independently, the average rate is constant within the interval, and simultaneous events are negligible at the chosen measurement resolution.
The parameter \(\lambda\) represents the expected number of events in the interval. The probability of observing \(k\) events depends on \(\lambda\) and the factorial \(k!\).
Probability mass function
A Poisson-distributed random variable \(X\sim\text{Poisson}(\lambda)\) takes values \(k=0,1,2,\dots\) with probability mass function (PMF)
\[ P(X=k)=\frac{e^{-\lambda}\lambda^k}{k!}, \qquad k=0,1,2,\dots,\ \lambda>0 \]
The exponential factor \(e^{-\lambda}\) ensures that the total probability sums to 1 across all nonnegative integers.
Core properties and interpretation of \(\lambda\)
| Property | Statement | Interpretation |
|---|---|---|
| Support | \(k\in\{0,1,2,\dots\}\) | Counts only |
| Mean | \(E[X]=\lambda\) | Average count per interval |
| Variance | \(\operatorname{Var}(X)=\lambda\) | Dispersion matches the mean under the model |
| Standard deviation | \(\sigma=\sqrt{\lambda}\) | Typical fluctuation scale |
| Additivity | If \(X_1\sim\text{Poisson}(\lambda_1)\), \(X_2\sim\text{Poisson}(\lambda_2)\) independent, then \(X_1+X_2\sim\text{Poisson}(\lambda_1+\lambda_2)\) | Combining independent intervals adds expected counts |
Computing probabilities
Point probability
A single-count probability follows directly from the PMF. For example, if \(\lambda=2\) events per interval, the probability of exactly \(5\) events is
\[ P(X=5)=\frac{e^{-2}\cdot 2^5}{5!} =\frac{e^{-2}\cdot 32}{120} \approx 0.0360894 \]
Cumulative probability
Cumulative probabilities are sums of PMF values. For \(\lambda=2\),
\[ P(X\le 2)=P(X=0)+P(X=1)+P(X=2) =e^{-2}\left(1+\frac{2^1}{1!}+\frac{2^2}{2!}\right) \approx 0.6766764 \]
Right-tail probabilities use complements, such as \(P(X\ge 5)=1-P(X\le 4)\). With \(\lambda=2\), \[ P(X\ge 5)\approx 0.0526530 \]
| \(\lambda\) | Quantity | Expression | Approximate value |
|---|---|---|---|
| 2 | \(P(X=0)\) | \(e^{-2}\) | 0.1353353 |
| 2 | \(P(X=5)\) | \(e^{-2}\cdot 2^5/5!\) | 0.0360894 |
| 2 | \(P(X\le 2)\) | \(\sum_{k=0}^{2} e^{-2}2^k/k!\) | 0.6766764 |
| 2 | \(P(X\ge 5)\) | \(1-\sum_{k=0}^{4} e^{-2}2^k/k!\) | 0.0526530 |
Visualization of the Poisson PMF for different \(\lambda\)
Rate and interval length
A common parameterization separates a rate \(r\) (events per unit time) and an exposure length \(t\). The expected count in the interval is \(\lambda=rt\). For example, a rate of \(r=2\) calls per minute over \(t=3\) minutes gives \(\lambda=6\), so the total call count in 3 minutes is modeled by \(\text{Poisson}(6)\) under constant-rate, independent-occurrence conditions.
Estimating \(\lambda\) from data
For independent observations \(x_1,\dots,x_n\) from \(\text{Poisson}(\lambda)\), the natural estimator of \(\lambda\) is the sample mean:
\[ \widehat{\lambda}=\frac{1}{n}\sum_{i=1}^{n} x_i \]
With unequal exposures \(t_i\) and counts \(x_i\), a rate estimate uses total counts divided by total exposure, \[ \widehat{r}=\frac{\sum_{i=1}^{n} x_i}{\sum_{i=1}^{n} t_i}, \qquad \widehat{\lambda}(t)=\widehat{r}\,t \]
Connections and approximations
The Poisson distribution arises as a limiting form of the binomial distribution when \(n\) is large, \(p\) is small, and \(\lambda=np\) is of moderate size. In that regime, \[ \text{Binomial}(n,p)\approx \text{Poisson}(\lambda=np) \]
For large \(\lambda\), the Poisson distribution becomes approximately normal: \[ X \approx N(\lambda,\lambda) \] with continuity adjustments improving tail approximations for integer counts.
Common pitfalls
| Issue | What changes | Typical symptom in data |
|---|---|---|
| Non-constant rate within the interval | \(\lambda\) varies with time or space | Systematic patterns by time-of-day or location |
| Dependence or clustering of events | Independence fails | Variance larger than the mean (overdispersion) |
| Many simultaneous events at the same instant/resolution | Discrete-time artifacts appear | Excess mass at higher counts compared with Poisson |
| Excess zeros (structural zeros) | Mixture mechanisms beyond Poisson | Observed zeros far exceed Poisson predictions |