Geometric distribution: waiting time for the first success
The geometric distribution models repeated, independent trials where each trial is a success with probability \(p\)
and a failure with probability \(q=1-p\). The key question is a waiting-time question: How long until the first success?
This is common in real contexts like “How many customers arrive before a sale?” or “How many attempts until a device works?” when each attempt has the same success chance.
Two standard definitions
Textbooks use two closely related versions of the geometric random variable \(X\):
-
Trials until first success: \(X\in\{1,2,3,\dots\}\) and \(X=k\) means the first success happens on trial \(k\).
Then the first \(k-1\) trials must be failures and the \(k\)-th must be success, so
\[
P(X=k)=q^{k-1}p.
\]
-
Failures before first success: \(X\in\{0,1,2,\dots\}\) and \(X=k\) means you see \(k\) failures and then a success.
In that case,
\[
P(X=k)=q^{k}p.
\]
The formulas are almost identical—only the exponent shifts—so it’s important to pick the definition that matches your problem statement.
Cumulative probability and “at most” statements
Besides a single value \(P(X=k)\), you often want the cumulative probability \(P(X\le k)\), which answers “success occurs by time \(k\).”
For the trials-until-success version,
\[
P(X\le k)=1-q^{k},
\]
because \(q^{k}\) is the probability that the first \(k\) trials are all failures. For the failures-before-success version,
\[
P(X\le k)=1-q^{k+1}.
\]
These closed forms avoid summing many terms and make it easy to check intuition: as \(k\) grows, the cumulative probability approaches 1.
Mean, variance, and the memoryless property
The geometric distribution has simple moments. If \(X\) counts trials until the first success, then
\[
\mathbb{E}[X]=\frac{1}{p}, \qquad \mathrm{Var}(X)=\frac{1-p}{p^{2}}.
\]
If \(X\) counts failures before success, the mean shifts to \(\mathbb{E}[X]=\frac{1-p}{p}\), while the variance stays \(\frac{1-p}{p^{2}}\).
A standout feature is that the geometric waiting time is memoryless:
\[
P(X>m+n \mid X>m)=P(X>n).
\]
Informally, if you have already observed \(m\) failures, the distribution of the additional waiting time does not change—past failures do not “use up” probability.
This property is one reason geometric models are used as discrete-time analogs of the exponential distribution in continuous time.
How to use this tool
Choose which definition of \(X\) you want (trials-until-success or failures-before-success), enter \(p\) and \(k\), and click Calculate.
The tool reports \(P(X=k)\), optionally \(P(X\le k)\), and the mean/variance with clear steps.
The visualization shows a trial sequence (failures then success) and a PMF bar chart you can pan and zoom.
Press Play to animate how the sequence unfolds and to sweep a highlighted \(k\) across the PMF—useful for building intuition about how probabilities decay by powers of \(1-p\).