What makes a discrete PMF valid?
A discrete random variable \(X\) assigns probability to individual outcomes (or values) \(x\).
The function \(p(x)=P(X=x)\) is called a probability mass function (PMF).
Unlike a continuous density, a PMF is a list (or table) of probability “masses” placed at specific points.
Because probabilities must obey the axioms of probability, a valid PMF has two essential properties:
- Non-negativity: \(p(x_i)\ge 0\) for every value \(x_i\).
- Total probability equals 1: \(\sum_i p(x_i)=1\).
The second condition means the PMF accounts for the entire sample space of outcomes. If your listed values
cover only some outcomes, the total will be less than 1; if you accidentally double-count or enter inconsistent
numbers, the total might exceed 1. Either way, the PMF is not valid as written.
Tolerance and rounding in real inputs
In practice, input values are often rounded (for example, \(0.3333\) instead of \(1/3\)).
That’s why validators typically use a tolerance \(\mathrm{tol}\) and check whether
\(\left|\sum_i p_i - 1\right|\le \mathrm{tol}\) rather than demanding exact equality.
The same idea can apply to tiny negative values that occur from rounding or subtraction,
such as \(-10^{-12}\). If \(\mathrm{tol}\) is much larger than that magnitude, it may be reasonable to
clamp those tiny negatives to 0. However, genuinely negative probabilities (clearly below \(-\mathrm{tol}\))
indicate an invalid model.
Normalization: when it helps (and when it changes the model)
If all probabilities are non-negative but the total sum is not 1, you can sometimes normalize
by dividing each value by the total:
\[
p_i'=\frac{p_i}{\sum_j p_j}.
\]
This forces \(\sum_i p_i' = 1\). Normalization is useful when your inputs are really “weights” that were not yet scaled,
or when a PMF is incomplete due to truncation and you want to renormalize the remaining mass.
At the same time, it’s important to recognize that normalization changes the distribution:
the relative proportions stay the same, but every probability is rescaled. If you expected the original totals to be meaningful,
normalization may hide an underlying input mistake.
Extra: moments from a PMF
Once you have a valid (or normalized) PMF, you can compute summary quantities like the mean and variance:
\[
E[X]=\sum_i x_i p_i, \qquad E[X^2]=\sum_i x_i^2 p_i, \qquad \mathrm{Var}(X)=E[X^2]-\big(E[X]\big)^2.
\]
These formulas highlight why validity matters: if probabilities don’t sum to 1 or include negatives, the computed moments
may be misleading.
How to use the validator
Enter pairs \((x,p)\) one per line (for example, “\(x=2, p=0.5\)” or “2 0.5”). Choose a tolerance \(\mathrm{tol}\),
and optionally enable clamping for tiny negatives. Click Calculate to see pass/fail checks,
the total sum, and (if enabled) the normalized PMF when the total isn’t 1. The interactive chart displays a “sum-to-1” meter
and PMF bars; press Play to animate how the bars and total fill in, and use pan/zoom to inspect large PMFs.