Runs Test for Randomness
The runs test is a nonparametric procedure used to check whether the order of observations in a sequence is
“random-like.” It does not test normality or independence directly; instead, it evaluates whether the sequence shows
too much clustering (too few runs) or too much alternation (too many runs) compared with what we would
expect under randomness.
What is a run?
A run is a maximal block of identical symbols occurring consecutively. For example, the sequence
\[
+\;+\;-\;-\;-\;+\;-\;-\;+\;+\;+\;
\]
has runs: \((++)\), \((---)\), \((+)\), \((--)\), \((+++)\), so the number of runs is \(R=5\).
Inputs and sequence construction
The test requires a binary sequence of “+” and “−”. This calculator supports two common ways to build it:
-
Binary mode: You enter the symbols directly (e.g., H/T, 0/1, +/−, or custom tokens).
The calculator maps your chosen symbols into “+” and “−”.
-
Numeric mode: You enter a numeric series \(x_1,x_2,\dots,x_n\) and convert it into a binary sequence using a
cutoff \(c\):
- Median rule: \(c = \mathrm{median}(x)\)
- Threshold rule: \(c\) is a user-chosen value
Then define
\[
\begin{aligned}
S_i &=
\begin{cases}
+, & x_i > c, \\
-, & x_i < c.
\end{cases}
\end{aligned}
\]
If \(x_i=c\), you may drop that observation (recommended) or force it into “+” or “−”.
Notation
After conversion, let:
\[
\begin{aligned}
n_+ &= \#\{i : S_i=+\}, \\
n_- &= \#\{i : S_i=-\}, \\
N &= n_+ + n_-.
\end{aligned}
\]
The test requires both categories to appear (\(n_+>0\) and \(n_->0\)).
Hypotheses
The null hypothesis is that the sequence is random-like (with fixed counts \(n_+\) and \(n_-\)). The alternative can
be selected depending on the pattern you suspect:
\[
\begin{aligned}
H_0 &: \text{The sequence order is random-like (in terms of runs)} \\
H_1 &: \text{The sequence is not random-like (two-sided)} \\
H_1 &: \text{Too few runs (clustering)} \\
H_1 &: \text{Too many runs (oscillation)}
\end{aligned}
\]
Counting the number of runs
The number of runs \(R\) can be computed by counting transitions:
\[
\begin{aligned}
R &= 1 + \sum_{i=2}^{N} \mathbf{1}(S_i \ne S_{i-1}).
\end{aligned}
\]
Normal approximation (z method)
When the sample size is moderate/large, a normal approximation is commonly used. Under \(H_0\), the expected number
of runs and the variance are:
\[
\begin{aligned}
\mu_R &= \frac{2n_+ n_-}{n_+ + n_-} + 1, \\
\sigma_R^2 &= \frac{2n_+ n_-(2n_+ n_- - n_+ - n_-)}{(n_+ + n_-)^2 (n_+ + n_- - 1)}.
\end{aligned}
\]
The standardized statistic is
\[
\begin{aligned}
z &= \frac{R - \mu_R}{\sigma_R}.
\end{aligned}
\]
A continuity correction is sometimes applied because \(R\) is discrete. The calculator provides the common
\(\pm 0.5\) adjustment for the selected tail.
The p-value is computed using the standard normal CDF \(\Phi(\cdot)\):
\[
\begin{aligned}
p\text{-value} &=
\begin{cases}
2\left(1-\Phi(|z|)\right), & \text{two-sided}, \\
\Phi(z), & \text{few runs (left-tail)}, \\
1-\Phi(z), & \text{many runs (right-tail)}.
\end{cases}
\end{aligned}
\]
Exact method (small samples)
For smaller \(N\), an exact p-value can be obtained from the exact distribution of \(R\) given \(n_+\) and \(n_-\).
Under \(H_0\), all sequences with \(n_+\) pluses and \(n_-\) minuses are equally likely, and the total number of such
sequences is
\[
\begin{aligned}
\#\{\text{sequences}\} &= \binom{N}{n_+}.
\end{aligned}
\]
The exact p-value is the probability (under \(H_0\)) of observing a run count as extreme as the observed \(R\),
according to the chosen alternative:
\[
\begin{aligned}
p\text{-value}
&=
\frac{\sum_{r \in \mathcal{T}} \#\{\text{sequences with } r \text{ runs}\}}
{\binom{N}{n_+}},
\end{aligned}
\]
\[
\begin{aligned}
\mathcal{T}
&=
\begin{cases}
\{r: r \le R\}, & \text{few runs}, \\
\{r: r \ge R\}, & \text{many runs}, \\
\{r: |r-\mu_R| \ge |R-\mu_R|\}, & \text{two-sided}.
\end{cases}
\end{aligned}
\]
Decision and interpretation
With significance level \(\alpha\):
\[
\begin{aligned}
\text{Reject } H_0 &\text{ if } p\text{-value} \le \alpha, \\
\text{otherwise } &\text{fail to reject } H_0.
\end{aligned}
\]
Practical interpretation:
- Too few runs \(\Rightarrow\) values tend to cluster (long streaks), suggesting non-random grouping.
- Too many runs \(\Rightarrow\) values alternate frequently, suggesting oscillation or over-regularity.
- Near expected runs \(\Rightarrow\) no evidence against randomness based on runs.
Reminder: “Fail to reject” does not prove randomness; it means the run count is not unusually extreme at the chosen
\(\alpha\).