The Sign Test (Theory)
The sign test is a simple nonparametric method for testing a median using only the direction of deviations from a hypothesized value.
It is useful when data are ordinal, strongly non-normal, or when a very robust procedure is preferred.
1) What the test does
One-sample median: tests whether the population median equals a hypothesized value \(m_0\).
Paired samples: tests whether the median of paired differences equals \(m_0\) (often \(0\)).
\[
\begin{aligned}
\text{One-sample:}\quad & d_i = x_i - m_0 \\
\text{Paired:}\quad & d_i = (A_i - B_i) - m_0
\end{aligned}
\]
Each difference is reduced to a sign:
\(+\) if \(d_i > 0\), \(0\) if \(d_i = 0\), and \(−\) if \(d_i < 0\).
2) Hypotheses
\[
\begin{aligned}
H_0 &: \text{median} = m_0 \\
H_1 &:
\begin{cases}
\text{median} \ne m_0 & \text{(two-sided)}\\
\text{median} > m_0 & \text{(right-tailed)}\\
\text{median} < m_0 & \text{(left-tailed)}
\end{cases}
\end{aligned}
\]
3) Test statistic and null distribution
Let \(n_+\) be the number of positive differences, \(n_-\) the number of negative differences, and \(n_0\) the number of ties.
The classical sign test drops ties and uses the effective sample size:
\[
\begin{aligned}
n &= n_{+} + n_{-} \\
X &= n_{+}
\end{aligned}
\]
Under \(H_0\), a non-tied observation is equally likely to be “+” or “−” (probability \(0.5\) each), so:
\[
X \sim \mathrm{Bin}(n, 0.5).
\]
Note on ties: dropping ties is the standard exact sign test. Some tie-handling options are sometimes used for descriptive summaries,
but the exact binomial model is based on \(n=n_+ + n_-\).
4) Exact p-value (binomial tail)
Right-tailed \((\text{median} > m_0)\):
\[
p\text{-value} = P(X \ge x).
\]
Left-tailed \((\text{median} < m_0)\):
\[
p\text{-value} = P(X \le x).
\]
Two-sided \((\text{median} \ne m_0)\):
\[
p\text{-value}
= 2\,P\!\left(X \le \min(x,\,n-x)\right),
\quad \text{then cap at } 1.
\]
Because the null is symmetric at \(p=0.5\), the two-sided p-value can be computed by doubling the smaller tail probability.
5) Normal approximation (large \(n\))
When \(n\) is reasonably large, the binomial distribution can be approximated by a normal distribution.
With continuity correction:
\[
\begin{aligned}
\mu &= \frac{n}{2}, \quad \sigma = \sqrt{\frac{n}{4}} \\
z &\approx \frac{x - 0.5 - \mu}{\sigma} \quad \text{(right tail)} \\
z &\approx \frac{x + 0.5 - \mu}{\sigma} \quad \text{(left tail)}
\end{aligned}
\]
Prefer the exact binomial p-value whenever possible; the normal approximation is mainly a convenience for very large samples.
6) Effect size idea
A simple descriptive effect is the proportion of plus signs among non-ties:
\[
\hat p = \frac{n_{+}}{n}.
\]
If \(\hat p > 0.5\), more differences are positive (median tends to be above \(m_0\));
if \(\hat p < 0.5\), more differences are negative.
7) Optional confidence interval for \(\hat p\)
An exact confidence interval for the binomial proportion can be computed using the Clopper–Pearson method.
With confidence level \(1-\alpha\) and \(x=n_+\) successes:
\[
\hat p = \frac{x}{n},
\qquad
\mathrm{CI}_{1-\alpha} = [p_L, p_U].
\]
In this app, the CI (when enabled) uses the same drop-ties model as the exact sign test.
8) Assumptions and when to use it
- Observations (or pairs) are random and independent.
- The scale supports “above vs below” comparisons relative to \(m_0\).
- Very robust to non-normality and outliers, but uses less information than rank-based methods.
If the data are paired and ranks are meaningful (not just signs), the Wilcoxon signed-rank test is often more powerful.
The sign test is the safest choice when you only trust direction.