Kruskal–Wallis Test
The Kruskal–Wallis test is a nonparametric alternative to one-way ANOVA for
k independent groups. Instead of comparing group means, it ranks all observations together
and compares the groups using rank sums. It is commonly used when normality is doubtful,
when outliers are present, or when the measurement scale is ordinal.
Setup and hypotheses
Suppose we have k independent samples with sizes
n1, n2, …, nk and total
N = \(\sum_{j=1}^{k} n_j\). After ranking all N observations together (using average
ranks for ties), let:
- Rj be the sum of ranks in group j
- \(\bar{R}_j = R_j/n_j\) be the average rank in group j
The hypotheses are:
\[
\begin{aligned}
H_0 &: \text{All } k \text{ groups come from the same population distribution.} \\
H_1 &: \text{At least one group differs (in location) from the others.}
\end{aligned}
\]
Test statistic
The Kruskal–Wallis statistic is computed from the group rank sums:
\[
\begin{aligned}
H_{\text{unc}}
&= \frac{12}{N(N+1)} \sum_{j=1}^{k}\frac{R_j^2}{n_j} - 3(N+1).
\end{aligned}
\]
Tie correction
If there are ties (equal values), ranks are averaged within each tie block.
Let the tie block sizes be \(t_1, t_2, \dots\). A standard correction factor is:
\[
\begin{aligned}
C
&= 1 - \frac{\sum_{r}(t_r^3 - t_r)}{N^3 - N}.
\end{aligned}
\]
When tie correction is enabled, the corrected statistic is:
\[
\begin{aligned}
H
&= \frac{H_{\text{unc}}}{C}.
\end{aligned}
\]
p-value (chi-square approximation)
For moderate to large samples, \(H\) is approximately chi-square distributed with
\(\text{df} = k - 1\):
\[
\begin{aligned}
\text{df} &= k - 1, \\
p\text{-value} &= P\!\left(\chi^2_{\text{df}} \ge H\right).
\end{aligned}
\]
Decision rule (equivalently):
\[
\begin{aligned}
H_{\text{crit}} &= \chi^2_{\text{df},\,1-\alpha}, \\
\text{Reject } H_0 &\text{ if } H \ge H_{\text{crit}} \quad (\text{or if } p\text{-value} \le \alpha).
\end{aligned}
\]
Effect size
A common rank-based effect size is epsilon-squared. A related measure is \(\eta_H^2\).
\[
\begin{aligned}
\varepsilon^2
&= \frac{H - (k-1)}{N-k}, \\
\eta_H^2
&= \frac{H - (k-1)}{N-1}.
\end{aligned}
\]
Optional post-hoc comparisons (Dunn test)
If the overall test is significant, pairwise differences can be examined with Dunn’s test.
It compares mean ranks between groups and uses a normal approximation.
For groups \(i\) and \(j\), with mean ranks \(\bar{R}_i\) and \(\bar{R}_j\):
\[
\begin{aligned}
z_{ij}
&= \frac{\bar{R}_i - \bar{R}_j}
{\sqrt{\left(\frac{N(N+1)}{12}\right)\left(\frac{1}{n_i}+\frac{1}{n_j}\right)}}.
\end{aligned}
\]
If tie correction is used, a common adjustment multiplies the rank-variance term by \(C\):
\[
\begin{aligned}
z_{ij,\text{tie}}
&= \frac{\bar{R}_i - \bar{R}_j}
{\sqrt{\left(\frac{N(N+1)}{12} \cdot C\right)\left(\frac{1}{n_i}+\frac{1}{n_j}\right)}}.
\end{aligned}
\]
Two-sided p-values are:
\[
\begin{aligned}
p_{ij}
&= 2\left(1 - \Phi\!\left(\lvert z_{ij}\rvert\right)\right),
\end{aligned}
\]
and then adjusted for multiple comparisons (e.g., Bonferroni or Holm).