Loading…

Wilcoxon–Mann Test for Two Independent Samples (Rank-Sum / U Statistic)

How is the wilcoxon mann​ test performed for two independent samples, including rank assignment, the U statistic, and a p-value-based conclusion?

Subject: Statistics Chapter: Nonparametric Methods Topic: Wilcoxon Rank Sum ( Two Independent Samples ), Mann Whitney U Answer included
wilcoxon mann​ Wilcoxon–Mann test Mann–Whitney U test Wilcoxon rank-sum test two independent samples nonparametric test rank sum W U statistic
Accepted answer Answer included

The wilcoxon mann​ test is the Wilcoxon rank-sum / Mann–Whitney U test, a nonparametric method for comparing two independent samples by analyzing the ranks of pooled observations rather than assuming normality.

Appropriate setting and hypotheses

  • Two independent samples with sizes \(n_1\) and \(n_2\).
  • Measurements are at least ordinal (ranking is meaningful).
  • Null hypothesis: \(H_0\): the two population distributions are identical.
  • Alternative hypothesis (two-sided): \(H_A\): the distributions differ (often interpreted as a location shift if shapes are similar).

Step-by-step procedure (rank-sum and U statistic)

1) Pool and rank the data

Combine both samples into one list of size \(N=n_1+n_2\). Assign ranks 1 to \(N\) from smallest to largest. If ties occur, each tied value receives the average of the ranks it would have occupied.

2) Compute the rank sum for one group

Let \(R_1\) be the sum of ranks for Group 1 (also denoted \(W\), the Wilcoxon rank-sum statistic).

\[ R_1=\sum_{i \in \text{Group 1}} \text{rank}(x_i) \]

3) Convert rank sum to the Mann–Whitney U statistic

The two U statistics are:

\[ U_1 = R_1 - \frac{n_1(n_1+1)}{2}, \qquad U_2 = n_1 n_2 - U_1 \]

For a two-sided test, many references use \(U_{\min}=\min(U_1,U_2)\) as the test statistic (equivalently, either \(U_1\) with an appropriate tail convention).

Worked example (no ties)

Group A (n1=5) Group B (n2=5)
12, 15, 17, 18, 20 8, 9, 10, 11, 13

Pool and rank the 10 observations:

Value Group Rank
8B1
9B2
10B3
11B4
12A5
13B6
15A7
17A8
18A9
20A10
\[ R_A = 5 + 7 + 8 + 9 + 10 = 39 \] \[ U_A = R_A - \frac{n_1(n_1+1)}{2} = 39 - \frac{5 \cdot 6}{2} = 39 - 15 = 24 \] \[ U_B = n_1 n_2 - U_A = 5 \cdot 5 - 24 = 25 - 24 = 1, \qquad U_{\min} = 1 \]

Normal approximation (p-value) and decision

When sample sizes are not tiny, a common approach is to standardize \(U\) under \(H_0\). With no ties:

\[ \mu_U = \frac{n_1 n_2}{2}, \qquad \sigma_U = \sqrt{\frac{n_1 n_2 (N+1)}{12}} \]

Using \(n_1=5\), \(n_2=5\), \(N=10\):

\[ \mu_U = \frac{5 \cdot 5}{2} = 12.5, \qquad \sigma_U = \sqrt{\frac{5 \cdot 5 \cdot 11}{12}}=\sqrt{\frac{275}{12}} \approx 4.787 \]

With a continuity correction for the lower tail (since \(U_{\min} < \mu_U\)):

\[ z \approx \frac{U_{\min} - \mu_U + 0.5}{\sigma_U} = \frac{1 - 12.5 + 0.5}{4.787} = \frac{-11}{4.787} \approx -2.297 \]

Two-sided p-value:

\[ p \approx 2 \cdot P\!\left(Z \le -|z|\right) = 2 \cdot P(Z \le -2.297) \approx 2 \cdot 0.0108 \approx 0.0216 \]

Conclusion at \(\alpha=0.05\)

Since \(p \approx 0.0216 \le 0.05\), reject \(H_0\). The samples provide evidence that the two populations differ; the rank pattern indicates Group A tends to have larger values than Group B.

Ties (variance correction)

If ties occur, average ranks are still used, but \(\sigma_U^2\) should be adjusted. If tied groups have sizes \(t_1,t_2,\dots\), a common correction is:

\[ \sigma_U^2 = \frac{n_1 n_2}{12} \left[ (N+1) - \frac{\sum_k (t_k^3 - t_k)}{N(N-1)} \right] \]

Effect size (recommended alongside \(p\))

Two standard summaries are the common-language effect size \(\hat{A}\) and the rank-biserial correlation \(r_{rb}\):

\[ \hat{A}=\frac{U_A}{n_1 n_2}, \qquad r_{rb}=\frac{U_A-U_B}{n_1 n_2}=1-\frac{2U_{\min}}{n_1 n_2} \]
\[ \hat{A}=\frac{24}{25}=0.96, \qquad r_{rb}=1-\frac{2 \cdot 1}{25}=1-\frac{2}{25}=0.92 \]

Visualization: pooled ranks with group markers

1 2 3 4 5 6 7 8 9 10 Circles: Group A; Squares: Group B Group A Group B
Group B occupies most of the smallest ranks (1–4 and 6), while Group A occupies most of the largest ranks (5 and 7–10), which corresponds to \(U_A=24\) and a small \(U_{\min}=1\).

Common checks before reporting

  • Independence: if observations are paired/dependent, a signed-rank approach is required instead.
  • Direction: for one-sided alternatives, use the \(U\) tail consistent with “Group 1 tends larger” or “Group 1 tends smaller.”
  • State what was used: exact p-value (small samples) versus normal approximation (larger samples), and whether a tie correction/continuity correction was applied.
Vote on the accepted answer
Upvotes: 1 Downvotes: 0 Score: 1
Community answers No approved answers yet

No approved community answers are published yet. You can submit one below.

Submit your answer Moderated before publishing

Plain text only. Your name is required. Links, HTML, and scripts are blocked.

Fresh

Most recent questions

109 questions · Sorted by newest first

Showing 1–10 of 109
per page
  1. Mar 5, 2026 Published
    Formula of the Variance (Population and Sample)
    Statistics Numerical Descriptive Measures Measures of Dispersion for Ungrouped Data
  2. Mar 5, 2026 Published
    Mean Median Mode Calculator (Formulas, Interpretation, and Example)
    Statistics Numerical Descriptive Measures Measures of Central Tendency for Ungrouped Data
  3. Mar 4, 2026 Published
    How to Calculate Standard Deviation in Excel (STDEV.S vs STDEV.P)
    Statistics Numerical Descriptive Measures Measures of Dispersion for Ungrouped Data
  4. Mar 4, 2026 Published
    Suppose T and Z Are Random Variables: How T Relates to Z in the t Distribution
    Statistics Estimation of the Mean and Proportion Estimation of a Population Mean σ Not Known the T Distribution
  5. Mar 4, 2026 Published
    What Does R Squared Mean in Statistics (Coefficient of Determination)
    Statistics Simple Linear Regression Coefficient of Determination
  6. Mar 3, 2026 Published
    Box and Plot Graph (Box Plot) Explained
    Statistics Numerical Descriptive Measures Box and Whisker Plot
  7. Mar 3, 2026 Published
    How to Calculate a Z Score
    Statistics Continuous Random Variables and the Normal Distribution Standardizing a Normal Distribution
  8. Mar 3, 2026 Published
    How to Calculate Relative Frequency
    Statistics Organizing and Graphing Data Organizing and Graphing Quantitative Data
  9. Mar 3, 2026 Published
    Is zero an even number?
    Statistics Numerical Descriptive Measures Measures of Central Tendency for Ungrouped Data
  10. Mar 3, 2026 Published
    Monty Hall Paradox (Conditional Probability Explained)
    Statistics Probability Marginal and Conditional Probabilities
Showing 1–10 of 109
Open the calculator for this topic