Loading…

Paired vs Unpaired Permutation Tests in Statistics

What is the difference between paired vs unpaired permutation tests, and how is the p-value computed in each case?

Subject: Statistics Chapter: Nonparametric Methods Topic: Wilcoxon Signed Rank Test ( Two Dependent Samples ) Answer included
paired vs unpaired permutation tests permutation test randomization test nonparametric test paired samples matched pairs independent samples label shuffling
Accepted answer Answer included

Core idea behind permutation tests

A permutation test (also called a randomization test) evaluates a null hypothesis by generating the distribution of a chosen test statistic under rearrangements that are valid if the null is true. The p-value is computed as the fraction of rearrangements that produce a statistic at least as extreme as the observed statistic.

Key decision in paired vs unpaired permutation tests: the randomization scheme must match the data structure.

  • Paired (dependent) samples: observations come in matched pairs (before/after on the same subject, twin pairs, matched units).
  • Unpaired (independent) samples: observations come from two separate groups with no natural one-to-one matching.

Paired permutation test (dependent samples)

In paired data, the pairing is essential and must be preserved. A standard approach reduces each pair to a within-pair difference \(d_i\), then tests whether the typical difference is \(0\).

Null hypothesis and statistic

Typical null: \(H_0:\) the distribution of within-pair differences is centered at \(0\) (often stated as “no systematic paired effect”).

Choose a test statistic based on the differences, for example:

  • Mean difference: \(T = \bar d = \dfrac{1}{n}\sum_{i=1}^{n} d_i\)
  • Sum of differences: \(T = \sum_{i=1}^{n} d_i\)
  • Median difference (less common for exact enumeration but possible)

Valid permutations for paired data

Under \(H_0\), swapping the two labels inside a pair (e.g., “before” and “after”) should not change the joint distribution. This induces a sign change in the difference:

\[ d_i = (\text{after})_i - (\text{before})_i \quad\Longrightarrow\quad d_i \text{ becomes } -d_i \text{ if the within-pair labels are swapped.} \]

Therefore, a common paired permutation test enumerates or samples all sign patterns \((s_1,\dots,s_n)\) where each \(s_i \in \{+1,-1\}\), forming \[ T^{(b)} = \frac{1}{n}\sum_{i=1}^{n} s_i \cdot d_i. \] There are \(2^n\) possible sign-flips (exact enumeration is feasible for modest \(n\)).

Paired p-value calculation

For a two-sided test using \(T=\bar d\): \[ p = \frac{\#\left\{b:\, \left|T^{(b)}\right| \ge \left|T_{\text{obs}}\right|\right\}}{2^n}. \] The “\(\ge\)” (not “\(>\)”) ensures the observed statistic is counted and gives an exact finite-sample p-value.

Unpaired permutation test (independent samples)

With two independent groups, the null hypothesis typically states that both samples come from the same distribution, so group labels are exchangeable.

Null hypothesis and statistic

Typical null: \(H_0:\) the two groups have the same distribution (a shift in location is absent).

A common statistic is the difference in sample means: \[ T = \bar X - \bar Y. \] Alternatives include difference in medians, trimmed means, or other robust location measures (the permutation framework remains the same).

Valid permutations for unpaired data

Pool all \(N=n_1+n_2\) observations, then repeatedly reassign \(n_1\) of them to “Group 1” and the rest to “Group 2” (label shuffling). Each reassignment produces a permuted statistic \(T^{(b)}\).

The number of distinct label assignments is \[ \binom{N}{n_1}. \] Exact enumeration is feasible when \(\binom{N}{n_1}\) is not too large; otherwise, Monte Carlo sampling of many random shuffles is used.

Unpaired p-value calculation

For a two-sided test: \[ p = \frac{\#\left\{b:\, \left|T^{(b)}\right| \ge \left|T_{\text{obs}}\right|\right\}}{\binom{N}{n_1}} \quad \text{(exact)} \qquad\text{or}\qquad p \approx \frac{1+\#\left\{b:\, \left|T^{(b)}\right| \ge \left|T_{\text{obs}}\right|\right\}}{1+B} \quad \text{(Monte Carlo)}. \] The “\(+1\)” form is a standard finite-sample correction that prevents a p-value of \(0\) when using random shuffles.

Worked examples comparing paired vs unpaired permutation tests

Example A: paired permutation test (sign-flips)

Scenario: measurements taken on the same \(n=6\) subjects before and after an intervention; analyze differences \(d_i=(\text{after})_i-(\text{before})_i\).

Subject \(d_i\)
12
2-1
33
40
54
6-2

Observed mean difference: \[ T_{\text{obs}}=\bar d=\frac{2+(-1)+3+0+4+(-2)}{6}=\frac{6}{6}=1. \]

Under \(H_0\), each difference can be sign-flipped. There are \(2^6=64\) sign patterns. The exact two-sided p-value is the fraction of sign patterns with \(\left|\bar d^{(b)}\right|\ge 1\). For this dataset, that fraction equals \[ p=\frac{28}{64}=0.4375. \]

Interpretation: evidence against \(H_0\) is weak for a two-sided paired effect using the mean difference as statistic.

Example B: unpaired permutation test (label shuffling)

Scenario: two independent groups (\(n_1=5\), \(n_2=5\)); test for a difference in means using \(T=\bar X-\bar Y\).

Group 1 values Group 2 values
128
97
119
106
1310

Compute sample means: \[ \bar X=\frac{12+9+11+10+13}{5}=\frac{55}{5}=11, \qquad \bar Y=\frac{8+7+9+6+10}{5}=\frac{40}{5}=8. \] Observed statistic: \[ T_{\text{obs}}=\bar X-\bar Y=11-8=3. \]

Under \(H_0\), all \(N=10\) observations are exchangeable across labels. The number of distinct labelings is \[ \binom{10}{5}=252. \] Enumerating all 252 labelings and recomputing \(T^{(b)}\) gives an exact two-sided p-value: \[ p=\frac{\#\left\{b:\, \left|T^{(b)}\right|\ge 3\right\}}{252}=\frac{10}{252}\approx 0.03968. \]

Interpretation: evidence against \(H_0\) is strong at \(\alpha=0.05\) for a two-sided difference in means, assuming independent samples.

Visualization: permutation distribution and “extreme” regions

Permutation distribution for an unpaired permutation test statistic A stylized histogram of the permutation distribution of T = mean(Group 1) minus mean(Group 2) under the null, with a vertical line marking the observed value and shaded tail bars representing values at least as extreme. Test statistic \(T=\bar X-\bar Y\) under label shuffling Frequency -3 -1.5 0 1.5 3 Observed \(T_{\text{obs}}=3\) Tail region used for p-value
The histogram represents the permutation distribution of \(T=\bar X-\bar Y\) under unpaired label shuffling; the vertical line marks \(T_{\text{obs}}\). Bars in the tail illustrate “at least as extreme” values counted in the two-sided p-value.

How to choose correctly between paired vs unpaired permutation tests

  • Use a paired permutation test when each observation in one condition corresponds to a specific observation in the other condition (matched pairs). Randomize within each pair (swap labels or sign-flip differences).
  • Use an unpaired permutation test when observations are independent across groups. Randomize by shuffling group labels across the pooled sample.
  • Do not break the structure: treating paired data as unpaired discards pairing information and can misstate variability; treating unpaired data as paired invents pairings and invalidates the exchangeability argument.

Practical notes for reliable implementation

  • Exact vs Monte Carlo: exact enumeration uses \(2^n\) (paired) or \(\binom{N}{n_1}\) (unpaired); otherwise approximate with a large number \(B\) of random permutations.
  • Two-sided vs one-sided: for a two-sided test with symmetric statistic \(T\), count permutations with \(\left|T^{(b)}\right|\ge\left|T_{\text{obs}}\right|\); for a one-sided test, count \(T^{(b)}\ge T_{\text{obs}}\) (or \(\le\)).
  • Choice of statistic: difference in means targets mean shifts; median-based or trimmed statistics target robust shifts. The permutation logic stays the same as long as the statistic is computed consistently for each shuffle.
  • Reporting: state whether the test was paired vs unpaired, specify the statistic, and specify whether the p-value is exact or Monte Carlo (including \(B\)).

Correct use of paired vs unpaired permutation tests rests on matching the randomization scheme to the data-generating design: preserve pairs when pairs exist, and shuffle labels only when samples are independent.

Vote on the accepted answer
Upvotes: 0 Downvotes: 0 Score: 0
Community answers No approved answers yet

No approved community answers are published yet. You can submit one below.

Submit your answer Moderated before publishing

Plain text only. Your name is required. Links, HTML, and scripts are blocked.

Fresh

Most recent questions

109 questions · Sorted by newest first

Showing 1–10 of 109
per page
  1. Mar 5, 2026 Published
    Formula of the Variance (Population and Sample)
    Statistics Numerical Descriptive Measures Measures of Dispersion for Ungrouped Data
  2. Mar 5, 2026 Published
    Mean Median Mode Calculator (Formulas, Interpretation, and Example)
    Statistics Numerical Descriptive Measures Measures of Central Tendency for Ungrouped Data
  3. Mar 4, 2026 Published
    How to Calculate Standard Deviation in Excel (STDEV.S vs STDEV.P)
    Statistics Numerical Descriptive Measures Measures of Dispersion for Ungrouped Data
  4. Mar 4, 2026 Published
    Suppose T and Z Are Random Variables: How T Relates to Z in the t Distribution
    Statistics Estimation of the Mean and Proportion Estimation of a Population Mean σ Not Known the T Distribution
  5. Mar 4, 2026 Published
    What Does R Squared Mean in Statistics (Coefficient of Determination)
    Statistics Simple Linear Regression Coefficient of Determination
  6. Mar 3, 2026 Published
    Box and Plot Graph (Box Plot) Explained
    Statistics Numerical Descriptive Measures Box and Whisker Plot
  7. Mar 3, 2026 Published
    How to Calculate a Z Score
    Statistics Continuous Random Variables and the Normal Distribution Standardizing a Normal Distribution
  8. Mar 3, 2026 Published
    How to Calculate Relative Frequency
    Statistics Organizing and Graphing Data Organizing and Graphing Quantitative Data
  9. Mar 3, 2026 Published
    Is zero an even number?
    Statistics Numerical Descriptive Measures Measures of Central Tendency for Ungrouped Data
  10. Mar 3, 2026 Published
    Monty Hall Paradox (Conditional Probability Explained)
    Statistics Probability Marginal and Conditional Probabilities
Showing 1–10 of 109
Open the calculator for this topic