Permutation Test P Value Calculator

Math Probability • Non Parametric and Computational Probability

Written by STEM Calculators Team Published February 15, 2026 Updated February 24, 2026

Compute a non-parametric permutation test p-value by randomly shuffling labels between two groups and comparing a test statistic (mean difference) against its permutation distribution. Includes an animated histogram and a “shuffle” animation with a Play button.

Paste Group 1 and Group 2 data, choose tail, set number of permutations \(N\), and click Run test. Press Play to animate how the permutation distribution builds.

Group 1 data

Comma/space/newline separated numbers.

Group 2 data

Comma/space/newline separated numbers.

Alternative / tail

Test statistic is \(T=\bar{x}_1-\bar{x}_2\).

Permutations \(N\)

Typical: 1000–10000. Limit: 300,000.

Histogram bins

Repeatable randomness

Use deterministic seed

Animation speed

Plot overlays

Show observed T and “extreme” region

Ready

Permutation distribution (animated) + label shuffle

Progress 0% Shuffle: —

The histogram builds from the first permutations up to \(N\). “Shuffle” shows one random relabeling of the pooled data.

Click “Run test” to compute a permutation p-value and the permutation distribution.

Rate this calculator

0.0 /5 (0 ratings)

Be the first to rate.

Your rating

Name (optional) Review (optional)

You can update your rating any time.

Permutation tests: non-parametric p-values by label shuffling

A permutation test (also called a randomization test) is a non-parametric way to compute a p-value by repeatedly shuffling group labels and measuring how unusual the observed statistic is under the null hypothesis. It is especially useful when you do not want to assume normality or equal variances.

1) The key assumption: exchangeability under the null

Suppose you have two groups (e.g., treatment vs. control). Under the null hypothesis \(H_0\) (“no treatment effect”), the labels are exchangeable: if there is truly no difference, then any assignment of the pooled values into two groups of sizes \(n_1\) and \(n_2\) should be equally plausible.

This is exactly the logic of randomized experiments: if labels were assigned randomly, then shuffling labels is consistent with \(H_0\).

2) Choose a test statistic

The permutation framework works with many statistics (mean difference, median difference, correlation, regression slopes, etc.). This calculator uses the difference in means:

\[ T=\bar{x}_1-\bar{x}_2. \]

Compute the observed statistic \(T_{\text{obs}}\) from your original groups.

3) Build the permutation distribution

Pool the data, shuffle, split back into two groups, recompute the statistic, and repeat many times:

\[ \text{Pool } \{x_1,\dots,x_{n_1},y_1,\dots,y_{n_2}\}, \ \text{shuffle labels, form } (\mathbf{x}^*,\mathbf{y}^*), \ \text{then compute } T^*. \]

After \(N\) permutations you have \(T^{*(1)},\dots,T^{*(N)}\). This collection approximates the sampling distribution of \(T\) under \(H_0\).

4) Tail choice and “extreme” permutations

The p-value depends on the alternative hypothesis:

Two-sided: count permutations with \(|T^*|\ge |T_{\text{obs}}|\).
Right-tailed: count permutations with \(T^*\ge T_{\text{obs}}\).
Left-tailed: count permutations with \(T^*\le T_{\text{obs}}\).

5) Monte Carlo p-value estimate

If \(E\) out of \(N\) permutations are at least as extreme as observed, the standard estimate is:

\[ \hat{p}=\frac{E}{N}. \]

Many texts also recommend a small adjustment to avoid reporting p-value \(0\) when \(E=0\):

\[ \hat{p}_{\text{adj}}=\frac{E+1}{N+1}. \]

6) Exact vs approximate permutation tests (university note)

If your sample sizes are very small, you can enumerate all labelings exactly (an “exact permutation test”). For moderate/large samples, enumeration is infeasible, so we use a Monte Carlo approximation with large \(N\).

7) Interpreting the histogram animation

The histogram shows the permutation distribution of \(T^*\). The observed \(T_{\text{obs}}\) is drawn as a vertical line, and the shaded region indicates what counts as “extreme” for the chosen tail. A smaller p-value means the observed statistic falls in a rare tail region under \(H_0\).

8) Practical tips

Increase \(N\) to reduce Monte Carlo error (histogram becomes smoother).
Permutation tests reflect your chosen statistic—mean difference focuses on location shifts, but not necessarily variance changes.
For paired data, do not shuffle labels across individuals; use a paired permutation approach (e.g., sign flipping).

Permutation distribution (animated) + label shuffle

Rate this calculator

Related calculators