Problem
The task “make lmperm output into anova table” means converting the results of a permutation-based linear model (commonly reported by lmperm) into the familiar ANOVA table layout: Source, \(df\), \(SS\), \(MS\), \(F\), and a p-value (here, a permutation p-value).
What an ANOVA table contains
An ANOVA table partitions variability into components attributed to model terms and an error (residual) component:
| Source | \(df\) | \(SS\) | \(MS\) | \(F\) | p-value |
|---|---|---|---|---|---|
| Term 1 (e.g., factor A) | \(df_A\) | \(SS_A\) | \(MS_A=SS_A/df_A\) | \(F_A=MS_A/MS_E\) | Permutation p |
| Term 2 (e.g., factor B) | \(df_B\) | \(SS_B\) | \(MS_B=SS_B/df_B\) | \(F_B=MS_B/MS_E\) | Permutation p |
| Error (Residuals) | \(df_E\) | \(SS_E=SSE\) | \(MS_E=SS_E/df_E\) | — | — |
| Total | \(df_T\) | \(SS_T\) | — | — | — |
Core computations behind the table
The mechanical part of making an ANOVA table is the same whether p-values come from a classical \(F\) distribution or from permutations. The difference is the final p-value column.
1) Decide which sums of squares are intended
- Sequential (Type I): each term is tested as it enters the model; the order of terms matters.
- Partial / marginal (common in regression): each term is tested after adjusting for the others (compare full vs reduced models).
Many permutation-model outputs correspond to comparing nested models; the nested-model method below produces a clean ANOVA-style table.
2) Use nested models to get \(SS\) for each term
Let the full model contain all terms of interest, and let \(SSE_{\text{full}}\) be its residual sum of squares. For a specific term \(j\), fit a reduced model that removes term \(j\) (but keeps all other terms), and compute \(SSE_{\text{reduced},j}\).
\[ SS_j = SSE_{\text{reduced},j} - SSE_{\text{full}} \]
The degrees of freedom for the term come from the parameter count difference: \[ df_j = df_{\text{reduced},j} - df_{\text{full}} \]
3) Compute the error row
The error (residual) sum of squares is: \[ SS_E = SSE_{\text{full}} \]
If the sample size is \(n\) and the full model has \(p\) estimated parameters (including the intercept), then: \[ df_E = n - p \]
Then: \[ MS_E = \frac{SS_E}{df_E} \]
4) Compute \(MS\) and \(F\) for each term
\[ MS_j = \frac{SS_j}{df_j} \qquad F_j = \frac{MS_j}{MS_E} \]
5) Replace classical p-values with permutation p-values
In a permutation approach, the observed statistic \(F_{j,\text{obs}}\) is compared to its permutation distribution \(F_{j}^{(1)},\dots,F_{j}^{(B)}\). A standard right-tail permutation p-value is: \[ p_j = \frac{1 + \sum_{b=1}^{B}\mathbf{1}\!\left(F_{j}^{(b)} \ge F_{j,\text{obs}}\right)}{B+1} \]
This is the key step for “make lmperm output into anova table”: keep \(SS\), \(df\), \(MS\), and \(F\) in the usual ANOVA format, but report p-values from permutations.
Visualization: how sums of squares partition variability
Worked miniature example (table formatting)
Suppose a model has two terms (A and B). After computing \(SS\) via nested models and using permutation p-values, an ANOVA table could be presented as:
| Source | \(df\) | \(SS\) | \(MS\) | \(F\) | Permutation p |
|---|---|---|---|---|---|
| A | 2 | 120.0 | \(120.0/2=60.0\) | \(60.0/12.5=4.8\) | 0.031 |
| B | 1 | 18.0 | \(18.0/1=18.0\) | \(18.0/12.5=1.44\) | 0.228 |
| Residuals | 24 | 300.0 | \(300.0/24=12.5\) | — | — |
| Total | 27 | 438.0 | — | — | — |
Practical checklist for “make lmperm output into anova table”
- List model terms in the desired testing scheme (sequential Type I or partial via reduced vs full).
- For each term \(j\), compute \(SS_j\) from nested-model \(SSE\) differences and compute \(df_j\).
- Set \(SS_E=SSE_{\text{full}}\) and \(df_E=n-p\); compute \(MS_E\).
- Compute \(MS_j\) and \(F_j\) using \(F_j=MS_j/MS_E\).
- Fill the p-value column using permutation p-values for each \(F_j\).