Z-scores & outlier checks
This topic answers two practical questions:
(1) how unusual is each observation compared with the rest of the dataset, and
(2) which observations should be flagged as potential outliers using a clear rule.
Key idea: An “outlier” flag is not a proof that a value is wrong. It is a signal to
double-check measurement, data entry, units, and whether the value comes from a different process.
1) Z-score: definition and interpretation
A z-score measures how many standard deviations an observation is from the mean.
\[
z_i = \frac{x_i - \bar{x}}{s}
\]
- \(z = 0\) means the value equals the mean.
- \(z = 1\) means the value is one standard deviation above the mean.
- \(z = -2\) means the value is two standard deviations below the mean.
When z-scores are most meaningful: when the distribution is roughly symmetric and not extremely skewed.
For strongly skewed data, robust methods (like the IQR rule) are usually better for outlier checks.
2) Mean and sample standard deviation
Z-scores depend on the mean and the (sample) standard deviation.
\[
\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i
\qquad\text{and}\qquad
s = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n - 1}}
\]
Important edge cases:
- If \(n < 2\), a sample standard deviation cannot be computed.
- If \(s = 0\) (all values identical), z-scores are undefined because the denominator is zero.
3) Outlier rule A: Z-threshold
A common rule flags an observation if its absolute z-score exceeds a chosen threshold.
\[
\text{Outlier (Z rule)} \iff |z_i| > c
\]
Typical choices for \(c\):
- \(c = 2.0\) (lenient): flags more values
- \(c = 2.5\) (moderate)
- \(c = 3.0\) (strict): flags fewer values
Equivalent cutoff on the original scale: the rule \(|z| > c\) corresponds to values outside
\[
\bar{x} - c\cdot s \le x \le \bar{x} + c\cdot s
\]
This is why the histogram visualization shades the two tail regions beyond these cutoffs.
4) Outlier rule B: IQR (Tukey fences)
The IQR rule is more robust because it uses quartiles instead of the mean and standard deviation.
\[
\mathrm{IQR} = Q_3 - Q_1
\]
\[
\text{Lower fence} = Q_1 - k\cdot \mathrm{IQR}
\qquad
\text{Upper fence} = Q_3 + k\cdot \mathrm{IQR}
\]
\[
\text{Outlier (IQR rule)} \iff x < \text{Lower fence} \ \text{or}\ x > \text{Upper fence}
\]
- \(Q_1\) is the 25th percentile; \(Q_3\) is the 75th percentile.
- \(k = 1.5\) is the standard choice (most common).
- \(k = 3\) is a stricter rule (flags only very extreme values).
Why “robust”? Quartiles are less sensitive to extreme values than \(\bar{x}\) and \(s\).
If your data are skewed (common in biology measurements), the IQR rule often gives a more sensible outlier flag.
5) How to read the visualizations
Histogram + shaded z-tail regions
- The bars show the frequency of values in bins.
- Two shaded regions indicate where values would be flagged by the Z-threshold (below \(\bar{x} - c\cdot s\) and above \(\bar{x} + c\cdot s\)).
- Dots represent individual observations; clicking a dot highlights the corresponding table row.
Boxplot + IQR fences
- The box spans from \(Q_1\) to \(Q_3\), with the median inside.
- Fences are drawn at \(Q_1 - k\cdot \mathrm{IQR}\) and \(Q_3 + k\cdot \mathrm{IQR}\).
- Points beyond the fences are IQR outliers; clicking a point highlights its row in the table.
Interactivity note: The plots support hover tooltips and zoom/pan (wheel to zoom, drag to pan).
This helps inspect exact values without shrinking the graphs on small screens.
6) Practical checklist for biology data
- Confirm units before interpreting outliers (for example, mg vs g, mL vs L).
- Check data entry issues (swapped digits, misplaced decimals, headers copied into data).
- Consider the mechanism: a true biological extreme may be real (not an error).
- Small samples: with very small \(n\), any rule can be unstable; use judgment and context.
- Skewed distributions: prefer the IQR rule; consider a transformation when appropriate.
Best practice: use the outlier flag to trigger a review, not an automatic deletion.
If you exclude points, document the rule and the reason.