Meaning of standard deviation
what does standard deviation mean refers to the typical size of deviations from the mean, expressed in the same units as the data. Larger standard deviation corresponds to a more widely spread dataset; smaller standard deviation corresponds to a more tightly clustered dataset.
Core interpretation. Standard deviation is a root-mean-square distance from the mean: it summarizes how far values tend to lie from the center when “distance” is measured by squared deviations and then square-rooted back to the original units.
Variance and root-mean-square deviation
For a population with mean \(\mu\), the variance is the mean squared deviation from \(\mu\), and the population standard deviation is its square root:
\[ \sigma^{2} = \frac{1}{N}\sum_{i=1}^{N}(x_i-\mu)^2, \qquad \sigma = \sqrt{\sigma^{2}}. \]Squaring makes all deviations nonnegative and emphasizes larger departures. The square root returns the measure to the original unit scale of the data.
Population and sample standard deviation
For a sample \(x_1,\dots,x_n\) with sample mean \(\bar{x}\), the sample variance and sample standard deviation are
\[ s^{2} = \frac{1}{n-1}\sum_{i=1}^{n}(x_i-\bar{x})^2, \qquad s = \sqrt{s^{2}}. \]The factor \(n-1\) (Bessel’s correction) produces an unbiased estimator of the population variance under standard random-sampling assumptions.
| Context | Center | Variance | Standard deviation | Typical use |
|---|---|---|---|---|
| Population | \(\mu\) | \(\sigma^2 = \dfrac{1}{N}\sum (x_i-\mu)^2\) | \(\sigma = \sqrt{\sigma^2}\) | Full population is available; dispersion is a fixed attribute of the population. |
| Sample | \(\bar{x}\) | \(s^2 = \dfrac{1}{n-1}\sum (x_i-\bar{x})^2\) | \(s = \sqrt{s^2}\) | Population dispersion is unknown; \(s\) estimates \(\sigma\). |
Interpretation on a distribution graph
On a histogram or smooth density curve, standard deviation corresponds to horizontal spread around the mean. Under a normal distribution, standard deviation has an especially direct probabilistic meaning through well-known coverage percentages.
Numerical example and “typical distance”
Consider the dataset \(2, 4, 4, 4, 5, 5, 7, 9\), with \(\bar{x} = 5\). Squared deviations from the mean sum to \[ \sum (x_i-\bar{x})^2 = (2-5)^2 + (4-5)^2 + (4-5)^2 + (4-5)^2 + (5-5)^2 + (5-5)^2 + (7-5)^2 + (9-5)^2 = 32. \]
The population variance and standard deviation for these eight values (treated as the full population) are \[ \sigma^2 = \frac{32}{8} = 4, \qquad \sigma = \sqrt{4} = 2. \] The sample standard deviation (treated as a sample) is \[ s^2 = \frac{32}{7} \approx 4.5714, \qquad s \approx \sqrt{4.5714} \approx 2.1381. \]
The numerical outcome \(s \approx 2.14\) indicates that a typical deviation from the mean is a bit above 2 units on the original scale of the data.
Standard deviation versus standard error
Standard deviation describes spread among individual observations. Standard error describes spread of an estimator across repeated samples. For the sample mean, a common relation is \[ \mathrm{SE}(\bar{x}) = \frac{s}{\sqrt{n}}. \] The distinction matters because a large dataset can have a large standard deviation while the sample mean still has a small standard error.
Common pitfalls
Standard deviation is sensitive to outliers because deviations are squared. Standard deviation depends on measurement units, so rescaling data by a factor \(c\) rescales the standard deviation by \(|c|\). Strong skewness or multimodality can make a single-number spread summary incomplete; in such settings, dispersion is often reported alongside shape summaries (histogram, box plot, or quantiles) rather than in isolation.