Loading…

Formula of the Variance (Population and Sample)

What is the formula of the variance, and how is it computed for a population versus a sample?

Subject: Statistics Chapter: Numerical Descriptive Measures Topic: Measures of Dispersion for Ungrouped Data Answer included
formula of the variance variance formula population variance sample variance unbiased estimator standard deviation mean squared deviation
Accepted answer Answer included

Meaning of variance

The formula of the variance expresses numerical spread by averaging squared deviations from the mean. Squaring makes positive and negative deviations contribute equally and gives extra weight to values farther from the mean.

Units: variance is measured in squared units (for example, cm2 if the data are in cm). Standard deviation is the square root of variance and returns to the original units.

Core formulas (population and sample)

Setting Notation Variance formula Interpretation
Population (all values) \(x_1, x_2, \dots, x_N\), mean \( \mu \) \(\sigma^2 = \dfrac{1}{N}\sum_{i=1}^{N}(x_i-\mu)^2\) Average squared deviation across the entire population.
Sample (subset, unbiased) \(x_1, x_2, \dots, x_n\), mean \( \bar{x} \) \(s^2 = \dfrac{1}{n-1}\sum_{i=1}^{n}(x_i-\bar{x})^2\) Unbiased estimator of \(\sigma^2\) under random sampling.

Computational (shortcut) forms

The same formula of the variance can be written in a form that reduces repeated subtraction:

\[ \sigma^2 = \left(\frac{1}{N}\sum_{i=1}^{N}x_i^2\right) - \mu^2 \]

\[ s^2 = \frac{1}{n-1}\left(\sum_{i=1}^{n}x_i^2 - n\bar{x}^2\right) \]

Both forms are algebraically equivalent to averaging squared deviations; the difference lies only in how the arithmetic is organized.

Worked example with interpretation

Data (ungrouped): \(2, 4, 4, 4, 6\). The mean is \( \bar{x} = \dfrac{2+4+4+4+6}{5} = \dfrac{20}{5} = 4 \).

\[ \sum (x_i-\bar{x})^2 = (2-4)^2 + (4-4)^2 + (4-4)^2 + (4-4)^2 + (6-4)^2 \]

\[ \sum (x_i-\bar{x})^2 = 4 + 0 + 0 + 0 + 4 = 8 \]

\[ \sigma^2 = \frac{8}{5} = 1.6 \qquad\text{and}\qquad s^2 = \frac{8}{4} = 2 \]

The sample variance \(s^2\) exceeds the population variance \(\sigma^2\) for the same numbers because dividing by \(n-1\) compensates for estimating the mean from the sample.

Visualization of squared deviations

Variance as squared deviations from the mean Left panel shows data points on a number line and their deviations to the mean at 4. Right panel shows the corresponding squared deviations as bars. Data on a number line (mean at 4) Squared deviations (x − mean)² 2 3 4 5 6 mean = 4 −2 +2 0 0 0 data value mean line deviation (x − mean) 0 1 2 3 4 data values: 2, 4, 4, 4, 6 (x − mean)² 2 4 4 4 6 4 4 Sum of squares = 4 + 0 + 0 + 0 + 4 = 8
The variance formula averages squared deviations from the mean. The data values \(2,4,4,4,6\) have mean \(4\). Deviations are \(-2,0,0,0,+2\), squared deviations are \(4,0,0,0,4\), and their sum is \(8\), which leads to \(\sigma^2=8/5\) for a population and \(s^2=8/4\) for an unbiased sample variance.

Grouped-data adaptation (frequency tables)

When values are presented with frequencies (or grouped into classes), the same variance concept is applied using representative values and weights. For a frequency table with distinct values \(x_j\) and frequencies \(f_j\), the total count is \(n=\sum f_j\) and the mean is \( \bar{x}=\dfrac{\sum f_j x_j}{n} \).

\[ s^2 = \frac{1}{n-1}\sum_{j}(f_j)\,(x_j-\bar{x})^2 \]

For grouped classes, the representative value is often the class midpoint; the resulting variance approximates the variance of the original raw data.

Common pitfalls

  • Squared units are expected; standard deviation \(s=\sqrt{s^2}\) restores the original units.
  • \(n\) versus \(n-1\) reflects population variance versus unbiased sample variance; mixing denominators changes the numerical value.
  • Rounding the mean too early can shift the sum of squared deviations; higher precision in intermediate calculations reduces rounding error.
Vote on the accepted answer
Upvotes: 0 Downvotes: 0 Score: 0
Community answers No approved answers yet

No approved community answers are published yet. You can submit one below.

Submit your answer Moderated before publishing

Plain text only. Your name is required. Links, HTML, and scripts are blocked.

Fresh

Most recent questions

109 questions · Sorted by newest first

Showing 1–10 of 109
per page
  1. Mar 5, 2026 Published
    Formula of the Variance (Population and Sample)
    Statistics Numerical Descriptive Measures Measures of Dispersion for Ungrouped Data
  2. Mar 5, 2026 Published
    Mean Median Mode Calculator (Formulas, Interpretation, and Example)
    Statistics Numerical Descriptive Measures Measures of Central Tendency for Ungrouped Data
  3. Mar 4, 2026 Published
    How to Calculate Standard Deviation in Excel (STDEV.S vs STDEV.P)
    Statistics Numerical Descriptive Measures Measures of Dispersion for Ungrouped Data
  4. Mar 4, 2026 Published
    Suppose T and Z Are Random Variables: How T Relates to Z in the t Distribution
    Statistics Estimation of the Mean and Proportion Estimation of a Population Mean σ Not Known the T Distribution
  5. Mar 4, 2026 Published
    What Does R Squared Mean in Statistics (Coefficient of Determination)
    Statistics Simple Linear Regression Coefficient of Determination
  6. Mar 3, 2026 Published
    Box and Plot Graph (Box Plot) Explained
    Statistics Numerical Descriptive Measures Box and Whisker Plot
  7. Mar 3, 2026 Published
    How to Calculate a Z Score
    Statistics Continuous Random Variables and the Normal Distribution Standardizing a Normal Distribution
  8. Mar 3, 2026 Published
    How to Calculate Relative Frequency
    Statistics Organizing and Graphing Data Organizing and Graphing Quantitative Data
  9. Mar 3, 2026 Published
    Is zero an even number?
    Statistics Numerical Descriptive Measures Measures of Central Tendency for Ungrouped Data
  10. Mar 3, 2026 Published
    Monty Hall Paradox (Conditional Probability Explained)
    Statistics Probability Marginal and Conditional Probabilities
Showing 1–10 of 109
Open the calculator for this topic