How to find the median for a data set, including cases with an odd or even number of observations and cases given as a frequency table?

The median is the middle ordered value for odd sample size and the average of the two middle ordered values for even sample size, with analogous position rules in frequency tables and grouped-data approximations.

How to Find the Median (Odd, Even, Frequency Tables, Grouped Data)

Accepted answer Answer included

Median as a positional center

The median is a measure of central tendency defined by position rather than by arithmetic balance. It is the value that splits an ordered data set so that at least half of the observations are at or below it and at least half are at or above it. In percentile language, the median is the 50th percentile.

How to find the median depends on whether the number of observations is odd or even, after the data are arranged from smallest to largest.

Ungrouped data (raw list of observations)

Let \(x_{(1)} \le x_{(2)} \le \dots \le x_{(n)}\) denote the ordered observations (order statistics) from a sample of size \(n\).

Odd sample size

When \(n\) is odd, the median equals the single middle ordered observation:

\[ \operatorname{Median} = x_{\left(\frac{n+1}{2}\right)} \qquad (n \text{ odd}) \]

Example (already ordered): 3, 7, 7, 9, 12 has \(n=5\), so the median is \(x_{(3)}=7\).

Even sample size

When \(n\) is even, there are two middle ordered observations. The conventional median is their average:

\[ \operatorname{Median} = \frac{x_{\left(\frac{n}{2}\right)} + x_{\left(\frac{n}{2}+1\right)}}{2} \qquad (n \text{ even}) \]

Example (already ordered): 2, 4, 7, 10, 13, 18 has \(n=6\), so the median is \(\frac{7+10}{2}=8.5\). The median need not be an observed data value when \(n\) is even.

Visualization of the “middle” rule

Ordered values are shown on number lines. The median is the middle ordered observation for odd sample size and the midpoint between the two central ordered observations for even sample size.

Frequency tables (ungrouped values with counts)

A frequency table lists distinct values \(v_1 < v_2 < \dots < v_k\) with frequencies \(f_1, f_2, \dots, f_k\), giving total \(N=\sum_{i=1}^{k} f_i\). The median is located by position, using cumulative frequency \(F_j=\sum_{i=1}^{j} f_i\).

When \(N\) is odd, the median is the value whose cumulative frequency reaches the position \(\frac{N+1}{2}\). When \(N\) is even, the two central positions \(\frac{N}{2}\) and \(\frac{N}{2}+1\) are located in the cumulative counts, and the median is the average of the corresponding values when those two positions fall on different values.

Value	Frequency \(f\)	Cumulative frequency \(F\)
1	2	2
3	1	3
5	4	7
8	2	9

The table has \(N=9\), so the median position is \(\frac{9+1}{2}=5\). The cumulative frequency reaches 5 at value 5, so the median equals 5.

Grouped data (class intervals)

Grouped data place observations into intervals (classes), such as \([10,20)\), \([20,30)\), and so on. The median is estimated by locating the median class, the class where cumulative frequency crosses \(N/2\), and interpolating within that class.

\[ \widetilde{m} = L + \left(\frac{\frac{N}{2}-C_{\text{before}}}{f_{\text{class}}}\right)w \]

Here \(L\) is the lower class boundary of the median class, \(C_{\text{before}}\) is the cumulative frequency before the median class, \(f_{\text{class}}\) is the frequency in the median class, and \(w\) is the class width.

Class interval	Frequency \(f\)	Cumulative frequency
[0, 10)	3	3
[10, 20)	5	8
[20, 30)	4	12
[30, 40)	2	14

The total is \(N=14\), so \(N/2=7\). The cumulative frequency crosses 7 in the class \([10,20)\), making it the median class. With \(L=10\), \(C_{\text{before}}=3\), \(f_{\text{class}}=5\), and \(w=10\), \[ \widetilde{m}=10+\left(\frac{7-3}{5}\right)\cdot 10 = 18 \]

Common pitfalls

Unordered data obscure the median’s positional definition; the median is defined on the ordered list \(x_{(1)},\dots,x_{(n)}\). Repeated values (ties) are fully compatible with the definition and often produce a median equal to the repeated value. For even \(n\), the median is commonly not an observed value because it is an average of two central observations.

Vote on the accepted answer

Upvotes: 0 Downvotes: 0 Score: 0

Median as a positional center

Ungrouped data (raw list of observations)

Odd sample size

Even sample size

Visualization of the “middle” rule

Frequency tables (ungrouped values with counts)

Grouped data (class intervals)

Common pitfalls

More questions in Measures of Central Tendency for Ungrouped Data