Stem-and-Leaf Displays
A stem-and-leaf display is a way to show quantitative data
in condensed form without losing the individual observations.
Each value is split into two parts:
- The stem (all but the last one or two digits), and
- The leaf (the last one or two digits).
All leaves belonging to the same stem are listed on one row. Reading the
stems with their leaves reproduces the original data set.
Two-digit data: one-digit leaves
For two-digit data such as test scores (for example 52, 75, 96), we often
choose
\[
\text{stem} = \text{tens digit}, \qquad
\text{leaf} = \text{ones digit}.
\]
Thus, \(75\) has stem \(7\) and leaf \(5\); \(52\) has stem \(5\) and leaf
\(2\). The stems might be 5, 6, 7, 8, 9, and each row lists the leaves for
that stem.
Three- and four-digit data: two-digit leaves
For larger values such as monthly rents (for example 880, 1081, 1231), it is
convenient to use the last two digits as the leaf:
\[
\text{stem} = \left\lfloor \frac{x}{100} \right\rfloor, \qquad
\text{leaf} = x \bmod 100.
\]
Here \(880\) is written with stem \(8\) and leaf \(80\), while
\(1231\) has stem \(12\) and leaf \(31\). The stems might run from
\(6\) to \(13\) and the leaves on each row show the cents or last two digits.
Unsorted vs ranked displays
When the leaves are written in the order the data are given, the result is an
unsorted stem-and-leaf display. If the leaves for each stem
are rearranged in increasing order, we obtain a
ranked stem-and-leaf display. Ranked displays make it easy
to see the shape of the distribution and to identify the median, quartiles,
or other summary features.
Grouped stems (condensed displays)
If there are many stems with only a few leaves, the display can be
grouped by combining several stems into a single row
(for example, grouping stems 0–2, 3–5, and 6–8). Asterisks or separators are
then used to show where one original stem ends and the next begins. Grouped
stem-and-leaf displays provide a more compact summary for large data sets.
The main advantage of a stem-and-leaf display over a grouped frequency table
is that no information about individual observations is lost: every data
value can be recovered directly from its stem and leaf.