Organizing and Graphing Quantitative Data
Quantitative data are numerical values (scores, earnings, measurements, etc.).
To summarize a large data set, we group the values into classes and build a
grouped frequency distribution.
Classes, Class Limits, and Frequency
The range of the data is divided into nonoverlapping intervals called
classes. Each class has a lower limit and an
upper limit (for example, \(801\) to \(1000\)).
The number of observations in a class is its frequency \(f\).
A table listing all classes and their frequencies is a frequency distribution
for quantitative data (grouped data).
Class Boundaries, Class Width, and Class Midpoint
To remove gaps between classes on a histogram we use class boundaries,
found at the midpoint between adjacent limits. If one class ends at \(1000\) and the
next begins at \(1001\), the common boundary is
\[
\frac{1000 + 1001}{2} = 1000.5.
\]
The class width (class size) is
\[
\text{Class width} = \text{Upper boundary} - \text{Lower boundary}.
\]
The class midpoint (class mark) is the average of the limits:
\[
\text{Class midpoint} =
\frac{\text{Lower limit} + \text{Upper limit}}{2}.
\]
Number of Classes and Approximate Class Width
A grouped table usually uses about 5–20 classes. If \(N\) is the number of
observations, a common rule for the number of classes is
\[
k \approx 1 + 3.322 \log_{10} N.
\]
Once \(k\) is chosen, an approximate class width is
\[
\text{Approximate class width}
= \frac{\text{Largest value} - \text{Smallest value}}{k},
\]
which is then rounded to a convenient number and used to generate the class limits.
Relative Frequency and Percentage
Let \(f\) be the frequency of a class and \(\sum f\) the total frequency. The
relative frequency and percentage of the class are
\[
\text{Relative frequency} =
\frac{\text{Frequency of that class}}{\sum f}
= \frac{f}{\sum f},
\]
\[
\text{Percentage} = \text{Relative frequency} \cdot 100.
\]
Relative frequencies sum to \(1.00\) (approximately), and percentages sum to
\(100\%\) (approximately).
Histograms, Frequency Polygons, and Frequency Curves
Histogram: draw bars for each class with bases along the class
boundaries (or limits) and heights proportional to frequency, relative frequency,
or percentage. Bars touch because the scale is continuous.
Frequency polygon: place a point above each class midpoint at a
height equal to its frequency and join the points with straight segments. Adding
two extra midpoints at frequency \(0\) closes the polygon.
Frequency distribution curve: for many narrow classes, the
frequency polygon approaches a smooth curve that represents how the frequencies
are distributed over the variable’s scale.