Cumulative Frequency Plots
A cumulative frequency plot
is a way to display cumulative information graphically.
It shows the number, percentage, or proportion of observations
that are less than or equal to particular values.
Frequency vs. Cumulative Frequency
In a data set, the cumulative frequency for a
value x is the total number of scores that are less than
or equal to x. In this section, we use two charts to illustrate the
difference between frequency and cumulative frequency.
Both charts show scores for a test administered to 300 students.
In the first chart (shown below), column height indicates frequency -
the number of students in each test score grouping. For example, about
30 students received a test score between 51 and 60.
Frequency | 100 80 60 40 20 | |
| | | | | | | | 41-50 | 51-60 | 61-70 | 71-80 | 81-90 | 91-100 |
|
In the next chart, column height shows cumulative frequency -
the number of students
up to and including each test score.
The chart below is
a cumulative frequency chart. It shows that 30 students received
a test score of at most 50; 60 students received a score of at
most 60; 120 students received a score of at most 70; and so on.
Cumulative frequency | 300 240 180 120 60 | |
| | | | | | | | 50 | 60 | 70 | 80 | 90 | 100 |
|
Absolute vs. Relative Frequency
Frequency counts can be measured in terms of absolute numbers or
relative numbers (e.g.,
proportions
or percentages). The chart
below duplicates the cumulative frequency chart above, except that
it expresses the counts in terms of percentages rather than
absolute numbers.
Cumulative percentage | 100 80 60 40 20 | |
| | | | | | |
| 50 | 60 | 70 | 80 | 90 | 100 |
Note that the columns in the chart have the same shape, whether the
Y axis is labeled with actual frequency counts or with percentages. If
we had used proportions instead of percentages, the shape would remain
the same.
Discrete vs. Continuous Variables
Each of the previous cumulative charts have used a
discrete
variable on the X axix (i.e., the horizontal axis).
The chart below duplicates the previous cumulative charts,
except that it uses a
continuous
variable for the test scores on the X axis.
Let's work through an example to understand how to read this
cumulative frequency plot. Specifically, let's find the
median.
Follow the grid line to the right from the Y axis at 50%.
This line intersects the curve over the X axis at a test score of
about 73. This means that half of the students received a test score
of at most 73, and half received a test score of at least 73.
Thus, the median is 73.
You can use the same process
to find the cumulative percentage associated with any other test
score. For example, what percentage of students received a test score of
64 or less? From the graph, you can see that about 25% of students received
a score of 64 or less.
Test Your Understanding
Problem 1
Below, the cumulative frequency plot shows height (in inches) of
college basketball players.
What is the
interquartile range?
(A) 3 inches
(B) 6 inches
(C) 25 inches
(D) 50 inches
(E) None of the above
Solution
The correct answer is (B). The
interquartile range is the middle range of the distribution,
defined by Q3 minus Q1.
Q1 is the height for which the cumulative percentage is 25%.
To find Q1 from the cumulative frequency plot,
follow the grid line to the right from the Y axis at 25%.
This line intersects the curve over the X axis at a height of
about 71 inches. This means that 25% of the basketball players are
at most 71 inches tall, so Q1 is 71.
To find Q3, follow the grid line to the right from the Y axis at 75%.
This line intersects the curve over the X axis at a height of
about 77 inches. This means that 75% of the basketball players are
at most 77 inches tall, so Q3 is 77.
Since the interquartile range is Q3 minus Q1, the interquartile
range is 77 - 71 or 6 inches.