## What you need to know

## Cumulative Frequency

Cumulative frequency is the number of times that anything up to and including that value (or group of values) appeared. You will need to be able to work out the cumulative frequency as well as use this to plot a cumulative frequency graph.

Before delving into the world of cumulative frequency, you should be familiar with the idea of frequency tables. If you aren’t click here frequency tables revision.

**Example 1: Cumulative Frequency Tables**

Below is a frequency table of data compiled on a group of college students’ heights. Construct a cumulative frequency table for this data.

Calculating the cumulative frequency as part of a table is as easy as adding up the frequencies as you go along.

The first value is the first frequency value, we then add this to the second value to get the second cumulative frequency value

13 + 33 = 46

Continuing this, we get that

46 + 35 = 81

people were 180cm of shorter, is

81 + 11 = 92

## Example 2: Cumulative Frequency Graphs

Using the table below, plot a cumulative frequency graph.

The points plotted on your graph should be plotted at the end of each class, i.e. the point which has cumulative frequency 13 should be plotted at 160 on the height axis, and so on.

You should join up the plotted points with a smooth curve. It should end up looking like an elongated ‘S’ shape.

Now that we’ve constructed our cumulative frequency graph, we can put it to good use: we can use it to form a boxplot. If you don’t remember what they are, then have a go at box plots revision.

## Example 3: Cumulative Frequency Graphs – Median and IQR

Using the cumulative frequency graph below to calculate the median and interquartile range.

There are 92 people in total, so the lower quartile, median, and upper quartile will be the 23rd person, 46th person, and 69th person. So, we find these points on the y-axis, and then draw a line across to the graph to find the corresponding heights on the x-axis. This is shown on the cumulative frequency graph below.

Here, we get

Q_1 = 163,\hspace{5mm} \text{ median } = 170,\hspace{5mm} Q_3 = 176

The interquartile range is therefore

176-163=13

### Example Questions

**Question 1:** Below is a frequency table of data showing the amount of time people spent on a particular website in one day.

a) Complete the cumulative frequency column in the table above.

b) Using the data from your table, plot a cumulative frequency diagram.

a) For the cumulative frequency column, we simply need to add up the frequencies as we move downwards in the table:

The first cumulative frequency box is simply the number 16.

For the second cumulative frequency box, we need to add 24 to the 16 from the previous cumulative frequency box.

24+16=40

For the third cumulative frequency box, we need to add 19 to the 40 from the previous cumulative frequency box.

19+40=59

This process of adding the frequency total to the cumulative frequency total repeats until the table is complete.

The final table should look like this:

b) For the cumulative frequency diagram, we need to plot the time in minutes along the x-axis and your cumulative frequency totals on the y-axis.

Then we need to plot each of the cumulative frequency figures with the corresponding class interval maximums. In other words, for the time interval of 0 – 20 minutes, you would go along the x-axis to 20 minutes (the maximum in this time range) and go upwards to the cumulative frequency value of 16.

Once you have plotted all points, including the origin (0,0), join up all points with a smooth curve. (A cumulative frequency graph is always a smooth curve which goes up.)

Your graph should look like this:

**Question 2:** Shown below is a cumulative frequency graph showing the number of hours per week people people spent exercising. Draw a box plot to represent this same information.

From the graph, we can see that the minimum number of hours was 0 and the maximum was 12.

We can also see from the graph that is represents exercising data from 72 people since this is where the graph ends.

In order to draw a box plot, we need to know the following values:

a) the minimum value

b) the maximum value

c) the median value

d) the lower quartile

e) the upper quartile

We have already established that the minimum value is 0 and the maximum value is 72.

The median value is the middle value. Since there are 72 values in total, then the median value is the 36^{th} value (since 36 is half of 72). If we go up the y-axis and locate the 36^{th} value, go across to the line and then down, we can see that is corresponds to a value of 3.2 hours.

The lower quartile is half-way between the minimum and the median. To work out which value is the lower quartile, find \frac{1}{4} of the total number of values:

72 \div4 = 18

The lower quartile is therefore the value of the 18^{th} term. If we go up the y-axis and locate the 18^{th} value, go across to the line and then down, we can see that is corresponds to a value of 1.6 hours.

The upper quartile is half-way between the maximum and the median values. To work out which value is the upper quartile, simply find \frac{3}{4} of the total number of values:

\dfrac{3}{4}\times72 = 54

The upper quartile is therefore the value of the 54^{th} term. If we go up the y-axis and locate the 54^{th} value, go across to the line and then down, we can see that is corresponds to a value of 5 hours.

The graph below illustrates the above:

Using this information, the resulting box plot will look like this:

**Question 3:** The below table shows the number of cars at a showroom and the price brackets that they fall in to:

a) Complete the cumulative frequency table, and draw a cumulative frequency graph to represent this data:

From your graph work out:

b) the approximate median price of a second-hand car

c) an estimate for how many cars cost more than £17,500

a) We know that there are 8 cars with a value between £0 and £10,000, so we can insert 8 in the £0 – £10,000 cumulative frequency box.

The next box is the £0 – £15,000 box. We know that there are 8 cars with a value of less than £10,000, and a further 12 cars which have a value of more than £10,000 but less than £15,000, therefore 20 cars have a value of between £0 and £15,000, so this is the next value we can insert.

Continue this process until all values are calculated and your cumulative frequency table should look as follows:

To draw the graph, we need to plot the cumulative frequency totals on the vertical axis (the cumulative frequency is always on the y-axis and the price in pounds on the x-axis. However, since we are dealing with grouped data (the prices are price bands), we need to plot the cumulative frequency total against the highest value in the band. So, the first point we plot (after plotting (0,0), the origin) would be the cumulative frequency total of 8 against £10,000 (the top value in the £0 – £10,000 band). The next point we would plot would be the cumulative frequency value of 20 against £15,000 (the highest value in the £10,000 – £15,000 band).

Your completed cumulative frequency graph should look as follows:

b) We know that there are 48 cars in the showroom in total (since the maximum value on the cumulative frequency table is 48). To find the median, we need to read the value of the car that corresponds to a cumulative frequency total of 24 (half of 48). By locating the value of 24 on the cumulative frequency axis, we can see that this corresponds to a value of approximately £16,000.

c) For this question, we need to locate £17,500 on the x-axis and see what cumulative frequency total this corresponds to. By drawing a line up from £17,500 until it touches the line and then drawing a horizontal line to the y-axis, we should hit a cumulative frequency total of approximately 26.

This means that 26 cars have a value that is *up to* £17,500. Therefore the remaining cars must have a value which is greater than £17,500. Since there are 48 cars in total then the number of cars which have a value of more than £17,500 is simply 48-26=22 cars.

**Question 4:** Below is a cumulative frequency graph which shows how 2 classes performed in a mock GCSE English exam. Class A is the darker line and class B is the pink line.

Where you see the gaps in the statements below, complete with:

**A for the phrase “is greater than”**

**B for the phrase “is less than”**

**C for the phrase “is the same as”**

a) The median of class A _______ the median of class B.

b) The number of students who scored less than 20 is class A _______ the number of students who scored less than 30 in class B.

c) The number of students who scored between 40 and 50 marks in class A _______ the number of students who scored between 40 and 50 marks in class B.

d) The number of students who scored more than 60 in class A _______ the number of students who scored above 60 in class B.

e) The interquartile range of class A _______ the interquartile range of class B.

a) In both classes, we can work out how many students there are in total. In both classes, there are 80 students since both graphs end at a cumulative frequency total of 80.

To calculate the median, we simply need to see how many marks the 40^{th} student achieved in each class (the 40^{th} value because 40 is half of 80).

In class A the 40^{th} pupil achieved 40 marks.

In class B, the 40^{th} pupil achieved 50 marks.

Therefore, the median of class A **is less than** the median of class B (**answer B)**.

** **

b) For this question, we need to locate 20 marks on the horizontal axis and see where it touches the line for class A. The value of 20 marks corresponds to a cumulative frequency total of 10. This means that 10 students scored *up to* 20 marks (in other words, less than 20 marks).

We now need to locate 30 marks on the horizontal axis and see where it touches the line for class B. The value of 30 marks corresponds to a cumulative frequency total of 10. This means that 10 students scored *up to* 30 marks.

Since 10 people in class A scored up to 20 marks and 10 students in class B scored up to 30 marks, we can now complete the statement:

The number of students who scored less than 20 is class A **is the same as** the number of students who scored less than 30 in class B **(answer C)**.

c) For this question, we need to locate 40 marks as well as 50 marks on the horizontal axis to see where these values touch the line for class A. The value of 40 marks corresponds to a cumulative frequency total of 40. The value of 50 marks corresponds to a cumulative frequency total of approximately 75. This means that 40 students scored up to 40 marks and 75 students scored up to 50 marks. All the students who fall into the category of ‘up to 40 marks’ also fall into the category of ‘up to 50 marks’, so we can work out how many students scored between 40 marks and 50 marks by simply subtracting the number of students who scored up to 40 marks from the number of students who scored up to 50 marks.

75-40=35\text{ students}

We need to follow exactly the same steps for class B. 30 students scored up to 30 marks and 40 students scored up to 50 marks, so this means that 10 students scored between 40 and 50 marks.

In class A, 35 students scored between 40 and 50 marks, compared to 10 in class B. We can now complete our statement:

The number of students who scored between 40 and 50 marks in class A **is greater than** the number of students who scored between 40 and 50 marks in class B **(answer A)**.

d) For this question, we need to locate 60 marks on the horizontal axis to locate the corresponding cumulative frequency for both classes A and B.

For class A, this may seem confusing. When you locate 60 marks on the horizontal axis, you should see that this corresponds to a cumulative frequency total of 80. This means that 80 students scored up to 60 marks. We know that there are 80 students in total, so if all 80 scored up to 60 marks, then 0 students scored above 60 marks. (You can see that the line is flat from 60 to 70 marks for class A on the graph. When the line is flat on a section of cumulative frequency graph, this represents a value of 0.)

When we locate 60 marks for class B, this corresponds to a cumulative frequency total of approximately 65. This tells us that 65 pupils scored up to 60 marks. If there are 80 pupils in total, and 65 scored up to 60 marks, then the remaining 15 must have scored above 60 marks.

In class A, no students scored more than 60 marks, compared to 15 in class B. We can now complete our statement:

The number of students who scored more than 60 in class A **is less than** the number of students who scored above 60 in class B **(answer B)**.

e) To find the interquartile range, we need to locate the lower quartile and upper quartile for each class. We know that there are 80 students in total, so the lower quartile will come from the 20^{th} student (because 20 is \frac{1}{4} of 80). The upper quartile will come from the 60^{th} student (because 60 is \frac{3}{4} of 80).

For class A, the 20^{th} student achieved 30 marks, and the 60^{th} student achieved approximately 45 marks. The interquartile range is the value of the upper quartile minus the lower quartile, so the interquartile range is 15 marks.

For class B, the 20^{th} student achieved approximately 35 marks, and the 60^{th} student achieved approximately 58 marks. The interquartile range is the value of the upper quartile minus the lower quartile, so the interquartile range is 23 marks.

In class A, the interquartile range is 15 compared to 23 marks for class B, so we can now complete our statement:

The interquartile range of class A **is less than** the interquartile range of class B **(answer B)**.

**Question 5:** The lengths of snakes kept in Bob Exotic’s Snake Sanctuary were recorded. The below table shows this information:

a) Draw a cumulative frequency graph for this information.

b) What is the interquartile range for the length of these snakes?

c) In another snake sanctuary in the same country, the median length of their snakes is 1.78m. To one decimal place, by what percentage are these snakes smaller than the snakes in Bob Exotic’s Snake Sanctuary?

a) In order to draw our cumulative frequency graph, we need to work out the cumulative frequency totals:

To represent this information on a graph, we need to plot the cumulative frequency totals on the vertical axis against snake length on the horizontal axis. After plotting our first point of (0,0), the next point is (1,23); the key thing to remember is that since the snake length data is grouped, we need to plot the highest value in each length band (so for the 0 metres – 1 metre band, we would plot the corresponding cumulative frequency total against 1 metre). Once we have plotted all the points, we need to join them together with a smooth line, with an end result similar to the below:

b) The interquartile range is calculated by subtracting the lower quartile from the upper quartile.

In this data set, the lower quartile is the length of the 40^{th} snake (40 because 40 is \frac{1}{4} of 160. To find the length of the 40^{th} snake, find 40 on the vertical cumulative frequency axis and find the corresponding length on the horizonal axis. The length of the 40^{th} snake is approximately 1.5 metres.

In this data set, the upper quartile is the length of the 120^{th} snake (120 because 120 is \frac{3}{4} of 160. To find the length of the 120^{th} snake, find 120 on the vertical cumulative frequency axis and find the corresponding length on the horizonal axis. The length of the 120^{th} snake is approximately 3.5 metres.

Therefore the interquartile ranges is 2 metres.

c) For this question, we need to work out the median snake length at Bob Exotic’s Snake Sanctuary. Since there are 160 snakes in total, the median snake length is the length of the 80^{th} snake (80 because 80 is \frac{1}{2} of 160. The 80^{th} snake has a length of approximately 2.6m.

If at the other snake sanctuary, the median snake length is 1.78 metres, then to calculate how much smaller as a percentage, we need to find out how much smaller the median snake is:

2.6\text{ m } – 1.78\text{ m} = 0.82\text{ m}

To work out a percentage increase or decrease, you need to remember the simply formula:

\dfrac{ \text{ difference}}{\text{ original value}}\times 100

\dfrac{0.82\text{ m}}{2.6\text{ m}}\times 100=31.5\%

(In this question, however, it may not be obvious what the original value is. The question asks us to work out by what percentage these snakes are smaller than the snakes in Bob Exotic’s Snake Sanctuary. Because of the word ‘than’, it is the length of the snakes at Bob Exotic’s Snake Sanctuary that we should consider as the ‘original’ value. Words / phrases like ‘compared to’ or ‘than’ always indicate what we are working out a percentage of.)

### Worksheets and Exam Questions

#### (NEW) Cumulative Frequency Exam Style Questions - MME

Level 6-7#### Cumulative Frequency - Drill Questions

Level 6-7### Videos

#### Cumulative Frequency Q1

GCSE MATHS#### Cumulative Frequency Q2

GCSE MATHS### Learning resources you may be interested in

We have a range of learning resources to compliment our website content perfectly. Check them out below.