Histograms Questions | Worksheets and Revision | MME

# Histogram Questions, Worksheets and Revision

Level 6 Level 7

## Histograms

When displaying grouped data, especially continuous data, a histogram is often the best way to do it – specifically in cases where not all the groups/classes are the same width. Histograms are like bar charts with 2 key differences:

• There are no gaps between the bars
• It’s the area (as opposed to the height) of each bar that tells you the frequency of that class.

Understanding grouped frequency tables is essential for this topic.

### Frequency Density

In order to make this work, when drawing a histogram, we plot frequency density on the y-axis rather than frequency. The frequency density for each group is found using the formula:

$\text{frequency density} = \dfrac{\text{frequency}}{\text{class width}}$

## Example 1: Drawing a Histogram

Below is a grouped frequency table of the lengths of 71 pieces of string. Construct a histogram of the data.

To construct a histogram, we will need the frequency density for each class. Dividing the frequency of the first class by its width, we get

$\text{frequency density } =\dfrac{8}{20-0} = 0.4$

Once we have calculated the frequency density with the remaining groups, then it is good to add a third column to the table containing the frequency density values, see: below.

Once this new column is completed, all that remains is to plot the histogram.

With lengths on the $x$-axis frequency density on the $y$-axis, each bar that we draw will have width equal to its class width, and height equal to the relevant frequency density.

The resulting histogram is shown below

## Example 2: Interpreting Histograms

Below is a histogram showing the times taken to complete a quiz. 44 people took between 0 and 1.5 minutes. Work out how many people took between 3 and 4 minutes.

To answer this question, we’re going to use the information to work out how much 1 small square of area is worth.

Between 0 and 1.5 minutes includes all of the first bar and some of second. From 0 to 1 minutes there are $10\times 12 =120$ small squares, and from 1 to 1.5 there are $5\times 20=100$ small squares (marked on the graph below for clarity).

So, in total there are $100+120=220$ small squares between 0 and 1.5 minutes, and the question tells us that this accounts for 44 people. Therefore, 1 person is equal to

$220 \div 44=5\text{ small squares}$.

Now, reading from the graph we get that there are $11 \times 10 = 110$ small squares between 3 and 4 minutes, so given that 5 small squares is one person, there must be

$110 \div 5 = 22$

who took between 3 and 4 minutes to do the quiz.

### Example Questions

In order to draw a histogram, we need to know the frequency density for each row of data. The frequency density is calculated by dividing each frequency by its associated class width.

This means that we need to create a new column on the data table for the frequency densities.

The first row of the table has a plant height from 0 – 10cm and a frequency of 6. As mentioned above, the frequency density is the frequency divided by the band width, so the frequency density for the first row can be calculated as follows:

$\text{Frequency density} = 6 \div10 -= 0.6$

By repeating this process for the remaining four rows, our completed frequency density column will look like the below:

Now we are in a position to draw the histogram. The height will be on the the $x$-axis and the frequency density on the $y$-axis. Since the band widths are not consistent (the band width of the 20 – 24 cm category is only 4 cm whereas the band width for the 30 – 50 cm category is 20 cm), this means that the widths of the bars you draw will not be the same.

Your completed histogram should look like the below:

The number of values in each class is represented by the area of each bar (and not the height). We have been told that 54 people can hold their breath for at least a minute, so this means that the area of the bars from 60 seconds upwards represents 54 people.

People who can hold their breath for 1 minute or more is represented by the whole of the last bar (70 – 100 seconds) and the right-hand part of the second-to-last bar (60 – 70 seconds). To work out the area in this two bars, we simply need to count the small squares:

$(5 \times 15) + (15 \times 4) = 75 + 60 = 135$

This is illustrated in red on the histogram below.

If 135 small squares represents 54 people, we can work out how many people one small square represents:

If

$54\text{ people} = 135\text{ small squares}$

then:

$\text{1 person } = \dfrac{135}{54} = 2.5\text{ small squares}$

Now that we know that 1 person is represented by 2.5 small squares, we need to work out how many small squares there are between 20 and 40 seconds.

The number of small squares between 20 and 40 is:

$(5 \times 32) + (5 \times 20) = 160 + 100 = 260$

This is illustrated in green on the graph below.

Therefore, the number of people who can hold their breath for between 20 and 40 seconds is:

$\dfrac{260}{2.5} = 104\text{ people}$

a) Since we are taking data from the histogram, we can see the frequency density and the band width, but we need to work out how many riders (the frequency) rode for 30 kilometres or less.

The key formula when we are dealing with histograms is:

$\text{Frequency density} = \dfrac{\text{frequency}}{\text{bandwidth}}$

If we need to work out the frequency, then we simply need to rearrange this formula:

If

$\text{Frequency density} = \dfrac{\text{frequency}}{\text{bandwidth}}$

then

$\text{Frequency} = \text{ frequency density}\times\text{ bandwidth}$

The number of riders (the frequency) who rode between 0 and 20 kilometres can be calculated as follows:

$4\times20 = 80\text{ riders}$

The number of riders (the frequency) who rode between 20 and 30 kilometres can be calculated as follows:

$10\times10 = 100\text{ riders}$

Therefore the number of riders who rode between 0 and 30 kilometres is:

$80+100=180\text{ riders}$

b) In order to work out the mean journey length, we need to work out how many riders there are in total. In order to do this, we need to work out how many riders rode between 0 – 20 kilometres, 20 – 30 kilometres, 30 – 54 kilometres etc.

We already know from the previous question that 80 rider rode between 0 and 20 kilometres and that a further 100 riders rode between 20 and 30 kilometres.

In the 30 – 54 kilometres category, we have a band width of 24 kilometres and a frequency density of 2, so the number of riders can be calculated as follows:

$24\times2 = 48\text{ riders}$

In the 54 – 70 kilometres category, we have a band width of 16 kilometres and a frequency density of 9, so the number of riders can be calculated as follows:

$16\times9 = 144\text{ riders}$

In the 70 – 90 kilometres category, we have a band width of 20 kilometres and a frequency density of 6, so the number of riders can be calculated as follows:

$20\times6 = 120\text{ riders}$

Although we now exactly how many riders rode in each distance category, we cannot know exactly how far each rider rode since we are dealing with grouped data. In the 0 – 20 kilometres category, the 80 riders could have cycled 1 kilometre or 19 kilometres. What we have to do is assume that the distance that each cyclist rode is the midpoint of each distance category (this is why this is an estimated mean and not an accurate mean).

The easiest thing for us to do is to tabulate our data, with one column for the midpoint of each distance category, another column for the frequency (number of riders) and another column for the midpoint multiplied by the frequency (this last column is to work out the total distance travelled by all the riders in that category combined because to work out the mean, we will need to divide the total distance travelled by all riders by the number of riders).

The tabulated data should look like the below:

The total of the frequency colum is the total number of riders. The total of the ‘midpoint multiplied by frequency column’ is the total distance travelled by all of the riders. Therefore the estimated mean can be calculated as follows:

$\text{Estimated mean} = 24,564\text{ kilometres} \div \text492\text{ riders} \approx 50\text{ kilometres}$

a) In this question, it would appear that we have inadequate information in the table to complete the histogram, and inadequate information in the histogram to complete the table. There is also no frequency density scale on the $y$-axis to help us.

On the histogram, a bar representing the 20 – 40 cm fish caught has been drawn, but there is no data in the table for this category. However, the 10 – 20 cm fish length category has been drawn on the histogram and we also know that this represents a frequency of 22 fish. With this information, we can work out the frequency density allowing us to work out the frequency density scale on the $y$-axis.

$\text{Frequency density} = \dfrac{\text{frequency}}{\text{bandwidth}}$

Therefore, the frequency density for the 10 – 20 cm fish length category is:

$22\div10 = 2.2$

This tells us that the top of the bar for the 10 – 20 cm fish length category stops at 2.2. Therefore, we can now see that every small square on the $y$-axis represents an frequency density change of 0.1, so we can now label the rest of the $y$-axis accordingly.

In order to complete the rest of the histogram, we need to work out the frequency densities for the length categories which have not already been drawn on the histogram.

The frequency density for the 0 – 4 cm length category can be calculated as follows:

$\text{Frequency density} = 32 \div 4 = 8$

The frequency density for the 20 – 40 cm length category can be calculated as follows:

$\text{Frequency density} = 42 \div 20 = 2.2$

The frequency density for the 40 – 45 cm length category can be calculated as follows:

$\text{Frequency density} = 30 \div 5 = 6$

The frequency density for the 55 – 70 cm length category can be calculated as follows:

$\text{Frequency density} = 9 \div 15 = 0.6$

Now that we have worked out the frequency density for each length category, we can now plot them on the histogram, with a result similar to the below:

b) For this part of the question, we need to fill in the gaps in the frequency column of the table. In order to do this, we will need to take a frequency density reading from the histogram for the 2 length categories in question.

Reading from the histogram, we see that the frequency density for the 4 – 10 cm category is 3.5, and the frequency density for the 45 – 55 cm category is 4.6. All we need to do is rearrange the frequency density formula so that we can work out the frequency.

Since

$\text{Frequency density} = \dfrac{\text{frequency}}{\text{bandwidth}}$

then

$\text{Frequency} =\text{frequency density}\times\text{bandwidth}$

Therefore, the frequency for the 4 – 10 cm length category can be calculated as follows:

$3.5\times6=21$

The frequency for the 45 – 55 cm length category can be calculated as follows:

$4.6\times10=46$

a) The key piece of information in this question is that 15 bags of flour weigh between 35 and 40 pounds. What we need to do is look and see what area of the histogram this represents. The area of the 35 – 40 pounds bar (do not accidentally work out the area of the entire 30 – 40 pounds bar!) can be calculated as follows:

$3\times25\text{ small squares} = 75\text{ small squares}$

We can therefore conclude that 15 bags of flour is represented by 75 small squares.

If

$15\text{ bags} = 75\text { small squares}$

then

$1\text{ bag} = 5\text { small squares}$

All we need to do now is work out how many small squares there are from 85 pounds upwards.

Between 85 and 95 pounds there are 50 small squares, and between 95 and 100 pounds, there are a further 150 small squares, giving us a total of 200 small squares.

Since 5 small squares represents a single bag of flour, then 200 squares represents 40 bags of flour.

b) The answer to part a) can only be an estimate because we are dealing with grouped data. We have made the assumption that the number of bags that weigh between 80 and 95 pounds is $\frac{3}{5}$ of the number of bags of flour that weigh between 70 and 95 pounds. At one extreme, it is possible that all of these bags of flour are less than 80 pounds and, at the other extreme, it is possible that they might all weigh more than 80 pounds.

c) We know from the question that there are 200 bags of flour in total. Therefore the median weight of a bag of flour is the weight of the 100th bag (since 100 is half of 200). We will therefore need to work out which weight band the 100th bag of flour falls into. This is going to be difficult (impossible) at this stage since we do not know how many bags of flour are in the 30 – 40 pound category, the 40 – 55 pound category etc.

We know from the first question, that 15 bags of flour weigh between 35 and 40 pounds. Since this is half of the total are of the 30 – 40 pound category, the number of bags between 30 and 40 pounds is:

$15\times2 = 30\text{ bags}$

In the 40 – 55 pound category, the area is 3 times the 35 – 40 pound strip, so this represents:

$15\times3 = 45\text{ bags}$

So far we have accounted for the first 75 bags of flour $(30+45=75)$ so haven’t reached the 100th bag of flour yet.

The 55 – 65 pound category has the same width as the 30 – 40 pound category. If we compare the area to the 30 – 40 pound category, its area is 100 small squares larger than the 30 – 40 pound category. Therefore, once we know what an area of 50 small squares represents, we can add this to 30 (the number of bags represented by the 30 – 40 pound category).

We know from the first question that 5 small squares corresponds to 1 bag, so 50 small squares will correspond to 10 bags.

Therefore the 55 – 65 pound category corresponds to 40 bags.

Since there are 30 bags in the 30 – 40 pound category and a further 45 bags in the 40 – 55 pound category, there are 75 bags that have a weight between 30 and 55 pounds. Therefore the 55 – 65 pound category accounts for the 76th bag to the 115th bag (115 since there are 75 bags between 30 and 55 pounds and 40 bags between 55 and 65 pounds). We are trying to locate the weight of the 100th bag, so we know it must be in the 55 to 65 pound weight category.

We are now in a position to calculate the estimated weight of the 100th bag (this is the hard bit!).

By subtracting the 75 bags that weigh less than 55 pounds from 100, we can work out that the 100th bag will be the 25th of the 40 bags between 55 and 65 pounds. We can write this as $\frac{25}{40}$. So where in the weight category does this fall? It will fall $\frac{25}{40}$ of the way between 55 – 65 pounds. Since this is a weight category of 10 pounds, we will need to perform the following calculation:

$\dfrac{25}{40}\times10=6.25\text{ pounds}$

Since the category starts at 55 pounds, then the weight of the median bag (the 100th) bag is $55+6.25=61.25 \text{ pounds}$

(This last part seems complicated, but only because the fraction is not that easy. If there were 20 bags in the 55 – 65 pound category, and it was the 10th bag in this category that represented the median, since the 10th bag in the category is exactly half way through the 20 bags in the category, then its estimated weight would simply be half way between 55 and 65 pounds, so would therefore have a weight of 60 pounds.)

Level 6-7

Level 6-7

Level 6-7

GCSE MATHS

GCSE MATHS

### Learning resources you may be interested in

We have a range of learning resources to compliment our website content perfectly. Check them out below.