Grouped Data

A LevelAQAEdexcelOCROCR 2022

Grouped Data

Grouped data is represented in a histogram or frequency polygon. We can use histograms to estimate the mean, median and standard deviation of data sets.

Make sure you are happy with the following topics before continuing.

A Level

Estimation from Histograms

(Note: for guidance on how to draw histograms, see Presenting Data.)

Since histograms collate data, it may seem impossible to answer questions such like how many data points are greater than $9$, unless $9$ is a class boundary. We can, however, estimate the answers to these questions by assuming frequency is evenly distributed across an entire class. Here is how to do it:

Example: Approximately how many values are greater than $12$ in this histogram?

Draw a line at $12$ on the $x$ axis. This will split the second block. Then, the area of the graph to the right of the line is our estimate. In this case, the second block now extends from $12$ to $20$, with a height of $2$, so a frequency of $2\times 8=16$ comes from the second block. The third block has a length of $10$ and a height of $3$, so gives $30$ frequency. In total, there are $16+30=46$ values larger than $12$.

A Level

Frequency Polygon

A frequency polygon is another way to represent grouped data. It is a line graph joining the points with co-ordinates (midpoint of class, frequency).

Example:

The midpoints are $5,14,21,25,33$, so we plot the points:

$(5,9)$

$(14,15)$

$(21,17)$

$(25,9)$

$(33,4)$

and connect them with straight lines.

A Level

Estimating the Mean and Standard Deviation from a Histogram

Previously, when we used frequency tables to find the mean and standard deviation, we looked at $x$, $fx$ and $fx^{2}$. While we clearly still have $f$, it is not obvious how we should get $x$. This is where the idea of midpoints comes in again.

To estimate the mean and standard deviation from a histogram, first turn the histogram into a table, then add a column of the midpoints of each class labelled $x$. Then, create columns $fx$ and $fx^{2}$ and find the totals of all of the columns. Finally, use these totals in the formulas for mean and standard deviation.

Recall: The formulas:

$\text{mean}=\dfrac{\sum{fx}}{\sum{f}}$

$\text{variance}=\dfrac{\sum{fx^{2}}}{\sum{f}}-\text{mean}^{2}$

$\text{standard deviation}=\sqrt{\text{variance}}$

A Level

Estimating the Median from a Histogram

To estimate the median from a histogram we use linear interpolation. This is where we assume that within each block, the frequency is evenly spaced.

To find the median, first find $\sum{f}$ and divide it by $2$ to find the position of the median (since this is an estimate, if we obtain a decimal we can treat it as if it is a whole number position). Then, find which block the position falls into. Then, within that block, find where it lies.

For example, if the median is the $7$th position of a block with $10$ values of length $5$, then you would add $\dfrac{7\times 5}{10}=3.5$ to the lower bound of the block to find the median.

A Level
A Level

Example 1: Estimating the Mean and Standard Deviation from a Histogram

Find the mean and standard deviation of the data in the histogram below.

[6 marks]

Step 1: Create a table of the data from the histogram.

Step 2: Add columns for the midpoint ($x$), $fx$ and $fx^{2}$.

Step 3: Use the formulas to find the mean and standard deviation.

\begin{aligned}\text{mean}&=\dfrac{\sum{fx}}{\sum{f}}\\[1.2em]&=\dfrac{138.5}{31}=4.47\\[1.2em]\text{variance}&=\dfrac{\sum{fx^{2}}}{\sum{f}}-\text{mean}^{2}\\[1.2em]&=\dfrac{759.75}{31}-4.47^{2}\\[1.2em]&=4.55\\[1.2em]\text{standard deviation}&=\sqrt{\text{variance}}\\[1.2em]&=\sqrt{4.55}\\[1.2em]&=2.13\end{aligned}

A Level

Example 2: Estimating the Median from a Histogram

Find the median of the data in the histogram from the previous example.

[3 marks]

There are $31$ data points, so the median is the $15.5$th data point. We can treat the decimal like it is a whole number position for our estimate. There are $15$ data points in the first two blocks, so this falls $0.5$ data points into the third block. Said block contains $8$ data points and has a width of $1$. So we are $\dfrac{1\times 0.5}{8}=0.0625$, so we are $0.0625$ into the block. The block starts at $5$, so the median is $5.0625$.

A Level

Example Questions

\begin{aligned}\text{mean}&=\dfrac{\sum{fx}}{\sum{f}}\\[1.2em]&=\dfrac{162}{18}\\[1.2em]&=9\\[1.2em]\text{variance}&=\dfrac{\sum{fx^{2}}}{\sum{f}}-\text{mean}^{2}\\[1.2em]&=\dfrac{\sum{fx^{2}}}{\sum{f}}-9^{2}\\[1.2em]&=\dfrac{2430}{18}-81\\[1.2em]&=135-81\\[1.2em]&=54\end{aligned}

a) A line at $15$ would split the second block. To the right of $15$ in this block is a width of $5$ and a height of $28$, for a total of $5\times 28=140$ frequency. The third block has a width of $10$ and a height of $12$, for a total of $10\times 12=120$ frequency. Overall, the number of values greater than $15$ is $140+120=260$

b)

c) Step 1: Using the table from the second question, create a table containing totals, midpoints, $fx$ and $fx^{2}$.

Step 2: Use the formulas to find the mean and standard deviation.

$\text{mean}=\dfrac{\sum{fx}}{\sum{f}}=\dfrac{8650}{700}=12.4$

$\text{variance}=\dfrac{\sum{fx^{2}}}{\sum{f}}-\text{mean}^{2}=\dfrac{141625}{700}-12.4^{2}=49.6$

$\text{standard deviation}=\sqrt{\text{variance}}=\sqrt{49.6}=7.04$

d) The median is the $350$th value, which falls within the second block. Since $160$ values are in the first block, this is the $190$th value of the second block. The second block has a width of $15$ and a frequency of $420$. So position $190$ is

$\dfrac{190\times 15}{420}=\dfrac{95}{14}$

Adding on the original $5$ from the width of the first block gives a value of $\dfrac{165}{14}$, which is our median.

A Level

A Level

A Level

A Level

A Level

You May Also Like...

A Level Maths Revision Cards

The best A level maths revision cards for AQA, Edexcel, OCR, MEI and WJEC. Maths Made Easy is here to help you prepare effectively for your A Level maths exams.

£14.99

A Level Maths – Cards & Paper Bundle

A level maths revision cards and exam papers for Edexcel. Includes 2022 predicted papers based on the advance information released in February 2022! MME is here to help you study from home with our revision cards and practise papers.

From: £22.99

Transition Maths Cards

The transition maths cards are a perfect way to cover the higher level topics from GCSE whilst being introduced to new A level maths topics to help you prepare for year 12. Your ideal guide to getting started with A level maths!

£8.99