# Outliers and Interquartile Range

A LevelAQAEdexcelOCROCR 2022

## Interquartile Range

Recall: The range is equal to the highest value subtract the lowest value.

The range is a measure of variation (how spread out the data is). It is affected severely by extreme values and outliers. To handle this problem, we introduce the interquartile range.

A Level   ## Quartiles and the Interquartile Range

Quartiles are values that split the data into four, in the same way that the median splits the data into two (in fact, the median is the second quartile).

Recall: To find the median, we find $\dfrac{n}{2}$, where $n$ is the frequency. If this is a whole number the median is the average of this term and the one above. If this is not a whole number we round the number up to find the position of the median term.

We find quartiles in very similar ways.

The first (or lower) quartile is calculated from $\dfrac{n}{4}$. If this is a whole number then the first quartile is the average of this term and the term above. If this is not a whole number then we round the number up to find the position of the first quartile.

The third (or upper) quartile is calculated from $\dfrac{3n}{4}$. If this is a whole number then the third quartile is the average of this term and the term above. If this is not a whole number then we round the number up to find the position of the third quartile.

Note: We always round up to find the position of the quartile, even if $\dfrac{n}{4}$ or $\dfrac{3n}{4}$ would usually be rounded down.

Finally we define the interquartile range:

$\text{interquartile range}=\text{third quartile}-\text{first quartile}$

A Level   ## Outliers

The interquartile range provides a method to deal with outliers. Since it is not calculated using any outliers because it is the range of the middle half of the data, it is sensible to say that an outlier is a certain multiple of the interquartile range below the first quartile or above the third quartile. An exam question might, for example, provide a data set and ask you to calculate the interquartile range and find outliers.

Note: The particular multiplier you need to use to identify any outliers will be given to you in the question.

A Level   A Level   ## Example 1: The Interquartile Range

Consider the data set $1,4,5,5,6,6,6,6,7,10,12$. What is the interquartile range?

[2 marks]

There are $11$ data points. $\dfrac{11}{4}=2.75$, so the first quartile is in the third position, which is $5$.

$\dfrac{3\times 11}{4}=8.25$, so the third quartile is in the $9$th position, which is $7$. So the interquartile range is $7-5=2$.

A Level   ## Example 2: Outliers

Consider the data set $21,34,35,39,41,42,44$. A data point is said to be an outlier if it is more than $1.5$ times the interquartile range above the third quartile or below the first quartile. Identify any outliers.

[5 marks]

There are $7$ data points. $\dfrac{7}{4}=1.75$, so the first quartile is in the second position, which is $34$.

$\dfrac{3\times 7}{4}=5.25$, so the third quartile is in the $6$th position, which is $42$. So the interquartile range is $42-34=8$.

Calculate boundaries for outliers: $1.5\times 8=12$ so the lower boundary is $34-12=22$ and the upper boundary is $42+12=56$. The data value $21$ falls outside of these boundaries, so it is an outlier. There are no other outliers.

A Level   ## Example Questions

There are $8$ data points. $\dfrac{8}{4}=2$ so for the lower quartile we average the second and third points, which is $\dfrac{11+21}{2}=16$. $\dfrac{3\times 8}{4}=6$ so for the upper quartile we average the sixth and seventh points, which is $\dfrac{49+51}{2}=50$. So the interquartile range is $50-16=34$.

There are $100$ data points.

$\dfrac{100}{4}=25$ so the first quartile is the average of the $25$th and $26$th data points, both of which are $3$; so the first quartile is $3$.

$\dfrac{3\times 100}{4}=75$ so the third quartile is the average of the $75$th and $76$th data points, both of which are $6$; so the third quartile is $6$.

Note: in which value a data point falls can be seen easily with a cumulative frequency table.

The interquartile range is $6-3=3$

There are 20 data points.

$\dfrac{20}{4}=5$ so the first quartile is the average of the fifth and sixth data point, which is $\dfrac{46+48}{2}=47$

$\dfrac{3\times 20}{4}=15$ so the third quartile is the average of the $15$th and $16$th data point, which is $\dfrac{56+58}{2}=57$

The interquartile range is $57-47=10$.

So our lower boundary is $47-1.5\times10=32$ and our upper boundary is $56+1.5\times10=71$.

$4$ and $16$ lie outside the lower boundary, while $72$, $81$ and $99$ lie outside the upper boundary, so there are five outliers overall.

A Level

A Level

A Level

## You May Also Like... ### A Level Maths Revision Cards

The best A level maths revision cards for AQA, Edexcel, OCR, MEI and WJEC. Maths Made Easy is here to help you prepare effectively for your A Level maths exams.

£14.99 ### A Level Maths – Cards & Paper Bundle

A level maths revision cards and exam papers for Edexcel. Includes 2022 predicted papers based on the advance information released in February 2022! MME is here to help you study from home with our revision cards and practise papers.

From: £22.99 ### Transition Maths Cards

The transition maths cards are a perfect way to cover the higher level topics from GCSE whilst being introduced to new A level maths topics to help you prepare for year 12. Your ideal guide to getting started with A level maths!

£8.99