Outliers and Interquartile Range

Outliers and Interquartile Range

A LevelAQAEdexcelOCROCR 2022

Interquartile Range

Recall: The range is equal to the highest value subtract the lowest value.

The range is a measure of variation (how spread out the data is). It is affected severely by extreme values and outliers. To handle this problem, we introduce the interquartile range.

A Level AQA Edexcel OCR

Quartiles and the Interquartile Range

Quartiles are values that split the data into four, in the same way that the median splits the data into two (in fact, the median is the second quartile).

Recall: To find the median, we find \dfrac{n}{2}, where n is the frequency. If this is a whole number the median is the average of this term and the one above. If this is not a whole number we round the number up to find the position of the median term.

We find quartiles in very similar ways.

The first (or lower) quartile is calculated from \dfrac{n}{4}. If this is a whole number then the first quartile is the average of this term and the term above. If this is not a whole number then we round the number up to find the position of the first quartile.

The third (or upper) quartile is calculated from \dfrac{3n}{4}. If this is a whole number then the third quartile is the average of this term and the term above. If this is not a whole number then we round the number up to find the position of the third quartile.

Note: We always round up to find the position of the quartile, even if \dfrac{n}{4} or \dfrac{3n}{4} would usually be rounded down.

Finally we define the interquartile range:

\text{interquartile range}=\text{third quartile}-\text{first quartile}

A LevelAQAEdexcelOCR

Outliers

The interquartile range provides a method to deal with outliers. Since it is not calculated using any outliers because it is the range of the middle half of the data, it is sensible to say that an outlier is a certain multiple of the interquartile range below the first quartile or above the third quartile. An exam question might, for example, provide a data set and ask you to calculate the interquartile range and find outliers.

 

Note: The particular multiplier you need to use to identify any outliers will be given to you in the question.

A LevelAQAEdexcelOCR
A Level AQA Edexcel OCR

Example 1: The Interquartile Range

Consider the data set 1,4,5,5,6,6,6,6,7,10,12. What is the interquartile range?

[2 marks]

There are 11 data points. \dfrac{11}{4}=2.75, so the first quartile is in the third position, which is 5.

\dfrac{3\times 11}{4}=8.25, so the third quartile is in the 9th position, which is 7. So the interquartile range is 7-5=2.

A LevelAQAEdexcelOCR

Example 2: Outliers

Consider the data set 21,34,35,39,41,42,44. A data point is said to be an outlier if it is more than 1.5 times the interquartile range above the third quartile or below the first quartile. Identify any outliers.

[5 marks]

There are 7 data points. \dfrac{7}{4}=1.75, so the first quartile is in the second position, which is 34.

\dfrac{3\times 7}{4}=5.25, so the third quartile is in the 6th position, which is 42. So the interquartile range is 42-34=8.

Calculate boundaries for outliers: 1.5\times 8=12 so the lower boundary is 34-12=22 and the upper boundary is 42+12=56. The data value 21 falls outside of these boundaries, so it is an outlier. There are no other outliers.

A LevelAQAEdexcelOCR

Example Questions

There are 8 data points. \dfrac{8}{4}=2 so for the lower quartile we average the second and third points, which is \dfrac{11+21}{2}=16. \dfrac{3\times 8}{4}=6 so for the upper quartile we average the sixth and seventh points, which is \dfrac{49+51}{2}=50. So the interquartile range is 50-16=34.

There are 100 data points.

 

\dfrac{100}{4}=25 so the first quartile is the average of the 25th and 26th data points, both of which are 3; so the first quartile is 3.

 

\dfrac{3\times 100}{4}=75 so the third quartile is the average of the 75th and 76th data points, both of which are 6; so the third quartile is 6.

Note: in which value a data point falls can be seen easily with a cumulative frequency table.

The interquartile range is 6-3=3

There are 20 data points.

 

\dfrac{20}{4}=5 so the first quartile is the average of the fifth and sixth data point, which is \dfrac{46+48}{2}=47

 

\dfrac{3\times 20}{4}=15 so the third quartile is the average of the 15th and 16th data point, which is \dfrac{56+58}{2}=57

 

The interquartile range is 57-47=10.

 

So our lower boundary is 47-1.5\times10=32 and our upper boundary is 56+1.5\times10=71.

 

4 and 16 lie outside the lower boundary, while 72, 81 and 99 lie outside the upper boundary, so there are five outliers overall.

Additional Resources

MME

Exam Tips Cheat Sheet

A Level
MME

Formula Booklet

A Level

Worksheet and Example Questions

Site Logo

Outliers and Cleaning Data

A Level

You May Also Like...

A Level Maths Revision Cards

The best A level maths revision cards for AQA, Edexcel, OCR, MEI and WJEC. Maths Made Easy is here to help you prepare effectively for your A Level maths exams.

£14.99
View Product

A Level Maths – Cards & Paper Bundle

A level maths revision cards and exam papers for Edexcel. Includes 2022 predicted papers based on the advance information released in February 2022! MME is here to help you study from home with our revision cards and practise papers.

From: £22.99
View Product

Transition Maths Cards

The transition maths cards are a perfect way to cover the higher level topics from GCSE whilst being introduced to new A level maths topics to help you prepare for year 12. Your ideal guide to getting started with A level maths!

£8.99
View Product