How do you test whether a value is an outlier, name the shape of a data set, and write a full description of a distribution?
Determine outliers using the interquartile range, describe and interpret the shape and features of a distribution (symmetry, skewness, modality, centre, spread and outliers) and compare data displays using these features
A focused answer to the HSC Maths Standard 2 dot point on outliers and describing distributions. The 1.5 times IQR outlier test with lower and upper fences, telling symmetric from positively and negatively skewed data, unimodal versus bimodal shape, and writing a full describe-the-distribution answer covering shape, centre, spread and outliers, with worked Australian examples.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
What this dot point is asking
NESA wants you to do two linked jobs. First, apply a definite rule to decide whether an extreme value is an outlier: the test, which builds a lower and an upper "fence" from the quartiles and flags anything beyond them. Second, describe a distribution in words, naming its shape (symmetric, or skewed left or right), its modality (one peak or two), its centre, its spread, and any outliers. Almost every Data Analysis question that shows a graph or a data set ends with "describe the distribution" or "is this an outlier", so these are among the most reliable marks in the module. The arithmetic is light; the marks are won by stating the fence you test against, showing the comparison, and using the right vocabulary for the shape.
The answer
There are two skills here, and they share one idea: the middle of the data is steady, and you measure everything relative to it. Outliers are found by stepping a fixed distance () out from the quartiles. Shape is read from how the data sits around its centre - balanced (symmetric) or lopsided (skewed).
The interquartile range, quickly
The interquartile range is the spread of the middle half of the data:
where is the lower quartile (a quarter of the way through the ordered data) and is the upper quartile (three quarters of the way through). The IQR ignores the extreme top and bottom quarters, so it is not distorted by a single wild value - which is exactly why the outlier test is built on it.
The outlier test
An outlier is a value that lies unusually far from the rest of the data. The standard test draws two fences:
- lower fence ,
- upper fence .
Any value below the lower fence or above the upper fence is an outlier. Everything between the fences is treated as ordinary. The number line below shows the fences built out from the quartiles, with two values flagged because they fall beyond them.
The two outliers in the diagram, at and , are labelled with their values, not just coloured, so they are identifiable even in black and white. Notice the test treats high and low extremes the same way: always check both fences, because a question may hide a low outlier while you stare at an obvious high one.
Shape: symmetry and skew
The shape of a distribution is how the data sits around its centre. There are three shapes you must name on sight:
- Symmetric: the data is balanced about the centre, so the left and right halves are near mirror images. The mean and median are roughly equal.
- Positively skewed (skewed to the right): most data is bunched at the low end with a long tail stretching to the right. The few large values pull the mean above the median.
- Negatively skewed (skewed to the left): most data is bunched at the high end with a long tail stretching to the left. The few small values pull the mean below the median.
The skew is named for the direction the tail points, which trips up many students: a right-pointing tail is positive skew even though the bulk of the data is on the left. The three smooth curves below show the shapes side by side, with the mean and median marked so you can see how skew pulls them apart.
Modality: how many peaks
Modality counts the clear peaks in the data:
- Unimodal: one clear peak (one mode). Most single-group data is unimodal.
- Bimodal: two clear, separate peaks. Two peaks almost always means two groups have been combined - for example heights of male and female students, or sales on weekdays versus weekends. When you see bimodal data, the useful comment is that the data may be better split and described as two groups.
A set with no clear peak (all bars about level) is sometimes called uniform, but unimodal and bimodal are the two you will name most.
Writing a full "describe the distribution" answer
When a question says "describe the distribution", markers expect a checklist, not a vibe. Cover four features, in this order:
- Shape - symmetric, positively skewed, or negatively skewed (and mention bimodal if there are two peaks).
- Centre - quote the median (preferred when the data is skewed or has an outlier) or the mean, with its value.
- Spread - quote the IQR (preferred when skewed) or the range, with its value.
- Outliers - state any outliers (ideally justified by the test) and whether they are kept.
A reliable sentence frame is: "The distribution is [shape], centred at [median] with a spread (IQR) of [value], and [has one outlier at .../ has no outliers]." Pairing median with IQR is the safe choice, because both resist outliers; pair mean with standard deviation only when the data is roughly symmetric.
How exam questions ask about outliers and shape
The wording maps straight onto a method:
- "Is [value] an outlier?" or "Determine whether ... is an outlier" - run the test: find the IQR, find the relevant fence, then state the comparison and conclusion.
- "Show that [value] is an outlier" - the answer is already known, so the marks are entirely in the working: fence calculation plus the comparison.
- "Describe the shape" or "What is the shape of the distribution?" - name symmetric, positive skew or negative skew (the tail names the skew), and add modality if there are two peaks.
- "Describe the distribution" - the full four-part answer: shape, centre, spread, outliers.
- "Which measure of centre is more appropriate?" - the median if the data is skewed or has an outlier, because it resists extreme values; otherwise the mean.
- "Compare the two distributions" - compare like with like: centre against centre and spread against spread, using the median and IQR, then note shape and outliers.
Exam-style practice questions
Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
2022 HSC-style3 marksA data set has a lower quartile of and an upper quartile of . The largest value is . Determine whether is an outlier, showing your working.Show worked answer →
A full-mark response computes the IQR, , then the upper fence, .
It then states the comparison explicitly: , therefore is an outlier.
Markers award one mark for the IQR, one for correctly applying the rule to get the fence , and one for the comparison and conclusion. A bare "yes" with no fence shown scores poorly, even if the answer is correct.
2021 HSC-style4 marksThe histogram of weekly earnings for a group of workers is bunched at the lower end with a long tail stretching to the right, where a few workers earn much more. (a) Name the shape of the distribution. (b) State whether the mean or the median is the larger measure of centre, and explain why. (c) State which measure better represents a typical worker, with a reason.Show worked answer →
Part (a): the distribution is positively skewed (skewed to the right) - the long tail points to the right.
Part (b): the mean is larger than the median, because the small number of very high earners in the right tail pull the mean up, while the median (the middle position) is barely affected.
Part (c): the median better represents a typical worker, because it is resistant to the few extreme high incomes that distort the mean.
Markers reward the correct shape name, the correct mean-versus-median direction WITH the tail reasoning, and a justified choice of the median for a typical value. Naming the skew the wrong way (a common slip) loses the part (a) and part (b) marks together.
2023 HSC-style4 marksA set of daily maximum temperatures has a five-number summary with minimum , , median , and maximum . (a) Show that is an outlier. (b) Describe the distribution, referring to shape, centre and spread.Show worked answer →
Part (a): IQR ; upper fence ; since , the value is an outlier.
Part (b): shape is positively skewed (the upper tail is stretched by the high value); centre is a median of degrees (preferred over the mean because of the outlier); spread is an IQR of degrees for the middle half, with a full range of degrees inflated by the outlier.
Markers award the outlier test (fence plus comparison), then one mark each for a correctly justified shape, centre and spread. Quoting the median rather than the mean for the centre of a skewed set is part of what is rewarded here.
Practice questions
Original practice questions graded from foundation to exam level, each with a full worked solution. Try them before revealing the solution.
foundation2 marksFor a data set the lower quartile is and the upper quartile is . (a) Find the interquartile range. (b) Find the lower and upper outlier fences using the rule.Show worked solution →
Part (a) - interquartile range. The IQR is the upper quartile minus the lower quartile:
Part (b) - the two fences. The lower fence sits below and the upper fence sits above . Here , so
Any value below or above would be flagged as an outlier. (Check: the fences sit one full step outside each quartile, so they should straddle the quartiles symmetrically, and and are each away from and .)
foundation2 marksA data set is positively skewed (skewed to the right). (a) State which is larger, the mean or the median. (b) State on which side the long tail of the data lies.Show worked solution →
Part (a) - mean versus median. In a positively skewed set the few large values in the tail pull the mean upward, while the median (a position, not a total) barely moves. So the mean is greater than the median.
Part (b) - the tail. Positive skew means the data is stretched out towards the high (positive) end, so the long tail points to the right. The bulk of the data is bunched on the left with a few large values trailing off to the right. (Memory hook: the skew is named for the direction the tail points, so "positive / right skew" has its tail on the right.)
foundation1 marksA histogram of student heights shows two clear, separate peaks. State the modality of the distribution and suggest what the two peaks might represent.Show worked solution →
Count the peaks. Two clear, separate peaks means the distribution is bimodal.
Interpret the peaks. Two peaks usually signals two groups combined into one data set. For heights, a plausible explanation is that the data mixes two subgroups, for example male and female students, each clustering around its own typical height. (A single-peak set is unimodal; no clear peak is sometimes called uniform.)
core3 marksThe number of goals scored by a netball team across games, in order, is . The five-number summary gives and . Use the rule to test whether is an outlier.Show worked solution →
Find the IQR. Subtract the quartiles:
Find the upper fence. A high value is tested against the upper fence, . Here , so
Compare and conclude. The value is greater than the upper fence , so
means is an outlier by the rule. (For completeness the lower fence is , and the smallest value is above , so there is no low outlier. State the fence you cross, then the comparison: that is the line markers reward.)
core3 marksThe waiting times (in minutes) at a clinic for patients, in order, are , with and . Test both ends for outliers using the rule and list any outliers.Show worked solution →
Find the IQR. Subtract the quartiles:
Find both fences. With :
Compare each extreme value. The smallest value is and the largest is :
so both fall outside their fences. The outliers are minutes and minutes. (Check: every other value lies between and , so exactly two points are flagged, one at each end. Always test both fences, not just the obvious big value.)
exam5 marksThe times (in minutes) for commuters to travel to work, in order, are . For this data , the median is and . (a) Find the IQR. (b) Use the rule to test whether is an outlier. (c) Describe the distribution, commenting on shape, centre, spread and outliers.Show worked solution →
Part (a) - the IQR. Subtract the quartiles:
Part (b) - test the value . A high value is tested against the upper fence. With :
Since
the value is an outlier.
Part (c) - describe the distribution. Work through the four features in order:
- Shape: ignoring the outlier the values rise fairly evenly, but the long upper tail (one value far above the rest) makes the data positively skewed (skewed to the right).
- Centre: the median is minutes; the median is the better measure of centre here because the outlier would inflate the mean.
- Spread: the IQR is minutes (the middle half of commuters are within a minute band); the full range is minutes, stretched by the outlier.
- Outliers: there is one outlier at minutes, well above the upper fence of ; this is a genuine value (a very long commute, perhaps a transport delay) rather than an error, so it should be kept but noted.
So the travel times are positively skewed with a median of minutes, an IQR of minutes, and one high outlier at minutes. (Check: the median sits inside the quartiles and as it must, and only the single value crosses a fence.)
exam6 marksA teacher records two class quiz results out of . Class A, in order, is with , median , . Class B, in order, is with , median , . (a) Test Class A for outliers using the rule. (b) Describe the shape of each class. (c) Write one or two sentences comparing the two classes' centre and spread.Show worked solution →
Part (a) - test Class A for outliers. First the IQR:
With , the fences are
The smallest value is and the largest is , and
so both extremes lie inside the fences: Class A has no outliers.
Part (b) - shape of each class. For Class A the values are spread fairly evenly and the mean and median are close, so the shape is roughly symmetric. For Class B the marks bunch up near the top (the maximum is ) with a tail of lower marks trailing down to , so Class B is negatively skewed (skewed to the left).
Part (c) - compare centre and spread. Class B has the higher centre (median versus ), so Class B performed better overall. Class B is also more consistent: its IQR is , smaller than Class A's IQR of , so Class B's middle marks are more tightly clustered. (Check: comparing like with like, both statements use the median for centre and the IQR for spread, which is the safe pairing when a set may be skewed.)
Related dot points
- Calculate measures of central tendency, including the mean, median and mode, for both raw data and data presented in a frequency table
A focused answer to the HSC Maths Standard 2 dot point on the mean, median and mode. Finding all three from a raw list, the mean and mode from a frequency table, the mean from grouped data using class centres, and choosing the most appropriate measure when the data is skewed or has an outlier, with worked Australian examples.
- Calculate measures of spread, including the range, quartiles and interquartile range, and the population standard deviation using technology
A focused answer to the HSC Maths Standard 2 dot point on measures of spread. The range, the quartiles and interquartile range, the five-number summary, the population standard deviation from a calculator, and how to compare the spread of two data sets, with worked Australian examples.
- Construct and interpret box-and-whisker plots and use them, including parallel (side-by-side) box plots, to compare data sets in terms of centre, spread, skewness and outliers
A focused answer to the HSC Maths Standard 2 dot point on box-and-whisker plots. Building a box plot from the five-number summary, flagging an outlier with the 1.5 times IQR rule, drawing parallel box plots, and comparing two groups by centre, spread, skew and outliers, with worked Australian examples.
- Display and interpret numerical data using dot plots and stem-and-leaf plots, including back-to-back stem-and-leaf plots, and describe the clusters, gaps, outliers and shape of the data
A focused answer to the HSC Maths Standard 2 dot point on dot plots and stem-and-leaf plots. How to construct and read each display, how to build a back-to-back stem-and-leaf plot to compare two groups, and how to describe clusters, gaps, outliers and the shape of a distribution, with worked Australian examples.