← Year 12: Statistical Analysis
How do scatterplots reveal the form, direction and strength of the relationship between two variables?
Construct and interpret scatterplots to describe the relationship between two variables in bivariate data
A focused answer to the HSC Maths Standard 2 dot point on scatterplots. Reading form, direction and strength of association, identifying outliers, and worked Australian examples using ABS-style economic and demographic data.
Have a quick question? Jump to the Q&A page
What this dot point is asking
NESA wants you to construct a scatterplot from a table of bivariate data, describe the form, direction and strength of the relationship between the two variables, and identify outliers and possible causes for them.
The answer
What a scatterplot is
A scatterplot is a graph of one variable against another with each data point plotted as a single dot. By convention:
- IMATH_3 -axis: the independent (or explanatory) variable.
- IMATH_4 -axis: the dependent (or response) variable.
For each pair in the data, plot one dot at that location. Do not connect the dots.
Describing the relationship
Three descriptors:
- Form. Linear (points cluster around a straight line), non-linear (curve), or no clear pattern.
- Direction. Positive (as increases, tends to increase), negative (as increases, tends to decrease), or no association.
- Strength. Strong (points cluster tightly around a curve), moderate, weak, or no association.
State all three when describing a scatterplot. Markers expect explicit use of these words.
Outliers
An outlier is a point that lies far from the bulk of the data. Outliers can dramatically affect the calculated correlation coefficient and the regression line.
Causes:
- Data error. Wrong digit entered, wrong unit.
- Genuine atypical case. A real but rare observation (e.g. a household with unusual circumstances).
- Subgroup effect. Two distinct populations on the same plot.
Decide whether to remove an outlier based on cause. Errors should be corrected or removed; genuine atypical cases should usually be reported and kept in.
Correlation does not imply causation
A strong positive correlation between and does not prove that causes . A third variable may explain both. Standard 2 expects you to mention this whenever a worded question invites a causal claim.
Reading scatterplots quickly
Pattern checklist:
- Straight line going up-right: positive linear.
- Straight line going down-right: negative linear.
- Curve: non-linear.
- Cloud with no pattern: no association.
- Two distinct clouds: subgroup effect, often best modelled as two separate relationships.
Past exam questions, worked
Real questions from past NESA papers on this dot point, with our answer explainer.
2022 HSC Q143 marksA scatterplot of (years of schooling) and (weekly income, \xy$.Show worked answer →
Form: linear (points cluster around a straight line).
Direction: positive (as years of schooling increase, weekly income tends to increase).
Strength: strong (points cluster closely; little scatter around the line).
Markers reward all three descriptors and the use of the words "linear", "positive", "strong" with brief justification from the plot.
2023 HSC Q164 marksA scatterplot of household income (, in \000y50(\ per year, hours per week. Describe the relationship and discuss the outlier.Show worked answer →
Form: roughly linear (the trend is steady, not curved).
Direction: negative (higher income associated with less TV).
Strength: moderate (not all points sit tightly on a line, but a clear trend).
Outlier: at (\30000, 5)5$ hours despite low income, well below the trend line. Possible explanations include working multiple jobs, having young children, or another lifestyle factor. The outlier should be considered for removal if it is a data error or a genuinely atypical case; otherwise it stays in the analysis.
Markers reward all three descriptors, identification of the outlier, and a brief sensible interpretation.
Related dot points
- Calculate and interpret Pearson's correlation coefficient using statistical technology, including the sign and magnitude
A focused answer to the HSC Maths Standard 2 dot point on Pearson's correlation coefficient. What measures, how to interpret its sign and magnitude, the limitations of in non-linear relationships, and how to compute it using calculator statistics functions.
- Find and use the equation of the least-squares regression line to model a linear relationship between two variables
A focused answer to the HSC Maths Standard 2 dot point on the least-squares regression line. The equation , finding the gradient and intercept using calculator statistics functions, interpreting the gradient in context, and worked Australian examples.
- Distinguish between interpolation and extrapolation when using a regression line, and assess the reliability of predictions
A focused answer to the HSC Maths Standard 2 dot point on interpolation vs extrapolation. The reliability of predictions inside and outside the data range, examples of when extrapolation breaks down, and Australian-context worked examples.