How do we measure the strength and direction of a relationship between two variables?
Display bivariate data in a scatterplot and describe the association using form, direction, strength and the correlation coefficient r.
How to read a scatterplot for form, direction and strength, interpret the correlation coefficient r and r squared, and avoid concluding causation from correlation.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
You must plot and describe bivariate data, interpret the value of , and explain why correlation does not prove causation.
Describing a scatterplot
Plot the explanatory variable on the horizontal axis and the response variable on the vertical axis. Then describe three features:
- Form: is the pattern roughly linear, or curved?
- Direction: positive (both rise together) or negative (one rises as the other falls)?
- Strength: how closely do the points follow the pattern, from weak to strong?
Also note any outliers that sit well away from the main pattern.
The correlation coefficient r
For a linear relationship, the Pearson correlation coefficient measures the strength and direction with a single number between and .
The coefficient of determination
Squaring gives the coefficient of determination , often written as a percentage. It tells you the proportion of the variation in the response variable explained by the linear relationship with the explanatory variable.
Correlation is not causation
A strong shows the two variables move together, but it does not prove one causes the other. The link might run the other way, or a third confounding variable might drive both. Ice-cream sales and drowning rates correlate strongly, but neither causes the other; warm weather drives both.
Exam-style practice questions
Practice questions written in the style of SACE Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
2019 SACE Stage 22 marksData on the life expectancy of Australian males by year of birth has been fitted with a linear model. Describe the strength and nature of the relationship between the variables.Show worked answer β
The coefficient of determination for these data is r squared = 0.927, so r is about +0.963.
Describe two things: strength and direction (nature).
Strength: r is very close to +1, so there is a strong (or very strong) linear correlation between year of birth and life expectancy.
Nature/direction: the correlation is positive, meaning that as the year of birth increases, the life expectancy of Australian males also tends to increase.
Award 1 mark for the strength (strong) and 1 mark for the direction (positive), both justified by reference to the closeness of r squared to 1.
2021 SACE Stage 22 marksTemperature T and relative humidity H over 24 hours give the least squares regression line H = -4.68T + 141.85 with r squared = 0.929. Interpret the slope of this least squares regression line in the context of the problem.Show worked answer β
The slope is -4.68, and it is attached to the variable T (temperature). In context, the slope is the predicted change in H for each one-unit increase in T.
Interpretation: for every 1 degree C increase in temperature, the relative humidity is predicted to decrease by 4.68 percentage points (about 4.68%).
Award 1 mark for the direction and size (a decrease of 4.68 per degree) and 1 mark for stating it in context, naming both variables with units (per 1 degree C rise in temperature, humidity falls by 4.68%). The negative sign is essential.
2023 SACE Stage 21 marksTable 2 shows the number of long-footed potoroos against time in months during a breeding program. Using a linear model, calculate Pearson's correlation coefficient (r) for the relationship between time in months and the number of potoroos.Show worked answer β
Enter the paired data into a calculator's linear regression and read off r directly.
For these data the linear regression gives r is approximately 0.99 (close to +0.99).
So Pearson's correlation coefficient is about 0.99. The single mark is for the correct value, which should be a positive number very close to 1, reflecting the strong positive linear trend (the potoroo population rises steadily over time).