Skip to main content
SAGeneral MathematicsSyllabus dot point

How do we measure the strength and direction of a relationship between two variables?

Display bivariate data in a scatterplot and describe the association using form, direction, strength and the correlation coefficient r.

How to read a scatterplot for form, direction and strength, interpret the correlation coefficient r and r squared, and avoid concluding causation from correlation.

Generated by Claude Opus 4.76 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Describing a scatterplot
  3. The correlation coefficient r
  4. The coefficient of determination
  5. Correlation is not causation

What this dot point is asking

You must plot and describe bivariate data, interpret the value of rr, and explain why correlation does not prove causation.

Describing a scatterplot

Plot the explanatory variable on the horizontal axis and the response variable on the vertical axis. Then describe three features:

  • Form: is the pattern roughly linear, or curved?
  • Direction: positive (both rise together) or negative (one rises as the other falls)?
  • Strength: how closely do the points follow the pattern, from weak to strong?

Also note any outliers that sit well away from the main pattern.

The correlation coefficient r

For a linear relationship, the Pearson correlation coefficient rr measures the strength and direction with a single number between βˆ’1-1 and +1+1.

The coefficient of determination

Squaring rr gives the coefficient of determination r2r^2, often written as a percentage. It tells you the proportion of the variation in the response variable explained by the linear relationship with the explanatory variable.

Correlation is not causation

A strong rr shows the two variables move together, but it does not prove one causes the other. The link might run the other way, or a third confounding variable might drive both. Ice-cream sales and drowning rates correlate strongly, but neither causes the other; warm weather drives both.

Exam-style practice questions

Practice questions written in the style of SACE Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2019 SACE Stage 22 marksData on the life expectancy of Australian males by year of birth has been fitted with a linear model. Describe the strength and nature of the relationship between the variables.
Show worked answer β†’

The coefficient of determination for these data is r squared = 0.927, so r is about +0.963.

Describe two things: strength and direction (nature).

Strength: r is very close to +1, so there is a strong (or very strong) linear correlation between year of birth and life expectancy.

Nature/direction: the correlation is positive, meaning that as the year of birth increases, the life expectancy of Australian males also tends to increase.

Award 1 mark for the strength (strong) and 1 mark for the direction (positive), both justified by reference to the closeness of r squared to 1.

2021 SACE Stage 22 marksTemperature T and relative humidity H over 24 hours give the least squares regression line H = -4.68T + 141.85 with r squared = 0.929. Interpret the slope of this least squares regression line in the context of the problem.
Show worked answer β†’

The slope is -4.68, and it is attached to the variable T (temperature). In context, the slope is the predicted change in H for each one-unit increase in T.

Interpretation: for every 1 degree C increase in temperature, the relative humidity is predicted to decrease by 4.68 percentage points (about 4.68%).

Award 1 mark for the direction and size (a decrease of 4.68 per degree) and 1 mark for stating it in context, naming both variables with units (per 1 degree C rise in temperature, humidity falls by 4.68%). The negative sign is essential.

2023 SACE Stage 21 marksTable 2 shows the number of long-footed potoroos against time in months during a breeding program. Using a linear model, calculate Pearson's correlation coefficient (r) for the relationship between time in months and the number of potoroos.
Show worked answer β†’

Enter the paired data into a calculator's linear regression and read off r directly.

For these data the linear regression gives r is approximately 0.99 (close to +0.99).

So Pearson's correlation coefficient is about 0.99. The single mark is for the correct value, which should be a positive number very close to 1, reflecting the strong positive linear trend (the potoroo population rises steadily over time).