β Year 12: Statistical Analysis
What does Pearson's correlation coefficient measure, and how is it interpreted?
Calculate and interpret Pearson's correlation coefficient using statistical technology, including the sign and magnitude
A focused answer to the HSC Maths Standard 2 dot point on Pearson's correlation coefficient. What measures, how to interpret its sign and magnitude, the limitations of in non-linear relationships, and how to compute it using calculator statistics functions.
Have a quick question? Jump to the Q&A page
What this dot point is asking
NESA wants you to interpret Pearson's correlation coefficient for a bivariate dataset, distinguish between the sign and magnitude, recognise the limitation of for non-linear data, and compute it from a dataset using the calculator's statistics functions.
The answer
What measures
Pearson's correlation coefficient measures the strength and direction of the linear relationship between two variables. It is bounded:
- IMATH_8 : perfect positive linear relationship.
- IMATH_9 : perfect negative linear relationship.
- IMATH_10 : no linear relationship.
- Sign indicates direction; magnitude indicates strength.
Strength descriptors
Standard 2 uses approximate verbal labels:
| IMATH_11 range | Strength |
|---|---|
| IMATH_12 - IMATH_13 | Very weak |
| IMATH_14 - IMATH_15 | Weak |
| IMATH_16 - IMATH_17 | Moderate |
| IMATH_18 - IMATH_19 | Strong |
| IMATH_20 - IMATH_21 | Very strong |
These are rough; markers accept reasonable adjacent labels.
Sign interpretation
- IMATH_22 : positive linear. As increases, tends to increase.
- IMATH_25 : negative linear. As increases, tends to decrease.
- IMATH_28 : no linear pattern. May still have a strong non-linear pattern.
Important caveat: linear only
Pearson's only detects linear association. A dataset that follows a parabolic curve perfectly can give close to zero, even though the relationship is deterministic. Always look at the scatterplot first.
Computing on a calculator
NESA-approved scientific calculators include statistics-mode (STAT) functions. The procedure typically:
- Switch to statistics mode (e.g. MODE 2 STAT 2-VAR).
- Enter the pairs.
- Calculate from the statistics-result menu.
You will not be asked to compute by hand. Read it off the calculator after entering the data.
Correlation versus causation
A strong correlation does not prove causation. Three possibilities for a strong :
- IMATH_36 causes .
- IMATH_38 causes (reverse causation).
- A third variable causes both ( and are both effects of a common cause).
The classic example: ice cream sales and drownings are positively correlated. Hot weather causes both, but neither causes the other.
Past exam questions, worked
Real questions from past NESA papers on this dot point, with our answer explainer.
2022 HSC Q183 marksA dataset of pairs gives Pearson's correlation coefficient . Interpret this value.Show worked answer β
The negative sign means the relationship is inverse: as increases, tends to decrease.
The magnitude is close to , indicating a strong linear relationship.
Overall, indicates a strong, negative, linear association between the two variables.
Markers reward identification of sign (direction), magnitude (strength) and the linear qualifier.
2023 HSC Q183 marksTwo datasets are presented. Dataset A has . Dataset B has . Describe the relationship in each, and explain why a low does not necessarily mean no relationship.Show worked answer β
Dataset A: strong positive linear relationship. As increases, increases, with points closely clustered around the line.
Dataset B: very weak or no linear relationship. The points show essentially no straight-line pattern.
A low value of measures only the linear association. A scatterplot may show a strong non-linear pattern (for example, parabolic, exponential or U-shaped), in which case Pearson's will be near zero despite a clear relationship. Always look at the scatterplot before relying on .
Markers reward describing both datasets correctly with sign, strength and linear qualifier, and the caveat that low does not preclude non-linear patterns.
Related dot points
- Construct and interpret scatterplots to describe the relationship between two variables in bivariate data
A focused answer to the HSC Maths Standard 2 dot point on scatterplots. Reading form, direction and strength of association, identifying outliers, and worked Australian examples using ABS-style economic and demographic data.
- Find and use the equation of the least-squares regression line to model a linear relationship between two variables
A focused answer to the HSC Maths Standard 2 dot point on the least-squares regression line. The equation , finding the gradient and intercept using calculator statistics functions, interpreting the gradient in context, and worked Australian examples.
- Distinguish between interpolation and extrapolation when using a regression line, and assess the reliability of predictions
A focused answer to the HSC Maths Standard 2 dot point on interpolation vs extrapolation. The reliability of predictions inside and outside the data range, examples of when extrapolation breaks down, and Australian-context worked examples.