← Year 12: Statistical Analysis

NSWMaths Standard 2Syllabus dot point

What does Pearson's correlation coefficient measure, and how is it interpreted?

Calculate and interpret Pearson's correlation coefficient using statistical technology, including the sign and magnitude

A focused answer to the HSC Maths Standard 2 dot point on Pearson's correlation coefficient. What rr measures, how to interpret its sign and magnitude, the limitations of rr in non-linear relationships, and how to compute it using calculator statistics functions.

Generated by Claude OpusReviewed by Better Tuition Academy7 min answer

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to interpret Pearson's correlation coefficient rr for a bivariate dataset, distinguish between the sign and magnitude, recognise the limitation of rr for non-linear data, and compute it from a dataset using the calculator's statistics functions.

The answer

Three scatterplots showing different correlation coefficients Three scatterplots side by side. Left: tight upward-sloping points with r approximately 0.95. Middle: looser upward trend with r approximately 0.5. Right: random cloud of points with r near 0. r β‰ˆ 0.95 strong positive r β‰ˆ 0.5 moderate positive r β‰ˆ 0 no linear pattern

What rr measures

Pearson's correlation coefficient rr measures the strength and direction of the linear relationship between two variables. It is bounded:

βˆ’1≀r≀1.-1 \le r \le 1.

  • IMATH_8 : perfect positive linear relationship.
  • IMATH_9 : perfect negative linear relationship.
  • IMATH_10 : no linear relationship.
  • Sign indicates direction; magnitude indicates strength.

Strength descriptors

Standard 2 uses approximate verbal labels:

IMATH_11 range Strength
IMATH_12 - IMATH_13 Very weak
IMATH_14 - IMATH_15 Weak
IMATH_16 - IMATH_17 Moderate
IMATH_18 - IMATH_19 Strong
IMATH_20 - IMATH_21 Very strong

These are rough; markers accept reasonable adjacent labels.

Sign interpretation

  • IMATH_22 : positive linear. As xx increases, yy tends to increase.
  • IMATH_25 : negative linear. As xx increases, yy tends to decrease.
  • IMATH_28 : no linear pattern. May still have a strong non-linear pattern.

Important caveat: linear only

Pearson's rr only detects linear association. A dataset that follows a parabolic curve perfectly can give rr close to zero, even though the relationship is deterministic. Always look at the scatterplot first.

Computing rr on a calculator

NESA-approved scientific calculators include statistics-mode (STAT) functions. The procedure typically:

  1. Switch to statistics mode (e.g. MODE 2 STAT 2-VAR).
  2. Enter the (x,y)(x, y) pairs.
  3. Calculate rr from the statistics-result menu.

You will not be asked to compute rr by hand. Read it off the calculator after entering the data.

Correlation versus causation

A strong correlation does not prove causation. Three possibilities for a strong rr:

  • IMATH_36 causes yy.
  • IMATH_38 causes xx (reverse causation).
  • A third variable causes both (xx and yy are both effects of a common cause).

The classic example: ice cream sales and drownings are positively correlated. Hot weather causes both, but neither causes the other.

Past exam questions, worked

Real questions from past NESA papers on this dot point, with our answer explainer.

2022 HSC Q183 marksA dataset of 2020 pairs gives Pearson's correlation coefficient r=βˆ’0.86r = -0.86. Interpret this value.
Show worked answer β†’

The negative sign means the relationship is inverse: as xx increases, yy tends to decrease.

The magnitude ∣r∣=0.86|r| = 0.86 is close to 11, indicating a strong linear relationship.

Overall, r=βˆ’0.86r = -0.86 indicates a strong, negative, linear association between the two variables.

Markers reward identification of sign (direction), magnitude (strength) and the linear qualifier.

2023 HSC Q183 marksTwo datasets are presented. Dataset A has r=0.95r = 0.95. Dataset B has r=0.05r = 0.05. Describe the relationship in each, and explain why a low rr does not necessarily mean no relationship.
Show worked answer β†’

Dataset A: strong positive linear relationship. As xx increases, yy increases, with points closely clustered around the line.

Dataset B: very weak or no linear relationship. The points show essentially no straight-line pattern.

A low value of rr measures only the linear association. A scatterplot may show a strong non-linear pattern (for example, parabolic, exponential or U-shaped), in which case Pearson's rr will be near zero despite a clear relationship. Always look at the scatterplot before relying on rr.

Markers reward describing both datasets correctly with sign, strength and linear qualifier, and the caveat that low rr does not preclude non-linear patterns.

Related dot points