Skip to main content
ExamExplained
NSW · Maths Standard 2
Maths Standard 2 study scene
§-Syllabus dot point
NSWMaths Standard 2Syllabus dot point

What does Pearson's correlation coefficient measure, and how is it interpreted?

Calculate and interpret Pearson's correlation coefficient using statistical technology, including the sign and magnitude

A focused answer to the HSC Maths Standard 2 dot point on Pearson's correlation coefficient. What rr measures, how to read its sign and magnitude, the strength scale, the non-linear limitation, computing it on a calculator, and worked Australian examples.

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to do four things with Pearson's correlation coefficient rr for a bivariate dataset (a set of paired values, like height and weight for each person). Interpret what rr tells you. Tell the sign apart from the magnitude (the size). Know that rr does not work well for non-linear data. And compute rr from a dataset using the calculator's statistics functions. The number on its own earns no marks: the marks are for what it tells you about the relationship, said in the right words.

The answer

Four scatterplots with different correlation coefficientsFour mini scatterplots side by side: a tight rising cloud with r about zero point nine five, a looser rising cloud with r about zero point five, a tight falling cloud with r about minus zero point eight five, and a random cloud with r about zero.r ≈ 0.95very strong positiver ≈ 0.5moderate positiver ≈ −0.85strong negativer ≈ 0no linear pattern

What rr measures

Pearson's correlation coefficient rr measures the strength and direction of the linear relationship between two variables. It is a single number that summarises a whole scatterplot, and it is bounded:

1r1.-1 \le r \le 1.

  • r=1r = 1: perfect positive linear relationship (every point exactly on a rising line).
  • r=1r = -1: perfect negative linear relationship (every point exactly on a falling line).
  • r=0r = 0: no linear relationship.
  • The sign (++ or -) gives the direction; the magnitude r|r| (how close to 11) gives the strength.

The gallery above shows the same idea four times. The closer the points crowd around a single straight line, the closer r|r| is to 11. The looser the cloud, the closer rr is to 00. The direction of the slope sets the sign.

Strength descriptors

Standard 2 uses approximate verbal labels for r|r|:

r|r| range Strength
0.00.0-0.20.2 Very weak
0.20.2-0.40.4 Weak
0.40.4-0.60.6 Moderate
0.60.6-0.80.8 Strong
0.80.8-1.01.0 Very strong

These bands are rough, and markers accept a reasonable adjacent label near a boundary (for example calling r=0.61r = 0.61 "moderate to strong"). Use the magnitude only, so r=0.9r = -0.9 is very strong even though it is negative.

Sign and magnitude are separate questions

This is the single most examined idea on this dot point, so it is worth seeing it isolated. The two plots below have exactly the same closeness of fit, so they are equally strong; only the sign differs, which flips the direction.

Same magnitude, opposite signTwo scatterplots with the same closeness of fit but opposite slopes: one rising with r equals plus zero point eight, one falling with r equals minus zero point eight.r = +0.8positive: y risesr = −0.8negative: y fallsSame magnitude (0.8): equally strong. The sign only flips the direction.

So when a question asks you to compare two coefficients, compare their magnitudes for strength and their signs for direction separately. r=0.9r = -0.9 is a stronger relationship than r=0.3r = 0.3, even though one is negative; the 0.9-0.9 cloud hugs its line much more tightly.

The linear-only limitation

Pearson's rr only detects linear association, meaning a straight-line trend. A dataset that follows a curve perfectly can still give rr close to zero. This happens because the rises and falls of the curve cancel out when you measure the straight-line trend. The plot below is a perfect U-shape: a textbook strong relationship, yet r0r \approx 0.

A strong non-linear pattern with r near zeroA scatterplot whose points form a clear U-shaped curve, yet Pearson r is about zero because the relationship is not linear.r ≈ 0but clearly U-shapedA perfect U-shape gives r ≈ 0: the rise on the right cancelsthe fall on the left. r measures linear association only.

This is why the scatterplot comes first. A small rr rules out a straight-line trend, but it does not rule out a curved one. Always look at the plot before trusting the number.

Computing rr on a calculator

NESA-approved scientific calculators include statistics-mode (STAT) functions. The procedure is typically:

  1. Switch to statistics mode (e.g. MODE 2 STAT, then a 2-variable option).
  2. Enter the (x,y)(x, y) pairs into the two lists.
  3. Read rr from the regression-results menu.

You will not be asked to compute rr by hand. The marks come from entering the data correctly and interpreting the value, not from the arithmetic. Different calculator models label and reach the result in different ways. So practise on the exact calculator you will take into the exam, and clear old data before entering a new dataset.

Reading the sign off a scatterplot

If you only have the plot (no number), you can still state the sign and a rough magnitude:

  • Cloud rises to the right: r>0r > 0.
  • Cloud falls to the right: r<0r < 0.
  • Tight band: r|r| near 11. Loose cloud: r|r| near 00. Round, tiltless blob: r0r \approx 0.

This is the same reading you did for direction and strength on the scatterplot, now phrased as the sign and magnitude of rr.

Correlation versus causation

A strong correlation does not prove causation. There are three ways to get a strong rr:

  • xx causes yy.
  • yy causes xx (reverse causation).
  • a third variable causes both, so xx and yy move together as effects of a common cause.

The classic example: ice-cream sales and drownings are positively correlated. Hot weather drives both, but neither causes the other. In the exam, if a worded question invites a causal claim, state that rr shows association only and use cautious language.

How exam questions ask about rr

  • "Interpret r=r = \dots" Give strength (from r|r|), direction (from the sign) and the word "linear", in one sentence: "a strong, negative, linear relationship".
  • "Compare the correlation in datasets A and B." Compare magnitudes for strength and signs for direction. The larger r|r| is the stronger relationship regardless of sign.
  • "Explain why a low rr does not mean no relationship." Because rr measures only linear association; a curved (non-linear) pattern can give r0r \approx 0. Look at the scatterplot.
  • "Calculate rr for this data." Enter the pairs in statistics mode and read rr off; quote it to two decimal places.
  • "Does this prove xx causes yy?" No: correlation is not causation; a third variable may be responsible.

Edge cases worth knowing

  • rr near a band boundary. Quote the value and give the nearest sensible label; a value like 0.790.79 sits right on the strong/very strong line, so "strong, almost very strong" is fine.
  • rr exactly 00 on a clear curve. Deterministic but non-linear; report that rr misses it and point to the scatterplot.
  • A high r|r| from a tiny sample. Two or three points can force r|r| near 11 by accident. A large r|r| from very few pairs is not strong evidence.
  • Restricted range. If the data only covers a narrow slice of xx, rr can look weaker than the true relationship over the full range. Standard 2 will not test this directly, but it is why the data range matters.

Exam-style practice questions

Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2022 HSC-style3 marksA dataset of 2020 pairs gives Pearson's correlation coefficient r=0.86r = -0.86. Interpret this value.
Show worked answer →

The negative sign means the relationship is inverse: as xx increases, yy tends to decrease.

The magnitude r=0.86|r| = 0.86 is close to 11, indicating a strong linear relationship.

Overall, r=0.86r = -0.86 indicates a strong, negative, linear association between the two variables.

Markers reward identification of sign (direction), magnitude (strength) and the linear qualifier.

2023 HSC-style3 marksTwo datasets are presented. Dataset A has r=0.95r = 0.95. Dataset B has r=0.05r = 0.05. Describe the relationship in each, and explain why a low rr does not necessarily mean no relationship.
Show worked answer →

Dataset A: strong positive linear relationship. As xx increases, yy increases, with points closely clustered around the line.

Dataset B: very weak or no linear relationship. The points show essentially no straight-line pattern.

A low value of rr measures only the linear association. A scatterplot may show a strong non-linear pattern (for example, parabolic, exponential or U-shaped), in which case Pearson's rr will be near zero despite a clear relationship. Always look at the scatterplot before relying on rr.

Markers reward describing both datasets correctly with sign, strength and linear qualifier, and the caveat that low rr does not preclude non-linear patterns.

Practice questions

Original practice questions graded from foundation to exam level, each with a full worked solution. Try them before revealing the solution.

foundation2 marksA study of 1515 pairs of data gives Pearson's correlation coefficient r=0.83r = 0.83. (a) State the direction of the relationship. (b) State its strength. (c) Write a one-sentence interpretation that names strength, direction and the word "linear".
Show worked solution →
Read the sign for direction (a)
The value r=0.83r = 0.83 is positive, so the relationship is positive: as one variable increases, the other tends to increase.
Read the magnitude for strength (b)
The size is r=0.83|r| = 0.83, which falls in the 0.80.8 to 1.01.0 band, so the relationship is very strong.
Write the full interpretation (c)
Combine all three parts: r=0.83r = 0.83 indicates a very strong, positive, linear relationship between the two variables.
Check
The sentence names strength (very strong), direction (positive) and includes the word "linear", which is exactly what markers reward.
foundation3 marksA scatterplot of daily maximum temperature against the number of cold drinks sold shows points that rise steadily to the right and lie in a fairly tight band close to a straight line. (a) State whether rr is positive or negative. (b) Estimate whether r|r| is close to 00 or close to 11. (c) Give a single value of rr that would be consistent with this description.
Show worked solution →
Read the direction from the slope (a)
The cloud rises to the right (drink sales go up as temperature goes up), so the trend is positive and r>0r > 0.
Read the strength from the spread (b)
The points lie in a fairly tight band close to a straight line, so the linear fit is strong and r|r| is close to 11, not close to 00.
Choose a consistent value (c)
A tight rising band matches a large positive value, for example r=0.9r = 0.9. (Any value such as 0.850.85 to 0.950.95 is reasonable.)
Check
Positive sign matches the rising cloud and a magnitude near 0.90.9 matches the tight band, so r=0.9r = 0.9 fits the description.
foundation4 marksA gardener waters five seedlings with different amounts of water and records the height after two weeks. The water xx (in litres) and height yy (in centimetres) are: (1,12)(1, 12), (2,19)(2, 19), (3,23)(3, 23), (4,31)(4, 31), (5,35)(5, 35). (a) Use the statistics mode of your calculator to find rr, correct to two decimal places. (b) Describe the relationship in words.
Show worked solution →
Enter the data into statistics mode (a)
Clear any old data, switch to 22-variable statistics mode, and enter the five pairs into the xx and yy lists in order: x=1,2,3,4,5x = 1, 2, 3, 4, 5 and y=12,19,23,31,35y = 12, 19, 23, 31, 35.
Read the coefficient (a)
From the regression-results menu the calculator returns r=0.99r = 0.99 (the unrounded value is 0.99470.9947).
Describe the relationship (b)
The sign is positive and r=0.99|r| = 0.99 is in the 0.80.8 to 1.01.0 band, so this is a very strong, positive, linear relationship: more water is associated with greater height.
Check
Both xx and yy increase together and the points lie almost exactly on a line, so a value very close to +1+1 is expected, agreeing with r=0.99r = 0.99.
foundation3 marksTwo studies are reported. Study P has r=0.78r = -0.78 and study Q has r=0.41r = 0.41. (a) Which study shows the stronger linear relationship? (b) Which shows a negative relationship? (c) Explain why a negative rr can still describe a stronger relationship than a positive one.
Show worked solution →
Compare magnitudes for strength (a)
Strength depends only on the size r|r|. Here 0.78=0.78|-0.78| = 0.78 and 0.41=0.41|0.41| = 0.41, and 0.78>0.410.78 > 0.41, so study P shows the stronger linear relationship.
Compare signs for direction (b)
The sign gives the direction. Study P has a negative value, so study P shows the negative relationship.
Explain the separation (c)
Sign and magnitude answer different questions: the sign is only the direction, while the magnitude is the strength. A value of 0.78-0.78 sits closer to 1-1 than 0.410.41 sits to 11, so its points hug the line more tightly and the relationship is stronger, even though it slopes downward.
Check
Study P is both stronger (larger magnitude) and negative, which is consistent: strength and direction are read separately.
core4 marksA student records the hours of revision xx and the test mark yy (out of 8080) for six classmates: (1,45)(1, 45), (2,52)(2, 52), (3,49)(3, 49), (4,63)(4, 63), (5,58)(5, 58), (6,70)(6, 70). (a) Find rr using your calculator, correct to two decimal places. (b) Interpret the value in context.
Show worked solution →
Enter the pairs into statistics mode (a)
Clear old data, then enter x=1,2,3,4,5,6x = 1, 2, 3, 4, 5, 6 and y=45,52,49,63,58,70y = 45, 52, 49, 63, 58, 70 as six (x,y)(x, y) pairs.
Read the coefficient (a)
The calculator gives r=0.90r = 0.90 (the unrounded value is 0.89990.8999, rounded to two decimal places).
Interpret in context (b)
The sign is positive and r=0.90|r| = 0.90 lies in the 0.80.8 to 1.01.0 band, so there is a very strong, positive, linear relationship: more revision hours are associated with higher test marks.
Check
The marks generally climb as revision hours rise, with only small dips, so a large positive value near 0.90.9 is sensible.
core4 marksThe daily recreational screen time xx (in hours) and the nightly sleep yy (in hours) for six teenagers are: (1,9)(1, 9), (2,7.5)(2, 7.5), (3,8)(3, 8), (4,6)(4, 6), (5,6.5)(5, 6.5), (6,5)(6, 5). (a) Find rr to two decimal places. (b) Describe the direction and strength. (c) State what the negative sign means in plain words.
Show worked solution →
Enter the data (a)
In 22-variable statistics mode enter x=1,2,3,4,5,6x = 1, 2, 3, 4, 5, 6 and y=9,7.5,8,6,6.5,5y = 9, 7.5, 8, 6, 6.5, 5.
Read the coefficient (a)
The calculator returns r=0.92r = -0.92 (unrounded 0.9221-0.9221).
Describe direction and strength (b)
The sign is negative, so the direction is negative; r=0.92|r| = 0.92 is in the 0.80.8 to 1.01.0 band, so the relationship is very strong. Overall this is a very strong, negative, linear relationship.
Explain the sign (c)
The negative sign means the variables move in opposite directions: as screen time increases, nightly sleep tends to decrease.
Check
Sleep hours fall as screen time rises, so a strong negative value near 0.9-0.9 is expected, matching r=0.92r = -0.92.
core4 marksA health survey records hours spent outdoors per week xx and a vitamin D score yy for eight people: (1,3)(1, 3), (2,5)(2, 5), (3,4)(3, 4), (4,7)(4, 7), (5,6)(5, 6), (6,9)(6, 9), (7,7)(7, 7), (8,10)(8, 10). (a) Find rr to two decimal places. (b) Give the strength band. (c) Explain why rr is not exactly 11 even though the trend is clearly upward.
Show worked solution →
Enter the eight pairs (a)
In statistics mode enter x=1,2,3,4,5,6,7,8x = 1, 2, 3, 4, 5, 6, 7, 8 and y=3,5,4,7,6,9,7,10y = 3, 5, 4, 7, 6, 9, 7, 10.
Read the coefficient (a)
The calculator gives r=0.89r = 0.89 (unrounded 0.89190.8919).
Give the strength band (b)
The magnitude r=0.89|r| = 0.89 lies in the 0.80.8 to 1.01.0 band, so the linear relationship is very strong and positive.
Explain why it is not exactly 11 (c)
A coefficient of exactly 11 needs every point to sit on one straight line. Here the points rise overall but zig-zag slightly (for example yy dips from 55 down to 44, and from 99 down to 77), so the fit is very strong but not perfect, giving r=0.89r = 0.89 rather than 11.
Check
The upward trend matches the positive sign, and the small wobbles explain why the value is high but below 11.
core3 marksA teacher wonders whether shoe size is linked to a spelling-test mark. For seven students the shoe size xx and mark yy are: (1,6)(1, 6), (2,9)(2, 9), (3,5)(3, 5), (4,8)(4, 8), (5,5)(5, 5), (6,9)(6, 9), (7,6)(7, 6). (a) Find rr to two decimal places. (b) Interpret the result. (c) Does this mean shoe size has no effect on spelling skill?
Show worked solution →
Enter the data (a)
In statistics mode enter x=1,2,3,4,5,6,7x = 1, 2, 3, 4, 5, 6, 7 and y=6,9,5,8,5,9,6y = 6, 9, 5, 8, 5, 9, 6.
Read the coefficient (a)
The calculator returns r=0.00r = 0.00 (unrounded value 0.00000.0000).
Interpret the result (b)
With r=0.00|r| = 0.00 the value sits in the 0.00.0 to 0.20.2 band, so there is essentially no linear relationship between shoe size and spelling mark.
Answer the effect question (c)
A value near 00 tells us there is no straight-line link in this sample; it does not "prove" anything about cause. We would not expect shoe size to affect spelling, and the near-zero rr is consistent with two unrelated variables.
Check
The marks bounce up and down with no upward or downward drift as shoe size grows, so a coefficient close to 00 is exactly what we expect.
exam5 marksA used-car dealer records the age xx (in years) and the price yy (in thousands of dollars) of seven cars of the same model: (2,30)(2, 30), (3,24)(3, 24), (4,27)(4, 27), (5,18)(5, 18), (6,16)(6, 16), (7,12)(7, 12), (8,9)(8, 9). (a) Find rr to two decimal places. (b) Interpret the value fully. (c) A salesperson says "this proves that getting older causes a car to lose value". Comment on this claim.
Show worked solution →
Enter the data into statistics mode (a)
Clear old data, then enter x=2,3,4,5,6,7,8x = 2, 3, 4, 5, 6, 7, 8 and y=30,24,27,18,16,12,9y = 30, 24, 27, 18, 16, 12, 9 as seven pairs.
Read the coefficient (a)
The calculator gives r=0.97r = -0.97 (unrounded 0.9658-0.9658).
Interpret fully (b)
The sign is negative, so the direction is negative; r=0.97|r| = 0.97 is in the 0.80.8 to 1.01.0 band, so the relationship is very strong. In context: there is a very strong, negative, linear relationship, so older cars of this model tend to be cheaper.
Comment on the causal claim (c)
A strong rr shows association, not proof of cause. While age plausibly contributes here, rr alone cannot establish causation: other factors (kilometres driven, condition, demand) move with age and also affect price. State the link as an association and use cautious language.
Check
Price falls steadily as age rises, so a value very close to 1-1 is expected, agreeing with r=0.97r = -0.97.
exam5 marksA school investigates whether class attendance is linked to the final exam mark. For eight classes the average attendance xx (in days per term) and average mark yy (out of 100100) are: (5,42)(5, 42), (8,55)(8, 55), (12,51)(12, 51), (15,68)(15, 68), (18,72)(18, 72), (22,79)(22, 79), (26,88)(26, 88), (30,95)(30, 95). (a) Find rr to two decimal places. (b) Describe the relationship. (c) Explain what a student should look at before trusting this value.
Show worked solution →
Enter the eight pairs (a)
In statistics mode enter x=5,8,12,15,18,22,26,30x = 5, 8, 12, 15, 18, 22, 26, 30 and y=42,55,51,68,72,79,88,95y = 42, 55, 51, 68, 72, 79, 88, 95.
Read the coefficient (a)
The calculator gives r=0.98r = 0.98 (unrounded 0.97980.9798).
Describe the relationship (b)
The sign is positive and r=0.98|r| = 0.98 is in the 0.80.8 to 1.01.0 band, so there is a very strong, positive, linear relationship: higher attendance is associated with higher exam marks.
State the check before trusting it (c)
Always look at the scatterplot first. A high rr measures only linear association, so a curved pattern or a single outlier could distort the picture. Confirm the cloud genuinely follows a straight-line trend before relying on r=0.98r = 0.98.
Check
Marks climb steadily as attendance rises with only small wobbles, so a value very close to +1+1 is sensible.
exam6 marksTwo data sets are collected. Data set A pairs an advertising spend xx with weekly sales yy: (1,11)(1, 11), (2,13)(2, 13), (3,18)(3, 18), (4,21)(4, 21), (5,27)(5, 27), (6,30)(6, 30). Data set B pairs the hours of machine downtime xx with units produced yy: (1,30)(1, 30), (2,22)(2, 22), (3,24)(3, 24), (4,18)(4, 18), (5,15)(5, 15), (6,9)(6, 9). (a) Find rr for each set, correct to two decimal places. (b) State which relationship is stronger and justify your answer. (c) Describe each relationship in one sentence.
Show worked solution →
Find rr for data set A (a)
Enter x=1,2,3,4,5,6x = 1, 2, 3, 4, 5, 6 and y=11,13,18,21,27,30y = 11, 13, 18, 21, 27, 30 in statistics mode; the calculator gives rA=0.99r_A = 0.99 (unrounded 0.99290.9929).
Find rr for data set B (a)
Clear the lists, then enter x=1,2,3,4,5,6x = 1, 2, 3, 4, 5, 6 and y=30,22,24,18,15,9y = 30, 22, 24, 18, 15, 9; the calculator gives rB=0.96r_B = -0.96 (unrounded 0.9613-0.9613).
Compare magnitudes (b)
Strength is set by the size only: rA=0.99|r_A| = 0.99 and rB=0.96|r_B| = 0.96. Since 0.99>0.960.99 > 0.96, data set A has the stronger linear relationship, even though B is negative.
Describe each set (c)
Data set A: a very strong, positive, linear relationship (more advertising is associated with more sales). Data set B: a very strong, negative, linear relationship (more downtime is associated with fewer units produced).
Check
A rises together so its sign is positive, B falls so its sign is negative, and both clouds hug their lines tightly, so two large magnitudes are expected with A slightly larger.
exam5 marksA science class measures a quantity yy at seven settings of a control xx and records: (1,20)(1, 20), (2,11)(2, 11), (3,5)(3, 5), (4,2)(4, 2), (5,5)(5, 5), (6,11)(6, 11), (7,20)(7, 20). (a) Find rr to two decimal places. (b) The class concludes "there is no relationship between xx and yy". Explain why this conclusion is wrong. (c) State the lesson about using rr.
Show worked solution →
Enter the data (a)
In statistics mode enter x=1,2,3,4,5,6,7x = 1, 2, 3, 4, 5, 6, 7 and y=20,11,5,2,5,11,20y = 20, 11, 5, 2, 5, 11, 20.
Read the coefficient (a)
The calculator returns r=0.00r = 0.00 (unrounded 0.00000.0000).
Explain why the conclusion is wrong (b)
A value of r=0r = 0 means there is no linear relationship, not no relationship at all. Plotting the points shows a clear U-shape (a strong non-linear pattern): yy falls to a minimum at x=4x = 4 and then rises symmetrically. The fall on the left cancels the rise on the right, forcing rr to 00 even though the pattern is obvious.
State the lesson (c)
Pearson's rr measures linear association only, so always look at the scatterplot before trusting it. A small rr rules out a straight-line trend but not a curved one.
Check
The yy values are symmetric about x=4x = 4, so by symmetry the linear trend must cancel to give r=0r = 0, confirming the calculator value.
ExamExplained