Year 12: Statistical Analysis

NSWMaths Standard 2Syllabus dot point

How do scatterplots reveal the form, direction and strength of the relationship between two variables?

Construct and interpret scatterplots to describe the relationship between two variables in bivariate data

A focused answer to the HSC Maths Standard 2 dot point on scatterplots. Reading form, direction and strength of association, identifying outliers, and worked Australian examples using ABS-style economic and demographic data.

Generated by Claude OpusReviewed by Better Tuition Academy7 min answer

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to construct a scatterplot from a table of bivariate data, describe the form, direction and strength of the relationship between the two variables, and identify outliers and possible causes for them.

The answer

Scatterplot showing a strong positive linear association Approximately 18 data points clustered along an upward-sloping line, indicating a strong positive linear relationship between the variables on the two axes. x y strong positive linear

What a scatterplot is

A scatterplot is a graph of one variable against another with each data point plotted as a single dot. By convention:

  • IMATH_3 -axis: the independent (or explanatory) variable.
  • IMATH_4 -axis: the dependent (or response) variable.

For each (x,y)(x, y) pair in the data, plot one dot at that location. Do not connect the dots.

Describing the relationship

Three descriptors:

  • Form. Linear (points cluster around a straight line), non-linear (curve), or no clear pattern.
  • Direction. Positive (as xx increases, yy tends to increase), negative (as xx increases, yy tends to decrease), or no association.
  • Strength. Strong (points cluster tightly around a curve), moderate, weak, or no association.

State all three when describing a scatterplot. Markers expect explicit use of these words.

Outliers

An outlier is a point that lies far from the bulk of the data. Outliers can dramatically affect the calculated correlation coefficient and the regression line.

Causes:

  • Data error. Wrong digit entered, wrong unit.
  • Genuine atypical case. A real but rare observation (e.g. a household with unusual circumstances).
  • Subgroup effect. Two distinct populations on the same plot.

Decide whether to remove an outlier based on cause. Errors should be corrected or removed; genuine atypical cases should usually be reported and kept in.

Correlation does not imply causation

A strong positive correlation between xx and yy does not prove that xx causes yy. A third variable may explain both. Standard 2 expects you to mention this whenever a worded question invites a causal claim.

Reading scatterplots quickly

Pattern checklist:

  • Straight line going up-right: positive linear.
  • Straight line going down-right: negative linear.
  • Curve: non-linear.
  • Cloud with no pattern: no association.
  • Two distinct clouds: subgroup effect, often best modelled as two separate relationships.

Past exam questions, worked

Real questions from past NESA papers on this dot point, with our answer explainer.

2022 HSC Q143 marksA scatterplot of xx (years of schooling) and yy (weekly income, \)showsthepointsclusteredcloselyaroundanupwardslopingline.Describetheform,directionandstrengthoftherelationshipbetween) shows the points clustered closely around an upward-sloping line. Describe the form, direction and strength of the relationship between xand and y$.
Show worked answer →

Form: linear (points cluster around a straight line).

Direction: positive (as years of schooling increase, weekly income tends to increase).

Strength: strong (points cluster closely; little scatter around the line).

Markers reward all three descriptors and the use of the words "linear", "positive", "strong" with brief justification from the plot.

2023 HSC Q164 marksA scatterplot of household income (xx, in \000peryear)againsthoursofrecreationalTVviewingperweek( per year) against hours of recreational TV viewing per week (y)for) for 50Australianhouseholdsshowsamoderatelystrongdownwardtrend.Thereisoneobviousoutlierat Australian households shows a moderately strong downward trend. There is one obvious outlier at (\3000030000 per year, 55 hours per week)). Describe the relationship and discuss the outlier.
Show worked answer →

Form: roughly linear (the trend is steady, not curved).

Direction: negative (higher income associated with less TV).

Strength: moderate (not all points sit tightly on a line, but a clear trend).

Outlier: at (\30000, 5),watchingonly, watching only 5$ hours despite low income, well below the trend line. Possible explanations include working multiple jobs, having young children, or another lifestyle factor. The outlier should be considered for removal if it is a data error or a genuinely atypical case; otherwise it stays in the analysis.

Markers reward all three descriptors, identification of the outlier, and a brief sensible interpretation.

Related dot points