← Year 12: Statistical Analysis
How is the least-squares regression line calculated, and how is it used to model a linear relationship between two variables?
Find and use the equation of the least-squares regression line to model a linear relationship between two variables
A focused answer to the HSC Maths Standard 2 dot point on the least-squares regression line. The equation , finding the gradient and intercept using calculator statistics functions, interpreting the gradient in context, and worked Australian examples.
Have a quick question? Jump to the Q&A page
What this dot point is asking
NESA wants you to find the equation of the least-squares regression line using calculator statistics functions, write it in form, use it to predict from , and interpret the gradient and intercept in the context of the worded problem.
The answer
The least-squares regression line
For bivariate data, the least-squares regression line is the straight line that minimises the sum of the squared vertical distances from the data points to the line. It is the standard "best-fit" line for linear association.
The equation has the form:
where is the gradient and is the -intercept.
Finding the line on a calculator
You will not be asked to compute the gradient and intercept by hand. The procedure on a NESA-approved scientific calculator:
- Enter statistics mode (typically MODE STAT 2-VAR or similar).
- Enter the pairs into the statistical lists.
- Read off (sometimes labelled or B) and (sometimes labelled or ) from the regression-result menu.
- Read off (correlation coefficient) at the same time.
Different calculator models label these differently. Practise on the exact model you will use in the exam.
Predicting from IMATH_18
Substitute the value into the line equation. This is the model's predicted at that . The actual value may be slightly different; the line is the best linear fit, not a guarantee.
Interpreting the gradient
The gradient has units of (y units) per (x unit). For every unit increase in , changes by units on average.
Always include the word "average" or "on average" and the units in your answer. Markers reward this explicitly.
Interpreting the -intercept
The intercept is the predicted value when . In context this is sometimes meaningful (e.g. base salary at zero years of experience) and sometimes extrapolation (e.g. predicted food spending at zero income).
If lies well outside the dataset, comment that the intercept is an extrapolation and may be unreliable.
When to use the line
The least-squares line is appropriate when:
- The scatterplot suggests an approximately linear relationship.
- The correlation coefficient is moderately strong or stronger.
- There are no extreme outliers distorting the fit.
If the scatterplot is clearly non-linear, the line will be a poor model even if is not zero.
Past exam questions, worked
Real questions from past NESA papers on this dot point, with our answer explainer.
2022 HSC Q214 marksFor a dataset of pairs, calculator output gives gradient and intercept for the least-squares regression line. Write the equation, predict when , and interpret the gradient.Show worked answer →
Equation: .
At : .
Interpretation: for every increase of in , increases by (on average, according to the model).
Markers reward the equation, the substitution, and an interpretation of the gradient that uses the word "average" or "on average" to acknowledge the model is a best fit, not exact.
2023 HSC Q213 marksA linear model of weekly food spending (, \x\$80y = 0.18 x + 95y$-intercept in this context.Show worked answer →
Gradient : for every extra dollar of weekly income, the household spends, on average, an additional \0.1818%$ of each additional dollar of income goes to food.
-intercept : a household with zero income is predicted to spend \95\ income is well outside the dataset, so the intercept may not be reliable (extrapolation).
Markers reward the gradient interpretation in context with units, the intercept interpretation in context, and a brief caveat about extrapolation for the intercept.
Related dot points
- Construct and interpret scatterplots to describe the relationship between two variables in bivariate data
A focused answer to the HSC Maths Standard 2 dot point on scatterplots. Reading form, direction and strength of association, identifying outliers, and worked Australian examples using ABS-style economic and demographic data.
- Calculate and interpret Pearson's correlation coefficient using statistical technology, including the sign and magnitude
A focused answer to the HSC Maths Standard 2 dot point on Pearson's correlation coefficient. What measures, how to interpret its sign and magnitude, the limitations of in non-linear relationships, and how to compute it using calculator statistics functions.
- Distinguish between interpolation and extrapolation when using a regression line, and assess the reliability of predictions
A focused answer to the HSC Maths Standard 2 dot point on interpolation vs extrapolation. The reliability of predictions inside and outside the data range, examples of when extrapolation breaks down, and Australian-context worked examples.