Skip to main content
VICGeneral MathematicsSyllabus dot point

When a scatterplot is curved, how do the squared, log and reciprocal transformations straighten the data so a least-squares line can be fitted?

Recognise non-linear association from a scatterplot and residual plot, apply the squared, logarithmic or reciprocal transformation to the explanatory or response variable to linearise the data, fit a least-squares line to the transformed data, and use it to predict

A focused answer to the VCE General Mathematics Unit 3 Data analysis key-knowledge point on data transformation. Spotting curvature, the circle-of-transformations idea, applying the squared, log and reciprocal transformations, fitting a line to transformed data, and predicting back.

Generated by Claude Opus 4.77 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Spotting that a transformation is needed
  3. Choosing the transformation
  4. Fitting and predicting with a transformed model
  5. Reading the transformed equation back
  6. Why this matters for the exams

What this dot point is asking

VCAA wants you to handle bivariate data whose scatterplot is curved rather than linear. A least-squares line should only be fitted to a linear relationship, so when the residual plot shows a clear pattern you first transform one of the variables, using a squared, logarithmic or reciprocal transformation, to straighten the data. You then fit the least-squares line to the transformed data and use it to predict, remembering to undo the transformation at the end. This is the natural follow-on from correlation and regression.

Spotting that a transformation is needed

A single curved scatterplot, or a residual plot with a clear arch or U-shape rather than random scatter, signals that a straight line is the wrong model. Rather than abandon regression, you re-express one variable so that the relationship becomes linear.

Choosing the transformation

The direction the curve bends tells you which transformation to apply. Stretching the high end of the xx-axis (squaring xx) or compressing it (log or reciprocal of xx) shifts points to straighten the bulge. In the exam you are usually told which transformation to apply, or you pick the one that gives the better r2r^2 on the transformed data.

Fitting and predicting with a transformed model

Once a variable is transformed, treat the transformed quantity as a new variable and fit the least-squares line as normal.

Reading the transformed equation back

The fitted equation already contains the transformation, so prediction is just careful substitution. If the transformation was on the response variable, for example y=a+bx\sqrt{y} = a + bx written as y1/2y^{1/2} or with yy replaced by logy\log y, you must undo it at the end: square both sides, or raise 1010 to the power. Always check whether the transformation sits on xx or on yy before predicting.

Why this matters for the exams

Transformation questions appear most years and reward students who keep track of which variable was transformed and who undo it correctly when predicting. They build directly on correlation and least-squares regression: the residual plot is the trigger, the transformation is the fix, and the prediction is the payoff. Show the transformed value explicitly in your working so a marker can follow each step.

Exam-style practice questions

Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2023 VCAA1 marksA scatterplot of tree height (m) against age (years) is linearised using a logarithm (base 10) transformation applied to the variable age. The equation of the least squares line is height = -3.8 + 12.6 x log10(age). Using this equation, the age, in years, of a tree with a height of 8.52 m is closest to A. 7.9 B. 8.9 C. 9.1 D. 9.5 E. 9.9
Show worked answer →

Substitute height = 8.52 into the transformed equation and solve for age.

8.52 = -3.8 + 12.6 x log10(age).

12.6 x log10(age) = 8.52 + 3.8 = 12.32, so log10(age) = 12.32 / 12.6 = 0.97778.

age = 10^0.97778 = 9.50 years.

This is closest to 9.5, so the answer is D. Remember to undo the log by raising 10 to the power of both sides.

2025 VCAA1 marksA squared transformation is applied to the variable doctors (number per 1000 people) when modelling life expectancy in years, life. The equation of the least squares line fitted to this transformed data is of the form life = a + b x (doctors)^2. Using this equation, the predicted life, in years, for a country with two doctors per 1000 people is closest to A. 73.6 B. 74.0 C. 74.5 D. 74.9
Show worked answer →

Using the data table, the squared transformation creates a new explanatory variable (doctors)^2. Fitting a least squares line of life on (doctors)^2 with a calculator gives, to four significant figures, life = 63.12 + 2.842 x (doctors)^2.

To predict for two doctors per 1000 people, substitute doctors = 2, so (doctors)^2 = 4.

life = 63.12 + 2.842 x 4 = 63.12 + 11.37 = 74.5 years.

This is closest to 74.5, so the answer is C. The key step is squaring the value before multiplying by the slope.