Skip to main content
QLDGeneral MathematicsSyllabus dot point

Topic 1: Bivariate data analysis - how do we straighten a curved relationship so a least-squares line can be fitted?

Apply a square, logarithmic or reciprocal transformation to one variable to linearise a non-linear association, fit a least-squares line to the transformed data, use the transformed equation to predict, and choose the transformation that best straightens the scatter

A focused answer to the QCE General Mathematics Unit 3 dot point on data transformation. Covers when to transform, the square, log and reciprocal transformations, how to fit and use a least-squares line on transformed data, and how to predict by back-substituting, with arithmetic-verified worked examples for IA2 and the external assessment.

Generated by Claude Opus 4.76 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

What this dot point is asking

QCAA wants you to handle bivariate data whose scatter is clearly curved, where fitting a straight line directly would be wrong. The fix is to transform one of the variables (square it, take its logarithm, or take its reciprocal) so the relationship straightens out, then fit a least-squares line to the transformed data and use that line to predict. You also have to choose which transformation does the best straightening. This is the natural follow-on from residual analysis in Unit 3 Topic 1 and is regularly tested in IA1, IA2 and the external assessment.

The answer

Why transform at all

Least-squares regression only describes a straight-line relationship. When a scatterplot or residual plot shows a smooth curve, a straight line fitted to the raw data gives biased predictions. Rather than abandon regression, you change the scale of one variable so the curve becomes a line. This is called linearising the data.

The three transformations

In General Mathematics you choose from three transformations, applied to either the explanatory variable xx or the response variable yy.

  • The squared transformation (x2x^2 or y2y^2) stretches the upper end of a variable. It straightens data that curves upward more and more steeply.
  • The logarithmic transformation (logx\log x or logy\log y) compresses the upper end. It straightens data that rises quickly then flattens, or data that grows by a roughly constant percentage.
  • The reciprocal transformation (1/x1/x or 1/y1/y) strongly compresses large values and is used for data that drops steeply and then levels off towards an asymptote.

Choosing the transformation

You pick the transformation that makes the transformed scatterplot look most like a straight line. In practice you compare residual plots or the value of r2r^2 for the candidate transformations and select the one with the most random residuals and the highest r2r^2. The transformation can be applied to either axis; sometimes squaring xx works while sometimes taking logy\log y works, so test rather than guess.

Fitting and predicting

Once transformed, treat the new variable exactly like ordinary data: fit the least-squares line on CAS. The fitted equation is written in terms of the transformed variable, for example

y^=a+bx2orlogy^=a+bx.\hat{y} = a + b x^2 \qquad \text{or} \qquad \widehat{\log y} = a + b x.

To predict, substitute the value into the transformed equation, then undo any transformation on the response variable. If you transformed yy to logy\log y, you must take the antilog (1010 to the power) at the end to return to the original units.