Skip to main content
WAMathematics ApplicationsSyllabus dot point

Why does a strong association not prove that one variable causes the other?

Distinguish association from causation, identify confounding and coincidence, and place bivariate analysis within the statistical investigation process.

How to separate association from causation, explain confounding and coincidental correlation, and work through the four-step statistical investigation process for bivariate data.

Generated by Claude Opus 4.76 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Association versus causation
  3. The three explanations for a correlation
  4. The statistical investigation process
  5. Communicating limitations

What this dot point is asking

You must explain why association is not causation, name the alternatives (confounding and coincidence), and describe where bivariate analysis sits in the statistical investigation process.

Association versus causation

Association means the values of two variables tend to change together, measured by the correlation coefficient. Causation means changing one variable actually produces a change in the other. Correlation is evidence consistent with causation but is not proof of it.

The three explanations for a correlation

When two variables correlate, there are three possible explanations, and your answer should consider them.

  • Causation. One variable genuinely affects the other.
  • Confounding. A lurking third variable drives both.
  • Coincidence. The correlation arose by chance, especially with a small sample.

Only a properly designed experiment that controls other variables can establish causation; observational bivariate data alone cannot.

The statistical investigation process

Bivariate analysis is taught inside the four-step statistical investigation process, which frames every data question.

  • Pose a question about a possible relationship between two variables.
  • Collect appropriate data, deciding which variable is explanatory and which is the response.
  • Analyse the data with a scatterplot, the correlation coefficient, the least-squares line and a residual plot.
  • Interpret the results in context and communicate them, stating limitations such as extrapolation and the association-causation distinction.

Communicating limitations

A complete interpretation reports the strength and direction of the association, the proportion of variation explained by r2r^2, the reliability of any prediction (interpolation versus extrapolation), and a clear statement that association does not establish causation.