Why does a strong association not prove that one variable causes the other?
Distinguish association from causation, identify confounding and coincidence, and place bivariate analysis within the statistical investigation process.
How to separate association from causation, explain confounding and coincidental correlation, and work through the four-step statistical investigation process for bivariate data.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
You must explain why association is not causation, name the alternatives (confounding and coincidence), and describe where bivariate analysis sits in the statistical investigation process.
Association versus causation
Association means the values of two variables tend to change together, measured by the correlation coefficient. Causation means changing one variable actually produces a change in the other. Correlation is evidence consistent with causation but is not proof of it.
The three explanations for a correlation
When two variables correlate, there are three possible explanations, and your answer should consider them.
- Causation. One variable genuinely affects the other.
- Confounding. A lurking third variable drives both.
- Coincidence. The correlation arose by chance, especially with a small sample.
Only a properly designed experiment that controls other variables can establish causation; observational bivariate data alone cannot.
The statistical investigation process
Bivariate analysis is taught inside the four-step statistical investigation process, which frames every data question.
- Pose a question about a possible relationship between two variables.
- Collect appropriate data, deciding which variable is explanatory and which is the response.
- Analyse the data with a scatterplot, the correlation coefficient, the least-squares line and a residual plot.
- Interpret the results in context and communicate them, stating limitations such as extrapolation and the association-causation distinction.
Communicating limitations
A complete interpretation reports the strength and direction of the association, the proportion of variation explained by , the reliability of any prediction (interpolation versus extrapolation), and a clear statement that association does not establish causation.