How do we put a number on the strength and direction of a linear association?
Calculate and interpret Pearson's correlation coefficient r and the coefficient of determination r squared, and state their limitations.
How to calculate Pearson's r with technology, read its sign and size, convert to the coefficient of determination, interpret the proportion of variation explained, and respect the limits of both measures.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
You must compute (almost always with technology), describe what its sign and size mean, find and interpret , and know when neither number should be trusted.
Reading Pearson's r
The correlation coefficient satisfies .
- The sign matches the direction: positive for a positive association, negative for a negative one.
- The size (distance from zero) matches the strength: values near are strong, near are weak.
In practice you read from your calculator's regression output after entering the two lists; you are rarely asked to compute it by hand.
The coefficient of determination
The coefficient of determination is . It is the fraction (or percentage) of the variation in the response variable that is explained by the linear relationship with the explanatory variable.
Because it is squared, is always between and and loses the sign. You quote the direction from and the explained proportion from .
Limitations of both measures
Both and describe only linear association and only over the range of the data.
- A curved relationship can give a small even though the variables are strongly related; the relationship is just not linear.
- An outlier can pull towards or away from zero, so always check the scatterplot.
- Neither number proves causation. A strong between two variables can be produced by a lurking third variable.