Topic 3: Statistical inference
Understand the distribution of the sample mean, apply the central limit theorem to describe its shape, mean and standard deviation, and use these to compute probabilities for sample means drawn from a population
A focused answer to the QCE Specialist Mathematics Unit 4 dot point on the sampling distribution of the mean. Covers the mean and standard error of the sample mean, the central limit theorem, standardising to compute probabilities, and how sample size affects spread, with a verified worked example and the standard-error mistake QCAA markers watch for.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
What this dot point is asking
QCAA wants you to understand that the sample mean is itself a random variable with its own distribution, to state and apply the central limit theorem, and to use the resulting normal model to compute probabilities about sample means. This is the foundation of statistical inference, assessed in IA3 and the external assessment, and it precedes confidence-interval work.
The answer
The sample mean is random
If you take a random sample of size from a population and compute its mean , a different sample gives a different mean. So is a random variable. Its distribution is called the sampling distribution of the mean, and it has its own mean and standard deviation.
Mean and standard error
If the population has mean and standard deviation , then for a sample of size :
The standard deviation of the sample mean, , is called the standard error. It is smaller than the population standard deviation, and it shrinks as grows: larger samples give more reliable estimates of . Crucially the divisor is , not .
The central limit theorem
The central limit theorem states that for a sufficiently large sample size , the distribution of the sample mean is approximately normal,
regardless of the shape of the original population. If the population is already normal, is exactly normal for any . A common rule of thumb takes as large enough for the approximation when the population is not too skewed.
Standardising to find probabilities
To compute a probability for , standardise using the standard error:
This has the standard normal distribution , so probabilities follow from the normal model. The only change from a single-observation calculation is dividing by rather than .
Effect of sample size
Because the standard error is , quadrupling the sample size halves the spread of . This is why larger samples produce tighter estimates and narrower confidence intervals: the sampling distribution concentrates around .
Exam-style practice questions
Practice questions written in the style of QCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
2024 QCAA5 marksThe height of Year 12 students at a school is normally distributed, with a mean height of 168.6 cm and standard deviation of 12.7 cm. The heights of a random sample of 20 of these students are recorded. a) Explain why it can be assumed that the sample means for random samples of the heights of students from this school are normally distributed. b) Determine the probability that the mean height of this sample will be greater than 170 cm. There is a 75% probability that the mean height of this sample will lie within +/- h cm of the population mean. c) Determine P(X-bar greater than or equal to 168.6 + h). d) Use your result from c) to determine the value of h.Show worked answer →
The sampling distribution has mean mu = 168.6 and standard error sigma/sqrt(n) = 12.7/sqrt(20) = 2.840 cm.
a) [1 mark] Because the underlying population of heights is itself normally distributed, the distribution of the sample mean is normal for any sample size (it does not need to rely on a large n via the central limit theorem here).
b) [2 marks] Standardise: z = (170 - 168.6) / 2.840 = 0.493. P(X-bar > 170) = P(Z > 0.493) = 0.31.
c) [1 mark] The central 75% lies within +/- h, so the two tails together carry 25%, i.e. 12.5% in each tail. Hence P(X-bar >= 168.6 + h) = 0.125.
d) [1 mark] The upper critical z for a 0.125 upper tail is z = 1.150. So h = z * standard error = 1.150 * 2.840 = 3.27 cm.
2023 QCAA7 marksThe travel time for students attending a certain university is assumed to be normally distributed, with a population mean of 25.2 minutes and standard deviation of 4.7 minutes. Travel times are collected from a random sample of 120 of these students and used to calculate a sample mean, X-bar, in minutes. a) Determine P(24.5 less than or equal to X-bar less than or equal to 25.9). b) Given P(X-bar less than or equal to k) = 0.8, determine the value of k. Travel times are collected from a second random sample and used to calculate a second sample mean. c) Given P(X-bar less than or equal to 24.6) = 0.05, determine the number of students in the second sample.Show worked answer →
For the first sample, standard error = 4.7 / sqrt(120) = 0.4291 minutes; mean = 25.2.
a) [2 marks] z-values: (24.5 - 25.2)/0.4291 = -1.631 and (25.9 - 25.2)/0.4291 = 1.631. P(-1.631 <= Z <= 1.631) = 0.897.
b) [1 mark] For a lower area of 0.8, z = 0.8416. So k = 25.2 + 0.8416 * 0.4291 = 25.56 minutes.
c) [4 marks] For the second sample, P(X-bar <= 24.6) = 0.05 puts 24.6 at the 5th percentile, z = -1.645.
So 24.6 = 25.2 + (-1.645)(4.7/sqrt(n)), giving -0.6 = -1.645 * 4.7 / sqrt(n).
sqrt(n) = 1.645 * 4.7 / 0.6 = 12.886, so n = 166.06. The number of students must be a whole sample size, so n = 166.