Skip to main content
VICSpecialist MathematicsSyllabus dot point

How is the sample mean distributed across repeated samples, and why does the central limit theorem make it approximately normal?

The distribution of the sample mean Xˉ\bar{X} as a random variable, its mean and standard deviation (the standard error), the effect of sample size, and the central limit theorem giving the approximate normality of Xˉ\bar{X} for large samples

A focused answer to the VCE Specialist Mathematics Unit 4 key-knowledge point on the distribution of the sample mean. Its mean and standard error, the effect of sample size, and the central limit theorem, with a verified worked example.

Generated by Claude Opus 4.76 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. The sample mean is a random variable
  3. Mean and standard error
  4. The effect of sample size
  5. The central limit theorem
  6. Examples in context
  7. Try this

What this dot point is asking

VCAA wants you to understand that the sample mean Xˉ\bar{X} is itself a random variable with its own distribution, to know its mean and standard deviation (the standard error), to see how increasing the sample size sharpens that distribution, and to state and use the central limit theorem that makes Xˉ\bar{X} approximately normal for large samples. This underpins confidence intervals and hypothesis testing.

The sample mean is a random variable

Take a random sample X1,X2,,XnX_1, X_2, \dots, X_n of size nn from a population. The sample mean

Xˉ=X1+X2++Xnn\bar{X} = \frac{X_1 + X_2 + \cdots + X_n}{n}

depends on which sample is drawn, so it varies from sample to sample: it is a random variable with its own distribution, called the sampling distribution of the mean.

Mean and standard error

Suppose the population has mean μ\mu and standard deviation σ\sigma, and the observations are independent. Using the rules for linear combinations of independent random variables:

E(Xˉ)=μ,Var(Xˉ)=σ2n,sd(Xˉ)=σn.E(\bar{X}) = \mu, \qquad \text{Var}(\bar{X}) = \frac{\sigma^2}{n}, \qquad \text{sd}(\bar{X}) = \frac{\sigma}{\sqrt{n}}.

The mean of Xˉ\bar{X} equals the population mean, so Xˉ\bar{X} is an unbiased estimator of μ\mu. Its standard deviation, called the standard error, is σn\frac{\sigma}{\sqrt{n}}, smaller than the population standard deviation by a factor of n\sqrt{n}.

The effect of sample size

Because the standard error is σn\frac{\sigma}{\sqrt{n}}, larger samples give a sample mean clustered more tightly around μ\mu. The shrinking is by 1n\frac{1}{\sqrt{n}}, not 1n\frac{1}{n}: to halve the standard error you must quadruple the sample size. This diminishing return is why very precise estimates need large samples.

The central limit theorem

The central limit theorem states that, for a sufficiently large sample size nn, the distribution of the sample mean is approximately normal:

XˉN ⁣(μ,σ2n),\bar{X} \approx N\!\left(\mu, \frac{\sigma^2}{n}\right),

regardless of the shape of the population distribution. So even if the population is skewed or discrete, the sample mean of a large sample behaves normally. This is what lets us use normal-based methods (confidence intervals, hypothesis tests) for the mean. If the population is itself normal, Xˉ\bar{X} is exactly normal for any nn.

Examples in context

Example 1. Quadrupling nn from 2525 to 100100 changes the standard error from σ5\frac{\sigma}{5} to σ10\frac{\sigma}{10}, halving it.

Example 2. For a normal population Xˉ\bar{X} is exactly normal for any nn, so no large-sample approximation is needed.

Try this

Q1. State the mean and standard error of Xˉ\bar{X} for μ=20\mu = 20, σ=4\sigma = 4, n=16n = 16. [2 marks]

  • Cue. Mean 2020, standard error 416=1\frac{4}{\sqrt{16}} = 1.

Q2. By what factor must nn increase to halve the standard error? [1 mark]

  • Cue. A factor of 44.

Q3. State the central limit theorem in one sentence. [2 marks]

  • Cue. For large nn, Xˉ\bar{X} is approximately normal with mean μ\mu and standard error σn\frac{\sigma}{\sqrt{n}}, whatever the population shape.

Exam-style practice questions

Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2025 VCAA1 marksThe volume of water, V mL, consumed by a student during a school day is normally distributed with a mean of 1000 mL and a standard deviation of 80 mL. Write down the mean and standard deviation of the sampling distribution for the average volume of water consumed by randomly selected samples of 25 students. Give your answers in millilitres.
Show worked answer →

For the sample mean X-bar of a sample of size n drawn from a population with mean mu and standard deviation sigma, the sampling distribution has mean mu and standard deviation (the standard error) sigma / sqrt(n).

Here mu = 1000, sigma = 80 and n = 25.

Mean of the sampling distribution: E(X-bar) = mu = 1000 mL.

Standard deviation of the sampling distribution: sigma / sqrt(n) = 80 / sqrt(25) = 80 / 5 = 16 mL.

So the average volume for samples of 25 has mean 1000 mL and standard deviation 16 mL.

2023 VCAA2 marksThe waiting time, Xw minutes, for a train is normally distributed with a mean of 8 minutes and a standard deviation of 3 minutes. The probability that, for 12 randomly chosen work days, the average waiting time is between 7 minutes 45 seconds and 8 minutes 30 seconds is equivalent to Pr(a < Z < b), where Z ~ N(0, 1). Find the values of a and b.
Show worked answer →

The sample mean of n = 12 waiting times is approximately normal with mean 8 and standard error sigma / sqrt(n) = 3 / sqrt(12) = sqrt(3) / 2 (about 0.8660).

Convert each bound to a z-score using z = (x - mu) / standard error. First write the times as decimals: 7 minutes 45 seconds = 7.75 minutes and 8 minutes 30 seconds = 8.5 minutes.

Lower bound: a = (7.75 - 8) / (sqrt(3) / 2) = (-0.25) / 0.8660 = -sqrt(3) / 6, which is about -0.289.

Upper bound: b = (8.5 - 8) / (sqrt(3) / 2) = 0.5 / 0.8660 = sqrt(3) / 3, which is about 0.577.

So a = -sqrt(3) / 6 (about -0.289) and b = sqrt(3) / 3 (about 0.577).

2023 VCAA1 marksA company accountant knows that the amount owed on any individual unpaid invoice is normally distributed with a mean of 800andastandarddeviationof800 and a standard deviation of 200. What is the probability, correct to three decimal places, that in a random sample of 16 unpaid invoices the total amount owed is more than $13 500? A. 0.087 B. 0.191 C. 0.413 D. 0.587 E. 0.809
Show worked answer →

The answer is B.

The total T of 16 independent invoices is normally distributed. Its mean is 16 . 800 = 12 800, and its variance is 16 . 200^2 = 640 000, so its standard deviation is sqrt(640 000) = 800.

Standardise the value 13 500: z = (13 500 - 12 800) / 800 = 700 / 800 = 0.875.

Then Pr(T > 13 500) = Pr(Z > 0.875) = 1 - 0.8092 = 0.191 (to three decimal places), which is option B.