How do we test a claim about a population mean using a sample, and what does the p value tell us about the strength of the evidence?
Hypothesis testing for a population mean, the null and alternative hypotheses, one-tailed and two-tailed tests, the test statistic and its value, the comparison with a significance level, the decision and its interpretation, and the meaning of Type I and Type II errors
A focused answer to the VCE Specialist Mathematics Unit 4 key-knowledge point on hypothesis testing for a mean. Null and alternative hypotheses, one and two tailed tests, the test statistic and p value, the decision, and Type I and Type II errors, with a verified worked example.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
VCAA wants you to carry out a hypothesis test for a population mean: state the null and alternative hypotheses, choose a one-tailed or two-tailed test, compute the test statistic and its value, compare with the significance level, and state the decision and its meaning. You also need to understand Type I and Type II errors.
Setting up the hypotheses
The null hypothesis states the value being tested, usually a status quo or claimed value. The alternative hypothesis states what we suspect instead:
- two-tailed: , when a difference in either direction matters;
- one-tailed: or , when only one direction is of interest.
The choice of tails is made before seeing the data, from the question's wording.
The test statistic and p value
Assuming is true, the sample mean is approximately normal with mean and standard error . The standardised test statistic is
the number of standard errors between the observed sample mean and the hypothesised mean. The value is the probability, computed under , of getting a result at least as extreme as the one observed:
- one-tailed: or in the direction of ;
- two-tailed: , doubling to count both tails.
A small value means the observed data would be unlikely if were true, which is evidence against .
The decision
Compare with the chosen significance level (commonly ):
- if : reject in favour of (the result is statistically significant);
- if : do not reject (insufficient evidence).
State the conclusion in context, not just "reject" or "not reject".
Type I and Type II errors
Because the decision is based on a sample, it can be wrong:
- a Type I error is rejecting when it is actually true; its probability is ;
- a Type II error is failing to reject when it is actually false.
Lowering reduces the Type I error rate but raises the chance of a Type II error.
Examples in context
Example 1. For a two-tailed test with , , so reject .
Example 2. Setting instead of makes rejection harder, lowering the Type I error rate.
Try this
Q1. Write and a two-tailed for testing whether a mean equals . [2 marks]
- Cue. , .
Q2. Compute the test statistic for , , , . [2 marks]
- Cue. .
Q3. Define a Type I error. [1 mark]
- Cue. Rejecting when it is actually true.
Exam-style practice questions
Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
2023 VCAA1 marksIt is thought that the mean mass of adult male koalas in a forest is 12 kg. The ranger thinks that the true mean mass is less than this and decides to apply a one-tailed statistical test. Write down the null hypothesis, H0, and the alternative hypothesis, H1, for the test.
Show worked answer →
The null hypothesis is the "no change" claim about the population mean, stated with equality. The alternative hypothesis reflects what the ranger suspects.
Since the ranger thinks the true mean is less than 12 kg, this is a one-tailed (lower-tail) test.
H0: mu = 12 (the mean mass is 12 kg).
H1: mu < 12 (the mean mass is less than 12 kg).
The mean stated in the problem always goes into H0; the direction of the inequality in H1 ("less than") is what makes the test one-tailed.
2023 VCAA1 marksThe ranger applies a one-tailed test at the 1% level of significance, assuming the mass of adult male koalas is normally distributed with a mean of 12 kg and a standard deviation of 1 kg. A random sample of 40 koalas gives a sample mean of 11.6 kg. Find the p value for the test correct to four decimal places.
Show worked answer →
Under H0 the sample mean X-bar is normally distributed with mean 12 and standard error sigma / sqrt(n) = 1 / sqrt(40) = 0.15811.
The test statistic is z = (x-bar - mu) / standard error = (11.6 - 12) / 0.15811 = -0.4 / 0.15811 = -2.530.
Because H1 is mu < 12, the p value is the lower-tail probability:
p = Pr(X-bar < 11.6) = Pr(Z < -2.530) = 0.0057 (to four decimal places).
Since 0.0057 is less than the 1% significance level (0.01), this would lead to rejecting H0.
2025 VCAA1 marksThe volume of water dispensed into Apa bottles is normally distributed with a mean of 750 mL and a standard deviation of 5 mL. After a service, a random sample of 50 bottles gave a sample mean of 748 mL. The company claims the mean volume is now less than 750 mL. For a one-tailed test at the 1% level of significance, determine the p value for this test. Give your answer correct to four decimal places.
Show worked answer →
Under H0 (mu = 750), the sample mean X-bar is normal with mean 750 and standard error sigma / sqrt(n) = 5 / sqrt(50) = 0.70711.
The test statistic is z = (748 - 750) / 0.70711 = -2 / 0.70711 = -2.828.
The alternative hypothesis is mu < 750, so the p value is the lower-tail probability:
p = Pr(X-bar < 748) = Pr(Z < -2.828) = 0.0023 (to four decimal places).
As 0.0023 is less than 0.01, the evidence supports the company's claim that the mean volume has decreased.