Compute the test statistic for x = 27, _0 = 25, σ = 6, n = 36. [2 marks]

§-Syllabus dot point

VICSpecialist MathematicsSyllabus dot point

How do we test a claim about a population mean using a sample, and what does the p value tell us about the strength of the evidence?

Hypothesis testing for a population mean, the null and alternative hypotheses, one-tailed and two-tailed tests, the test statistic and its $p$ value, the comparison with a significance level, the decision and its interpretation, and the meaning of Type I and Type II errors

A focused answer to the VCE Specialist Mathematics Unit 4 key-knowledge point on hypothesis testing for a mean. Null and alternative hypotheses, one and two tailed tests, the test statistic and p value, the decision, and Type I and Type II errors, with a verified worked example.

Generated by Claude Opus 4.86 min answerUpdated 2026-05-29

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Quick answer

A hypothesis test for a mean starts with a null hypothesis $H_0: \mu = \mu_0$ and an alternative $H_1$ (one-tailed $\mu > \mu_0$ or $\mu < \mu_0$ , or two-tailed $\mu \ne \mu_0$ ). Assuming $H_0$ , the test statistic $z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$ measures how many standard errors the sample mean is from $\mu_0$ . The $p$ value is the probability, under $H_0$ , of a result at least as extreme as observed (doubled for a two-tailed test). If $p$ is below the significance level $\alpha$ (often $0.05$ ), reject $H_0$ ; otherwise do not reject. A Type I error is rejecting a true $H_0$ ; a Type II error is failing to reject a false $H_0$ .

Jump to a section

What this dot point is asking
Setting up the hypotheses
The test statistic and p value
The decision
Type I and Type II errors
Examples in context
Try this

What this dot point is asking

VCAA wants you to carry out a hypothesis test for a population mean: state the null and alternative hypotheses, choose a one-tailed or two-tailed test, compute the test statistic and its $p$ value, compare with the significance level, and state the decision and its meaning. You also need to understand Type I and Type II errors.

Setting up the hypotheses

The null hypothesis $H_0: \mu = \mu_0$ states the value being tested, usually a status quo or claimed value. The alternative hypothesis $H_1$ states what we suspect instead:

two-tailed: $H_1: \mu \ne \mu_0$ , when a difference in either direction matters;
one-tailed: $H_1: \mu > \mu_0$ or $H_1: \mu < \mu_0$ , when only one direction is of interest.

The choice of tails is made before seeing the data, from the question's wording.

The test statistic and p value

Assuming $H_0$ is true, the sample mean is approximately normal with mean $\mu_0$ and standard error $\frac{\sigma}{\sqrt{n}}$ . The standardised test statistic is

z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}},

the number of standard errors between the observed sample mean and the hypothesised mean. The $p$ value is the probability, computed under $H_0$ , of getting a result at least as extreme as the one observed:

one-tailed: $p = P(Z > z)$ or $P(Z < z)$ in the direction of $H_1$ ;
two-tailed: $p = 2\,P(Z > |z|)$ , doubling to count both tails.

A small $p$ value means the observed data would be unlikely if $H_0$ were true, which is evidence against $H_0$ .

The decision

Compare $p$ with the chosen significance level $\alpha$ (commonly $0.05$ ):

if $p < \alpha$ : reject $H_0$ in favour of $H_1$ (the result is statistically significant);
if $p \ge \alpha$ : do not reject $H_0$ (insufficient evidence).

State the conclusion in context, not just "reject" or "not reject".

Type I and Type II errors

Because the decision is based on a sample, it can be wrong:

a Type I error is rejecting $H_0$ when it is actually true; its probability is $\alpha$ ;
a Type II error is failing to reject $H_0$ when it is actually false.

Lowering $\alpha$ reduces the Type I error rate but raises the chance of a Type II error.

Worked example

A one-tailed test

A machine is set to fill bottles with a mean of $500\ \text{mL}$ , with known standard deviation $\sigma = 8\ \text{mL}$ . A sample of $n = 16$ bottles has mean $\bar{x} = 503\ \text{mL}$ . Test at the $5\%$ level whether the machine overfills.

Step 1: State the hypotheses

Overfilling means a mean above $500$ , so the alternative hypothesis points upward, giving a one-tailed upper test. The null hypothesis always states the claimed value with equality.

H_0: \mu = 500, \qquad H_1: \mu > 500 \ (\text{one-tailed, upper}).

Step 2: Compute the standard error

The standard error is the standard deviation of the sampling distribution of $\bar{x}$ under $H_0$ . Dividing $\sigma$ by $\sqrt{n}$ accounts for the fact that a sample mean varies less than individual observations.

\frac{\sigma}{\sqrt{n}} = \frac{8}{\sqrt{16}} = \frac{8}{4} = 2\ \text{mL}.

Step 3: Compute the test statistic

The test statistic measures how many standard errors the observed sample mean lies from the hypothesised mean. A large positive value provides evidence against $H_0$ in the direction of $H_1$ .

z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}} = \frac{503 - 500}{2} = \frac{3}{2} = 1.5.

Step 4: Find the p value

Because $H_1$ is one-tailed upper, the p value is the probability of observing a $z$ -score of $1.5$ or greater under $H_0$ . We use only the upper tail, not double it.

p = P(Z > 1.5) = 1 - 0.9332 = 0.0668.

Step 5: State the decision

Compare $p$ with the significance level $\alpha = 0.05$ and interpret in the context of the problem. Since $p = 0.0668 > \alpha = 0.05$ , we do not reject $H_0$ . There is insufficient evidence at the $5\%$ level to conclude the machine overfills. Note that $z = 1.5$ gives a p value just above $0.05$ , so the result is close to but not statistically significant; had the test been two-tailed, $p$ would have been $2 \times 0.0668 = 0.1336$ , also not significant.

Examples in context

Example 1. For a two-tailed test with $z = 2.1$ , $p = 2\,P(Z > 2.1) = 2(0.0179) = 0.0358 < 0.05$ , so reject $H_0$ .

Example 2. Setting $\alpha = 0.01$ instead of $0.05$ makes rejection harder, lowering the Type I error rate.

Try this

Q1. Write $H_0$ and a two-tailed $H_1$ for testing whether a mean equals $20$ . [2 marks]

Cue. $H_0: \mu = 20$ , $H_1: \mu \ne 20$ .

Q2. Compute the test statistic for $\bar{x} = 27$ , $\mu_0 = 25$ , $\sigma = 6$ , $n = 36$ . [2 marks]

Cue. $z = \frac{27 - 25}{6/6} = \frac{2}{1} = 2$ .

Q3. Define a Type I error. [1 mark]

Cue. Rejecting $H_0$ when it is actually true.

Exam-style practice questions

Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2023 VCAA1 marksIt is thought that the mean mass of adult male koalas in a forest is 12 kg. The ranger thinks that the true mean mass is less than this and decides to apply a one-tailed statistical test. Write down the null hypothesis, H0, and the alternative hypothesis, H1, for the test.

Show worked answer →

The null hypothesis is the "no change" claim about the population mean, stated with equality. The alternative hypothesis reflects what the ranger suspects.

Since the ranger thinks the true mean is less than 12 kg, this is a one-tailed (lower-tail) test.

H0: mu = 12 (the mean mass is 12 kg).

H1: mu < 12 (the mean mass is less than 12 kg).

The mean stated in the problem always goes into H0; the direction of the inequality in H1 ("less than") is what makes the test one-tailed.

2023 VCAA1 marksThe ranger applies a one-tailed test at the 1% level of significance, assuming the mass of adult male koalas is normally distributed with a mean of 12 kg and a standard deviation of 1 kg. A random sample of 40 koalas gives a sample mean of 11.6 kg. Find the p value for the test correct to four decimal places.

Show worked answer →

Under H0 the sample mean X-bar is normally distributed with mean 12 and standard error sigma / sqrt(n) = 1 / sqrt(40) = 0.15811.

The test statistic is z = (x-bar - mu) / standard error = (11.6 - 12) / 0.15811 = -0.4 / 0.15811 = -2.530.

Because H1 is mu < 12, the p value is the lower-tail probability:
p = Pr(X-bar < 11.6) = Pr(Z < -2.530) = 0.0057 (to four decimal places).

Since 0.0057 is less than the 1% significance level (0.01), this would lead to rejecting H0.

2025 VCAA1 marksThe volume of water dispensed into Apa bottles is normally distributed with a mean of 750 mL and a standard deviation of 5 mL. After a service, a random sample of 50 bottles gave a sample mean of 748 mL. The company claims the mean volume is now less than 750 mL. For a one-tailed test at the 1% level of significance, determine the p value for this test. Give your answer correct to four decimal places.

Show worked answer →

Under H0 (mu = 750), the sample mean X-bar is normal with mean 750 and standard error sigma / sqrt(n) = 5 / sqrt(50) = 0.70711.

The test statistic is z = (748 - 750) / 0.70711 = -2 / 0.70711 = -2.828.

The alternative hypothesis is mu < 750, so the p value is the lower-tail probability:
p = Pr(X-bar < 748) = Pr(Z < -2.828) = 0.0023 (to four decimal places).

As 0.0023 is less than 0.01, the evidence supports the company's claim that the mean volume has decreased.