Skip to main content
VICSpecialist MathematicsSyllabus dot point

How do the mean and variance of linear combinations of random variables behave, and how do we use a sample mean to test a hypothesis about a population mean?

Linear combinations of independent random variables and their mean and variance, the distribution of the sample mean Xˉ\bar{X}, the construction of confidence intervals for a population mean, and hypothesis testing for the mean using a pp value

A focused answer to the VCE Specialist Mathematics Unit 4 key-knowledge point on linear combinations and statistical inference. Mean and variance of linear combinations, the distribution of the sample mean, confidence intervals, and hypothesis testing with p values, with a verified worked example.

Generated by Claude Opus 4.77 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Mean and variance of linear combinations
  3. The distribution of the sample mean
  4. Confidence intervals for a population mean
  5. Hypothesis testing for a mean
  6. Why independence and squaring matter
  7. Examples in context
  8. Try this

What this dot point is asking

VCAA wants you to find the mean and variance of linear combinations of independent random variables, to use the resulting distribution of the sample mean Xˉ\bar{X}, to construct a confidence interval for a population mean, and to carry out a hypothesis test for a mean by computing a test statistic and a pp value. This is the statistical inference strand of Specialist, distinct from the proportion-based inference in Mathematical Methods.

Mean and variance of linear combinations

For any random variables, expectation is linear:

E(aX+bY)=aE(X)+bE(Y),E(aX + bY) = aE(X) + bE(Y),

whether or not XX and YY are independent. Variance, however, behaves differently. For independent XX and YY,

Var(aX+bY)=a2Var(X)+b2Var(Y).\mathrm{Var}(aX + bY) = a^2\mathrm{Var}(X) + b^2\mathrm{Var}(Y).

The coefficients are squared, and the cross term vanishes because independence makes the covariance zero. Note that even for a difference, variances add: Var(XY)=Var(X)+Var(Y)\mathrm{Var}(X - Y) = \mathrm{Var}(X) + \mathrm{Var}(Y), since (1)2=1(-1)^2 = 1.

The distribution of the sample mean

Take a random sample X1,X2,,XnX_1, X_2, \dots, X_n of independent observations from a population with mean μ\mu and standard deviation σ\sigma. The sample mean is Xˉ=1nXi\bar{X} = \frac{1}{n}\sum X_i. Applying the linear-combination rules:

E(Xˉ)=μ,Var(Xˉ)=σ2n,sd(Xˉ)=σn.E(\bar{X}) = \mu, \qquad \mathrm{Var}(\bar{X}) = \frac{\sigma^2}{n}, \qquad \mathrm{sd}(\bar{X}) = \frac{\sigma}{\sqrt{n}}.

So the sample mean is unbiased for μ\mu, and its spread shrinks as nn grows. By the central limit theorem, for large nn the distribution of Xˉ\bar{X} is approximately normal, XˉN ⁣(μ,σ2n)\bar{X} \sim N\!\left(\mu, \frac{\sigma^2}{n}\right), regardless of the population's shape. The quantity σn\frac{\sigma}{\sqrt{n}} is the standard error of the mean.

Confidence intervals for a population mean

An approximate C%C\% confidence interval for μ\mu, when σ\sigma is known and nn is large, is

xˉ±zσn,\bar{x} \pm z\,\frac{\sigma}{\sqrt{n}},

where zz is the standard normal value capturing the central C%C\%. For a 95% interval, z1.96z \approx 1.96. The interval is a range of plausible values for μ\mu; the confidence level refers to the long-run proportion of such intervals that would contain the true mean, not the probability that μ\mu lies in this one fixed interval.

Hypothesis testing for a mean

A test of the mean compares a null hypothesis H0:μ=μ0H_0: \mu = \mu_0 against an alternative. The steps:

  1. State H0:μ=μ0H_0: \mu = \mu_0 and the alternative H1H_1 (one-sided, μ>μ0\mu > \mu_0 or μ<μ0\mu < \mu_0, or two-sided μμ0\mu \neq \mu_0).
  2. Compute the test statistic z=xˉμ0σ/nz = \dfrac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}, which measures how many standard errors the observed mean sits from μ0\mu_0.
  3. Find the pp value: the probability, assuming H0H_0 is true, of observing a sample mean at least as extreme as xˉ\bar{x}.
  4. Compare the pp value with the significance level α\alpha (commonly 0.050.05). If p<αp < \alpha, reject H0H_0; otherwise do not reject it.

Why independence and squaring matter

The single most error-prone step is the variance rule. Means combine linearly with their coefficients, but variances combine with the squares of the coefficients, and only when the variables are independent. This is why doubling a measurement (2X2X) quadruples its variance, and why averaging nn independent values divides the variance by nn rather than by n\sqrt{n}. Keeping variance and standard deviation distinct (one is the square of the other) prevents most slips in this strand.

Examples in context

Example 1. Variance of a sum. If XX and YY are independent with Var(X)=4\mathrm{Var}(X) = 4 and Var(Y)=9\mathrm{Var}(Y) = 9, then Var(2X+Y)=4(4)+1(9)=25\mathrm{Var}(2X + Y) = 4(4) + 1(9) = 25, so sd=5\mathrm{sd} = 5.

Example 2. Confidence interval. A sample of n=25n = 25 with xˉ=50\bar{x} = 50 and known σ=10\sigma = 10 gives standard error 105=2\frac{10}{5} = 2, so a 95% interval is 50±1.96(2)=50±3.9250 \pm 1.96(2) = 50 \pm 3.92, that is (46.08,53.92)(46.08, 53.92).

Try this

Q1. XX and YY are independent with Var(X)=5\mathrm{Var}(X) = 5, Var(Y)=3\mathrm{Var}(Y) = 3. Find Var(XY)\mathrm{Var}(X - Y). [2 marks]

  • Cue. Variances add: 5+3=85 + 3 = 8.

Q2. A sample of n=36n = 36 from a population with σ=12\sigma = 12 has mean xˉ=70\bar{x} = 70. State the standard error of the mean. [1 mark]

  • Cue. 1236=126=2\frac{12}{\sqrt{36}} = \frac{12}{6} = 2.

Q3. For a test of H0:μ=100H_0: \mu = 100 against H1:μ100H_1: \mu \neq 100, the test statistic is z=2z = 2. State the pp value. [2 marks]

  • Cue. Two-sided: p=2P(Z>2)2(0.0228)=0.0456p = 2\,P(Z > 2) \approx 2(0.0228) = 0.0456.

Exam-style practice questions

Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2023 VCAA2 marksThe time, Xc minutes, taken to drive to a station is normal with mean 20 and standard deviation 6. The waiting time, Xw minutes, for a train is normal with mean 8 and standard deviation 3. The time, Xt minutes, on the train is normal with mean 12 and standard deviation 5. The three times are independent. Find the mean and standard deviation of the total time, in minutes, it takes to travel from home to the city.
Show worked answer →

The total time is the linear combination T = Xc + Xw + Xt.

Mean: for a sum, means add (regardless of independence). E(T) = 20 + 8 + 12 = 40 minutes.

Variance: because the three times are independent, variances add. Var(T) = 6^2 + 3^2 + 5^2 = 36 + 9 + 25 = 70.

Standard deviation: sd(T) = sqrt(Var(T)) = sqrt(70), which is about 8.37 minutes.

So the total travel time has mean 40 minutes and standard deviation sqrt(70) minutes (approximately 8.37 minutes). A common error is to add the standard deviations (6 + 3 + 5 = 14); only the variances add.