How do the mean and variance of linear combinations of random variables behave, and how do we use a sample mean to test a hypothesis about a population mean?
Linear combinations of independent random variables and their mean and variance, the distribution of the sample mean , the construction of confidence intervals for a population mean, and hypothesis testing for the mean using a value
A focused answer to the VCE Specialist Mathematics Unit 4 key-knowledge point on linear combinations and statistical inference. Mean and variance of linear combinations, the distribution of the sample mean, confidence intervals, and hypothesis testing with p values, with a verified worked example.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
VCAA wants you to find the mean and variance of linear combinations of independent random variables, to use the resulting distribution of the sample mean , to construct a confidence interval for a population mean, and to carry out a hypothesis test for a mean by computing a test statistic and a value. This is the statistical inference strand of Specialist, distinct from the proportion-based inference in Mathematical Methods.
Mean and variance of linear combinations
For any random variables, expectation is linear:
whether or not and are independent. Variance, however, behaves differently. For independent and ,
The coefficients are squared, and the cross term vanishes because independence makes the covariance zero. Note that even for a difference, variances add: , since .
The distribution of the sample mean
Take a random sample of independent observations from a population with mean and standard deviation . The sample mean is . Applying the linear-combination rules:
So the sample mean is unbiased for , and its spread shrinks as grows. By the central limit theorem, for large the distribution of is approximately normal, , regardless of the population's shape. The quantity is the standard error of the mean.
Confidence intervals for a population mean
An approximate confidence interval for , when is known and is large, is
where is the standard normal value capturing the central . For a 95% interval, . The interval is a range of plausible values for ; the confidence level refers to the long-run proportion of such intervals that would contain the true mean, not the probability that lies in this one fixed interval.
Hypothesis testing for a mean
A test of the mean compares a null hypothesis against an alternative. The steps:
- State and the alternative (one-sided, or , or two-sided ).
- Compute the test statistic , which measures how many standard errors the observed mean sits from .
- Find the value: the probability, assuming is true, of observing a sample mean at least as extreme as .
- Compare the value with the significance level (commonly ). If , reject ; otherwise do not reject it.
Why independence and squaring matter
The single most error-prone step is the variance rule. Means combine linearly with their coefficients, but variances combine with the squares of the coefficients, and only when the variables are independent. This is why doubling a measurement () quadruples its variance, and why averaging independent values divides the variance by rather than by . Keeping variance and standard deviation distinct (one is the square of the other) prevents most slips in this strand.
Examples in context
Example 1. Variance of a sum. If and are independent with and , then , so .
Example 2. Confidence interval. A sample of with and known gives standard error , so a 95% interval is , that is .
Try this
Q1. and are independent with , . Find . [2 marks]
- Cue. Variances add: .
Q2. A sample of from a population with has mean . State the standard error of the mean. [1 mark]
- Cue. .
Q3. For a test of against , the test statistic is . State the value. [2 marks]
- Cue. Two-sided: .
Exam-style practice questions
Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
2023 VCAA2 marksThe time, Xc minutes, taken to drive to a station is normal with mean 20 and standard deviation 6. The waiting time, Xw minutes, for a train is normal with mean 8 and standard deviation 3. The time, Xt minutes, on the train is normal with mean 12 and standard deviation 5. The three times are independent. Find the mean and standard deviation of the total time, in minutes, it takes to travel from home to the city.
Show worked answer →
The total time is the linear combination T = Xc + Xw + Xt.
Mean: for a sum, means add (regardless of independence). E(T) = 20 + 8 + 12 = 40 minutes.
Variance: because the three times are independent, variances add. Var(T) = 6^2 + 3^2 + 5^2 = 36 + 9 + 25 = 70.
Standard deviation: sd(T) = sqrt(Var(T)) = sqrt(70), which is about 8.37 minutes.
So the total travel time has mean 40 minutes and standard deviation sqrt(70) minutes (approximately 8.37 minutes). A common error is to add the standard deviations (6 + 3 + 5 = 14); only the variances add.