What is the normal distribution, and how does the empirical rule give the percentage of data within , and standard deviations?
Recognise the features of the normal distribution and apply the empirical -- rule
A focused answer to the HSC Maths Standard 2 dot point on the normal distribution. The bell-shaped curve, the empirical -- rule built band by band, mean and standard deviation as the two parameters, how the questions are worded, and worked Australian examples for heights, exam marks and manufacturing quality control.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
What this dot point is asking
NESA wants three things. First, recognise the features of the normal distribution: it is symmetric, bell-shaped, and set by its mean and standard deviation. Second, apply the empirical -- rule to find the percentage of data inside standard-deviation bands. Third, use this to solve practical problems. This is one of the most reliably examined ideas in the Statistical Analysis module. The marks are easy once you stop thinking in raw numbers and start thinking in standard deviations.
The answer
The normal distribution
The normal distribution (or bell curve) is a smooth curve that shows how data is spread out. It turns up naturally whenever a quantity is built from many small, separate effects added together. Common examples are adult heights, exam scores across a large group, repeated measurement errors, and the fill weight of mass-produced bottles. Its shape is set entirely by two numbers:
- (mu): the mean, which sits under the peak and the centre of symmetry.
- (sigma): the standard deviation, which controls the width. A small gives a tall narrow curve; a large gives a short wide one.
Here is the insight that makes every question easy to handle. Once you know and , you know the whole curve. Every normal curve is the same shape, just stretched or shifted. So a question never really cares about the raw value . It only cares about how many standard deviations sits from the mean. That count is the only thing that decides a percentage.
Key properties to be able to state:
- Symmetric about the mean, so the left and right halves mirror each other.
- Mean median mode (a consequence of the symmetry and single peak).
- Highest at , falling away smoothly on both sides and never quite touching the axis.
- The total area under the curve is , because it is a probability density: an area is a proportion of the data.
The empirical rule, band by band
The empirical rule is the main result here. Do not just memorise three separate numbers. Instead, build the rule up one band at a time, because that is exactly how the harder questions are set out step by step. The diagrams below shade one more standard-deviation band at each stage.
Stage 1, the central . Go one standard deviation either side of the mean, from to . About of all values fall in this central band. This is the bulk of the data, clustered near the mean.
Stage 2, out to . Widen the band to two standard deviations either side, to . Now about of values are captured. The extra strip on each side (from to out) adds per side, which is how grows to .
Stage 3, out to . Go three standard deviations either side, to , and you have about of the data. Almost nothing lies beyond standard deviations: only in total, split between the two tails.
Stage 4, the half-band percentages. Because the curve is symmetric, each band splits evenly about the mean, so it is worth knowing the percentage of each individual strip. From the mean outward each side reads , then , then , then in the far tail. These are the pieces you add and subtract to answer any "between these two values" question.
The tails, by symmetry
The single most useful follow-up move is splitting "inside the band" into "outside the band", then halving for one tail. If is inside , then is outside, and by symmetry each tail carries half:
- Above (or below ): .
- Above (or below ): .
- Above (or below ): .
Common standard-deviation regions
This table is the empirical rule in its most usable form. Every "what percentage" question is built by adding or subtracting these strips.
| Region | Percentage |
|---|---|
| Within | |
| Within | |
| Within | |
| Between and | |
| Between and | |
| Between and | |
| Above |
As a check, the strips on one side of the mean add to a half: .
Applying the rule: the three-step method
To find the percentage of data in a range:
- Express each endpoint as a number of standard deviations from the mean, using . You are turning raw values into the only currency that matters.
- Sketch the curve, mark the mean and the two endpoints, and shade the region you want.
- Add or subtract the strips from the region table to total the shaded area.
The sketch is not optional flourish: it stops the most common error, which is forgetting which strips lie inside the region you were actually asked about.
If the endpoints are not whole numbers of standard deviations from the mean, the empirical rule cannot give an exact answer. Standard 2 then expects z-scores and the standard normal table (the next dot point); the empirical rule is the special case where the endpoints land exactly on , or .
When the normal distribution applies
The rule only works when the data is actually (approximately) normal. The model fits when:
- Natural variation is at play: adult heights and weights, exam scores in a large cohort, IQ scores.
- Measurement error accumulates: repeated readings of the same fixed quantity.
- Manufacturing tolerances apply: fill weights, component dimensions on a production line.
It does not fit obviously skewed data (household incomes, house prices, reaction times), where a few extreme values pull one tail out. Applying the -- percentages to skewed data is a conceptual error that costs marks.
How exam questions ask about the normal distribution
The wording changes but the task is always "turn endpoints into standard deviations, then add strips". Learn the translations:
- "What percentage lie between and ?" Convert both endpoints to standard deviations from the mean, then sum the strips between them.
- "What percentage are more than / greater than ?" A tail question. Find how many SDs is above the mean, take the inside-band percentage, subtract from , then halve for the single tail (or read the tail directly: , , ).
- "What percentage are less than ?" Everything to the left: for the whole left half, plus or minus the strips between the mean and .
- "How many students / bottles / people ...?" Find the percentage, then multiply by the total. Give a whole number for a count of people.
- "Between what two values do the middle lie?" Run the rule backwards: the middle is , so compute those two values.
- "Is it unusual to score above ?" Code for "how far into the tail is it": a value beyond (top ) is uncommon; beyond (top ) is rare.
Exam-style practice questions
Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
2022 HSC-style3 marksThe heights of Year 12 boys at a Sydney school are normally distributed with mean cm and standard deviation cm. What percentage are taller than cm?Show worked answer →
cm is standard deviations above the mean.
By the empirical rule, of values lie within standard deviations of the mean, so lie outside this band.
By symmetry, half of that () is above cm.
So about of Year 12 boys are taller than cm.
Markers reward the number of standard deviations from the mean, the application of inside, and the halving by symmetry.
2021 HSC-style4 marksA factory produces bags of rice with a mean weight of kg and standard deviation g. The weights are normally distributed. (a) What percentage of bags weigh between g and g? (b) What percentage weigh between g and g?Show worked answer →
Convert standard deviation to grams: g.
(a) g is SD below the mean ( g), and g is SD above. By the empirical rule, of bags lie in this range.
(b) g is SD below, and g is SD above the mean. By the empirical rule, of bags lie in this range.
Markers reward both: the identification of how many SDs each endpoint is from the mean, and the empirical rule percentage.
Practice questions
Original practice questions graded from foundation to exam level, each with a full worked solution. Try them before revealing the solution.
foundation1 marksA set of data is normally distributed. State the approximate percentage of values that lie within (a) standard deviation of the mean, (b) standard deviations of the mean, (c) standard deviations of the mean.
Show worked solution →
Recall the empirical rule. For any normal distribution the central bands carry fixed percentages, measured outward from the mean .
State each band.
(a) Within standard deviation, to : about .
(b) Within standard deviations, to : about .
(c) Within standard deviations, to : about .
Answer: , and .
foundation1 marksThe lifetimes of a brand of light globe are normally distributed with mean hours and standard deviation hours. How many standard deviations above the mean is a lifetime of hours?
Show worked solution →
Set up the standard-deviation count. Find how far the value sits above the mean, then divide by one standard deviation: .
Substitute the numbers.
Answer: hours is standard deviations above the mean.
foundation2 marksThe time a fully charged phone battery lasts is normally distributed with mean hours and standard deviation hours. What percentage of batteries last more than hours?
Show worked solution →
Express the cutoff in standard deviations.
Take the tail by symmetry. The empirical rule puts within standard deviation, so lies outside the band. By symmetry that splits into two equal tails, so the upper tail above is .
Answer: about of batteries last more than hours.
foundation2 marksAnnual rainfall in a town is normally distributed with mean mm and standard deviation mm. What percentage of years have rainfall between mm and mm?
Show worked solution →
Locate each endpoint on the standard-deviation scale.
Read the single strip between them. The band from to is one strip of the curve, worth .
Answer: about of years have rainfall between mm and mm.
core2 marksA mill packs flour into bags whose weights are normally distributed with mean g and standard deviation g. What percentage of bags weigh between g and g?
Show worked solution →
Convert each endpoint to standard deviations.
- Identify the band
- From to is the central band of the empirical rule.
- Read the percentage
- The central band within standard deviation carries about of the data.
- Answer
- about of bags weigh between g and g.
core3 marksMarks in a class test are normally distributed with mean and standard deviation . What percentage of students score above ?
Show worked solution →
Express the boundary in standard deviations.
- Find the outside-band percentage
- The empirical rule puts within standard deviations, so lies outside the band.
- Halve for the single tail
- By symmetry the splits evenly between the two tails, so above is .
- Answer
- about of students score above .
core3 marksThe heights of Year 12 students at a school are normally distributed with mean cm and standard deviation cm. What percentage of students are shorter than cm?
Show worked solution →
Express the cutoff in standard deviations.
- Find the outside-band percentage
- With inside standard deviations, lies outside the band, split equally between the two tails.
- Take the lower tail
- Below is one tail, so .
- Answer
- about of students are shorter than cm.
core4 marksIn a cohort of students, marks in an examination are normally distributed with mean and standard deviation . How many students score between and ?
Show worked solution →
Convert each endpoint to standard deviations.
Read the band percentage. From to is the central band, about of students.
Turn the percentage into a count. Multiply by the cohort size:
Answer: about students score between and .
exam4 marksA factory packs rice into bags whose weights are normally distributed with mean g and standard deviation g. A batch contains bags. (a) What percentage of bags weigh less than g? (b) How many bags in the batch is this?
Show worked solution →
Part (a): express the cutoff in standard deviations.
Take the lower tail. With within standard deviations, is outside, split evenly, so below is .
So for part (a), about of bags weigh less than g.
Part (b): convert the percentage to a count.
Answer: (a) about ; (b) about bags.
exam4 marksA manufacturer tests light globes whose lifetimes are normally distributed with mean hours and standard deviation hours. (a) What percentage of globes last longer than hours? (b) Estimate how many of the globes this represents.
Show worked solution →
Part (a): express the boundary in standard deviations.
Take the upper tail. Since lies within standard deviations, is outside, and halving for the single tail gives above .
So for part (a), about of globes last longer than hours.
Part (b): convert the percentage to a count.
Answer: (a) about ; (b) about globes.
exam5 marksMarks in an HSC trial examination are approximately normally distributed with mean and standard deviation . A school enters students. (a) What percentage of students score between and ? (b) How many students would be expected to score above ?
Show worked solution →
Part (a): convert each endpoint to standard deviations.
Sum the strips between the boundaries. From to is three strips: the left half-band , the right half-band , and the next strip out on the right :
So for part (a), about of students score between and .
Part (b): read the upper tail, then count. Above is the top . Applying this to the students:
Answer: (a) about ; (b) about students.
