Skip to main content
ExamExplained
NSW · Maths Extension 1
Maths Extension 1 study scene
§-Syllabus dot point
NSWMaths Extension 1Syllabus dot point

When and how do we use the normal distribution to approximate binomial probabilities?

Use the normal approximation XN(np,np(1p))X \sim N(n p, n p (1 - p)) to approximate binomial probabilities for large nn

A focused answer to the HSC Maths Extension 1 dot point on the normal approximation of the binomial. The rule of thumb np5n p \ge 5 and n(1p)5n(1 - p) \ge 5, continuity correction, standardising and computing approximate probabilities, with worked examples.

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to recognise when the binomial distribution can be approximated by a normal distribution, write down the approximating normal N(np,np(1p))N(n p, n p (1 - p)), apply the continuity correction, and compute approximate probabilities using z-scores.

The answer

The result

If XB(n,p)X \sim B(n, p) with nn "large" (rule of thumb: both np5n p \ge 5 and n(1p)5n (1 - p) \ge 5), then

XN(μ,σ2)withμ=np,σ2=np(1p).X \approx N(\mu, \sigma^2) \quad \text{with} \quad \mu = n p, \quad \sigma^2 = n p (1 - p).

This is a consequence of the central limit theorem: the binomial is a sum of nn independent identically distributed Bernoulli trials, and sums of many i.i.d. random variables tend to a normal distribution.

Watching the bell curve appear, stage by stage

The approximation is not a coincidence: as the number of trials grows, the binomial bars settle into the shape of a bell curve with the same mean and standard deviation. These panels overlay the matching normal N(np,np(1p))N(np,\, np(1 - p)) on the bars of B(n,0.5)B(n, 0.5) for growing nn.

Stage 1, a small nn is already close. With n=10n = 10 the bars are wide and chunky, yet the bell curve N(5,2.5)N(5, 2.5) already passes through their tops. The fit is rough in the tails but the centre is good.

Normal curve over binomial bars, n = 10Binomial bars for n equals 10 and p equals 0.5 with a matching normal bell curve of the same mean and standard deviation drawn over them. As n grows the bars fit the curve more closely.k23578Stage 1n = 10: the bars are chunky but the bell curve N(5, 2.5) already tracks their tops.

Stage 2, more trials, narrower bars. At n=30n = 30 (mean 1515, standard deviation 2.742.74) there are more, thinner bars and they hug the curve much more closely. This is the central limit theorem taking hold: summing more Bernoulli trials pulls the shape toward the normal.

Normal curve over binomial bars, n = 30Binomial bars for n equals 30 and p equals 0.5 with a matching normal bell curve of the same mean and standard deviation drawn over them. As n grows the bars fit the curve more closely.k1012151820Stage 2n = 30: mean 15, sd 2.74. The bars are narrower and hug the curve.

Stage 3, a close fit you can compute with. By n=50n = 50 (mean 2525, standard deviation 3.543.54) the bars and the curve are almost indistinguishable, so the area under the normal curve is a reliable stand-in for a sum of binomial probabilities. This is exactly when replacing a long pmf sum with one or two zz-lookups is justified.

Normal curve over binomial bars, n = 50Binomial bars for n equals 50 and p equals 0.5 with a matching normal bell curve of the same mean and standard deviation drawn over them. As n grows the bars fit the curve more closely.k1821252932Stage 3n = 50: mean 25, sd 3.54. The fit is close enough to read areas off the curve.

When the approximation works

The approximation is good when:

  • nn is large.
  • pp is not too close to 00 or 11 (which makes the distribution very skewed).

The HSC rule of thumb is np5n p \ge 5 and n(1p)5n (1 - p) \ge 5. With n=100n = 100, this works for 0.05p0.950.05 \le p \le 0.95. With n=20n = 20, it works for roughly 0.25p0.750.25 \le p \le 0.75.

When pp is very small and nn is large, the Poisson approximation is more appropriate, but that is beyond HSC Extension 1.

Continuity correction

The binomial is discrete; the normal is continuous. To improve the approximation, adjust the boundary by ±0.5\pm 0.5.

For XB(n,p)X \sim B(n, p) and kk an integer,

P(Xk)P ⁣(Zk+0.5μσ),P(X \le k) \approx P\!\left( Z \le \frac{k + 0.5 - \mu}{\sigma} \right),

P(Xk)P ⁣(Zk0.5μσ),P(X \ge k) \approx P\!\left( Z \ge \frac{k - 0.5 - \mu}{\sigma} \right),

P(X=k)P ⁣(k0.5μσZk+0.5μσ).P(X = k) \approx P\!\left( \frac{k - 0.5 - \mu}{\sigma} \le Z \le \frac{k + 0.5 - \mu}{\sigma} \right).

The half-unit shift accounts for the fact that the binomial X=kX = k corresponds to the interval [k0.5,k+0.5][k - 0.5, k + 0.5] in the continuous picture. The diagram below zooms in on the bars near k=55k = 55: the bar for X=55X = 55 stretches across the continuous interval [54.5,55.5][54.5, 55.5], so P(X55)P(X \le 55) keeps the whole of that bar and the boundary the normal curve should use is 55.555.5, not 5555.

The continuity correctionA close up of binomial bars near k equals 55. The bar for X equals 55 spans the continuous interval from 54.5 to 55.5, so P(X at most 55) uses the boundary 55.5.k535455[54.5, 55.5]5657use 55.5For P(X at most 55), the boundary sits at 55.5: the top of the X = 55 bar.

The direction is the part to get right: for an upper bound ("at most kk", "k\le k") push the boundary out to k+0.5k + 0.5; for a lower bound ("at least kk", "k\ge k") push it out to k0.5k - 0.5. Either way you are growing the interval to include the whole of the boundary bar.

Standardising

For a normally distributed XX with mean μ\mu and standard deviation σ\sigma, the standardised value is

Z=Xμσ.Z = \frac{X - \mu}{\sigma}.

Look up the probability in a standard normal table (or use arcsin\arcsin-style estimates for HSC if a table is not provided).

When to use the approximation

For HSC Extension 1, use it when:

  • nn is large enough that summing pmf values is tedious.
  • The question asks for a numerical answer (not exact).
  • The question explicitly says "using the normal approximation".

Otherwise, compute the binomial directly.

How exam questions ask about the normal approximation

These questions are heavily signposted, so the marks are in following the standard sequence without skipping a step:

  • "Using the normal approximation, estimate P()P(\ldots)": the explicit instruction to approximate. Quote μ=np\mu = np and σ=np(1p)\sigma = \sqrt{np(1 - p)}, apply the continuity correction, standardise, and look up the zz-value.
  • "Explain why the normal approximation is appropriate / valid": check and state both conditions, np5np \ge 5 and n(1p)5n(1 - p) \ge 5. This is usually a 1 mark justification.
  • "A coin is tossed 100100 times ... probability of at least 6060 heads": a large-nn "at least" or "at most" probability that would need a huge pmf sum, the classic cue to approximate. Use the continuity-corrected boundary (59.559.5 here for X60X \ge 60).
  • "State the mean and standard deviation of [the count]": just npnp and np(1p)\sqrt{np(1 - p)}; sometimes the lead-in to a later approximation part.
  • A given table value such as "use P(Z1.9)0.9713P(Z \le 1.9) \approx 0.9713": a strong hint about which zz-score you should be standardising to; if your zz does not match, recheck the continuity correction and the arithmetic.

Exam-style practice questions

Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

HSC-style4 marksA fair coin is tossed 100100 times. Using the normal approximation with a continuity correction, estimate the probability of getting at least 6060 heads. Use P(Z1.9)0.9713P(Z \le 1.9) \approx 0.9713.
Show worked answer →

Let XB(100,0.5)X \sim B(100, 0.5). Then μ=np=50\mu = n p = 50 and σ2=np(1p)=25\sigma^2 = n p (1 - p) = 25, so σ=5\sigma = 5.

Both np=505n p = 50 \ge 5 and n(1p)=505n(1 - p) = 50 \ge 5, so the approximation is valid.

For P(X60)P(X \ge 60) apply the continuity correction with 600.5=59.560 - 0.5 = 59.5:

P(X60)P ⁣(Z59.5505)=P(Z1.9).P(X \ge 60) \approx P\!\left( Z \ge \frac{59.5 - 50}{5} \right) = P(Z \ge 1.9).

P(Z1.9)=1P(Z1.9)10.9713=0.0287P(Z \ge 1.9) = 1 - P(Z \le 1.9) \approx 1 - 0.9713 = 0.0287.

HSC-style3 marksA biased die shows a six with probability 16\frac{1}{6}. It is rolled 180180 times. State the mean and standard deviation of the number of sixes, and explain why the normal approximation is appropriate.
Show worked answer →

Let XB(180,16)X \sim B(180, \tfrac{1}{6}) count the sixes.

Mean: μ=np=180×16=30\mu = n p = 180 \times \tfrac{1}{6} = 30.

Variance: σ2=np(1p)=180×16×56=25\sigma^2 = n p (1 - p) = 180 \times \tfrac{1}{6} \times \tfrac{5}{6} = 25, so σ=5\sigma = 5.

The approximation is appropriate because np=305n p = 30 \ge 5 and n(1p)=1505n(1 - p) = 150 \ge 5, so the binomial is well approximated by N(30,25)N(30, 25).

ExamExplained