Statistical Analysis (ME-S1)

NSWMaths Extension 1Syllabus dot point

When and how do we use the normal distribution to approximate binomial probabilities?

Use the normal approximation XN(np,np(1p))X \sim N(n p, n p (1 - p)) to approximate binomial probabilities for large nn

A focused answer to the HSC Maths Extension 1 dot point on the normal approximation of the binomial. The rule of thumb np5n p \ge 5 and n(1p)5n(1 - p) \ge 5, continuity correction, standardising and computing approximate probabilities, with worked examples.

Generated by Claude OpusReviewed by Better Tuition Academy7 min answer

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to recognise when the binomial distribution can be approximated by a normal distribution, write down the approximating normal N(np,np(1p))N(n p, n p (1 - p)), apply the continuity correction, and compute approximate probabilities using z-scores.

The answer

The result

If XB(n,p)X \sim B(n, p) with nn "large" (rule of thumb: both np5n p \ge 5 and n(1p)5n (1 - p) \ge 5), then

XN(μ,σ2)withμ=np,σ2=np(1p).X \approx N(\mu, \sigma^2) \quad \text{with} \quad \mu = n p, \quad \sigma^2 = n p (1 - p).

This is a consequence of the central limit theorem: the binomial is a sum of nn independent identically distributed Bernoulli trials, and sums of many i.i.d. random variables tend to a normal distribution.

When the approximation works

The approximation is good when:

  • IMATH_13 is large.
  • IMATH_14 is not too close to 00 or 11 (which makes the distribution very skewed).

The HSC rule of thumb is np5n p \ge 5 and n(1p)5n (1 - p) \ge 5. With n=100n = 100, this works for 0.05p0.950.05 \le p \le 0.95. With n=20n = 20, it works for roughly 0.25p0.750.25 \le p \le 0.75.

When pp is very small and nn is large, the Poisson approximation is more appropriate, but that is beyond HSC Extension 1.

Continuity correction

The binomial is discrete; the normal is continuous. To improve the approximation, adjust the boundary by ±0.5\pm 0.5.

For XB(n,p)X \sim B(n, p) and kk an integer,

P(Xk)P ⁣(Zk+0.5μσ),P(X \le k) \approx P\!\left( Z \le \frac{k + 0.5 - \mu}{\sigma} \right),

P(Xk)P ⁣(Zk0.5μσ),P(X \ge k) \approx P\!\left( Z \ge \frac{k - 0.5 - \mu}{\sigma} \right),

P(X=k)P ⁣(k0.5μσZk+0.5μσ).P(X = k) \approx P\!\left( \frac{k - 0.5 - \mu}{\sigma} \le Z \le \frac{k + 0.5 - \mu}{\sigma} \right).

The half-unit shift accounts for the fact that the binomial X=kX = k corresponds to the interval [k0.5,k+0.5][k - 0.5, k + 0.5] in the continuous picture.

Standardising

For a normally distributed XX with mean μ\mu and standard deviation σ\sigma, the standardised value is

Z=Xμσ.Z = \frac{X - \mu}{\sigma}.

Look up the probability in a standard normal table (or use arcsin\arcsin-style estimates for HSC if a table is not provided).

When to use the approximation

For HSC Extension 1, use it when:

  • IMATH_34 is large enough that summing pmf values is tedious.
  • The question asks for a numerical answer (not exact).
  • The question explicitly says "using the normal approximation".

Otherwise, compute the binomial directly.

Related dot points