What is linear thinking about sample size?

Halving the margin of error needs four times the sample, not twice, because the SD carries sqrt(n), not n.

NSWMaths Extension 1Syllabus dot point

How does the sample proportion behave as a random variable, and how do we use its normal approximation?

Use the sample proportion $\hat{p} = X/n$ as a random variable, with mean $p$ , variance $pq/n$ and standard deviation $\sqrt{pq/n}$ , and apply its normal approximation $\hat{p} \sim N(p,\, pq/n)$

A focused answer to the HSC Maths Extension 1 dot point on sample proportions. What $\hat{p} = X/n$ is, why it is a random variable, its mean $p$ , variance $pq/n$ and standard deviation $\sqrt{pq/n}$ , the normal approximation $\hat{p} \sim N(p, pq/n)$ , computing probabilities about $\hat{p}$ , the effect of sample size, and expected-range reasoning for polling and quality control.

Generated by Claude Opus 4.814 min answerUpdated 2026-06-21

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to treat the sample proportion $\hat{p} = X/n$ as a random variable in its own right, know its mean $p$ , its variance $\dfrac{pq}{n}$ and its standard deviation $\sqrt{\dfrac{pq}{n}}$ , and use the normal approximation $\hat{p} \sim N\!\left(p,\, \dfrac{pq}{n}\right)$ to compute probabilities about how close a survey's estimate is likely to be to the true value. This is the one slice of the statistics chapter that is uniquely Extension 1, and it is where binomial theory turns into real polling and quality-control reasoning.

The answer

What a sample proportion is

Run a binomial experiment: $n$ independent Bernoulli trials, each a success with probability $p$ . Let $X \sim B(n, p)$ count the successes. The population proportion $p$ is a fixed (usually unknown) number, for example the true fraction of Australians who support a policy. The sample proportion

\hat{p} = \frac{X}{n}

is the fraction of your sample that were successes. If you poll $n = 2000$ voters and $X = 700$ say they support a party, your sample proportion is $\hat{p} = \dfrac{700}{2000} = 0.35$ , an estimate of the true $p$ .

The crucial idea is that $\hat{p}$ is itself a random variable. Run the survey again with a fresh random sample and you get a different $X$ , hence a different $\hat{p}$ . So $\hat{p}$ has its own distribution, mean and spread, and those are exactly what tell us how trustworthy a single survey is.

Why the mean is $p$ and the variance is $pq/n$

These are not new results to memorise blindly; they fall straight out of the binomial mean and variance you already know, $E(X) = np$ and $\operatorname{Var}(X) = npq$ , using the scaling rules $E(aX) = aE(X)$ and $\operatorname{Var}(aX) = a^2 \operatorname{Var}(X)$ with the constant $a = \dfrac{1}{n}$ :

E(\hat{p}) = E\!\left(\frac{X}{n}\right) = \frac{1}{n} E(X) = \frac{np}{n} = p,

\operatorname{Var}(\hat{p}) = \operatorname{Var}\!\left(\frac{X}{n}\right) = \frac{1}{n^2}\operatorname{Var}(X) = \frac{npq}{n^2} = \frac{pq}{n},

\text{SD}(\hat{p}) = \sqrt{\frac{pq}{n}} = \frac{\sqrt{npq}}{n} = \frac{\text{SD}(X)}{n}.

The mean result, $E(\hat{p}) = p$ , is the whole reason surveys work: on average the sample proportion equals the population proportion, so $\hat{p}$ is an unbiased estimator of $p$ . The variance result is the whole reason large samples are better: the spread carries a $\dfrac{1}{n}$ , so it shrinks as $n$ grows. Note the dimension check too, $\hat{p}$ is a proportion (a number between $0$ and $1$ ), so its SD $\sqrt{pq/n}$ is also a small fraction, never the large count-sized spread $\sqrt{npq}$ that $X$ has.

The distribution of $\hat{p}$ is the binomial, restretched

Because $\hat{p} = X/n$ , each value of $\hat{p}$ corresponds to exactly one value of $X$ and carries the same probability:

P\!\left(\hat{p} = \frac{x}{n}\right) = P(X = x) = \binom{n}{x} p^x q^{\,n-x}.

So the probability graph of $\hat{p}$ is just the probability graph of $X$ with the horizontal axis relabelled (squashed from the integers $0,1,\ldots,n$ onto the fractions $0, \tfrac1n, \ldots, 1$ ). Nothing about the probabilities changes, only the scale of the axis. That is why every binomial tool you have still applies: to find a probability about $\hat{p}$ exactly, convert it into the matching statement about $X$ and sum binomial terms; to find it quickly for large $n$ , use the normal approximation below.

The normal approximation $\hat{p} \sim N(p,\, pq/n)$

For large $n$ the binomial is well approximated by a normal distribution (the central limit theorem). Dividing through by $n$ carries that approximation over to $\hat{p}$ : it becomes approximately normal with the same mean and variance we just derived,

\hat{p} \sim N\!\left(p,\ \frac{pq}{n}\right) \qquad\text{when } np \ge 5 \text{ and } nq \ge 5.

The validity conditions are the same $np \ge 5$ , $nq \ge 5$ rule used for approximating $X$ itself, because $\hat{p}$ and $X$ have identical probabilities. The picture below is the sampling distribution of $\hat{p}$ for a national poll with true support $p = 0.4$ and $n = 1000$ : a bell curve centred exactly on the true value $p$ , with standard deviation $\sqrt{pq/n} \approx 0.0155$ . The shaded band is the central $\pm 2$ standard deviations, the "expected range" almost every poll will fall inside.

The effect of sample size $n$

The mean of $\hat{p}$ is always $p$ , no matter the sample size, so a bigger sample does not move the centre, it sharpens it. Since

\text{SD}(\hat{p}) = \sqrt{\frac{pq}{n}},

the spread is inversely proportional to $\sqrt{n}$ . To halve the standard deviation you must quadruple the sample. The overlay below fixes $p = 0.5$ and stacks the sampling distributions for $n = 100$ , $n = 400$ and $n = 1600$ : each fourfold jump in $n$ halves the standard deviation (from $0.05$ to $0.025$ to $0.0125$ ) and the curve becomes correspondingly taller and tighter around $0.5$ .

This is the link to confidence-style "expected range" reasoning. Because $\hat{p}$ is roughly normal, about $95\%$ of the time a single survey's $\hat{p}$ lands within $\pm 2$ standard deviations of $p$ (more precisely $\pm 1.96$ SD). Read backwards, that interval is the "margin of error": if a poll of $n = 1000$ reports $\hat{p} = 0.40$ , you can say the true value is very likely within $2 \times 0.0155 \approx 0.031$ , i.e. roughly $0.40 \pm 0.03$ . The smaller you want that margin, the larger $n$ has to be, and because of the $\sqrt{n}$ , shrinking the margin by half costs four times the sample.

Exact versus approximate

For small $n$ , do not reach for the normal curve, the values of $\hat{p}$ are too few and chunky. Instead convert the question about $\hat{p}$ into the matching question about $X = n\hat{p}$ and sum binomial terms exactly. For large $n$ (both $np \ge 5$ and $nq \ge 5$ ), the normal approximation $N(p, pq/n)$ replaces a long sum with one or two $z$ -lookups. A continuity correction of $\pm \dfrac{1}{2n}$ (the half-step between adjacent $\hat{p}$ values is $\dfrac1n$ ) can be applied, but for the large samples typical of polling it is negligible and is usually dropped, which is the convention this page follows.

How exam questions ask about sample proportions

The wording is the tell. Map the phrase to the move:

"Write down / find the sample proportion": just compute $\hat{p} = X/n$ from the given count. A $1$ mark opener.
"State the mean and standard deviation of $\hat{p}$ ": quote $E(\hat{p}) = p$ and $\text{SD}(\hat{p}) = \sqrt{pq/n}$ (note the square root, and that the SD is a small fraction, not $\sqrt{npq}$ ).
"Show that $E(\hat{p}) = p$ / $\operatorname{Var}(\hat{p}) = pq/n$ ": derive from $E(X) = np$ , $\operatorname{Var}(X) = npq$ using $\hat{p} = X/n$ and the scaling rules $E(aX)=aE(X)$ , $\operatorname{Var}(aX)=a^2\operatorname{Var}(X)$ .
"Use the normal approximation to find $P(\hat{p} \ldots)$ " or "estimate the probability the sample proportion is between ...": confirm $np \ge 5$ and $nq \ge 5$ , write $\hat{p} \sim N(p, pq/n)$ , standardise with $z = \dfrac{\hat{p} - p}{\sqrt{pq/n}}$ , look up.
"within $0.0X$ of the true value": this is $P(|\hat{p} - p| \le 0.0X)$ , a symmetric interval $p \pm 0.0X$ ; standardise both ends and use $2\,P(Z \le z) - 1$ .
"What sample size is needed so the estimate is within ... with probability ...": set $z \sqrt{pq/n} \le \text{margin}$ and solve for $n$ , using the worst case $p = 0.5$ if $p$ is unknown.

Exam technique

Lead with the model in one line: " $\hat{p} \sim N\!\left(p, \dfrac{pq}{n}\right)$ with $p = \ldots$ , $n = \ldots$ ", then state $\text{SD} = \sqrt{pq/n}$ as a decimal before you standardise, markers look for the SD written out. Standardise with $z = \dfrac{\hat{p} - p}{\sqrt{pq/n}}$ (divide by the standard deviation, never the variance) and keep $z$ to two decimals to match the table. Convert tails with $P(Z \ge z) = 1 - P(Z \le z)$ and a symmetric "within" interval with $2 P(Z \le z) - 1$ . Always check $np \ge 5$ and $nq \ge 5$ first and say so, it is an easy mark, and if the conditions fail, state that the approximation is invalid and use the exact binomial instead. For a "minimum sample size" part, take $p = 0.5$ (so $pq$ is largest) when $p$ is unknown, and round $n$ up.

Worked example

State the distribution of the sample proportion for a coin tossed six times

A fair coin is tossed $6$ times. Let $X$ be the number of heads and $\hat{p} = X/6$ the sample proportion of heads. Find the possible values of $\hat{p}$ , its mean, variance and standard deviation, and compare the spread of $\hat{p}$ with the spread of $X$ .

Values: $X$ takes $0,1,\ldots,6$ , so $\hat{p} = X/6$ takes the $7$ values $0, \tfrac16, \tfrac26, \tfrac36, \tfrac46, \tfrac56, 1$ . Each carries the same probability as the matching $X$ , for instance $P(\hat{p} = \tfrac36) = P(X = 3) = \binom{6}{3}(0.5)^6 = \tfrac{20}{64}$ .
Mean: With $p = 0.5$ , $E(\hat{p}) = p = 0.5$ .
Variance: With $q = 0.5$ and $n = 6$ ,
$\operatorname{Var}(\hat{p}) = \frac{pq}{n} = \frac{0.25}{6} = 0.041\overline{6} \approx 0.0417.$
Standard deviation: $\text{SD}(\hat{p}) = \sqrt{0.0417} \approx 0.2041$ .
Compare with $X$: For $X$ , $\text{SD}(X) = \sqrt{npq} = \sqrt{6 \times 0.25} = \sqrt{1.5} \approx 1.2247$ . Dividing by $n = 6$ gives $1.2247 / 6 \approx 0.2041$ , matching $\text{SD}(\hat{p})$ exactly, confirming $\text{SD}(\hat{p}) = \text{SD}(X)/n$ .

Estimate a polling probability with the normal approximation

A poll of $n = 1000$ voters estimates support for a referendum whose true level is $p = 0.40$ . Use the normal approximation to find the probability that the poll's sample proportion $\hat{p}$ lies between $0.37$ and $0.43$ (that is, within $0.03$ of the true value). Use $P(Z \le 1.94) \approx 0.9738$ .

Check the conditions. $np = 1000 \times 0.40 = 400 \ge 5$ and $nq = 1000 \times 0.60 = 600 \ge 5$ , so the approximation is valid.

Parameters. With $q = 0.60$ ,

\text{mean} = p = 0.40, \qquad \operatorname{Var}(\hat{p}) = \frac{pq}{n} = \frac{0.24}{1000} = 0.00024,

\text{SD}(\hat{p}) = \sqrt{0.00024} \approx 0.01549.

So $\hat{p} \sim N(0.40,\, 0.00024)$ .

Standardise the symmetric interval: Both endpoints are $0.03$ from the mean:
$z = \frac{0.43 - 0.40}{0.01549} \approx 1.94, \qquad \frac{0.37 - 0.40}{0.01549} \approx -1.94.$
Read the probability: $P(0.37 \le \hat{p} \le 0.43) \approx P(-1.94 \le Z \le 1.94) = 2\,P(Z \le 1.94) - 1 \approx 2(0.9738) - 1 = 0.9476.$
Final answer: about $0.948$ . A poll of $1000$ has roughly a $95\%$ chance of landing within $3$ percentage points of the truth, which is the basis of the familiar " $\pm 3\%$ margin of error".

Estimate a quality-control probability

A bottling line's true defective rate is $p = 0.05$ . An inspector samples $n = 400$ bottles. Using the normal approximation, estimate the probability that the sample defective proportion $\hat{p}$ exceeds $0.07$ . Use $P(Z \le 1.84) \approx 0.9671$ .

Check the conditions: $np = 400 \times 0.05 = 20 \ge 5$ and $nq = 400 \times 0.95 = 380 \ge 5$ , valid.
Parameters: With $q = 0.95$ ,
$\text{mean} = p = 0.05, \qquad \operatorname{Var}(\hat{p}) = \frac{pq}{n} = \frac{0.0475}{400} = 0.00011875,$

$\text{SD}(\hat{p}) = \sqrt{0.00011875} \approx 0.01090.$
Standardise: $z = \frac{0.07 - 0.05}{0.01090} \approx 1.84.$
Read the probability: $P(\hat{p} > 0.07) \approx P(Z > 1.84) = 1 - P(Z \le 1.84) \approx 1 - 0.9671 = 0.0329.$
Final answer: about $0.033$ . There is roughly a $3\%$ chance an in-spec line shows a sample defective rate above $7\%$ just from sampling variation, so a single high sample is not on its own proof the line has drifted.

Derive the mean and variance of the sample proportion

Starting only from $X \sim B(n, p)$ with $E(X) = np$ and $\operatorname{Var}(X) = npq$ , show that $E(\hat{p}) = p$ and $\operatorname{Var}(\hat{p}) = \dfrac{pq}{n}$ .

Set up: By definition $\hat{p} = \dfrac{X}{n} = \dfrac{1}{n}X$ , a constant $\dfrac1n$ times the random variable $X$ .
Mean (using $E(aX) = aE(X)$ ): $E(\hat{p}) = E\!\left(\frac{1}{n}X\right) = \frac{1}{n}E(X) = \frac{1}{n}\cdot np = p.$
Variance (using $\operatorname{Var}(aX) = a^2\operatorname{Var}(X)$ ): $\operatorname{Var}(\hat{p}) = \operatorname{Var}\!\left(\frac{1}{n}X\right) = \frac{1}{n^2}\operatorname{Var}(X) = \frac{1}{n^2}\cdot npq = \frac{pq}{n}.$
Conclusion: $\hat{p}$ is unbiased ( $E(\hat{p}) = p$ ) and its variance $\dfrac{pq}{n} \to 0$ as $n \to \infty$ , so the estimate concentrates on $p$ for large samples.

Find the sample size for a target margin of error

A pollster wants a $95\%$ chance that the sample proportion is within $0.02$ of the true value $p$ . Find the smallest sample size, taking the worst case $p = 0.5$ and using $z = 1.96$ .

Translate the requirement: "Within $0.02$ with probability $0.95$ " means $1.96 \times \text{SD}(\hat{p}) \le 0.02$ , where $\text{SD}(\hat{p}) = \sqrt{pq/n}$ .
Worst case: $pq$ is largest at $p = 0.5$ ( $pq = 0.25$ ), which gives the largest required $n$ and is safe for any true $p$ :
$1.96\sqrt{\frac{0.25}{n}} \le 0.02.$
Solve: Squaring,
$1.96^2 \cdot \frac{0.25}{n} \le 0.0004 \;\Longrightarrow\; n \ge \frac{3.8416 \times 0.25}{0.0004} = 2401.$
Final answer: $n = 2401$ . This is exactly why national polls quoting a $\pm 2\%$ margin sample around $2000$ to $2500$ people.

Common traps

Using $\sqrt{npq}$ instead of $\sqrt{pq/n}$: That is the SD of the count $X$ , not of the proportion $\hat{p}$ . The proportion's SD is the small fraction $\sqrt{pq/n} = \sqrt{npq}/n$ . Mixing them is the single most common error.
Dividing by the variance, not the standard deviation: Standardise with $z = \dfrac{\hat{p} - p}{\sqrt{pq/n}}$ . The denominator is the SD, not $pq/n$ .
Forgetting the validity check: Always confirm $np \ge 5$ and $nq \ge 5$ before approximating. If they fail (small $n$ or extreme $p$ ), the normal curve is a poor fit, use the exact binomial.
Thinking a bigger sample shifts the centre: It does not. $E(\hat{p}) = p$ for every $n$ . A larger sample only shrinks the spread (by $\sqrt{n}$ ), it does not change what $\hat{p}$ is centred on.
Linear thinking about sample size: Halving the margin of error needs four times the sample, not twice, because the SD carries $\sqrt{n}$ , not $n$ .

Practice questions

Original practice questions graded from foundation to exam level, each with a full worked solution. Try them before revealing the solution.

foundation3 marksA market-research firm surveys

50

randomly chosen Sydney commuters and finds that

18

used a train at least once last week. Write down the sample proportion

\hat{p}

. If the true population proportion is

p = 0.4

, state the mean, variance and standard deviation of

\hat{p}

for a sample of this size.

Show worked solution →

Sample proportion: With $X = 18$ successes out of $n = 50$ ,
$\hat{p} = \frac{X}{n} = \frac{18}{50} = 0.36.$
Mean: $E(\hat{p}) = p = 0.4$ .
Variance: With $q = 1 - p = 0.6$ ,
$\operatorname{Var}(\hat{p}) = \frac{pq}{n} = \frac{0.4 \times 0.6}{50} = \frac{0.24}{50} = 0.0048.$
Standard deviation: $\sqrt{0.0048} \approx 0.0693$ .

So this one survey produced an estimate of $0.36$ , and the estimator $\hat{p}$ is centred on the true value $0.4$ with a spread of about $0.069$ .

foundation3 marksA fair coin is tossed

4

times and

\hat{p}

is the proportion of heads. List the possible values of

\hat{p}

, then find its mean and standard deviation.

Show worked solution →

Possible values: $X$ can be $0,1,2,3,4$ , so $\hat{p} = X/4$ takes the $5$ values
$0,\ \tfrac14,\ \tfrac12,\ \tfrac34,\ 1 \quad\text{i.e.}\quad 0,\ 0.25,\ 0.5,\ 0.75,\ 1.$
Mean: $E(\hat{p}) = p = 0.5$ .
Standard deviation: With $p = q = 0.5$ and $n = 4$ ,
$\operatorname{Var}(\hat{p}) = \frac{pq}{n} = \frac{0.25}{4} = 0.0625, \qquad \text{SD} = \sqrt{0.0625} = 0.25.$

Note how few values $\hat{p}$ has and how large the spread is: with only $4$ trials a single survey tells you very little about $p$ .

core4 marksA streaming service estimates that a new show has a true national audience share of

p = 0.20

. A ratings panel of

n = 625

households is sampled. Using the normal approximation to

\hat{p}

, estimate the probability that the panel's sample proportion lies between

0.18

and

0.23

. Use

P(Z \le 1.88) \approx 0.9699

and

P(Z \le 1.25) \approx 0.8944

Show worked solution →

Set up the model. Here $p = 0.20$ , $q = 0.80$ , $n = 625$ . Check the conditions: $np = 125 \ge 5$ and $nq = 500 \ge 5$ , so the normal approximation is valid.

Parameters of the approximating normal.

\text{mean} = p = 0.20, \qquad \operatorname{Var}(\hat{p}) = \frac{pq}{n} = \frac{0.16}{625} = 0.000256,

\text{SD} = \sqrt{0.000256} = 0.016.

So $\hat{p} \approx N(0.20,\, 0.000256)$ .

Standardise both endpoints.

z_{\text{lower}} = \frac{0.18 - 0.20}{0.016} = -1.25, \qquad z_{\text{upper}} = \frac{0.23 - 0.20}{0.016} = 1.875 \approx 1.88.

Read the probability.

P(0.18 \le \hat{p} \le 0.23) \approx P(-1.25 \le Z \le 1.88) = P(Z \le 1.88) - P(Z \le 1.25).

\approx 0.9699 - 0.8944 = 0.0755 \ ?

That uses $P(Z \le -1.25) = 1 - P(Z \le 1.25) = 1 - 0.8944 = 0.1056$ , so

P(-1.25 \le Z \le 1.88) = 0.9699 - 0.1056 = 0.8643.

Answer. About $0.864$ , so roughly an $86\%$ chance the panel's share lands in $[0.18, 0.23]$ .

core4 marksA bottling plant claims its true defective rate is

p = 0.02

. A quality inspector samples

n = 900

bottles. Find the standard deviation of the sample proportion

\hat{p}

, then use the normal approximation to estimate the probability that the inspector observes a sample defective rate greater than

0.03

. Use

P(Z \le 2.14) \approx 0.9838

Show worked solution →

Conditions: $np = 900 \times 0.02 = 18 \ge 5$ and $nq = 900 \times 0.98 = 882 \ge 5$ , so the approximation is valid.
Standard deviation: With $q = 0.98$ ,
$\operatorname{Var}(\hat{p}) = \frac{pq}{n} = \frac{0.02 \times 0.98}{900} = \frac{0.0196}{900} \approx 0.0000218,$

$\text{SD} = \sqrt{0.0000218} \approx 0.004667.$
Standardise: For $\hat{p} > 0.03$ ,
$z = \frac{0.03 - 0.02}{0.004667} \approx 2.14.$

Read the probability.

P(\hat{p} > 0.03) \approx P(Z > 2.14) = 1 - P(Z \le 2.14) \approx 1 - 0.9838 = 0.0162.

Answer. About $0.016$ . So even though the claimed rate is only $2\%$ , there is roughly a $1.6\%$ chance a clean batch shows a sample rate above $3\%$ purely by sampling variation, worth remembering before raising an alarm.

exam5 marksA polling company wants to estimate the proportion

p

of voters supporting a referendum. It requires a

95\%

chance that its sample proportion

\hat{p}

falls within

0.02

of the true value

p

. Taking the worst case

p = 0.5

, and using

z = 1.96

for the central

95\%

of a normal distribution, find the smallest sample size

n

the company should use.

Show worked solution →

Translate the requirement: "Within $0.02$ of $p$ with probability $0.95$ " means the half-width of the central $95\%$ interval of $\hat{p}$ must be at most $0.02$ :
$1.96 \times \text{SD}(\hat{p}) \le 0.02, \qquad \text{SD}(\hat{p}) = \sqrt{\frac{pq}{n}}.$
Substitute the worst case: The product $pq$ is largest at $p = 0.5$ , where $pq = 0.25$ . Using $p = 0.5$ gives the most demanding (largest) $n$ , which is safe for any true $p$ :
$1.96 \sqrt{\frac{0.25}{n}} \le 0.02.$
Solve for $n$: Square both sides:
$1.96^2 \cdot \frac{0.25}{n} \le 0.02^2 \;\Longrightarrow\; n \ge \frac{1.96^2 \times 0.25}{0.02^2} = \frac{3.8416 \times 0.25}{0.0004} = 2401.$
Answer: The company needs $n = 2401$ voters (round up to guarantee the bound). This is the textbook "margin of error $\pm 2\%$ " sample size for a national poll, and it explains why such polls quote samples of roughly $2000$ to $2500$ people.

exam5 marksTwo opinion polls estimate the same true support level

p = 0.45

. Poll A samples

n_A = 400

voters; Poll B samples

n_B = 1600

. (a) Compare the standard deviations of

\hat{p}

for the two polls. (b) Using the normal approximation, estimate for each poll the probability that

\hat{p}

falls within

0.03

p

. Use

P(Z \le 1.21) \approx 0.8869

and

P(Z \le 2.41) \approx 0.9920

Show worked solution →

(a) Standard deviations. With $p = 0.45$ , $q = 0.55$ so $pq = 0.2475$ .

\text{SD}_A = \sqrt{\frac{0.2475}{400}} \approx 0.02487, \qquad \text{SD}_B = \sqrt{\frac{0.2475}{1600}} \approx 0.01244.

Because

n_B = 4 n_A

and the SD has

\sqrt{n}

in the denominator, quadrupling the sample halves the standard deviation:

\text{SD}_A / \text{SD}_B = \sqrt{1600/400} = 2

(b) Probability within $0.03$ for Poll A.

z_A = \frac{0.03}{0.02487} \approx 1.21,

P(|\hat{p} - p| \le 0.03) \approx P(-1.21 \le Z \le 1.21) = 2 \times 0.8869 - 1 = 0.7738.

Probability within $0.03$ for Poll B.

z_B = \frac{0.03}{0.01244} \approx 2.41,

P(|\hat{p} - p| \le 0.03) \approx 2 \times 0.9920 - 1 = 0.9840.

Answer. Poll A has about a $77\%$ chance of landing within $0.03$ of the truth; Poll B about $98\%$ . Halving the standard deviation sharply tightens the estimate, which is why larger samples are worth the cost.

What this dot point is asking

The answer

What a sample proportion is

Why the mean is ppp and the variance is pq/npq/npq/n

The distribution of p^\hat{p}p^​ is the binomial, restretched

The normal approximation p^∼N(p, pq/n)\hat{p} \sim N(p,\, pq/n)p^​∼N(p,pq/n)

The effect of sample size nnn

Exact versus approximate

How exam questions ask about sample proportions

Practice questions

Parameters of the approximating normal.

Standardise both endpoints.

Read the probability.

Read the probability.

(b) Probability within 0.030.030.03 for Poll A.

Probability within 0.030.030.03 for Poll B.

Related dot points

Why the mean is $p$ and the variance is $pq/n$

The distribution of $\hat{p}$ is the binomial, restretched

The normal approximation $\hat{p} \sim N(p,\, pq/n)$

The effect of sample size $n$

(b) Probability within $0.03$ for Poll A.

Probability within $0.03$ for Poll B.