Skip to main content
NSWMaths AdvancedSyllabus dot point

How do we describe a discrete random variable and summarise its distribution with mean and variance?

Define a discrete random variable by its probability distribution, and calculate the expected value, variance and standard deviation

A focused answer to the HSC Maths Advanced dot point on discrete random variables. Probability distributions, expected value, variance, standard deviation, and linear transformations of a discrete random variable, with worked examples.

Generated by Claude Opus 4.814 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to recognise a discrete random variable, check that its probability distribution is valid, compute the expected value and variance from the distribution, and apply the linear-transformation rules to aX+ba X + b. Everything starts from the probability distribution, the list of values with their probabilities, so reading and validating that table is the first marked move in almost every question.

The answer

A discrete probability distributionA spike graph of the distribution of X over the values 0, 1, 2 and 3 with probabilities 0.1, 0.4, 0.3 and 0.2. The four spike heights are probabilities that sum to 1.xP(X = x)01230.10.20.30.40.10.40.30.2Step 1Each spike height is P(X = x). The four probabilities0.1 + 0.4 + 0.3 + 0.2 = 1, so the distribution is valid.

Discrete random variables and their distributions

A discrete random variable XX takes a countable list of values x1,x2,,xnx_1, x_2, \dots, x_n with probabilities pi=P(X=xi)p_i = P(X = x_i). The list of values with their probabilities is the probability distribution of XX. For it to be valid, two conditions must hold:

  • 0pi10 \le p_i \le 1 for every ii (each is a genuine probability),
  • ipi=1\sum_i p_i = 1 (something must happen).

The spike graph above is the natural picture: each value sits on the horizontal axis and the height of its spike is its probability, so the heights are the pip_i and they must add to 11. The probability that XX falls in some set is the sum of pip_i for the values in that set. For example, P(X2)=P(X=0)+P(X=1)+P(X=2)P(X \le 2) = P(X = 0) + P(X = 1) + P(X = 2) if XX takes integer values from 00.

Expected value

The expected value (or mean) of XX is the long-run average value if we repeated the experiment many times. It is the weighted sum

E(X)=μ=ixipi.E(X) = \mu = \sum_i x_i \, p_i.

The expected value need not be one of the values XX can actually take; it is a balance point, not an outcome.

Expected value of a function of XX

For any function gg,

E(g(X))=ig(xi)pi.E(g(X)) = \sum_i g(x_i) \, p_i.

The most common case is g(x)=x2g(x) = x^2, which gives

E(X2)=ixi2pi.E(X^2) = \sum_i x_i^2 \, p_i.

This is the quantity you build to find the variance, so it is worth setting up as its own column of working.

Variance and standard deviation

The variance of XX measures spread around the mean. By definition it is the expected squared deviation,

Var(X)=σ2=E((Xμ)2)=i(xiμ)2pi,\text{Var}(X) = \sigma^2 = E((X - \mu)^2) = \sum_i (x_i - \mu)^2 p_i,

which is algebraically equivalent (and almost always easier to compute) to

Var(X)=E(X2)[E(X)]2.\text{Var}(X) = E(X^2) - [E(X)]^2.

The standard deviation is σ=Var(X)\sigma = \sqrt{\text{Var}(X)}, in the same units as XX, which is why it is the spread measure you can compare directly against the mean.

Linear transformations

If Y=aX+bY = a X + b for constants aa and bb,

E(Y)=aE(X)+b,Var(Y)=a2Var(X),σY=aσX.E(Y) = a E(X) + b, \qquad \text{Var}(Y) = a^2 \text{Var}(X), \qquad \sigma_Y = |a| \sigma_X.

Shifting XX by bb slides the mean but leaves the spread untouched; scaling by aa multiplies the mean by aa and the standard deviation by a|a| (and the variance by a2a^2). These rules let you find the mean and variance of YY without rebuilding any sums.

Reading a distribution and finding its mean, stage by stage

The two diagrams here use the distribution P(X=0)=0.1P(X=0)=0.1, P(X=1)=0.4P(X=1)=0.4, P(X=2)=0.3P(X=2)=0.3, P(X=3)=0.2P(X=3)=0.2.

Stage 1, read the distribution and check it is valid. Whether it arrives as a two-row table or as the spike graph above, the first move is the same: confirm the probabilities are between 00 and 11 and sum to 11. Here 0.1+0.4+0.3+0.2=10.1 + 0.4 + 0.3 + 0.2 = 1, so the distribution is valid and you can build calculation columns from it. (If a constant were involved, you would solve pi=1\sum p_i = 1 for it first.)

Stage 2, find the expected value as the balance point. The mean is the weighted sum E(X)=0(0.1)+1(0.4)+2(0.3)+3(0.2)=1.6E(X) = 0(0.1) + 1(0.4) + 2(0.3) + 3(0.2) = 1.6. Picture the probabilities as weights placed along the axis: E(X)E(X) is the point where the bar would balance, marked by the fulcrum below. Note that 1.61.6 is not one of the values XX can take, which is exactly what "balance point, not an outcome" means.

The expected value is the balance pointThe same spike graph with a triangular fulcrum marking the mean of X at 1.6 on the axis, the balance point of the probability weights.xP(X = x)01230.10.20.30.40.10.40.30.2E(X)=1.6Step 2E(X) = Σ x P(X = x) = 0(0.1)+1(0.4)+2(0.3)+3(0.2) = 1.6.It is the balance point of the distribution, not a value X must take.

Presenting a distribution as a table

In the exam a discrete distribution is usually laid out as a two-row table: the values xix_i on top and the probabilities pip_i underneath. Reading it correctly is the first marked step. Check the probabilities sum to 11 (solve for any unknown if a constant is involved), then build the calculation columns you need: xipix_i p_i for the mean and xi2pix_i^2 p_i for E(X2)E(X^2). Laying the work out in columns keeps the arithmetic tidy and is exactly what markers look for.

Interpreting expected value and variance

The expected value is the balance point of the distribution: if you placed the probabilities as weights along a number line, E(X)E(X) is where it would balance, as the fulcrum in the diagram shows. The variance measures how widely the values spread around that balance point, in squared units, and the standard deviation brings it back to the original units so it can be compared with the mean. A small standard deviation means the outcomes cluster tightly around the mean; a large one means they are spread out. This interpretation is what justifies the linear-transformation rules: shifting every value left or right slides the balance point but leaves the spread untouched, while stretching the scale stretches both.

Why E(X2)μ2E(X^2) - \mu^2 is the practical formula

The definition Var(X)=(xiμ)2pi\text{Var}(X) = \sum (x_i - \mu)^2 p_i is conceptually clear but arithmetically painful because it subtracts μ\mu inside every term. The equivalent E(X2)μ2E(X^2) - \mu^2 is almost always faster: build one extra column of xi2pix_i^2 p_i, sum it, and subtract the square of the mean. The two formulas are algebraically identical, so use the second to compute and quote the first to explain.

How exam questions ask about discrete random variables

  • "Show that the table is a valid probability distribution" or "find the value of kk." Check 0pi10 \le p_i \le 1 and solve pi=1\sum p_i = 1 for any unknown.
  • "Find P(X2)P(X \le 2)" or "P(X1)P(X \ge 1)." Add the relevant pip_i; for "at least" it is often quicker to use 1P(the rest)1 - P(\text{the rest}).
  • "Find the expected value / mean." Compute the weighted sum xipi\sum x_i p_i, showing the products.
  • "Find the variance / standard deviation." Build E(X2)=xi2piE(X^2) = \sum x_i^2 p_i, then Var(X)=E(X2)μ2\text{Var}(X) = E(X^2) - \mu^2, then square-root for σ\sigma.
  • "Let Y=aX+bY = aX + b. Find E(Y)E(Y) and Var(Y)\text{Var}(Y) / σY\sigma_Y." Apply E(Y)=aE(X)+bE(Y) = aE(X) + b, Var(Y)=a2Var(X)\text{Var}(Y) = a^2\text{Var}(X), σY=aσX\sigma_Y = |a|\sigma_X.

Edge cases worth knowing

  • An unknown probability via the sum. If one entry is missing or given as kk, find it from pi=1\sum p_i = 1 before any mean or variance work.
  • The mean is not an attainable value. E(X)E(X) is a balance point, so a fair die has mean 3.53.5 even though you can never roll 3.53.5. Do not "round it to a face".
  • A symmetric distribution. If the probabilities are symmetric about a central value, that value is the mean immediately, with no weighted sum required.
  • Negative-looking variance. Variance is a sum of squared terms times probabilities, so it can never be negative; a negative result signals an arithmetic slip, usually E(X)2E(X)^2 confused with E(X2)E(X^2).

Exam-style practice questions

Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2022 HSC Q244 marksThe discrete random variable XX has probability distribution P(X=0)=0.2P(X = 0) = 0.2, P(X=1)=0.5P(X = 1) = 0.5, P(X=2)=0.2P(X = 2) = 0.2, P(X=3)=0.1P(X = 3) = 0.1. Find E(X)E(X) and Var(X)\text{Var}(X).
Show worked answer →

E(X)=xP(X=x)=00.2+10.5+20.2+30.1=0+0.5+0.4+0.3=1.2E(X) = \sum x P(X = x) = 0 \cdot 0.2 + 1 \cdot 0.5 + 2 \cdot 0.2 + 3 \cdot 0.1 = 0 + 0.5 + 0.4 + 0.3 = 1.2.

For the variance, first compute E(X2)=00.2+10.5+40.2+90.1=0.5+0.8+0.9=2.2E(X^2) = 0 \cdot 0.2 + 1 \cdot 0.5 + 4 \cdot 0.2 + 9 \cdot 0.1 = 0.5 + 0.8 + 0.9 = 2.2.

Var(X)=E(X2)[E(X)]2=2.21.44=0.76\text{Var}(X) = E(X^2) - [E(X)]^2 = 2.2 - 1.44 = 0.76.

Markers reward the explicit weighted sum for E(X)E(X), the use of E(X2)μ2E(X^2) - \mu^2 for the variance, and clean arithmetic.

2021 HSC Q253 marksA discrete random variable XX has E(X)=5E(X) = 5 and Var(X)=4\text{Var}(X) = 4. Let Y=3X2Y = 3 X - 2. Find E(Y)E(Y) and the standard deviation of YY.
Show worked answer →

Linearity of expectation: E(Y)=E(3X2)=3E(X)2=3(5)2=13E(Y) = E(3 X - 2) = 3 E(X) - 2 = 3(5) - 2 = 13.

Variance scales by the square of the coefficient and is unchanged by adding a constant: Var(Y)=Var(3X2)=9Var(X)=36\text{Var}(Y) = \text{Var}(3 X - 2) = 9 \text{Var}(X) = 36.

Standard deviation: σY=36=6\sigma_Y = \sqrt{36} = 6.

Markers expect explicit use of E(aX+b)=aE(X)+bE(a X + b) = a E(X) + b and Var(aX+b)=a2Var(X)\text{Var}(a X + b) = a^2 \text{Var}(X), with the standard deviation as the positive square root.

Related dot points