Skip to main content
VICGeneral MathematicsSyllabus dot point

How do you summarise, display and describe the distribution of a single numerical or categorical variable in VCE General Mathematics?

Display and describe the distribution of a numerical variable using a histogram, dot plot, stem plot or boxplot, summarise it with measures of centre and spread, and identify outliers using the lower and upper fences

A focused answer to the VCE General Mathematics Unit 3 Data analysis key-knowledge point on univariate data. Choosing displays, describing shape, centre and spread, computing the five-number summary, and finding outliers with the 1.5 IQR fence rule.

Generated by Claude Opus 4.76 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Choosing a display
  3. Describing shape
  4. The five-number summary and the boxplot
  5. Centre and spread for symmetric data
  6. Reading the standard deviation
  7. Why this matters for the exams

What this dot point is asking

VCAA wants you to take a single variable, choose an appropriate display, and describe its distribution clearly. For a numerical variable you describe four things in order: shape, centre, spread, and outliers. You must also be able to construct a five-number summary and a boxplot, and to test for outliers using the fence rule. This is the foundation of the entire Data analysis core, and it appears every year on both exams.

Choosing a display

A categorical variable (e.g. eye colour, transport mode) is summarised with a frequency table and shown with a bar chart. A numerical variable (e.g. height, time) is shown with a dot plot, stem plot, histogram or boxplot.

  • Dot plots and stem plots keep every value and suit small data sets.
  • Histograms group data into intervals and suit large data sets.
  • Boxplots compress the data into five numbers and are ideal for comparing groups.

Describing shape

A distribution is symmetric if it balances about its centre. It is positively skewed if it has a tail to the right (toward larger values) and negatively skewed if it has a tail to the left. Skew matters because it tells you which summary statistics to trust: for skewed data or data with outliers, report the median and IQR; for roughly symmetric data with no outliers, the mean and standard deviation are appropriate.

The five-number summary and the boxplot

The five-number summary is the minimum, first quartile Q1Q_1, median MM, third quartile Q3Q_3, and maximum. The interquartile range is

IQR=Q3Q1.\mathrm{IQR} = Q_3 - Q_1.

The IQR measures the spread of the middle 50 percent of the data and is resistant to outliers. The boxplot draws a box from Q1Q_1 to Q3Q_3 with the median marked inside, and whiskers extending to the most extreme values that are not outliers.

Centre and spread for symmetric data

For roughly symmetric numerical data the mean is

xˉ=xn,\bar{x} = \frac{\sum x}{n},

and the standard deviation ss measures the typical distance of values from the mean. When data is symmetric and bell-shaped, the 68 to 95 to 99.7 percent rule applies: about 68 percent of values lie within one standard deviation of the mean, about 95 percent within two, and about 99.7 percent within three. This rule is examined directly and is a fast source of marks on Exam 1.

Reading the standard deviation

The standard deviation is best read from a calculator in this course, but you should know its behaviour. A larger ss means more spread. If every value is increased by a constant, the mean shifts by that constant but ss is unchanged. If every value is multiplied by a constant kk, both the mean and ss are multiplied by k|k|. These transformation rules are tested on Exam 1 where a calculator may help on Exam 2 but the reasoning must be by hand.

Why this matters for the exams

Univariate analysis is assumed knowledge for the rest of Data analysis. Boxplots reappear when you compare two or more groups, the 68 to 95 to 99.7 percent rule underpins standardised scores, and the idea of a resistant summary returns when you fit regression lines. Master the four-part description and the fence rule first, because the rest of the core is built on them.

Exam-style practice questions

Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2023 VCAA2 marksA sample of 15 oysters has its weight, in grams, recorded. The 15 weights are 12.9, 11.4, 17.4, 6.8, 9.6, 15.5, 9.7, 7.0, 12.6, 12.5, 10.1, 10.6, 13.0, 8.1 and 14.1. Five oysters were graded large (weights 12.9, 17.4, 15.5, 13.0 and 14.1). Determine, in grams: i. the mean weight of all the oysters in this sample; ii. the median weight of the large oysters in this sample.
Show worked answer →

i. The mean is the sum of all 15 weights divided by 15.

sum = 171.3 g, so mean = 171.3 / 15 = 11.42 g (1 mark).

ii. The median of the five large oysters is the middle value once they are ordered.

Ordered large weights: 12.9, 13.0, 14.1, 15.5, 17.4. With five values the median is the 3rd value, so median = 14.1 g (1 mark).

The median of an odd number of values is a single data value, so do not average a pair here.

2025 VCAA2 marksFor a sample of 20 homes, the standard deviation of the sale price of the 10 houses is $300 911. The 10 apartment prices, in dollars, are 350 000, 490 000, 500 000, 620 000, 720 000, 830 000, 875 000, 995 000, 1 100 000 and 1 520 000. i. Find the standard deviation of the apartment sale prices, to the nearest whole number. ii. Comment on the relative spread of the sale prices of houses compared with apartments in this sample.
Show worked answer →

i. Enter the 10 apartment prices into your calculator's statistics function and read off the sample standard deviation.

standard deviation of apartment prices = $346 466 (to the nearest dollar) (1 mark).

ii. Compare the two standard deviations. The apartments have a standard deviation of about 346466,whichislargerthanthehouses346 466, which is larger than the houses' 300 911.

So the sale prices of apartments are more spread out (more variable) than the sale prices of houses in this sample (1 mark). The larger standard deviation means greater spread.