Skip to main content
VICGeneral MathematicsSyllabus dot point

How do you classify data, choose the right display, and describe the shape, centre and spread of a distribution?

Types of data (categorical and numerical), appropriate graphical displays, and describing a numerical distribution in terms of shape, centre and spread using the mean, median, range, interquartile range and standard deviation

A focused answer to the VCE General Mathematics Unit 3 data analysis key knowledge on classifying data, choosing graphical displays, and describing numerical distributions by shape, centre and spread with the mean, median, range, IQR and standard deviation.

Generated by Claude Opus 4.76 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this dot point is asking
  2. Choosing a display
  3. Describing shape
  4. Centre and spread
  5. Putting it together in the exam

What this dot point is asking

VCAA wants you to look at a dataset, decide whether each variable is categorical or numerical, choose a graph that suits that data type, and then describe a numerical distribution in plain language using shape, centre and spread. This is the foundation of the whole data analysis module, so the exam rewards getting the vocabulary exactly right and choosing the correct measure of centre and spread for the shape you see.

Choosing a display

The display you choose depends on the data type.

  • Categorical data is summarised with a frequency table and shown with a bar chart (the bars have gaps between them). A two-way frequency table is used when comparing two categorical variables.
  • Numerical data is shown with a histogram, a dot plot, or a stem-and-leaf plot. A histogram has no gaps between bars because the horizontal axis is a continuous number line.

A common exam instruction is to read frequencies or percentages off a histogram, so be comfortable with the fact that the area pattern of the bars shows where the data clusters.

Describing shape

The shape of a numerical distribution is described as one of:

  • Symmetric: roughly a mirror image about the centre.
  • Positively skewed: a tail stretching to the right (towards larger values).
  • Negatively skewed: a tail stretching to the left (towards smaller values).

You also note any outliers, which are values sitting well away from the main body of data.

Centre and spread

The centre is a single typical value. The two measures are the mean xˉ\bar{x} and the median MM.

xˉ=xn\bar{x} = \frac{\sum x}{n}

The median is the middle value when the data is ordered. For nn values, the median is at position n+12\frac{n+1}{2}.

The spread describes how scattered the data is. The three measures are:

  • Range =maximumminimum= \text{maximum} - \text{minimum}.
  • Interquartile range IQR=Q3Q1\text{IQR} = Q_3 - Q_1, the spread of the middle 50 percent.
  • Standard deviation ss, the typical distance of a value from the mean.

The standard deviation is found with technology in this course, but you must know what it means. A larger ss means the values are more spread out from the mean. For symmetric, bell-shaped data the 68-95-99.7 percent rule applies: about 68 percent of values lie within one standard deviation of the mean, about 95 percent within two, and about 99.7 percent within three.

Putting it together in the exam

A full description answers four things: shape, centre, spread, and outliers. A strong sentence reads: the distribution is positively skewed with a median of 5 messages, an IQR of 4 messages, and an outlier at 28 messages. Stating the units every time is a frequent source of easy marks. When the data is symmetric instead, swap to the mean and standard deviation and report both with units.

Exam-style practice questions

Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2025 VCAA1 marksA sample of 20 homes sold in an inner Melbourne suburb is recorded, with the type of home being either an apartment or a house. State whether the variable type is numerical, nominal or ordinal.
Show worked answer →

The variable type takes the labels apartment or house. These are names of categories, not numbers, so type is a categorical variable.

Categorical variables split into nominal (categories with no natural order) and ordinal (categories with a natural order, such as small, medium, large). Apartment and house cannot be ranked, so there is no natural order.

The answer is nominal. The mark requires the single correct word nominal, not just categorical.

2025 VCAA1 marksA sample of 20 homes is recorded with their sale price in dollars. Find the median, in dollars, of the variable price. (The 20 prices, in dollars, range from 350 000 to 1 540 000.)
Show worked answer →

The median of 20 ordered values is the average of the 10th and 11th values.

Ordering all 20 sale prices and locating the middle pair, the 10th value is 920000andthe11thvalueisalso920 000 and the 11th value is also 920 000.

median = (920 000 + 920 000) / 2 = $920 000.

For an even sample size, always average the two central values; do not just pick one. The mark is for the correct value $920 000.

2025 VCAA1 marksThe preferred car colour (black, silver or white) is recorded for a sample of female and male car buyers in a two-way frequency table. Which one of the following is the most appropriate way to graphically display the data shown in the table? A. a histogram B. a back-to-back stem plot C. parallel boxplots D. a segmented bar chart
Show worked answer →

Both variables here are categorical: preferred car colour (black, silver, white) and sex (female, male). The display must compare the distribution of one categorical variable across the categories of another.

A histogram and a stem plot are for numerical data, and parallel boxplots compare a numerical variable across groups, so options A, B and C are ruled out.

A segmented (or side-by-side) bar chart is the standard display for two categorical variables, so the answer is D.