How do you classify data as categorical or numerical, and how does the type of data decide which display is appropriate?
Classify data relating to a single random variable as categorical (nominal or ordinal) or numerical (discrete or continuous), and select and use an appropriate graphical display for the data type
A focused answer to the HSC Maths Standard 2 dot point on classifying data. Categorical (nominal versus ordinal) against numerical (discrete versus continuous), the questions that decide each branch, and choosing an appropriate display for each data type, with a classification tree and worked Australian examples.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
What this dot point is asking
NESA wants you to look at any variable and place it correctly in the data-type family: first decide whether it is categorical or numerical, then pick the right sub-type (nominal or ordinal for categorical, discrete or continuous for numerical). The second half of the dot point is the payoff: the type of data decides which graph is appropriate, so a wrong classification leads to a wrong display. The skill is tested constantly, because almost every Data Analysis question begins with data that you must read correctly before you can summarise or graph it. The arithmetic is trivial here; the marks live in two clean decisions and in justifying your choice of display.
The answer
Every variable answers two questions in turn. Question one: is each value a label, or is it a number you can count or measure and do arithmetic with? Labels make the variable categorical; counts and measurements make it numerical. Question two depends on the first answer. If categorical, ask whether the categories have a natural order: ordered means ordinal, unordered means nominal. If numerical, ask whether the values are counted in whole steps or measured on a continuous scale: counted means discrete, measured means continuous. The tree below runs both questions in one picture.
Categorical data: nominal versus ordinal
A categorical variable sorts each item into a named group. The values are labels, not amounts, so arithmetic on them is meaningless: there is no "average eye colour". Categorical data splits in two:
- Nominal categories are just different names with no natural order. Eye colour, suburb, blood type, favourite sport and nationality are nominal. You could list the categories in any order without losing meaning.
- Ordinal categories have a natural order even though they are still labels. Clothing sizes (S, M, L, XL), survey responses (strongly disagree to strongly agree), school year groups and movie ratings (poor to excellent) are ordinal. Sorting them from lowest to highest makes sense.
The test is simple: if you can rank the categories sensibly, the variable is ordinal; if ranking them would be arbitrary, it is nominal.
Numerical data: discrete versus continuous
A numerical variable takes values that are genuine numbers you can count or measure and do arithmetic with (a mean, for instance). Numerical data also splits in two:
- Discrete data is counted in whole steps, so only separated values are possible. The number of siblings, cars in a household, goals in a match or rooms in a house are discrete: you cannot have siblings.
- Continuous data is measured on a scale and can take any value in a range, limited only by the precision of the instrument. Height, mass, time, temperature and volume are continuous: a height could be cm, cm or cm.
The test is whether the values come in countable jumps (discrete) or can sit anywhere on a number line (continuous).
Choosing an appropriate display
The type of data decides which graph is appropriate, and a marker checks that your display fits the type:
- Categorical (nominal or ordinal): a column (bar) graph to compare group frequencies, a sector (pie) graph when the categories are parts of one whole, or a divided bar graph. For ordinal data, keep the categories in their natural order along the axis so the trend is readable. The bars in a column graph for categorical data are drawn with gaps, because the categories are separate.
- Numerical discrete (small range): a dot plot (one dot per observation stacked above each value) or a column/frequency graph, which show every count and any clustering.
- Numerical continuous (or a large range): a histogram of data grouped into class intervals, which reveals the shape of the distribution (its centre, spread and skew). Unlike a column graph, a histogram's bars touch, because the scale is continuous.
- A stem-and-leaf plot suits numerical data when you want to keep the actual data values visible.
Two displays to avoid: a sector graph for continuous data (there are no natural slices), and a histogram for nominal categories (there is no numerical scale to group).
How exam questions ask about classifying data
The command words are predictable, and each points to the same routine:
- "Classify the variable ..." or "What type of data is ...?" means run the two questions: categorical or numerical first, then the sub-branch.
- "... categorical or numerical" asks only for the first split; "... discrete or continuous" tells you the variable is already numerical and wants the second split.
- "Choose / suggest an appropriate display (or graph) for ..." means name a graph that fits the data type and justify it by naming the type ("a column graph, because the data is categorical").
- "Explain why [a given graph] is not suitable ..." wants the type-mismatch reason (for example, "a sector graph needs categories that are parts of a whole, but continuous heart-rate data has no natural categories").
- "Give an example of a ... variable" asks you to invent one of the named type, so keep a ready example of each of the four types.
Exam-style practice questions
Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
2022 HSC-style3 marksA researcher records three variables about each car in a car park: its make (Toyota, Ford, Mazda, other), its number of seats, and its fuel efficiency in litres per km. Classify each variable as categorical or numerical, and where it is numerical state whether it is discrete or continuous.Show worked answer →
Make: categorical (the values are labels). No discrete/continuous sub-branch is required for a categorical variable, though noting "nominal" is a bonus.
Number of seats: numerical and discrete (a whole-number count).
Fuel efficiency: numerical and continuous (measured on a scale, can take any value in a range).
Markers award one mark per correct full classification. A common error is calling the number of seats continuous; it is a count, so it is discrete. Stating only "categorical/numerical" without the discrete/continuous sub-branch on the numerical variables loses the detail marks.
2021 HSC-style4 marksA school surveys students on their year group (Year 7 to Year 12), their preferred canteen meal (sushi, wrap, pie, salad), and their daily screen time in hours. (a) Classify each variable fully. (b) For preferred canteen meal, name an appropriate graph and justify your choice.Show worked answer →
Part (a): Year group is categorical ordinal (the years have a natural order). Preferred meal is categorical nominal (unordered labels). Daily screen time is numerical continuous (measured on a scale).
Part (b): A column (bar) graph or sector graph is appropriate because preferred meal is categorical; a column graph shows one bar per meal with height equal to the frequency, making the most popular meal easy to read. A sector graph is acceptable if the response is justified as showing each meal as a share of the whole.
Markers award a mark for each correct classification (with the ordinal/nominal distinction expected), and marks in part (b) for naming a valid categorical display AND giving a reason tied to the data type. Naming a histogram or dot plot for the meal data loses the mark, since those are for numerical data.
2023 HSC-style3 marksExplain the difference between a discrete and a continuous numerical variable, giving one example of each from a sport of your choice. Then state, with a reason, whether a histogram or a dot plot is the more natural display for a discrete variable over a small range of whole numbers.Show worked answer →
Difference: a discrete variable is counted in whole-number steps (it can only take separated values), while a continuous variable is measured on a scale and can take any value within a range.
Examples (sport-dependent, e.g. cricket): discrete - the number of wickets taken; continuous - a bowler's delivery speed in km/h. Any valid count and any valid measurement earn the marks.
Display: a dot plot is the more natural choice for a discrete variable over a small whole-number range, because each whole-number value gets its own stack of dots, showing every individual value and the most common count clearly; a histogram groups data into intervals, which is better suited to continuous data or a large range.
Markers reward a correct, clearly worded distinction, one valid example of each type, and a justified display choice. Defining the terms the wrong way round, or giving a measured quantity as the "discrete" example, loses marks.
Practice questions
Original practice questions graded from foundation to exam level, each with a full worked solution. Try them before revealing the solution.
foundation2 marksClassify each of the following variables as categorical or numerical. (a) The eye colour of a student. (b) The number of siblings a student has.Show worked solution →
Ask the deciding question: is the value a label or a number you can count or measure?
Part (a) eye colour. The values are labels such as brown, blue and green. They place each student into a named group, so eye colour is a categorical variable.
Part (b) number of siblings. The value is a genuine count, such as , or , that you can do arithmetic with, so the number of siblings is a numerical variable. (Check: you can find a mean number of siblings, but a mean eye colour makes no sense, which confirms one is numerical and the other categorical.)
foundation3 marksFor each numerical variable below, state whether it is discrete or continuous. (a) The number of cars in a household. (b) The mass of a watermelon in kilograms. (c) The number of goals scored in a netball match.Show worked solution →
Ask the deciding question: is the variable counted in whole steps (discrete) or measured on a scale that can take any value (continuous)?
- Part (a) number of cars
- You count cars in whole numbers; a household cannot own cars, so this is discrete.
- Part (b) mass of a watermelon
- Mass is measured and can take any value in a range, such as kg or kg depending on the scales, so this is continuous.
- Part (c) number of goals
- Goals are counted in whole numbers, so this is discrete. (Check: the two counts can only land on whole numbers, while the mass can sit anywhere between two whole numbers, which is exactly the discrete versus continuous split.)
foundation2 marksClassify each categorical variable as nominal or ordinal. (a) A student's favourite sport (cricket, netball, soccer). (b) A movie rating (poor, fair, good, excellent).Show worked solution →
Ask the deciding question: do the categories have a natural order, or are they just different names?
Part (a) favourite sport. Cricket, netball and soccer are simply different names with no natural ranking, so the variable is nominal.
Part (b) movie rating. Poor, fair, good and excellent have a clear order from worst to best, so the variable is ordinal. (Check: you could sensibly sort the ratings from lowest to highest, but sorting the sports into an order would be meaningless, which is the nominal versus ordinal test.)
core4 marksA survey records four variables about each Year 11 student: their postcode, their T-shirt size (S, M, L, XL), their height in centimetres, and the number of languages they speak. Classify each variable fully (categorical nominal, categorical ordinal, numerical discrete, or numerical continuous).Show worked solution →
Work each variable through the two questions in turn: first categorical or numerical, then the sub-branch.
- Postcode
- Although it is written with digits, a postcode is a label for an area; adding or averaging postcodes is meaningless. So it is categorical, and the postcodes have no natural order, so it is categorical nominal.
- T-shirt size
- The values S, M, L, XL are labels, so the variable is categorical, and they run in a clear order from smallest to largest, so it is categorical ordinal.
- Height in centimetres
- Height is measured and can take any value in a range (such as cm), so it is numerical and continuous.
- Number of languages
- This is a count in whole numbers, so it is numerical and discrete. (Check: the postcode is the classic trap - digits do not make it numerical, because the deciding test is whether arithmetic on the value is meaningful, and averaging postcodes is not.)
core4 marksA student has collected data and must choose a display for each. State an appropriate graph for each variable and justify your choice. (a) The proportion of a household budget spent on rent, food, transport and other. (b) The number of pets owned by each of students (a whole-number count). (c) The favourite takeaway cuisine of people (Thai, Italian, Indian, other).Show worked solution →
Match the display to the data type: categorical data suits column, sector and divided bar graphs; numerical data suits dot plots, stem-and-leaf plots, histograms and frequency tables.
- Part (a) household budget shares
- The categories are parts of one whole (the total budget), so a sector (pie) graph is appropriate because it shows each category as a slice of the total . A divided bar graph would work equally well.
- Part (b) number of pets
- This is discrete numerical data over a small range, so a dot plot (or a column/frequency graph) is appropriate: one dot per student stacked above each count shows the distribution and any clustering.
- Part (c) favourite cuisine
- This is nominal categorical data, so a column (bar) graph is appropriate, with one bar per cuisine and the height showing the frequency. (Check: each chosen graph matches its data type - a slice-of-the-whole for budget shares, stacked dots for a small count, and separated bars for unordered categories.)
exam5 marksA health researcher records, for each of patients: blood type (A, B, AB, O); pain level reported as none, mild, moderate or severe; resting heart rate in beats per minute; and the number of GP visits in the past year. (a) Classify all four variables fully. (b) For blood type and for resting heart rate, name one appropriate graphical display and justify each choice. (c) Explain why a sector graph would be a poor choice for resting heart rate.Show worked solution →
Part (a) classify all four. Apply the two questions to each variable.
- Blood type: the values are unordered labels, so it is categorical nominal.
- Pain level: none, mild, moderate and severe are labels with a natural order, so it is categorical ordinal.
- Resting heart rate: measured on a scale and able to take any value in a range, so it is numerical continuous.
- Number of GP visits: a whole-number count, so it is numerical discrete.
Part (b) appropriate displays. Blood type is categorical, so a column graph (one bar per blood type, height equal to the frequency) clearly compares the four groups. Resting heart rate is continuous numerical, so a histogram with the data grouped into class intervals (for example to , to , and so on) shows the shape of the distribution.
Part (c) why a sector graph fails for heart rate. A sector graph splits a whole into categories, but resting heart rate is continuous with hundreds of distinct values, so it has no small set of natural slices; a pie of near-unique values would be unreadable and would hide the shape (centre, spread, skew) that a histogram reveals. (Check: each display matches its type, and the explanation turns on the fact that continuous data has no natural categories to slice, which is the whole reason type drives the display.)
exam5 marksA council survey asks residents: their suburb; how satisfied they are with parks (very dissatisfied, dissatisfied, neutral, satisfied, very satisfied); their exact age in years and months; and how many times they visited a park last month. (a) Classify each variable fully. (b) The council wants a single graph that shows the satisfaction responses in order from least to most satisfied. State an appropriate display and explain why the order matters here but would not matter for suburb. (c) The number of park visits ranged from to . Recommend a display and justify it.Show worked solution →
Part (a) classify each. Run each variable through the two questions.
- Suburb: unordered labels, so categorical nominal.
- Satisfaction: ordered labels from very dissatisfied to very satisfied, so categorical ordinal.
- Age in years and months: measured on a continuous scale (age increases smoothly, and "years and months" records part-years), so numerical continuous.
- Number of park visits: a whole-number count, so numerical discrete.
Part (b) display for satisfaction, and why order matters. A column graph with the bars placed in the natural order from very dissatisfied to very satisfied is appropriate, because satisfaction is ordinal: keeping the categories in order lets a reader see the trend from unhappy to happy at a glance. Suburb is nominal, so its bars can sit in any order (for example tallest first) without losing meaning, since there is no underlying ranking to preserve.
Part (c) display for park visits. The visits are discrete numerical data over the small whole-number range to , so a dot plot (or a column graph of the frequencies) is appropriate: one dot per resident above each count shows the most common number of visits, any gaps, and any unusually high values. (Check: ordering matters only when the categories themselves are ordered, which is precisely what separates ordinal from nominal, and a small whole-number range is exactly where a dot plot is at its clearest.)
Related dot points
- Investigate sampling techniques, including census, simple random, systematic and stratified sampling, and identify the target population and sources of bias in data collection
A focused answer to the HSC Maths Standard 2 dot point on data collection and sampling. Census versus sample, defining the target population, simple random, systematic and stratified sampling, choosing a sample size, designing a stratified sample with correct proportions, and spotting bias in a survey, with worked Australian examples.
- Display and interpret numerical data using dot plots and stem-and-leaf plots, including back-to-back stem-and-leaf plots, and describe the clusters, gaps, outliers and shape of the data
A focused answer to the HSC Maths Standard 2 dot point on dot plots and stem-and-leaf plots. How to construct and read each display, how to build a back-to-back stem-and-leaf plot to compare two groups, and how to describe clusters, gaps, outliers and the shape of a distribution, with worked Australian examples.
- Organise, interpret and display data into appropriate tabular and graphical representations including frequency distribution tables, both ungrouped and grouped using class intervals and class centres, and cumulative frequency
A focused answer to the HSC Maths Standard 2 dot point on frequency tables. Tallying raw data into a frequency table, grouping data into class intervals, finding the class centre, and building the cumulative frequency column, with worked Australian examples and the totals checked so the cumulative frequency ends at the sample size.
- Display categorical and numerical data using a range of statistical graphs, including column graphs, sector graphs, line graphs, divided bar graphs and Pareto charts, and interpret the displays
A focused answer to the HSC Maths Standard 2 dot point on statistical graphs. Column and bar graphs, sector (pie) graphs with each angle computed as a fraction of 360 degrees, line graphs, divided bar graphs, and the Pareto chart with bars sorted descending and a cumulative percentage line for the 80/20 read, with worked Australian examples.