§-Syllabus dot point

NSWSoftware EngineeringSyllabus dot point

Inquiry Question 1: How do machine learning systems work?

Compare supervised, unsupervised and reinforcement learning, and identify a typical application of each

A focused answer to the HSC Software Engineering Module 3 dot point on learning paradigms. Supervised classification and regression, unsupervised clustering, reinforcement learning, applications of each, worked examples, and the traps markers look for.

Generated by Claude Opus 4.810 min answerUpdated 2026-07-03

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

In plain English

Imagine three different ways to teach someone to sort a box of mixed lollies. In the first, you sit next to them and correct every choice: "this one is a snake, this one is a bear", over and over, until they can sort new lollies correctly on their own, that is supervised learning, learning from labelled examples. In the second, you just hand them the box and say "group these however makes sense to you", with no answer key at all, and they end up sorting by colour or by shape, that is unsupervised learning, finding structure with no labels. In the third, you give them a video game controller and say "figure out the best strategy to win, I'll only tell you your score after each round", so they improve purely by trial, error and the score they get back, that is reinforcement learning, learning from reward. All three are still "learning", but what they are given to learn from, correct answers, raw examples, or a reward signal, is completely different, and that is exactly what NESA wants you to be able to compare.

What this dot point is asking

NESA wants you to compare the three main paradigms of machine learning, identify what kind of training data each needs, and give a typical application of each.

The answer

Supervised learning

The training data has both features and labels. The algorithm learns the function from features to label. Two sub-categories:

Classification: the label is a category. Spam vs not spam. Cat vs dog. The output is a class.
Regression: the label is a number. Predicting house prices, exam marks, stock prices. The output is a continuous value.

Common algorithms: logistic regression, decision trees, random forests, gradient-boosted trees, support vector machines, neural networks.

Applications:

Email spam filtering (classification).
Image classification (classification).
House price prediction (regression).
Disease diagnosis from medical imaging (classification).
Forecasting daily power demand (regression).

Unsupervised learning

The training data has features but no labels. The algorithm finds structure in the data on its own. Sub-categories:

Clustering: group similar examples together. K-means is the textbook algorithm.
Dimensionality reduction: compress many features into a few. Principal Component Analysis (PCA), t-SNE.
Anomaly detection: flag examples that are very different from the rest.

Applications:

Customer segmentation in marketing.
Recommendation engines (people similar to you also liked...).
Fraud detection (transactions that look unusual).
Topic modelling on a corpus of documents.

Reinforcement learning

An agent learns by interacting with an environment, receiving rewards or penalties for its actions. The algorithm learns a policy that maximises long-term reward.

Vocabulary:

Agent: the learner.
Environment: the world the agent acts in.
State: a snapshot of the environment.
Action: a choice the agent makes.
Reward: feedback after each action.
Policy: the strategy the agent learns.

Applications:

Game playing (AlphaGo, chess engines, video games).
Robotics (a robot learning to walk).
Autonomous driving decisions.
Resource scheduling and operations research.

A worked Python example: supervised classification

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report

data = load_breast_cancer()
X = data.data  # 30 features per tumour
y = data.target  # 0 = malignant, 1 = benign

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
predictions = model.predict(X_test)
print(classification_report(y_test, predictions))

A worked example: unsupervised clustering

from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

X, _ = make_blobs(n_samples=300, centers=4, random_state=42)

# No labels passed to fit.
model = KMeans(n_clusters=4, random_state=42, n_init=10)
clusters = model.fit_predict(X)

# `clusters` assigns each example to one of 4 groups, learned from structure alone.

The same data is being analysed, but kmeans.fit_predict(X) takes only features, while RandomForestClassifier.fit(X, y) takes features and labels.

How to choose

Situation	Paradigm
You have labelled examples and want to predict the label for new data	Supervised
You have unlabelled examples and want to discover groups or anomalies	Unsupervised
You have an agent that can act in an environment and receive rewards	Reinforcement

Most real systems combine paradigms. A recommendation engine might use unsupervised clustering to discover taste groups, supervised regression to predict ratings, and reinforcement learning to optimise long-term engagement.

Worked example

For each problem, identify the appropriate learning paradigm and justify.

(a) A bank wants to decide whether to approve a new credit card application.
(b) A streaming service wants to group its users into "taste tribes" without knowing in advance how many tribes there are.
(c) A delivery company wants its drones to learn the most efficient route through a warehouse.
(d) An electricity network wants to identify unusual meter readings that might indicate a fault, without any past examples labelled "fault".

(a) Supervised classification. Historical applications have known outcomes (default / repaid). Train a classifier to predict default probability for new applications.

Marker's note: full marks need the paradigm named AND the labelled outcome column identified specifically (default/repaid), not just "past data".

(b) Unsupervised clustering. There are no pre-defined tribes. K-means or hierarchical clustering on viewing history finds groups; the marketing team interprets and names them afterwards.

Marker's note: emphasise that the NUMBER of groups is not fixed in advance, this is the detail that distinguishes it from classification into a small number of preset categories.

(c) Reinforcement learning. The drone takes actions (turn left, turn right, lift), receives rewards for reaching the package quickly and penalties for collisions. Over many trials it learns a policy.

Marker's note: name at least the action and reward explicitly; "it learns over time" alone is too vague for full marks.

(d) Unsupervised anomaly detection. With no labelled "fault" examples to train against, the system instead models what normal meter readings look like and flags readings that deviate significantly from that pattern.

Marker's note: candidates often default to "supervised" here because "detection" sounds like classification; the giveaway is "without any past examples labelled fault" - no labels means it cannot be supervised.

Common traps

Calling clustering a kind of classification: Clustering has no predefined labels. Classification does. They look superficially similar but are different paradigms.
Confusing reinforcement learning with supervised learning by example: RL learns from a delayed reward signal received after acting. Supervised learning learns from an immediate correct answer supplied with each training example.
Treating unsupervised learning as easier because there are no labels: It is often harder: there is no objective right answer to compare against, so evaluating model quality is qualitative or relies on indirect metrics.
Naming the paradigm without the specifics: "This is supervised learning" alone rarely earns full marks. Markers want the features, the label (or agent/environment/reward for RL), named specifically to the scenario given.
Forgetting semi-supervised and self-supervised learning: Real-world problems often fall in between. Large language models use self-supervised pretraining (predict the next word) on unlabelled text. This is not in scope for HSC but worth knowing exists.

Exam-style practice questions

Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2025 HSC5 marksCompare supervised and unsupervised learning. For each, describe an example application and identify what the training data looks like.

Show worked answer →

Supervised learning trains on labelled examples - each row of training data has both features and the known correct answer (label). The model learns the function from features to label.

Training data: features plus labels. Example: medical scans labelled by a radiologist as "tumour" or "no tumour".
Application: classifying chest x-rays as showing pneumonia. The hospital collects thousands of scans, each labelled by a doctor, and trains a model to classify new scans.
Output: a prediction (the label) for new examples.

Unsupervised learning trains on unlabelled examples - only features, no answers. The algorithm finds structure on its own.

Training data: features only. Example: customer purchase histories with no segments attached.
Application: customer segmentation in retail. The algorithm clusters customers into groups with similar buying habits without anyone defining the groups in advance.
Output: a structure (clusters, dimensions, anomalies) derived from the data.

The key difference is the availability of labels. Supervised problems have an answer; unsupervised problems require the algorithm to find patterns in the data alone. Labelling is expensive, so unsupervised methods are common when labels are unavailable.

Markers reward both definitions, the labels vs no labels distinction, a concrete application for each, and recognising that both are still learning patterns from data.

Practice questions

Original practice questions graded from foundation to exam level, each with a full worked solution. Try them before revealing the solution.

foundation2 marksState two differences between supervised and unsupervised learning.

Show worked solution →

Any two of the following (1 mark each):

Supervised learning trains on features plus labels; unsupervised learning trains on features only.
Supervised learning predicts a known type of answer for new data; unsupervised learning discovers structure with no predefined answer to check against.
Supervised learning is usually evaluated with an accuracy-style metric against known labels; unsupervised learning is harder to evaluate because there is no ground truth.

Marking criteria: 1 mark per correctly stated and distinct difference, to a maximum of 2.

foundation3 marksA veterinary clinic has 10,000 past records of animals with known outcomes (recovered / did not recover) after a given treatment, and wants to predict the outcome for a new animal. Identify the learning paradigm and name the features and the label for this problem.

Show worked solution →

This is a supervised learning problem (specifically classification), because each historical record already has a known correct outcome attached.

Features (inputs): variables such as species, age, weight, treatment given, and vital signs at admission.

Label (the answer column): the recorded outcome, "recovered" or "did not recover".

Marking criteria: 1 mark for correctly identifying supervised learning, 1 mark for correctly identifying plausible features, 1 mark for correctly identifying the label as the recorded outcome.

core4 marksThe table below shows the results of clustering 300 customers into groups using purchase history, with no prior group definitions supplied to the algorithm. | Cluster | Customers | Average spend/month | Average items/order | |---|---|---|---| | 1 | 120 | $45 | 2.1 | | 2 | 90 | $210 | 6.8 | | 3 | 90 | $30 | 1.2 | (a) Explain why this is an unsupervised learning task rather than supervised. (b) Suggest a business name and one use for Cluster 2.

Show worked solution →

(a) Why unsupervised. No customer was pre-labelled with a segment name before clustering; the algorithm was given only features (spend, items per order) and grouped customers by similarity in those features on its own. This is the defining trait of unsupervised learning: structure is discovered, not predicted against a known answer.

(b) Cluster 2. With the highest average spend ($210/month) and largest average order size (6.8 items), Cluster 2 could reasonably be labelled "high-value bulk buyers". A retailer could use this group to target premium loyalty offers or early access to new stock, since losing these customers would have a disproportionate revenue impact.

Marking criteria: 1 mark for correctly identifying no pre-existing labels, 1 mark for linking this to the definition of unsupervised learning, 1 mark for a sensible cluster name matching the data, 1 mark for a plausible business use of that cluster.

core4 marksA games studio wants an in-game character to learn to navigate a maze it has never seen, improving over many attempts based only on whether it reaches the exit faster or hits a wall. (a) Identify the learning paradigm. (b) Identify the agent, environment, action and reward in this scenario.

Show worked solution →

(a) Paradigm. This is reinforcement learning: the character improves through trial and error, guided by a feedback signal from acting in an environment, not from a dataset of correct paths supplied in advance.

(b) Components.

Agent: the in-game character.
Environment: the maze.
Action: a movement choice, e.g. move forward, turn left, turn right.
Reward: positive feedback for reaching the exit sooner, negative feedback (penalty) for hitting a wall.

Marking criteria: 1 mark for correctly identifying reinforcement learning, 1 mark each for correctly identifying the agent, environment and action/reward pair (up to 3 marks).

exam5 marksExplain why choosing the wrong learning paradigm for a problem can waste significant development effort. Illustrate with a scenario where a team incorrectly tries to solve an unsupervised problem using a supervised approach.

Show worked solution →

Choosing the wrong paradigm wastes effort because the required inputs are fundamentally different: supervised learning needs a labelled dataset before training can even begin, while unsupervised learning does not. If a team assumes a problem is supervised when it is not, they will spend time and money labelling data that either does not need to exist or cannot be produced correctly.

Scenario. A telecommunications company wants to discover previously unknown patterns of fraudulent account behaviour it has never seen before. A team incorrectly treats this as supervised learning: they ask staff to manually label thousands of past accounts as "fraud" or "not fraud" so a classifier can be trained. This fails for two reasons: (1) genuinely novel fraud patterns, by definition, are not represented in the historical labels, so a classifier trained on past fraud only recognises repeats of already-known fraud, and (2) the labelling effort itself is expensive and slow, and may be inconsistent between staff members (label bias).

The correct approach is unsupervised anomaly detection: cluster or model normal account behaviour from unlabelled data, then flag accounts that deviate significantly from the normal pattern, without needing anyone to have already labelled them as fraudulent. This detects new fraud types the supervised approach would have missed entirely, while avoiding the wasted labelling effort.

Marking criteria: 1 mark for explaining that supervised learning requires labelled data before training can start, 1 mark for identifying the wasted labelling effort as the direct cost, 1 mark for a coherent scenario showing a team wrongly forcing an unsupervised problem into a supervised approach, 1 mark for explaining why the supervised approach specifically fails (cannot detect novel patterns absent from past labels), 1 mark for correctly naming the better paradigm (unsupervised anomaly detection).

exam6 marksJustify the design of a self-driving delivery robot's software using at least two of the three learning paradigms (supervised, unsupervised, reinforcement), explaining what role each paradigm plays and what training data or feedback each would need.

Show worked solution →

A realistic self-driving delivery robot combines multiple paradigms because no single paradigm covers every sub-task well.

Supervised learning for perception: The robot needs to classify objects in its camera feed, such as pedestrians, other vehicles, kerbs and obstacles. This is a supervised classification problem: the training data is a large set of labelled images (each object type marked by human annotators), and the model learns to predict the object category for new camera frames in real time. Without accurate labels for training, the robot cannot reliably tell a rubbish bin from a pedestrian.
Reinforcement learning for navigation policy: Deciding how to move (accelerate, brake, steer) to reach the delivery point efficiently while avoiding collisions is naturally a reinforcement learning problem. The robot (agent) takes actions in its environment (the street), and receives reward signals, positive for reaching the destination quickly and safely, negative for collisions, jerky movements, or leaving the footpath. Over many simulated and real trials the robot learns a policy that balances speed and safety, something that would be extremely hard to hand-code as fixed rules for every possible street layout.
Optional unsupervised role: The company could also use unsupervised clustering on delivery route data (with no predefined labels) to discover naturally occurring "zones" of similar traffic and pedestrian density, which then informs how cautious the reinforcement learning policy should be in each zone.
Judgement: Perception (supervised) and control (reinforcement) solve different kinds of problems, one classifies a static image against a known answer, the other learns a sequential decision strategy from ongoing feedback, so a single paradigm could not do both well; combining them, with unsupervised methods as an optional extra layer for pattern discovery, reflects how real autonomous systems are actually engineered.

Marking criteria: 1 mark for correctly assigning supervised learning to perception/classification, 1 mark for correctly describing its training data (labelled images), 1 mark for correctly assigning reinforcement learning to navigation/control, 1 mark for correctly describing agent/environment/reward for that role, 1 mark for a coherent justification of why a single paradigm is insufficient, 1 mark for a concluding judgement tying the design together.