Skip to main content
ExamExplained
NSW · Software Engineering
Software Engineering study scene
§-Syllabus dot point
NSWSoftware EngineeringSyllabus dot point

Inquiry Question 1: How do machine learning systems work?

Compare supervised, unsupervised and reinforcement learning, and identify a typical application of each

A focused answer to the HSC Software Engineering Module 3 dot point on learning paradigms. Supervised classification and regression, unsupervised clustering, reinforcement learning, applications of each, worked examples, and the traps markers look for.

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to compare the three main paradigms of machine learning, identify what kind of training data each needs, and give a typical application of each.

The answer

Supervised learning

The training data has both features and labels. The algorithm learns the function from features to label. Two sub-categories:

  • Classification: the label is a category. Spam vs not spam. Cat vs dog. The output is a class.
  • Regression: the label is a number. Predicting house prices, exam marks, stock prices. The output is a continuous value.

Common algorithms: logistic regression, decision trees, random forests, gradient-boosted trees, support vector machines, neural networks.

Applications:

  • Email spam filtering (classification).
  • Image classification (classification).
  • House price prediction (regression).
  • Disease diagnosis from medical imaging (classification).
  • Forecasting daily power demand (regression).

Unsupervised learning

The training data has features but no labels. The algorithm finds structure in the data on its own. Sub-categories:

  • Clustering: group similar examples together. K-means is the textbook algorithm.
  • Dimensionality reduction: compress many features into a few. Principal Component Analysis (PCA), t-SNE.
  • Anomaly detection: flag examples that are very different from the rest.

Applications:

  • Customer segmentation in marketing.
  • Recommendation engines (people similar to you also liked...).
  • Fraud detection (transactions that look unusual).
  • Topic modelling on a corpus of documents.

Supervised, unsupervised and reinforcement learning compared as three pipelines Three horizontal pipelines. The top pipeline shows supervised learning, where labelled training data of features plus known answers feeds a training algorithm that produces a model able to predict labels for new data. The middle pipeline shows unsupervised learning, where unlabelled training data of features only feeds an algorithm that produces discovered structure such as clusters. The bottom pipeline shows reinforcement learning, where an agent takes actions in an environment and receives a reward signal in a repeating loop, gradually improving its policy. Supervised learning Features + labels Training algorithm Model -> predicts labels on new data Unsupervised learning Features only Structure- finding algorithm Discovered structure (clusters, anomalies) Reinforcement learning Agent (the learner) Environment (state) action reward + new state The agent repeats this loop, improving its policy to maximise long-term reward.

Reinforcement learning

An agent learns by interacting with an environment, receiving rewards or penalties for its actions. The algorithm learns a policy that maximises long-term reward.

Vocabulary:

  • Agent: the learner.
  • Environment: the world the agent acts in.
  • State: a snapshot of the environment.
  • Action: a choice the agent makes.
  • Reward: feedback after each action.
  • Policy: the strategy the agent learns.

Applications:

  • Game playing (AlphaGo, chess engines, video games).
  • Robotics (a robot learning to walk).
  • Autonomous driving decisions.
  • Resource scheduling and operations research.

A worked Python example: supervised classification

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report

data = load_breast_cancer()
X = data.data  # 30 features per tumour
y = data.target  # 0 = malignant, 1 = benign

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
predictions = model.predict(X_test)
print(classification_report(y_test, predictions))

A worked example: unsupervised clustering

from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

X, _ = make_blobs(n_samples=300, centers=4, random_state=42)

# No labels passed to fit.
model = KMeans(n_clusters=4, random_state=42, n_init=10)
clusters = model.fit_predict(X)

# `clusters` assigns each example to one of 4 groups, learned from structure alone.

The same data is being analysed, but kmeans.fit_predict(X) takes only features, while RandomForestClassifier.fit(X, y) takes features and labels.

How to choose

Situation Paradigm
You have labelled examples and want to predict the label for new data Supervised
You have unlabelled examples and want to discover groups or anomalies Unsupervised
You have an agent that can act in an environment and receive rewards Reinforcement

Most real systems combine paradigms. A recommendation engine might use unsupervised clustering to discover taste groups, supervised regression to predict ratings, and reinforcement learning to optimise long-term engagement.

Exam-style practice questions

Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2025 HSC5 marksCompare supervised and unsupervised learning. For each, describe an example application and identify what the training data looks like.
Show worked answer →

Supervised learning trains on labelled examples - each row of training data has both features and the known correct answer (label). The model learns the function from features to label.

  • Training data: features plus labels. Example: medical scans labelled by a radiologist as "tumour" or "no tumour".
  • Application: classifying chest x-rays as showing pneumonia. The hospital collects thousands of scans, each labelled by a doctor, and trains a model to classify new scans.
  • Output: a prediction (the label) for new examples.

Unsupervised learning trains on unlabelled examples - only features, no answers. The algorithm finds structure on its own.

  • Training data: features only. Example: customer purchase histories with no segments attached.
  • Application: customer segmentation in retail. The algorithm clusters customers into groups with similar buying habits without anyone defining the groups in advance.
  • Output: a structure (clusters, dimensions, anomalies) derived from the data.

The key difference is the availability of labels. Supervised problems have an answer; unsupervised problems require the algorithm to find patterns in the data alone. Labelling is expensive, so unsupervised methods are common when labels are unavailable.

Markers reward both definitions, the labels vs no labels distinction, a concrete application for each, and recognising that both are still learning patterns from data.

Practice questions

Original practice questions graded from foundation to exam level, each with a full worked solution. Try them before revealing the solution.

foundation2 marksState two differences between supervised and unsupervised learning.
Show worked solution →

Any two of the following (1 mark each):

  • Supervised learning trains on features plus labels; unsupervised learning trains on features only.
  • Supervised learning predicts a known type of answer for new data; unsupervised learning discovers structure with no predefined answer to check against.
  • Supervised learning is usually evaluated with an accuracy-style metric against known labels; unsupervised learning is harder to evaluate because there is no ground truth.

Marking criteria: 1 mark per correctly stated and distinct difference, to a maximum of 2.

foundation3 marksA veterinary clinic has 10,000 past records of animals with known outcomes (recovered / did not recover) after a given treatment, and wants to predict the outcome for a new animal. Identify the learning paradigm and name the features and the label for this problem.
Show worked solution →

This is a supervised learning problem (specifically classification), because each historical record already has a known correct outcome attached.

Features (inputs): variables such as species, age, weight, treatment given, and vital signs at admission.

Label (the answer column): the recorded outcome, "recovered" or "did not recover".

Marking criteria: 1 mark for correctly identifying supervised learning, 1 mark for correctly identifying plausible features, 1 mark for correctly identifying the label as the recorded outcome.

core4 marksThe table below shows the results of clustering 300 customers into groups using purchase history, with no prior group definitions supplied to the algorithm. | Cluster | Customers | Average spend/month | Average items/order | |---|---|---|---| | 1 | 120 | $45 | 2.1 | | 2 | 90 | $210 | 6.8 | | 3 | 90 | $30 | 1.2 | (a) Explain why this is an unsupervised learning task rather than supervised. (b) Suggest a business name and one use for Cluster 2.
Show worked solution →

(a) Why unsupervised. No customer was pre-labelled with a segment name before clustering; the algorithm was given only features (spend, items per order) and grouped customers by similarity in those features on its own. This is the defining trait of unsupervised learning: structure is discovered, not predicted against a known answer.

(b) Cluster 2. With the highest average spend ($210/month) and largest average order size (6.8 items), Cluster 2 could reasonably be labelled "high-value bulk buyers". A retailer could use this group to target premium loyalty offers or early access to new stock, since losing these customers would have a disproportionate revenue impact.

Marking criteria: 1 mark for correctly identifying no pre-existing labels, 1 mark for linking this to the definition of unsupervised learning, 1 mark for a sensible cluster name matching the data, 1 mark for a plausible business use of that cluster.

core4 marksA games studio wants an in-game character to learn to navigate a maze it has never seen, improving over many attempts based only on whether it reaches the exit faster or hits a wall. (a) Identify the learning paradigm. (b) Identify the agent, environment, action and reward in this scenario.
Show worked solution →

(a) Paradigm. This is reinforcement learning: the character improves through trial and error, guided by a feedback signal from acting in an environment, not from a dataset of correct paths supplied in advance.

(b) Components.

  • Agent: the in-game character.
  • Environment: the maze.
  • Action: a movement choice, e.g. move forward, turn left, turn right.
  • Reward: positive feedback for reaching the exit sooner, negative feedback (penalty) for hitting a wall.

Marking criteria: 1 mark for correctly identifying reinforcement learning, 1 mark each for correctly identifying the agent, environment and action/reward pair (up to 3 marks).

exam5 marksExplain why choosing the wrong learning paradigm for a problem can waste significant development effort. Illustrate with a scenario where a team incorrectly tries to solve an unsupervised problem using a supervised approach.
Show worked solution →

Choosing the wrong paradigm wastes effort because the required inputs are fundamentally different: supervised learning needs a labelled dataset before training can even begin, while unsupervised learning does not. If a team assumes a problem is supervised when it is not, they will spend time and money labelling data that either does not need to exist or cannot be produced correctly.

Scenario. A telecommunications company wants to discover previously unknown patterns of fraudulent account behaviour it has never seen before. A team incorrectly treats this as supervised learning: they ask staff to manually label thousands of past accounts as "fraud" or "not fraud" so a classifier can be trained. This fails for two reasons: (1) genuinely novel fraud patterns, by definition, are not represented in the historical labels, so a classifier trained on past fraud only recognises repeats of already-known fraud, and (2) the labelling effort itself is expensive and slow, and may be inconsistent between staff members (label bias).

The correct approach is unsupervised anomaly detection: cluster or model normal account behaviour from unlabelled data, then flag accounts that deviate significantly from the normal pattern, without needing anyone to have already labelled them as fraudulent. This detects new fraud types the supervised approach would have missed entirely, while avoiding the wasted labelling effort.

Marking criteria: 1 mark for explaining that supervised learning requires labelled data before training can start, 1 mark for identifying the wasted labelling effort as the direct cost, 1 mark for a coherent scenario showing a team wrongly forcing an unsupervised problem into a supervised approach, 1 mark for explaining why the supervised approach specifically fails (cannot detect novel patterns absent from past labels), 1 mark for correctly naming the better paradigm (unsupervised anomaly detection).

exam6 marksJustify the design of a self-driving delivery robot's software using at least two of the three learning paradigms (supervised, unsupervised, reinforcement), explaining what role each paradigm plays and what training data or feedback each would need.
Show worked solution →

A realistic self-driving delivery robot combines multiple paradigms because no single paradigm covers every sub-task well.

Supervised learning for perception
The robot needs to classify objects in its camera feed, such as pedestrians, other vehicles, kerbs and obstacles. This is a supervised classification problem: the training data is a large set of labelled images (each object type marked by human annotators), and the model learns to predict the object category for new camera frames in real time. Without accurate labels for training, the robot cannot reliably tell a rubbish bin from a pedestrian.
Reinforcement learning for navigation policy
Deciding how to move (accelerate, brake, steer) to reach the delivery point efficiently while avoiding collisions is naturally a reinforcement learning problem. The robot (agent) takes actions in its environment (the street), and receives reward signals, positive for reaching the destination quickly and safely, negative for collisions, jerky movements, or leaving the footpath. Over many simulated and real trials the robot learns a policy that balances speed and safety, something that would be extremely hard to hand-code as fixed rules for every possible street layout.
Optional unsupervised role
The company could also use unsupervised clustering on delivery route data (with no predefined labels) to discover naturally occurring "zones" of similar traffic and pedestrian density, which then informs how cautious the reinforcement learning policy should be in each zone.
Judgement
Perception (supervised) and control (reinforcement) solve different kinds of problems, one classifies a static image against a known answer, the other learns a sequential decision strategy from ongoing feedback, so a single paradigm could not do both well; combining them, with unsupervised methods as an optional extra layer for pattern discovery, reflects how real autonomous systems are actually engineered.

Marking criteria: 1 mark for correctly assigning supervised learning to perception/classification, 1 mark for correctly describing its training data (labelled images), 1 mark for correctly assigning reinforcement learning to navigation/control, 1 mark for correctly describing agent/environment/reward for that role, 1 mark for a coherent justification of why a single paradigm is insufficient, 1 mark for a concluding judgement tying the design together.

ExamExplained