← Module 1: Secure Software Architecture
Inquiry Question 2: How can the security of a developed solution be evaluated?
Apply input validation, sanitisation and output encoding to defend against injection attacks
A focused answer to the HSC Software Engineering Module 1 dot point on input validation. Allow-list vs deny-list, sanitisation, output encoding, parameterised queries, the worked SQL injection example, and the traps markers look for.
Have a quick question? Jump to the Q&A page
What this dot point is asking
NESA wants you to define three related but distinct techniques - input validation, sanitisation, and output encoding - and apply each to defend against injection attacks. You need to know when to use which, and why parameterised queries (output encoding for SQL) are the primary defence against SQL injection.
The answer
Input validation
Check that incoming data matches an expected format before processing it. Two approaches:
- Allow-list (preferred): specify exactly what is permitted, reject everything else. "Username must be 3-32 alphanumeric characters."
- Deny-list: specify what is forbidden, allow everything else. Brittle because attackers find creative bypasses.
Validate at the server, not just in the browser. Browser validation improves UX but does nothing for an attacker who calls your API directly.
import re
USERNAME_PATTERN = re.compile(r"^[a-zA-Z0-9_]{3,32}$")
def validate_username(username):
if not USERNAME_PATTERN.fullmatch(username):
raise ValueError("Invalid username format")
return username
Sanitisation
Transform input to make it safe for downstream use. Removes or escapes unwanted characters but does not reject the request.
import html
def sanitise_for_display(text):
return html.escape(text)
print(sanitise_for_display("<script>alert(1)</script>"))
# <script>alert(1)</script>
Sanitisation is a useful defence in depth but is fragile when used as the only defence. Different output contexts (HTML attribute, JavaScript string, SQL value, URL) have different escape rules.
Output encoding
Transform data at the boundary where it is written to a target context. The encoding depends on the context:
- HTML body: HTML-encode
<,>,&,",'. - HTML attribute: HTML-encode plus quote the attribute.
- JavaScript string: JavaScript-encode and never trust user input as code.
- SQL: do not encode - use parameterised queries.
- URL parameter: URL-encode.
The big example: parameterised queries
A vulnerable login query:
def login(username, password):
query = f"SELECT * FROM users WHERE name = '{username}' AND pass = '{password}'"
return db.execute(query).fetchone()
Submitting ' OR '1'='1 as the password turns the query into:
SELECT * FROM users WHERE name = 'admin' AND pass = '' OR '1'='1'
which returns the admin row regardless of password.
The fix:
def login(username, password):
query = "SELECT id, pass_hash FROM users WHERE name = ?"
row = db.execute(query, (username,)).fetchone()
if row and bcrypt.checkpw(password.encode(), row["pass_hash"]):
return row["id"]
return None
The database driver substitutes the ? placeholder with the value safely. No string concatenation, no escape rules, no injection.
Defence in depth
Real systems combine all three:
- Validate at input: reject obviously malformed data early.
- Use parameterised queries for SQL (and equivalent techniques for other languages).
- Encode at output for HTML, JavaScript, URL contexts.
- Apply Content Security Policy headers to limit damage if XSS slips through.
Past exam questions, worked
Real questions from past NESA papers on this dot point, with our answer explainer.
2024 HSC5 marksDistinguish between input validation, sanitisation and output encoding. Show how each technique defends against an SQL injection attack on a login form.Show worked answer →
Input validation checks that data matches an expected format before any further processing. A username field might require 3-32 characters, alphanumeric only. If the input fails the check, the request is rejected. Validation works best as an allow-list (specify what is allowed) rather than a deny-list (specify what is blocked).
Sanitisation transforms input to make it safe for downstream use - for example, stripping or escaping characters that would have a special meaning. Validation rejects bad input; sanitisation modifies it.
Output encoding transforms data at the point it is written to a target context (HTML, SQL, shell). Encoding HTML entities (< becomes <) prevents reflected XSS; SQL parameter binding prevents SQL injection.
For an SQL injection attack via the login form:
- Validation: reject usernames containing quote or semicolon characters. Defends in depth but is not the primary fix.
- Sanitisation: strip or escape SQL metacharacters. Fragile because escape rules differ by database.
- Output encoding (parameterised queries): pass the username as a bound parameter so the database driver never interprets it as SQL. This is the primary defence against SQL injection.
Markers reward the three definitions, the distinction (validation rejects, sanitisation transforms, encoding adapts to context), and identifying parameterised queries as the canonical defence.
Related dot points
- Identify the OWASP Top 10 web application security risks and describe mitigations for each
A focused answer to the HSC Software Engineering Module 1 dot point on the OWASP Top 10. Each risk, an example, and a mitigation, the worked broken-access-control example, and the traps markers look for.
- Identify and mitigate cross-site scripting (XSS), cross-site request forgery (CSRF) and SQL injection vulnerabilities
A focused answer to the HSC Software Engineering Module 2 dot point on web vulnerabilities. XSS (stored and reflected), CSRF, SQL injection, mitigations for each, the worked example, and the traps markers look for.
- Design a relational database schema and write SQL statements to create tables, insert data, query with joins, and update or delete rows
A focused answer to the HSC Software Engineering Module 2 dot point on relational databases. Schema design, primary and foreign keys, SELECT with JOIN, INSERT, UPDATE, DELETE, the worked example, and the traps markers look for.