§-Syllabus dot point

NSWSoftware EngineeringSyllabus dot point

Inquiry Question 1: How are large-scale software solutions developed and managed?

Set up continuous integration and deployment pipelines that build, test and release software automatically

A focused answer to the HSC Software Engineering Module 4 dot point on CI/CD. Build, test, deploy automation, GitHub Actions, the worked pipeline example, and the traps markers look for.

Generated by Claude Opus 4.85 min answerUpdated 2026-06-19

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

What this dot point is asking

NESA wants you to define continuous integration and continuous deployment, describe what a typical pipeline does at each stage, and explain why teams adopt CI/CD.

The answer

Definitions

Continuous integration (CI): every code change is automatically built and tested as soon as it is committed or proposed via a pull request.
Continuous delivery: every change that passes CI is automatically prepared for release; humans choose when to deploy.
Continuous deployment: every change that passes CI is automatically deployed to production. No human gate.

CI is universal; continuous delivery is common; continuous deployment is used by mature teams with deep test coverage.

The pipeline

A typical CI/CD pipeline runs every stage in order. Any failure halts the pipeline.

Each stage either passes (continue) or fails (stop, notify, do not deploy).

Tools

GitHub Actions: YAML pipelines stored in .github/workflows/, free for public repos.
GitLab CI: YAML pipelines in .gitlab-ci.yml.
CircleCI, Travis, Jenkins: dedicated CI services.
Argo CD, Flux: declarative continuous deployment for Kubernetes.

A worked GitHub Actions pipeline

A ci.yml for a Python project:

name: CI
on:
  push:
    branches: [main]
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install dependencies
        run: |
          pip install -r requirements.txt
          pip install pytest ruff mypy

      - name: Lint
        run: ruff check .

      - name: Type check
        run: mypy src/

      - name: Unit tests
        run: pytest tests/unit -v

      - name: Integration tests
        run: pytest tests/integration -v
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test

The same pipeline runs locally (pytest, ruff, mypy) and in CI. Developers can reproduce CI failures on their own machine.

A worked deployment pipeline

A deploy.yml triggered on push to main:

name: Deploy
on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    needs: test  # depends on the CI job
    steps:
      - uses: actions/checkout@v4

      - name: Build container image
        run: docker build -t app:${{ github.sha }} .

      - name: Push to registry
        run: |
          echo ${{ secrets.REGISTRY_TOKEN }} | docker login -u user --password-stdin
          docker push app:${{ github.sha }}

      - name: Deploy to staging
        run: |
          kubectl set image deployment/app app=app:${{ github.sha }} -n staging
          kubectl rollout status deployment/app -n staging

      - name: Smoke test staging
        run: ./scripts/smoke-test.sh https://staging.example.com

      - name: Deploy to production
        run: |
          kubectl set image deployment/app app=app:${{ github.sha }} -n prod
          kubectl rollout status deployment/app -n prod

Quality gates

CI is the place to enforce standards across the team:

Tests pass.
Lint passes. Consistent style across the codebase.
Type checker passes. Catches whole classes of bugs.
Test coverage threshold. "No PR drops coverage below 80 percent".
Security scans. SAST, dependency scanning.
No secrets committed. Secret scanning.

Each rule is automated. Humans review the changes; the pipeline enforces the rules.

Rolling deployments and rollback

A production deployment should be safe. Patterns:

Rolling deployment: replace instances one at a time. No downtime if the new version is healthy.
Blue-green deployment: two parallel environments. Switch traffic from blue to green when green is ready.
Canary deployment: send a small fraction of traffic to the new version. Promote if metrics look good.
Feature flags: deploy the code with the feature disabled. Enable later for selected users.

Rollback should be one click or one command. Tag every release, retain the previous artefact, monitor key metrics for 5-15 minutes after deploy.

Monitoring

Deployments are not done when the new code is running. Watch:

Error rates.
Latency.
Business metrics (signups, transactions).
User reports.

Roll back if anything regresses.

Benefits

Fast feedback: a developer learns within minutes whether their change passes.
Confidence: every change is tested before it ships.
More frequent releases: from quarterly to daily to many-per-day.
Smaller changes: each release is smaller, so failures are less catastrophic.
Consistency: the build runs in a clean environment, eliminating "works on my machine".
Documentation: the pipeline is the build documentation.

Worked example

A team is deploying a new version of an e-commerce site for a Black Friday sale. Describe a CI/CD strategy that minimises risk.

Pre-deploy: full test suite must pass in CI. Load tests against staging show the new version handles 3x normal traffic.
Canary deployment: deploy to one of ten production instances, taking 10 percent of traffic. Monitor error rate and checkout success rate for 15 minutes.
Promote if green: extend to 50 percent of instances. Monitor for 30 minutes.
Full rollout: extend to 100 percent.
Feature flag for the sale-specific page: the Black Friday landing page is behind a flag. Flip the flag at the sale start time, with the ability to flip back if metrics regress.
Rollback rehearsed: the team has practised rolling back to the previous version. The runbook is one command.
War room: dev, product and ops watch dashboards during the launch window. Pre-agreed rollback criteria (error rate > 1 percent, checkout success < 95 percent).

Marker's note: strong answers sequence the strategy (load test, then canary, then staged promotion, then full rollout), name a concrete metric and threshold for each go/no-go decision, and separate the deployment mechanism (canary) from the release mechanism (feature flag), since the two are commonly confused.

Exam-style practice questions

Practice questions written in the style of NESA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2025 HSC5 marksDescribe what continuous integration and continuous deployment mean, and explain how a CI/CD pipeline benefits a small software team.

Show worked answer →

Continuous integration (CI) is the practice of merging every developer's work into a shared main branch many times a day, with each merge triggering an automated build and test run. Every commit is verified against the test suite so problems are caught within minutes of being introduced, not days later when the next release is cut.

Continuous deployment (CD) extends CI by automatically deploying every change that passes CI into production. (Continuous delivery is the related practice where every change is releasable but humans choose when to deploy.) The pipeline removes the manual gate between code passing tests and code reaching users.

A typical pipeline:

Developer pushes a commit or opens a pull request.
CI fetches the code, installs dependencies and builds the project.
CI runs the test suite (unit, integration, sometimes end-to-end).
CI runs quality checks: linter, type checker, security scanner.
If all checks pass and the PR is approved, the change is merged.
CD builds the release artefact (Docker image, executable, deployable bundle).
CD deploys to staging automatically, runs smoke tests, then to production.

Benefits for a small team:

Every change is tested. Confidence to ship.
Bugs are caught while the change is fresh in the author's mind.
Releasing becomes routine rather than risky, so teams release more often.
The pipeline is the documentation: anyone can read it and see how the project is built.
Removes the "works on my machine" problem - the build runs in a clean environment every time.

Markers reward both definitions, the pipeline stages, and at least two specific benefits (fast feedback, fewer release-day bugs, consistent environments, more frequent releases).

Practice questions

Original practice questions graded from foundation to exam level, each with a full worked solution. Try them before revealing the solution.

foundation2 marksState the key difference between continuous delivery and continuous deployment.

Show worked solution →

Continuous delivery automatically prepares every change that passes CI for release, but a human decides when to actually deploy it. Continuous deployment removes that human decision: every change that passes CI is deployed to production automatically.

Marking criteria: 1 mark for correctly describing continuous delivery (human gate remains), 1 mark for correctly describing continuous deployment (no human gate).

foundation3 marksList, in order, three stages that occur before a change is allowed to merge into main in a typical CI pipeline, and state what happens if any one of them fails.

Show worked solution →

Three stages in order (any valid subset): build/install dependencies, lint and type check, run the automated test suite (unit and/or integration).

If any stage fails, the pipeline halts at that point, the failure is reported to the author, and the change is blocked from merging until it is fixed.

Marking criteria: 1 mark for three stages in a plausible order, 1 mark for stating the pipeline halts, 1 mark for stating the change is blocked from merging (not just "an error shows").

core4 marksExplain the difference between a canary deployment and a blue-green deployment, and identify one advantage the canary approach has over blue-green for detecting a subtle bug.

Show worked solution →

A canary deployment sends only a small fraction of production traffic (for example, 10 percent) to the new version while the rest of the traffic still uses the old version, and the team monitors metrics before gradually increasing the new version's traffic share.

A blue-green deployment keeps two complete environments (blue = old, green = new) and switches ALL traffic from blue to green at once, once the green environment is judged ready.

Advantage of canary for a subtle bug: because only a small slice of real traffic hits the new version at first, a subtle bug that only shows up under real user load affects a small fraction of users and is detected from live metrics before it can affect everyone, whereas blue-green exposes 100 percent of users to the new version the instant the switch happens.

Marking criteria: 1 mark for correctly describing canary, 1 mark for correctly describing blue-green, 2 marks for a valid, specific advantage of canary tied to limiting blast radius or early detection under real load.

core5 marksThe table below shows deployment data for a team over three months after they adopted CI/CD. | Month | Deployments per week | Median time to detect a bad deploy (minutes) | Rollbacks | |---|---|---|---| | Month 1 | 1 | 240 | 3 | | Month 2 | 8 | 40 | 2 | | Month 3 | 22 | 12 | 1 | Using the data, explain the relationship between deployment frequency and the team's ability to detect and recover from a bad deploy, and suggest one CI/CD practice that would explain this trend.

Show worked solution →

As deployment frequency rose sharply (1 to 8 to 22 per week), the median time to detect a bad deploy fell sharply (240 to 40 to 12 minutes) and the number of rollbacks needed fell (3 to 2 to 1). This is the opposite of what intuition might suggest (more deploys, more risk); instead, more frequent, smaller deployments make each individual change easier to monitor and attribute a regression to, so problems are caught and rolled back faster and less often.

Practice explaining the trend: adopting continuous monitoring immediately after each deploy (watching error rate and latency dashboards for a fixed window, for example 15 minutes) alongside smaller, more frequent releases means each deployment carries less new code, so when a metric regresses it is obvious which deploy caused it, cutting detection time and reducing the chance a bad change survives long enough to need a full rollback.

Marking criteria: 1 mark for correctly reading the trend in each column, 1 mark for explaining why smaller/more frequent deploys narrow down the cause faster, 1 mark for linking this to reduced detection time, 1 mark for linking it to reduced rollback count, 1 mark for a specific, valid CI/CD practice (not just "test more").

core3 marksExplain why a CI pipeline that takes 45 minutes to run is a problem for a development team, and suggest one way to reduce this.

Show worked solution →

A 45-minute pipeline breaks a developer's flow: they either wait idle for feedback, losing productive time, or context-switch to other work and lose focus when the result finally comes back, either way slowing down how many changes the team can safely ship per day.

Reduction strategy: parallelise independent stages (for example running unit tests and lint simultaneously in separate jobs rather than sequentially), or use test-impact analysis to only run the tests affected by the changed files rather than the full suite every time.

Marking criteria: 1 mark for identifying the flow/productivity cost, 1 mark for linking it to fewer safe releases per day, 1 mark for a valid, specific reduction strategy (parallelising or test-impact analysis, not just "make it faster").

exam7 marksEvaluate whether a small three-person startup building an early-stage product should adopt full continuous deployment (no human gate before production) rather than continuous delivery (human approves each release).

Show worked solution →

This is a 7-mark EVALUATE: markers reward a judgement supported by contrasted evidence for a small, early-stage team specifically, not a generic CI/CD description.

Case for continuous deployment: A three-person team has limited time to spend on manual release ceremonies; if changes are small and the test suite is trustworthy, automatic deployment lets the team ship many times a day, get real user feedback fast, and avoid a manual step becoming a bottleneck when there is no dedicated release manager. Feature flags let risky changes be deployed dark (disabled) and enabled independently of the deploy itself, decoupling "deployed" from "live for users", which reduces the practical risk of removing the human gate.
Case against continuous deployment: An early-stage product typically has thin, evolving test coverage, because the team is still discovering the right abstractions and has not had time to build a deep regression suite; deploying automatically on a shaky test suite risks shipping broken behaviour straight to real (possibly paying) users with no human sanity check. A small team also has limited on-call capacity to respond immediately if an automatic deploy goes wrong, so a brief human check before release (continuous delivery) costs little time for a three-person team but adds a meaningful safety margin.
Judgement: For most early-stage startups, continuous delivery is the better fit initially: keep every passing change release-ready, but require one human click before it goes live, because test coverage is not yet mature enough to trust an automatic gate. As the team matures its test suite, adds feature flags and builds confidence in its monitoring and rollback speed, it can graduate to full continuous deployment for lower-risk parts of the system while keeping delivery-only gates for higher-risk areas, which is a middle path many mature companies use.

Marker's note: top-band answers (1) weigh both sides specifically for a small, early-stage team rather than giving a generic CI/CD essay, (2) identify test coverage maturity as the deciding factor, (3) mention feature flags as a risk-reducing technique that changes the calculus, and (4) end with a qualified, non-absolute judgement (a path to graduate) rather than a flat yes or no.