Skip to main content
VICBiologySyllabus dot point

How are errors identified, quantified and discussed in a Unit 4 AoS 3 investigation?

Evaluate the validity, reliability, precision and accuracy of the student-designed investigation, identify sources of error, and propose improvements grounded in the data

A focused VCE Biology Unit 4 AoS 3 answer on evaluating the investigation. Defines validity, reliability, precision and accuracy in VCAA's sense; categorises sources of error (random, systematic, gross); walks through worked examples of error analysis on enzyme and ecology investigations.

Generated by Claude Opus 4.79 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this sub-topic is asking
  2. The answer
  3. Examples in context
  4. Try this

What this sub-topic is asking

VCAA's Key Science Skill 5 expects you to evaluate the data and method. The evaluation has to use VCAA's four terms (validity, reliability, precision, accuracy) correctly, identify the specific sources of error in your investigation, and link the limitations back to the conclusion you can defend. This page covers the four terms, the error categories, and how to write the evaluation section that markers reward.

The answer

The evaluation is the most rewarded section of the poster after the discussion. Many investigations have plausible methods and reasonable conclusions; the ones that score in the top band have evaluations that name specific limitations, quantify their impact where possible, and propose realistic improvements.

The four VCAA terms

Validity
Whether the method actually tests the hypothesis. A pH investigation with no buffering is not a valid test of enzyme activity at different pH (the pH drifts during the reaction). A measurement of "how fast plants grow" by counting leaves is not a valid measure of biomass (leaf area, dry mass, or height would be more valid). Validity is about the link between what you measured and the construct you wanted.
Reliability
The consistency of the measurement. If repeating the procedure produces similar results, the measurement is reliable. Reliability is improved by replication (multiple trials per condition), standardised method, and controlling variables. A single trial per condition cannot be assessed for reliability.
Precision
How close repeated measurements are to one another. High precision means the values cluster tightly. Note that precision is independent of accuracy: a balance reading 50.000 g for a 25 g mass is precise (consistent to 0.001 g) but inaccurate.
Accuracy
How close a measurement is to the true value. Accuracy is improved by calibration (zeroing balances, calibrating thermometers, validating pH probes with reference buffers) and by minimising systematic bias.

The three error categories

Random error
Unpredictable fluctuations: human reaction time when starting a stopwatch, small variations in mixing, small variations in lighting or temperature. Reduced by averaging over many replicates. Random error affects precision.
Systematic error
A consistent bias in one direction: an uncalibrated balance reading 0.5 g high, a stopwatch that runs slow, a pH probe that reads 0.2 units high. Reduced by calibration and by checking instruments against a known standard. Systematic error affects accuracy.
Gross error
A one-off mistake: misreading the instrument, contaminating a sample, transposing a number when recording. Reduced by carefully recording at the time of measurement and by double-checking. Gross errors usually appear as outliers that should be investigated, not discarded silently.

Anomalies and outliers

A data point that lies far from the rest of the data needs investigation, not silent removal. The standard approach:

  1. Check the logbook for any recorded oddity in that trial (apparatus problem, contamination, timing miss).
  2. If a cause is identified, document the cause and exclude or rerun with explanation.
  3. If no cause is identified, retain the point and acknowledge it in the discussion. Strong investigations may apply a statistical outlier rule (1.5 x IQR rule) and report the test result.

Quietly deleting a data point because it disagrees with the hypothesis is a research-integrity issue. The logbook trail should make any exclusion defensible.

Quantifying error where possible

A higher-band evaluation quantifies error rather than describing it qualitatively. Examples:

  • A manual stopwatch contributes approximately +/- 0.2 seconds per timing; over a 30-second measurement that is around 0.7 percent.
  • An analytical balance reads to +/- 0.001 g; on a 10 g sample that is 0.01 percent.
  • A 10 mL graduated pipette is typically +/- 0.05 mL precision; on a 10 mL aliquot that is 0.5 percent.
  • A pH probe is typically +/- 0.05 pH units; on a 4 to 8 pH range the relative error is small but the consequence for an enzyme assay can still be substantial.

Propagating these into the final result (or at least noting the dominant error source) is what separates an evaluation from a list of caveats.

Examples in context

Example 1. A bioinformatics investigation has no instrument error but still needs evaluation. A sequence-comparison investigation using UniProt has no apparatus precision issues. Its limitations are validity (does percent sequence identity actually measure relatedness, given that some proteins are highly conserved across distant species?) and sample bias (the species sampled may not represent the diversity of the protein family). The evaluation should address these specifically rather than transplant lab-style error talk.

Example 2. The catalase pH curve as a teaching exemplar. Enzyme kinetics investigations are typical Unit 4 AoS 3 candidates. The error analysis on a pH curve is a good test of the evaluation skill because the dominant errors (temperature drift, pH buffer accuracy, enzyme freshness, bubble-counting variation) are specific and quantifiable. Many examiner reports across years cite this kind of investigation as one where the evaluation can be done at top-band depth.

Try this

Q1. Distinguish between validity and reliability with a biological example of each. [4 marks]

  • Cue. Validity = method tests the hypothesis (e.g. using leaf area to measure plant growth is more valid than counting leaves). Reliability = consistency of measurement on repeats (e.g. five replicate trials at the same condition give similar results).

Q2. A student investigates the effect of light intensity on the rate of photosynthesis using oxygen bubble counting from elodea. Identify two sources of systematic error and one source of random error, and propose one improvement for each. [6 marks]

  • Cue. Systematic: light from the room contributing to the measured intensity (improvement: cover the apparatus); temperature drift during the trial day (improvement: use a water-bath). Random: bubble-counting variation due to different bubble sizes (improvement: use volumetric oxygen measurement or photosynthesis chamber).

Q3. A student reports peak enzyme activity at pH 7 based on three trials per pH. The pH 6 mean is higher than expected and the standard deviation is large. Critique the data and recommend next steps. [4 marks]

  • Cue. Large SD at pH 6 reduces confidence in the mean. Check the logbook for any methodological oddity at pH 6 (buffer freshness, liver freshness, timing). Recommend repeating pH 6 with additional replicates and verified buffer; consider whether the pH 6 buffer was correctly mixed.

Exam-style practice questions

Practice questions written in the style of VCAA exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

2023 VCAA1 marksStudents completed their investigation and analysed their results. They suggested their results were affected by systematic errors. Systematic errors A. result in a spread of readings. B. affect the precision of a measurement. C. are easy to identify and eliminate. D. cause readings to differ from the true value by a consistent amount each time.
Show worked answer →

The answer is D.

A systematic error shifts every reading in the same direction by the same (or proportional) amount, for example an uncalibrated balance that always reads 0.2 g high. Because the offset is consistent, it affects accuracy (how close readings are to the true value), not precision (how close repeated readings are to each other).

Why the others are wrong: A and B describe random error, which produces scatter and reduces precision. C is false because systematic errors are often hard to detect precisely because the data still looks consistent and repeatable; you usually need calibration against a known standard to reveal them.

Exam tip: repeating the experiment does NOT remove a systematic error - it just gives you a consistently wrong mean. Only fixing the cause (recalibration, corrected technique) removes it.

2023 VCAA1 marksStudents designed a controlled experiment. After they had performed the experiment, another group of students gave them feedback suggesting that they should modify the experiment to improve the accuracy of their results. A change that the first group of students could make to improve the accuracy of their results could include A. ignoring outlying results. B. repeating the experiment many times. C. carefully calibrating the equipment used. D. having many people take the measurements.
Show worked answer →

The answer is C.

Accuracy is how close a measurement is to the true value. Calibrating the equipment against a known standard removes systematic offsets and brings readings closer to the true value, so it improves accuracy.

Why the others are wrong: B (repeating many times) improves reliability and the precision of the mean, but if there is a systematic error the mean is still inaccurate. A (ignoring outliers) is poor practice unless an outlier has a documented cause, and it does not address accuracy. D (many people measuring) tends to introduce more variation between observers, which can reduce precision rather than improve accuracy.

Watch the wording: VCAA separates accuracy (closeness to true value) from reliability/precision (consistency of repeats). The verb in the stem tells you which one to target.

2017 VCAA1 marksDuring the experiment, the student measured the varying pH levels using a digital pH meter. The student calibrated the meter using a pH 7 buffer solution. The reason the student calibrated the pH meter was to A. ensure a random error would not influence the results. B. eliminate the effect of all uncontrolled variables. C. enable the use of the instrument with precision. D. allow the pH to be measured accurately.
Show worked answer →

The answer is D.

Calibrating against a known pH 7 buffer sets the instrument's reference point so its readings match true pH values. This targets a systematic error in the instrument and therefore improves the accuracy (closeness to the true value) of every subsequent measurement.

Why the others are wrong: A is incorrect because calibration corrects a consistent (systematic) offset, not the random scatter between repeats. B is too broad - calibration only addresses the instrument, not every uncontrolled variable in the experiment. C confuses precision (consistency of repeats) with accuracy; calibration does not make the readings more closely clustered, it makes them closer to the true value.

Related dot points