Skip to main content
WAEnglishSyllabus dot point

How do visual and multimodal texts construct meaning through image, layout and design?

Analyse how visual and multimodal texts construct perspectives and position viewers through visual and design choices

A focused answer to the WACE Year 12 English Unit 4 dot point on visual and multimodal texts. How framing, salience, gaze, colour and layout carry meaning, how mode and image interact, and how to write visual analysis with the same rigour as language.

Generated by Claude Opus 4.76 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

What this dot point is asking

WACE English includes visual and multimodal texts, and the Comprehending section regularly presents images, advertisements, cartoons or texts that combine words and pictures. Students who treat the image as a backdrop to the words lose marks. This dot point asks you to read visual and design choices analytically, applying the same discipline of naming a feature and arguing its effect that you use on prose, and to understand how the different modes in a multimodal text work together.

Visual texts have their own metalanguage

Just as written texts have syntax and diction, visual texts have a vocabulary of choices you can name precisely.

  • Framing and composition: what is included, what is cropped out, and how the elements are arranged.
  • Salience: what draws the eye first, through size, contrast or placement.
  • Gaze and angle: where a depicted figure looks, and whether the viewer looks up at, down on, or level with the subject.
  • Colour: the palette and its connotations, including warmth, coldness and saturation.
  • Layout: in texts with words and images, how the two are arranged and which leads.

Using this vocabulary accurately lets you argue effect rather than describe the picture.

A camera angle is a position

Visual choices construct perspective in the literal and the analytical sense. A low angle looking up at a figure positions the viewer to feel small before them, lending the subject power. A high angle looking down does the reverse. A direct gaze meeting the viewer demands engagement; an averted gaze invites the viewer to observe unseen. None of this is neutral. Each choice positions the viewer to feel and judge in a particular way, and naming the choice is the start of the analysis.

The paragraph names visual features with accurate metalanguage and argues the position each constructs, treating the image exactly as it would treat a written text.

Multimodal texts coordinate their modes

In a text that combines words and images, the meaning lives partly in how the modes interact. Words can anchor an ambiguous image toward one meaning, an image can undercut or ironise the words above it, and layout decides which the reader meets first. Analysing a multimodal text means reading the relationship between modes, not just each mode alone.

A reliable analytical frame

Build the point around this chain: the visual choice of [framing, salience, gaze, colour or layout] positions the viewer to [response] by [how], constructing a perspective in which [view]. The frame keeps your visual analysis as rigorous as your language analysis.

How this maps to the exam

The Comprehending section regularly includes a visual or multimodal text, often paired with a written one, and the marks reward genuine visual analysis rather than description. Reading the relationship between modes is also useful in Responding when studied texts are films or graphic works, where image and word are inseparable.