How and why is data compressed for storage and transmission between digital systems?
Explain how lossless and lossy data compression reduce the size of data for efficient storage and transmission, and the trade-offs each makes
A focused answer to the QCE Digital Solutions Unit 4 dot point on compression. The difference between lossless and lossy compression, how run-length and dictionary methods work, common formats, and the trade-offs between size, quality and speed in data exchange.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this dot point is asking
Unit 4 covers how compression, encryption and hashing are used in the storage and transfer of data. This dot point is the compression part: QCAA wants you to explain how data compression reduces file size, distinguish lossless from lossy compression, and discuss the trade-offs. Smaller data transmits faster and costs less to store and send, which matters directly to the efficiency of a data exchange, so this sits alongside encryption as a core data-handling concept.
Why compress data
Compression makes data smaller so it takes less storage space and transmits faster across a network. In a data exchange, smaller payloads mean lower bandwidth use, faster responses and lower cost, which are all non-functional benefits you can justify. The trade-off is processing time, since compressing and decompressing use CPU, and for lossy methods, a loss of quality.
Lossless compression
Lossless compression reduces size by removing redundancy in a way that lets the exact original be reconstructed. No information is lost, so it is essential for text, code, spreadsheets and any data where an exact copy is required. Two ideas QCAA expects you to understand:
- Run-length encoding (RLE): replaces runs of the same value with a count and the value. The sequence AAAAA becomes 5A. It works well on data with long runs and poorly on varied data.
- Dictionary methods: build a table of repeated patterns and replace each occurrence with a short code (the idea behind ZIP and GIF).
Common lossless formats include ZIP and GZIP for general files, PNG for images and FLAC for audio.
Lossy compression
Lossy compression achieves much smaller sizes by permanently discarding data that the human eye or ear is unlikely to miss. The original cannot be perfectly restored. It suits photos, music and video, where a small, fast file matters more than a bit-perfect copy.
- JPEG discards fine colour and brightness detail in images.
- MP3 and AAC discard sounds masked by louder ones.
- MP4 / H.264 discards spatial and temporal detail in video.
The compression level is usually adjustable, trading more size reduction for more visible or audible loss.
Choosing between lossless and lossy
The decision rule is simple and examinable:
- If an exact copy is required (text, code, financial data, medical images for diagnosis), use lossless.
- If perceptual quality is enough and size or speed dominates (web photos, streamed music), use lossy.
Compression ratio expresses the saving, for example a 10 MB file compressed to 2 MB has a 5 to 1 ratio. Higher ratios save more but, for lossy methods, lose more quality.
Compression in data exchange
In a data exchange, compression usually happens before transmission and decompression after receipt. HTTP can compress response bodies (for example with GZIP) transparently. When you design a prototype that moves large data, choosing an appropriate compression method and justifying it against the requirement (exact copy needed or not, bandwidth limited or not) is a strong design decision. Compression also interacts with encryption: data is normally compressed first, then encrypted, because encrypted data has little redundancy left to compress.
How this appears in assessment
The external exam can ask you to compare lossless and lossy compression, choose a method for a scenario, or explain a trade-off. In IA3 you may justify compressing exchanged data for efficiency. Be ready to name a method, state whether it is lossless or lossy, and explain the size-versus-quality trade-off in context.