Explain operant conditioning, including reinforcement, punishment and schedules of reinforcement, using Thorndike and Skinner.

How operant conditioning shapes voluntary behaviour through consequences, including positive and negative reinforcement, punishment, and schedules of reinforcement, from Thorndike to Skinner.

Generated by Claude Opus 4.89 min answerUpdated 2026-06-02

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section

What this dot point is asking
The basic principle
Reinforcement and punishment
Shaping and primary versus secondary reinforcers
Schedules of reinforcement
Why it matters

What this dot point is asking

You need to explain how consequences shape voluntary behaviour, define reinforcement and punishment, and describe schedules of reinforcement, using the key researchers. This is a core external-exam topic.

The basic principle

Operant conditioning concerns voluntary behaviour that operates on the environment to produce consequences. Edward Thorndike's law of effect stated that behaviours followed by satisfying consequences are more likely to recur, and those followed by unpleasant consequences less likely.

B. F. Skinner developed this using the operant chamber (Skinner box), in which animals learned to press a lever for food. He distinguished reinforcement from punishment.

Reinforcement and punishment

Reinforcement increases behaviour; punishment decreases it. Each can be positive (adding something) or negative (removing something):

Positive reinforcement - adding a pleasant consequence (praise, food) to increase a behaviour.
Negative reinforcement - removing an unpleasant stimulus (turning off a loud alarm) to increase a behaviour.
Positive punishment - adding an unpleasant consequence (a fine) to decrease a behaviour.
Negative punishment - removing a pleasant stimulus (taking away phone privileges) to decrease a behaviour.

Shaping and primary versus secondary reinforcers

Shaping - reinforcing successive approximations to a target behaviour to build complex behaviours step by step.
Primary reinforcers satisfy biological needs (food, water); secondary (conditioned) reinforcers gain their power through association (money, tokens, grades).

Schedules of reinforcement

How often reinforcement is given affects how strongly and persistently behaviour is learned.

Continuous reinforcement - every response is reinforced. Fast learning but rapid extinction.
Fixed ratio - reinforcement after a set number of responses (every 5th lever press).
Variable ratio - reinforcement after an unpredictable number of responses. Produces the highest, most extinction-resistant response rate (the basis of gambling).
Fixed interval - reinforcement for the first response after a set time.
Variable interval - reinforcement after unpredictable time periods, producing steady responding.

Why it matters

Operant conditioning underpins behaviour management, token economies, education, animal training and habit formation. Variable-ratio schedules explain why gambling and some app designs are so persistent.

Psychologists generally regard reinforcement as more effective than punishment for shaping behaviour, and evaluating why is worth marks. Punishment only suppresses a behaviour without teaching what to do instead, its effect often fades once the punisher is absent, and it can produce fear, aggression or avoidance of the punisher rather than the behaviour. Reinforcing a desirable alternative behaviour gives clearer, more durable learning. Timing also matters: consequences that follow immediately and consistently shape behaviour far more effectively than delayed or inconsistent ones, which is why real-world attempts to change behaviour often fail when the consequence is distant from the action.

Exam-style practice questions

Practice questions written in the style of SACE Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

SACE 20184 marksJack spends a lot of time on his phone playing games. Using psychological terminology, explain how operant conditioning could lead Jack to spend more time playing games on his phone.

Show worked answer →

Four marks: identify the behaviour, the consequence, the type of reinforcement and the resulting change in behaviour, using correct terms.

Behaviour and consequence: Jack plays a game (the operant behaviour) and the game delivers points, rewards, level-ups or a sense of achievement (the consequence) shortly after he plays.
Positive reinforcement: because a pleasant stimulus (the in-game reward) is added following the behaviour, the consequence acts as a positive reinforcer.
Effect on behaviour: reinforcement increases the probability that Jack will repeat the behaviour, so he plays more often and for longer.
Schedule: games typically reward players on a variable-ratio schedule (rewards after an unpredictable number of plays), which produces high, persistent rates of responding and makes the gaming behaviour resistant to extinction. Naming the schedule lifts the response into the top band.

SACE 20184 marksIdentify the schedule of reinforcement in each case. (i) Misha the cat sits at the fridge every time she is hungry, but is only fed at 6 pm. (ii) Shaun occasionally buys a lottery ticket. (iii) Maria gets pocket money after washing all three family cars. (iv) A teacher randomly calls on every student to answer one question during each lesson.

Show worked answer →

Four marks: one mark for each correctly identified schedule.

(i) Fixed interval: reinforcement (food) is delivered for the first response after a set, predictable time period has elapsed (6 pm each day).

(ii) Variable ratio: reinforcement (a win) follows an unpredictable, varying number of responses (lottery purchases), which sustains gambling behaviour.

(iii) Fixed ratio: reinforcement (pocket money) is delivered after a set number of responses are completed (washing all three cars).

(iv) Variable interval: reinforcement (being called on) occurs for a response after varying, unpredictable amounts of time during the lesson. Distinguishing ratio (based on number of responses) from interval (based on time) is the key marking discriminator.

SACE 20192 marksState two similarities between classical conditioning and operant conditioning.

Show worked answer →

Two marks: one mark for each genuine similarity.

Both are forms of associative learning in which a relatively permanent change in behaviour results from experience, rather than from maturation or fatigue.
Both show the same secondary processes, including acquisition, extinction, spontaneous recovery, stimulus generalisation and discrimination.

Other acceptable points: both rely on environmental stimuli or consequences, and both were established through controlled animal research (Pavlov for classical, Thorndike and Skinner for operant). Avoid listing differences here, as the question asks specifically for similarities.