Contingency and Continuity

Operant Conditioning:

Causal Factors and Explanations

Contingency and Continuity

Conditioning and Learning

Conditioning is the process by which an activity originates or is changed through reacting to an encountered situation provided that the change in activity can not be explained on the basis or native tendencies, maturation or temporary states. Hilgard (1956)

Conditioning is the learning of relations among events so as to allow the organism to represent its environment. Rescorla (1988)

S-R

S-S

WHAT PRODUCES CONDITIONING: CONTIGUITY OR CONTINGENCY? Power of reinforcement to shape and

sustain operant behavior is pervasive. So, too, is its potential as a tool in a

wide range of practical applications. But, what is it about response-reinforcer

relation that produces conditioning?

Operant Conditioning

S+ R O*

WHAT PRODUCES CONDITIONING: CONTIGUITY OR CONTINGENCY? Two answers have been offered: Contiguity

– Temporal proximity between response and reinforcer

Contingency– Probabilistic relation with reinforcement of

responding and not responding

temporal relationship; temporal contiguity refers to the delivery of the reinforcer immediately after the

response

causal relationship; response-reinforcer contingency refers to the extent to which the response is necessaryand sufficient for the occurrence of the reinforcer

Evidence for Contiguity

Generally, as delay between response and reinforcer increases, rate of operant responding decreases.





Temporal contiguity is thus necessary for conditioning to occur.

But, is temporal contiguity also sufficient?

Can’t tell because most studies require responses to produce reinforcers.

Evidence for Contiguity Skinner’s 1948

superstition project : Studied 8 hungry

pigeons. Food given every 15 sec

regardless of pigeon’s behavior.

6/8 pigeons performed idiosyncratic patterns of unnecessary behavior.

Responding rose as time to food approached.

Why did all of this happen?


Food happened to follow something each pigeon was doing.

Different behaviors were strengthened for different pigeons.

Higher the rate of response, the more likely food would again follow response.

Responding rose as time to food neared because P of response-food pairing rose the longer the time since last food.


Skinner thus concluded that necessary and sufficient condition for operant conditioning was that a reinforcer closely follow a response.

Why is response-reinforcer contingency so effective?

It guarantees response-reinforcer contiguity.


Skinner’s results and conclusions have been questioned.

His results may be difficult to replicate. His conclusions may not be general. But, beyond superstition experiment,

there may be good evidence to support importance of response-reinforcer contiguity.

What leads to conditioning?

Contiguity– Stimuli that are close

to one another in time and in space become associated

Co-occurrence– Proximity critical

Contingency– When one stimulus

depends on the other, they will become associated

Information – Predictive value

critical


Thomas (1981) study on contiguity-promoting schedules.

P(Food|Press) = P(Food|No Press)

Trial 1 Trial 2

Trial 1 Trial 2

20s

No Response:

Subject Responds:

RewardS*

RewardS*

RewardS*

RewardS*

Bar pressR

Bar pressR

Thomas Schedule

p p pp ----/----/----/---- f f f f


Thomas (1981) study on contiguity-promoting schedules.

P(Food|Press) = P(Food|No Press) No press-food contingency. But, response-food contiguity was

promoted by novel schedule. So too was lever pressing. Rats

increased lever pressing


Extra wrinkle of Thomas (1981) study: P(Food|Press) < P(Food|No Press) Thus, negative press-food contingency.

Trial 1 Trial 2

Trial 1 Trial 2

20s

No Response:

Subject Responds:

RewardS*

NOS*

RewardS*

RewardS*

Bar pressR

Bar pressR

Rewarded response causes next 20s trial to be unrewarded

Trial 4Trial 3

Second Thomas Schedule

p p pp ----/----/----/---- f f f


Extra wrinkle of Thomas (1981) study: P(Food|Press) < P(Food|No Press) Thus, negative press-food contingency. Response-food contiguity was still promoted

by second schedule. So too was lever pressing. Rats increased

lever pressing Power of contiguity is very strong; can even

override effects of contingency.








critical

Operant Conditioning: Contiguity

S+ R O*

O* stamps in S+-R relationship

Pavlov

Hull

Skinner

CONTINGENCY LEARNING

Attempts to assess contingency learning in operant conditioning parallel studies in Pavlovian conditioning.

Operant studies suggest that organisms can distinguish dependence from independence between response and reinforcer.

Cause and effect








critical

CONTINGENCY LEARNING Figure 8.4

Contiguity without Contingency

10 20

20 40

airplane

no plane

no

S* 2 S* 2 a b

c d

S+ 1

No S+ 1

bird andplane are paired

A quick test for contingency

a·d > c·bthen positive

a·d = c·bzero contingency

a·d < c·bthen negative

no bird bird

prob.(birdplane) = .33prob.(birdno plane) = .33

10/30 20/60

You can have a positive contingency even whenpairing is the least frequent possibility

Example: can you learn that

and “cat” are associated?

“cat” no “cat”

100 900 1,000

200 9,800 10,000

see

no

p (“cat” ) = .10

p (“cat”no ) = .02

hear

positive contingency

Learning:

Seeking cause and effect

relationships


Head turn, mobile, infants given positive contingency procedure (Watson, 1967):– Infants’ head turning increased, plus they

smiled when mobile moved Infants put on zero contingency

procedure:– Infants’ head turning did not increase, plus

they stopped smiling when mobile moved

Apple martinis

Carolyn Rovee-Collier

445-3364


Infants discriminate response-dependent from response-independent reinforcement: shown by head turning.

Infants differentially enjoy response-dependent and response-independent reinforcers: shown by smiling and cooing.

Both cognition and affect may be changed by control by consequence.

Cause and effect

Learned Helplessness (Seligman)

panel

Learned Helplessness (Seligman)

Phase I - Learning to Escape

Control Dogs Yoked Dogs

Shock

•A long lasting shock is given to both groups

every once in a while

•Control dogs can turn shock off by pushing a

panel

•Yoked dogs’ shock turns off too, when control

dog pushes panel

•Yoked dogs can do nothing themselves to escape

shock

Contiguity or Contingency?

Spot Periodically shocked Can terminate shock

by pressing lever with his nose

Lassie Periodically shocked Has no control over

shocks, but when Spot’s shock is terminated, so is Lassie’s

Phase 2 - Avoidance Learning

•shock delivered to one side of box•if dog jumps hurdle to other side there is no shock

Control dogs learn to avoid shockYoked dogs don’t

Yoked dogs have learned that they can’t stop shockThey have learned to be helpless

hurdle

Learned Helplessness

Yoked dog seems to have learned that its behavior does not matter:– It not only fails to learn– It stops reacting to shock

Phenomenon of learned helplessness strongly suggests that organisms can discriminate response-dependent from response-independent events.

Learned Helplessness

Animals must learn to jump barrier to avoid shock

Results– Spot learns, Lassie yelps

but eventually becomes passive and accepts shocks

Contingency– Spot learns his actions

matter– Lassie learned that it was

helpless Contiguity

– Spot learned to press lever– Lassie learned to act

passively

Seligman’s Learned Helplessness Study

Two groups of dogs are exposed to shock– control group could escape shock– “no escape” group could NOT escape

shock Later, when escape was possible, “no

escape” dogs didn’t even try Learned that they had NO CONTROL

OPERANT CONDITIONING: WHAT IS LEARNED? In any operant conditioning study, three

events need to be considered: – Response (R)– Reinforcer or punisher it produces (O*) – Stimulus situation in which response occurs (S) – Three occur in S-R-S* sequence

What associations among three elements are formed when animal learns to make operant response?

OPERANT CONDITIONING: WHAT IS LEARNED? R-O* association Seems to require foresight: acting in

accord with future consequences. Thorndike famously denied that

animals know what consequence of their behavior will be.

Law of effect thus emphasized past consequences.


S+ (R O*)

OPERANT CONDITIONING: WHAT IS LEARNED? S+-R association Thorndike’s idea Situation evokes behavior (S-R). Reinforcers strengthen S-R bond. S+ becomes more likely to evoke R.

Operant Conditioning: Contiguity

S+ R O*

O* stamps in S+-R relationship

Pavlov

Hull

Skinner

OPERANT CONDITIONING: WHAT IS LEARNED? Two-process theory:

– S-R association (operant)– S-O* association (Pavlovian)

Sight of lever not only triggers lever pressing, but it also makes animal “think” about upcoming food.

Anticipation of reinforcer motivates operant response.

OPERANT CONDITIONING: R-S* Learning Strongest evidence comes from studies

using devaluation procedure (Colwill & Rescorla, 1985).

Chain PullSugar Water Lever PressFood Pellet Food PelletIllness (Devaluation) Choice: chain pull versus lever press Rats pull chain much more than press

lever.

R-O* association

Colwill & Rescorla (1985)Training Devaluation Test

R1 O1

R2 O2

O1 LiCL

O2 nothing

R1 and R2

1

2

3

4

5

6

7

Meanresp/min

R1 -outcomewas devalued

Time

R2 -outcomenot devalued

OPERANT CONDITIONING: R-O* Learning Association of food with illness does not

change stimulus aspects of situation that might generate responses.

Lever press does not occur because it is associated with chamber (S-R), but because it is associated with reinforcer (R-S*).

When value of reinforcer is eliminated, so too is impetus for response.

OPERANT CONDITIONING: R-O* Learning Operant conditioning involves learning to

expect responses to produce reward. Rats not only expect reward, but a

particular kind of reward. Devaluation procedure could not work

unless rats had specifically remembered that one response produced food pellets and other produced sugar water.


(S+ ) R (O*)

OPERANT CONDITIONING: S-O* Learning Rats trained to panel press. A light or noise was always present. S1 = sugar water and S2 = food pellets. Lever press = sugar water and chain pull =

food pellets. S1 increased lever pressing, but not chain

pulling. S2 increased chain pulling, but not lever

pressing.

S-O* Learning

OPERANT CONDITIONING: S-O* Learning For rat to show these selective

increases in responding, it must have learned which stimulus was associated with which reward.

Therefore, this study provides evidence of S-O* associations in operant conditioning (Colwill & Rescorla, 1988).

S-O association

Colwill & Rescorla (1988)

Sd training Response training Test

S1 R1 O1

S2 R2 O2

R3 O1

R4 O2

S1: R3 vs R4

S2: R3 vs R4

2

4

6

10

Meanresp/min Different

outcome

Trials

Sameoutcome

8

OPERANT CONDITIONING: S-R Learning Devaluation studies find a reduction in

response that leads to devalued reward. But, response is rarely eliminated. Residual responding may represent

behavior triggered by stimulus situation in which responding was rewarded.

OPERANT CONDITIONING: WHAT IS LEARNED? Summary statement: Research suggests organisms learn

associations between response and reinforcer (R-O*), environmental stimuli and reinforcer (S-O*), and stimuli and response (S-R).

The “simple” process of operant conditioning is not so simple after all.

Technology

Contingency and Continuity