Operant Conditioning I. Volunteer? Priscilla the Fastidious Pig

Preview:

Citation preview

Operant Conditioning I

Volunteer?

Priscilla the Fastidious Pig

• http://www.youtube.com/watch?v=y6tpXC2mwlQ

Thorndike and Law of Effect

• Rewarded behavior is likely to occur

B. F. Skinner

“Operant conditioning shapes behavior as a sculptor shapes a lump of clay”

It’s all a matter of consequences

Operant Conditioning

• Learning where responses come to be controlled by their consequences

– Classical conditioning = regulating reflexive, involuntary responses

– Operant conditioning = voluntary responses

Skinner = Pigeons

• http://www.youtube.com/watch?v=vGazyH6fQQ4

What the what!? How did he do that?

• http://www.youtube.com/watch?v=TtfQlkGwE2U

• Shaping and reinforcement– Shaping – operant technique, reward for closer

and closer approximation of desired response

Skinner says…• Organisms tend to repeat

responses that are followed by favorable consequences

– Understood best through idea of reinforcement – response is strengthened because it leads to rewarding consequences

– Defined AFTER THE FACT

Response: Go to Chipotle for a meal

Rewarding stimulus presented: Most delicious meal you will ever have… ever

Tendency to patronize Chipotle increases

Response: Tell jokes Rewarding stimulus presented: Friends laugh

Tendency to tell jokes increases

REINFORCEMENT IN OPERANT CONDITIONING

Primary vs. Conditioned (secondary) Reinforcers

Generalization vs. Discrimination

• Which is which?

1) Kids only ask parents for sweets when they know the parents are in a good mood.

2) A cat runs into the kitchen whenever a can opener is being utilized

Basic Processes in Classical and Operant Conditioning

VS.

Overjustification Effect

• Preview of motivation chapter– What’s your reward for coming to school?

Operant Conditioning II

Reinforcement SchedulesR

ati

oIn

terv

al

Fixed Variable

Cu

mu

lati

ve R

esp

on

ses

Cu

mu

lati

ve R

esp

on

ses

Fixed-ratio (FR): Lower resistance to extinction

Rapid Responding

Short pause after reinforcement

Note: higher ratios generate higher response rates

Fixed-interval (FI): Lower resistance to extinction Long pause after

reinforcement yields “scalloping effect”

Note: Shorter intervals generate higher rates overall

Variable-Ratio (VR): Higher resistance to extinction

High, steady rate without pauses

Note: Higher ratios generate higher response rates

Variable-Interval (VI): Higher resistance to extinction

Low, steady rate without pauses

Note: Shorter intervals generate higher rates overall

Time Time

Fixed-ratio schedule

• A rat is reinforced for every tenth lever press

• A salesperson receives a bonus for every fourth gym membership sold

Reinforcement SchedulesR

ati

oIn

terv

al

Fixed Variable

Cu

mu

lati

ve R

esp

on

ses

Cu

mu

lati

ve R

esp

on

ses

Fixed-ratio (FR): Lower resistance to extinction

Rapid Responding

Short pause after reinforcement

Note: higher ratios generate higher response rates

Fixed-interval (FI): Lower resistance to extinction Long pause after

reinforcement yields “scalloping effect”

Note: Shorter intervals generate higher rates overall

Variable-Ratio (VR): Higher resistance to extinction

High, steady rate without pauses

Note: Higher ratios generate higher response rates

Variable-Interval (VI): Higher resistance to extinction

Low, steady rate without pauses

Note: Shorter intervals generate higher rates overall

Time Time

Variable-ratio schedule

• A slot machine in a casino pays off once every six tries on the average. – The number of winning responses between

payoffs varies greatly from one time to the next.

Reinforcement SchedulesR

ati

oIn

terv

al

Fixed Variable

Cu

mu

lati

ve R

esp

on

ses

Cu

mu

lati

ve R

esp

on

ses

Fixed-ratio (FR): Lower resistance to extinction

Rapid Responding

Short pause after reinforcement

Note: higher ratios generate higher response rates

Fixed-interval (FI): Lower resistance to extinction Long pause after

reinforcement yields “scalloping effect”

Note: Shorter intervals generate higher rates overall

Variable-Ratio (VR): Higher resistance to extinction

High, steady rate without pauses

Note: Higher ratios generate higher response rates

Variable-Interval (VI): Higher resistance to extinction

Low, steady rate without pauses

Note: Shorter intervals generate higher rates overall

Time Time

Fixed-interval schedule

• A man washing his clothes periodically check to see whether each load is finished– The reward (clean clothes) is available only after a

fixed time interval– The man who checks his laundry before it is

completed in the cycle does not receive reinforcement of clean clothes… because they’re not done yet

Reinforcement SchedulesR

ati

oIn

terv

al

Fixed Variable

Cu

mu

lati

ve R

esp

on

ses

Cu

mu

lati

ve R

esp

on

ses

Fixed-ratio (FR): Lower resistance to extinction

Rapid Responding

Short pause after reinforcement

Note: higher ratios generate higher response rates

Fixed-interval (FI): Lower resistance to extinction Long pause after

reinforcement yields “scalloping effect”

Note: Shorter intervals generate higher rates overall

Variable-Ratio (VR): Higher resistance to extinction

High, steady rate without pauses

Note: Higher ratios generate higher response rates

Variable-Interval (VI): Higher resistance to extinction

Low, steady rate without pauses

Note: Shorter intervals generate higher rates overall

Time Time

Variable-interval schedule

• Person wants to win a radio contest, so calls the station and gets a busy signal– Getting through to the DJ is the reinforcer

• A rat is reinforced for the first lever press after a 1 minute interval, but the following intervals are 3 min, 2 min, and 4 min (average of 2 min)

Reinforcement SchedulesR

ati

oIn

terv

al

Fixed Variable

Cu

mu

lati

ve R

esp

on

ses

Cu

mu

lati

ve R

esp

on

ses

Fixed-ratio (FR): Lower resistance to extinction

Rapid Responding

Short pause after reinforcement

Note: higher ratios generate higher response rates

Fixed-interval (FI): Lower resistance to extinction Long pause after

reinforcement yields “scalloping effect”

Note: Shorter intervals generate higher rates overall

Variable-Ratio (VR): Higher resistance to extinction

High, steady rate without pauses

Note: Higher ratios generate higher response rates

Variable-Interval (VI): Higher resistance to extinction

Low, steady rate without pauses

Note: Shorter intervals generate higher rates overall

Time Time

Conclusion:

• Faster responding leads to reinforcement sooner when ratio is in effect

• Variable schedules tend to generate steadier response rates– Greater resistance to extinction

Positive vs. Negative Reinforcement

• Study hint – THEY BOTH HAVE THE WORD REINFORCEMENT IN IT – IT’S ABOUT REINFORCEMENT

• Positive reinforcement: response is strengthened because it is followed by the presentation of a rewarding stimulus

– Ex: good grades, tasty meals, paychecks, scholarship, promotions, nice clothes, attention, flattery

• Negative Reinforcement: occurs when a response is strengthened because it is followed by the removal of an averse stimulus– ex: you rush home in winter to get out of the cold,

you clean a house to get rid of a mess, you give in to an argument to avoid an unpleasant situation

REINFORCEMENT IS REINFORCEMENT

• Both positive and negative reinforcement involve a favorable outcome that

STRENGTHENS a response tendency

Positive vs. Negative Reinforcement in a Skinner Box

+

-

Behavior Consequence

Response: Press lever

Response: Press lever

Rewarding stimulus presented: food delivered

Aversive Stimulus removed: shock turned off

Tendency to press lever increases

Tendency to press lever increases

Negative reinforcement applications

1. Escape learning: organism acquires a response that decreases or ends some aversive stimulation– Ex: you leave a party where you were getting

picked on by peers

2. Avoidance learning: organism acquires a response that prevents some aversive stimulation from occurring– Ex: you quit going to parties because of your

concern about being picked on

How does avoidance learning present an example of how classical conditioning and operant conditioning work together to regulate behavior?

Ex: Rat, shuttle box, shock

Punishment: Consequences that weaken responses

• Punishment: occurs when event following a response weakens the tendency to make that response

– Super easy to mix-up– How is this different from negative

reinforcement?

Positive Vs. Negative Punishment

• POSITIVE punishment – adding aversive stimulus– Spanking, parking tickets

• NEGATIVE punishment – taking away aversive stimulus– Time out, revoking driver’s license

Punishment examples:

1) If you wear a new outfit and your classmates make fun of it, your behavior will have been punished and your tendency to wear the same clothing will probably decline.

2) If you have a bad meal and a restaurant, your response will have been punished, and you will be less likely to go to the restaurant again.