Pragmatically-guided perceptual learning

Preview:

DESCRIPTION

Pragmatically-guided perceptual learning. Tanya Kraljic, Arty Samuel, Susan Brennan Adaptation Project mini-Conference, May 7, 2007. 1-Minute Background on Speech Perception Part 1: Perceptual constancy. Speaker. Listener. Speech sounds (phonemes) differ depending on: who is speaking - PowerPoint PPT Presentation

Citation preview

Pragmatically-guided perceptual learning

Tanya Kraljic, Arty Samuel, Susan Brennan

Adaptation Project mini-Conference, May 7, 2007

Speaker Listener

Speech sounds (phonemes) differ depending on:

• who is speaking• what the immediate phonetic context is

1-Minute Background on Speech Perception

Part 1: Perceptual constancy

Speaker Listener

Speech sounds (phonemes) differ depending on:

• who is speaking• what the immediate phonetic context is

Perceptual constancy

And Yet…

Speaker Listener

1. Learn the acoustic invariants as children, then extract those and discard everything else as we’re listening Problem: What acoustic invariants?

1-Minute Background on Speech Perception

Part 2: Solutions?

Speaker Listener

1. Learn the acoustic invariants as children, then extract those and discard everything else as we’re listening Problem: What acoustic invariants?

2. Represent (learn) every variation that is encountered Problem: memory (if every variant is stored separately), ‘catastrophic interference’ (if you keep changing the same

representation)

1-Minute Background on Speech Perception

Part 2: Solutions

Getting at the Question: How does the perceptual system decide what to learn?

General idea in perception: Maybe the system tries to learn invariants of the distal objects that produce the stimuli (in this case, that would mean the speaker) and not of the stimuli themselves (in this case, the acoustic signal)

Our hypothesis: Maybe the system tries to learn those aspects of the signal that reflect characteristic properties of the speaker (and therefore are likely to remain stable across contexts and situations)

Getting at the Question: How does the perceptual system decide what to learn?

Specifically: How might it determine which variations are characteristic?

Our test: two kinds of information the system might use:

1. A ‘first impressions’ heuristic: In the absence of any other information, the properties that are present during first encounter are assumed to be representative and stable

2. Pragmatic cues that indicate that the variation is incidental (seeing that the speaker is talking with a pen in her mouth) can override the influence of primacy

What does Perceptual learning look like?2-phase Method

1. Exposure Phase (Lexical Decision Task)

Purpose: To expose participants to a speaker who pronounces a

particular sound in an ambiguous way (e.g., /?s/)Method: The /?s/ occurs in the context of words that cause the

sound to be perceived as one or the other phoneme (e.g. dino?aur

OR impa?ent).

Example: dino?aur Example: dino?aur OR impa?ent OR impa?ent

What does Perceptual learning look like?2-phase Method

1. Exposure Phase (Lexical Decision Task)

Purpose: To expose participants to a speaker who pronounces a

particular sound in an ambiguous way (e.g., /?s/)Method: The /?s/ occurs in the context of words that cause the

sound to be perceived as one or the other phoneme (e.g. dino?aur

OR impa?ent). * Listeners hear both ‘odd’ (dino?aur) and good versions of the (legacy) phonemes from the same speaker *

2. Test Phase (Category Identification)Purpose: Tests whether perceptual learning has occurredMethod: Participants hear items from a continuum that ranges from/s/ to // with several ambiguous points in between. They have to label each sound as S or SH.

*All manipulations are during the Exposure phase*

Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)

(really X another 2 - Phoneme: ?S or ?SH)

Manipulation: 2X2

*All manipulations are during the Exposure phase*

Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)

(really X another 2 - Phoneme: ?S or ?SH)

Manipulation: 2X2

*All manipulations are during the Exposure phase*

Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)

(really X another 2 - Phoneme: ?S or ?SH)

Manipulation: 2X2

*All manipulations are during the Exposure phase*

Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)

(really X another 2 - Phoneme: ?S or ?SH)

Pronunciation attribute varies by modality:

AudioOnly modality = Order manipulation (to test ‘first impressions heuristic)

Order 1st half 2nd half Attribution Prediction

Odd 1st dino?aur legacy Characteristic learning

Odd 2nd legacy dino?aur Incidental no learning

Manipulation: 2X2

/s/

Odd SecondNo Perceptual learning (F(1,62)=.29, p=.59

Results: Audio ModalityResults: Audio Modality

0

10

20

30

40

50

60

70

80

90

100

% S

H re

spon

ses

?SH Exposure?S Exposure

Odd FirstPerceptual learning (F(1,62)=5.93, p=.018)

/s/ /?s/ // ///?s/

*All manipulations are during the Exposure phase*

Modality (Audio Only, AudioVisual) X Pronunciation attribute (Characteristic, Incidental)

(really X another 2 - Phoneme: ?S or ?SH)

Pronunciation attribute varies by modality:

AudioVisual modality = Pragmatic manipulation (can it override ‘first impressions’ heuristic?)

Pragmatic Order Attribution Prediction

No pen in mouth* odd first Characteristic learning

Pen in mouth odd first Incidental no learning *No pen in mouth condition is just an AV version of our Audio, Odd-first condition

Manipulation: 2X2

Example of manipulation:

No pen in mouth

Pen in mouth

Manipulation: 2X2

/s/

Pen in MouthNo Perceptual learning (F(1,68)=.04, p>.05

Results: AudioVisual ModalityResults: AudioVisual Modality

0

10

20

30

40

50

60

70

80

90

100

% S

H re

spon

ses

?SH Exposure?S Exposure

No Pen in MouthPerceptual learning (F(1,68)=6.29, p=.015)

/s/ /?s/ // ///?s/

Overall results / Conclusions

Results: Same acoustic signal is handled differently depending on whether it is assumed to be a characteristic pronunciation or an incidental (perhaps transient) one

Main effect of phoneme (SH vs. S), no interaction with modality, significant interaction with Pronunciation attribute.

0

1

2

3

4

5

6

7

8

9

10

% P

erce

ptu

al l

earn

ing

eff

ect

(%S

H r

esp

- %

S r

esp

)

Audio AudioVisual

Characteristic pronunciation

Incidental pronunciation

Overall results / Conclusions

Converging Evidence: Our work on idiolectal/dialectal STR shows learning for ?s when it is speaker-driven, but not when it is contextually-driven

Conclusion: Perceptual learning is a powerful mechanism applied conservatively.

Pragmatic information plays an immediate role in guiding learning

Thank you

Design Elaboration

?S ?SH

Audio AudioAudioVisual AudioVisual

odd 1st odd 2nd odd 2ndodd 1st

Design Elaboration

?S ?SH

Audio AudioAudioVisual AudioVisual

odd 1st odd 2nd odd 2ndodd 1st

PenNo Pen PenNo Pen

Recommended