20
Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Embed Size (px)

Citation preview

Page 1: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Effectiveness of spatial cues, prosody, and talker characteristics in

selective attention

C.J. Darwin & R.W. Hukin

Page 2: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Background

• Spatial attention often the focus of studies of the cocktail party effect

• But humans can separate sources that aren’t separated in space

• What other aspects of the speech signal are useful for source separation?– Pitch contour?– Individual characteristics?– A combination of characteristics?

Page 3: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Aims

• Characterize the role of natural prosody in sound source localization

• Characterize the role of vocal-tract size in sound source localization

Page 4: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Methods

• 13 listeners (21-52yrs)• “Could you PLEASE write the word

bead/globe down now?” / “You’ll ALSO hear the sound bead/globe played here”

Page 5: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Methods

• 13 listeners (21-52yrs)• “Could you PLEASE write the word

bead/globe down NOW?” / “You’ll ALSO hear the sound bead/globe played HERE”– Target word onsets aligned– Target word duration matched– Similar phrase durations

Page 6: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Methods

• Three pitch conditions– Original– Together (Equalize target word F0s)– Apart (Shift target word F0s apart)

• Two splicing methods– Normal– Swapped

Page 7: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

You will ALSO hear the sound globe played here

You will also hear the sound globe played HERE

Could you please write the word bead down NOW

Could you PLEASE write the word bead down now

Page 8: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

You will ALSO hear the sound globe played here

Could you please write the word bead down NOW

Swapped…

Could you PLEASE write the word bead down now

You will also hear the sound globe played HERE

Page 9: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Methods

• Three pitch conditions– Original– Together (Equalize target word F0s)– Apart (Shift target word F0s apart)

• Two splicing methods– Normal (prosodic cues reinforce spatial)– Swapped (prosodic cues oppose spatial)

• ITDs– 0, ±45.3, ±90.7 µs

• 144 trials heard 5 times each (720 trials)

You will ALSO hear the sound globe played here / Could you please write the word BEAD down now

Page 10: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

ResultsITD = 0• Normal:

Select target with matching prosody (83%)

• Swapped: Lower incidence of accuracy (69%)

In the absence of other cues, listeners can use natural F0 contour to track a sentence

Page 11: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

ResultsITD ≠ 0• Normal:

Improved accuracy (93%)

• Swapped: chance selection with an ITD of ±45.3 µs

• With ITD of ±90.7 µs report target with ITD of target sentence

Page 12: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

ResultsITD ≠ 0• Apart condition

strengthens prosodic cues chance of reporting target with same prosody as target sentence

• Together condition weakens prosodic cues

ITD cues dominate, but natural prosody can help direct listeners’ attention

Page 13: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Aims

• Characterize the role of natural prosody in sound source localization

• Characterize the role of vocal-tract size in sound source localization

Page 14: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Experiment 2

• Changed spectral envelope by 15%– Formant frequencies changed– Voice source characteristics changed– F0 unchanged

• Produced 2 apparently different talkers

• ITD 0, ±45.3, ±90.7, ±181.4 µs

Page 15: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

• Different vocal tract sizes have a large effect • Even with large ITDs and swapped condition,

listeners prefer original target word (73%)

Page 16: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Experiment 3

• Fixed ITD ±90.7 µs

• Vocal tract size changes of ±2, ±4, ±8, ±15%

Page 17: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

• A ± 8% size difference is comparable to that between male and females

• Little significant change arises across vocal tract length change conditions below ±8%

Page 18: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Conclusions

• Natural prosodic variations more effectively override spatial cues than monotone F0

• Vocal tract size changes ≥ average male/female differences can override spatial cues

Page 19: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

Things to consider

• Natural cues?• Natural setting?• In a natural environment are these cues ever

pitted against one another?• What are listeners really attending to? Can

we really conclude that more attention is being paid to ITD than to prosody?

Page 20: Effectiveness of spatial cues, prosody, and talker characteristics in selective attention C.J. Darwin & R.W. Hukin

But is the vocal tract modification of realistic proportions?