Upload
iola-sosa
View
18
Download
1
Embed Size (px)
DESCRIPTION
Communicating in a Natural Cocktail Party: Relating Human and Animal Behavior to Neural Response. Barbara Shinn-Cunningham Boston University Auditory Neuroscience Laboratory. Michele Dent Liz McClaine. Kamal Sen Rajiv Narayan. Erol Ozmeral Erick Gallun Gin Best. - PowerPoint PPT Presentation
Citation preview
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Communicating ina Natural Cocktail Party:Relating Human and Animal
Behavior to Neural Response
Barbara Shinn-CunninghamBoston University
Auditory Neuroscience Laboratory
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Erol Ozmeral
Erick Gallun
Gin Best
Michele Dent
Liz McClaine
Kamal Sen
Rajiv Narayan
With funding from ONR, AFOSR, & NIH
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
(Cocktail Party by SLAW, Maniscalco Gallery)
The everyday acoustic environment is full of competition and clutter
The “Cocktail Party Problem”
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Penguins and other social birds suffer from the cocktail party problem
(thanks to M. Dent)
Penguins recognize their mates and offspring amidst thousands of birds
Chicks identify parents from 11 m -- when call is 6 dB below the level of the background noise
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Zebra finches learn to make a call by listening to a tutor… while in a
large colony
Zebra finches are a model system for studying• vocal production learning
• hierarchical encoding of complex signals (e.g., “birds own song” neurons; Narayan et al., 2006)
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
How do we figure out what is in the world from the sound mixtures we hear?
air pressure
time (sec)
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
How do we figure out what is in the world from the sound mixtures we hear?
air pressure
time (sec)
Syllables / words heard as units
Confusions occur between sources (streaming over time)
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Frequency analysis breaks sound into parallel
channels
time (sec)
air
pressure When sounds
overlap in their spectral content, neural responses are a mixture
low
high
mechanical vibration in air
neural firing (electrical spikes) in auditory nerve
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Spectrotemporal structure of sound is critical
(in contrast with simple, “traditional” stimuli)
Supports segregation of competing source “units”• harmonicity• common onsets• comodulation
Reduces likelihood of spectrotemporal overlap (important elements are unlikely to be masked)
Moreover, the important information is contained in the spectrotemporal structure of sound.
BUT…
removing linguistic/semantic effects may tease out different contributing mechanisms
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
This project uses birdsongs from male zebra
finch
Can compare results with avian behavior (Dent lab) and neurophysiological responses (Sen lab)
Spectrotemporal structure supports segregation
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Listeners were trained to identify individual bird
songs
Moe Uno
Toro Nibbles
Junior
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
A quick test for you…
What is this?
Moe Uno
Toro Nibbles
Junior
What is this?
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Short-term “units” segregate, but streaming errors occur for similar
sources What is this?
Moe Uno
Toro Nibbles
Junior
Moe Nibbles
What is this?
Uno
+
noise
+
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Three maskers, to tease apart different types of
interference
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Three maskers, to tease apart different types of
interference
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Three maskers, to tease apart different types of
interference
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Why these maskers?
Noise
Mod. Noise
Chorus
Spectrotemporalstructure
DenseMuch overlap
SparseKey target features audible
More sparseTemporal control for chorus
Main type of interference
Reduce audibility
Cause streaming confusions
?
Effect of spatial separationImprove audibility through acoustic better-ear effects
Allow spatial attention to combat confusions
?
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Better-ear effects: the Target-to-Masker Energy
Ratio improves with separation
separated co-located
separated co-located
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Better-ear effects: the Target-to-Masker Energy
Ratio improves with separation
separated co-located
separated co-located
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Binaural effects: Interaural decorrelation causes masked signal to
be audible
Running cross-correlation output for 500-Hz channel
(simple model of brainstem processing in Medial Superior Olive)
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Binaural effects: Interaural decorrelation causes masked signal to
be audible
Running cross-correlation output for 500-Hz channel
(simple model of brainstem processing in Medial Superior Olive)
Important for signals below about 1500 Hz, but the birdsongs have a lot of high-frequency information
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Hypothesized role of spatial attention in
complex settings
?
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Measure performance with and without spatial
separation of target / masker
Masker
Masker
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Quantify spatial unmasking =
improvement in threshold
M
Co-located
M
Separated
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Quantify spatial unmasking =
improvement in threshold
M
Co-located
M
Separated
Spatial Unmasking
M
M
-
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
For birdsong, better-ear energy effects are large
M
M
Better earNo better ear
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Is there an additional benefit of perceived
separation? Diotic versus binaural
M
Diotic
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Is there an additional benefit of perceived
separation? Diotic versus binaural
M
Diotic
- Better-ear benefit
- No binaural processing
- Sources perceived at same location
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Is there an additional benefit of perceived
separation? Diotic versus binaural
M
Diotic
- Better-ear benefit
- No binaural processing
- Sources perceived at same location
M
Binaural
- Better-ear benefit
- Maybe binaural processing
- Sources perceived at different locations
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Why these maskers?
Noise
Mod. Noise
Chorus
Main type of interference
Reduce audibility
Cause streaming confusions
?
Effect of spatial separationImprove audibility through acoustic better-ear effects
Allow spatial attention to combat confusions
?
Diotic vs. binaural performance?
Identical
Binaural much better than diotic performance
?
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Masker
Noise Mod Noise Chorus
Threshold (dB)
-20
-10
0
10
Humans
worse performance
(need louder target)
For co-located target/masker,
the chorus causedthe most interference
Target threshold
(dB re: masker target)
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Spatial separation causes unmasking due to better-ear acoustics (diotic
presentation)
IdentifyTarget
improvement withspatialseparation
Best et al. 2005
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Spatial separation causes unmasking due to better-ear acoustics (diotic
presentation)
IdentifyTarget
Size of acoustic effect decreases as masker becomes sparser (audibility less of a problem)
improvement withspatialseparation
Best et al. 2005
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
For dissimilar maskers,there is no added benefit
of perceived spatial
separation
No advantage from spatial attention or binaural processing (high-frequency content)
IdentifyTarget
improvement withspatialseparation
Best et al. 2005
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
For a chorus masker, perceived location differences improve
identification
Perceived separation adds 10 dB of spatial unmasking, for confusable masker
IdentifyTarget
improvement withspatialseparation
No advantage from spatial attention or binaural processing (high-frequency content)
Best et al. 2005
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
LEDs on the speakers:- no information- which speaker- which time- or both
Ask listener to identify a song from a random
location, occurring at a random time
… …
…… …
Five simultaneous, similar sources, every 15 deg
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
For identification of familiar birdsongs in a
chorus,when and where both help
For best subjects, when cue less important; they report “pop out” of familiar songs
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
For identifying digits intime-reversed digits,
when doesn’t help
For all subjects, when cue less important; forward digits “pop out” of reversed speech
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Ongoing work
How prior knowledge affects spatial attention
The role of visual cuing of spatial attention
Divided auditory attention
Comparisons with visual attention
Modeling spatial release from different interference
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Psychophysics shows different maskers cause
different forms of perceptual interference Noise and modulated noise
• reduce audibility of song elements• are dissimilar from targets• are easily segregated • don’t cause confusion• show spatial release due to acoustic better-ear improvements
Chorus (or reversed speech)• is sparse enough that overall interference is not as great• consists of “units” (syllables) like those in the targets• is hard to segregate from target• causes confusion between target and masker• shows spatial release through spatial attention
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Comparing to avian behavior
Dent Lab
SUNY Buffalo
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
PECK left key to begin variable waiting period (2-7 s) HEAR a call from
one of six individuals
PECK left or right key
Correct: food reward
Incorrect: lights extinguished
Dent Lab: Teaching birds to recognize the songs
RECOGNIZE and CATEGORIZE call
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Zebra finch and budgerigars learn zebra
finch songs
*Average sessions to criterion = 34.25
100 Trial Sessions
0 10 20 30 40 50 60
% Correct
30
40
50
60
70
80
90
100
MaddoxTrumanMiloZolaP
ercent Correct
100 Trial Sessions
0 10 20 30 40
% Correct
30
40
50
60
70
80
90
100
BuckyBundy CosmoDixie
= 22
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Masker
Noise Mod Noise Chorus
Threshold (dB)
-20
-15
-10
-5
0
5
10
15
Humans
Zebra Finches
Budgerigars
worse performance
(need louder target)
Relative effectiveness of the maskers differs
across species Target threshold
(dB re: masker target)
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Next stage: measuring whether the birds use spatial attention like
humans
*Dent et al., Behav. Neurosci., 1997
N
S + N
S + N1.00 2.00 2.86 4.00
Masked Threshold (in dB)
0
5
10
15
20
25
30
35 Unilateral Sound SourceBilateral Sound Source
Binaural Hearing
Frequency (in kHz)
Budgerigars exhibit spatial release for noise maskers
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Avian psychophysics shows different maskers cause
different levels of interference
Degree of interference differs from species to species
Next stage will explore whether effect of spatial separation differs with masker type, as in humans
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Comparing to avian physiology
Sen Lab
Biomedical Engineering
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Recording from zebra finch forebrain Field L (homologue of primary
auditory cortex)Record neural spike trains in response to multiple copies of clean songs from five birds
Record neural spike trains in response to repetitions of each song embedded in each masker
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Each neuron has a set of spectrotemporal features
to which it responds
Frequency
Time
Broadband onset neuron Narrowband neuron
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Compare clean-song templates to target +
masker
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Compute single-neuron classification performance
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
The chorus causes the least interference in
performance
Target-to-Masker Energy Ratio
-10 -5 0 5 10 clean0
50
100
Percent Correct
chorusmod. noisenoise
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Narrowband neurons perform better, with larger differences
between masker types
Target-to-Masker Energy Ratio
-10 -5 0 5 10 clean -5 0 5 10 clean0
50
100
Percent Correct
-10
chorusnoisemod. noise
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Best single neuron performance is very good
chorusnoisemod. noise
bestneuron
ave.neuron
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
But overall percent correct classification
does not describe kind of interference
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Information in neural spike train is in timing / pattern
Frequency
Time
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Information in neural spike train is in timing / pattern
Frequency
Time
Representation of the neuron’s tuning (e.g., features in time-frequency)Spike output
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Hypothesize that noise masker suppresses
spikes to target features
Frequency
Time
Target content
Masker content
Target / Masker mixture
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Hypothesize that modulated noise masker adds extra spikes at
noise onsets
Frequency
Time
Target content
Masker content
Target / Masker mixture
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Hypothesize that chorus masker adds spikes, but fewer than the modulated
noise
Frequency
Time
Target content
Masker content
Target / Masker mixture
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Spike ratecolor plot(clean song)
Wideband neuron example: response to clean song
Targetpressure waveform
Averagespike rate
Time
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Wideband response to song in noise: spike
suppressionTargetpressure waveform
Averagespike rate
Spike rateas functionof SNR
Time
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Wideband response to song in mod noise: spike
addition (some suppression)Targetpressure waveform
Averagespike rate
Spike rateas functionof SNR
Time
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Wideband response to song in chorus: spike suppression!!
Targetpressure waveform
Averagespike rate
Spike rateas functionof SNR
Time
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Narrowband response to song in noise: spike
suppressionTargetpressure waveform
Averagespike rate
Spike rateas functionof SNR
Time
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Narrowband response to song in mod noise: spike
additionTargetpressure waveform
Averagespike rate
Spike rateas functionof SNR
Time
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Narrowband response to song in chorus: spike
suppression!!Targetpressure waveform
Averagespike rate
Spike rateas functionof SNR
Time
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
In between target syllables, all maskers
add some spikes
Mod noise adds the most spikes
Noise adds almost as many
The chorus adds the least
Ave. for clean targets
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
For Mod Noise, there is no effect of target level
clean ave.
clean ave.
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Within syllables, the rate increases with
target level for Chorus and Noise
clean ave.
clean ave.
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Neurophysiology shows neural correlates of different forms of
perceptual interference Single Field L neurons contain enough information to classify complex bird songs
Noise suppresses responses to song features
Modulated noise causes spurious extra spikes
Chorus response shows surprising amount of suppression rather than expected spike addition
Results consistent with extraordinary nonlinearity in response to complex features in chorus (further supported by estimates of linear spectrotemporal receptive field analysis)
Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006
Future Work
Compare release from interference with spatial separation in avian species and humans
Measure effects of spatial position of sources on avian forebrain neural responses
Develop awake-behaving neurophysiological preparation to explore attention and single-trial events
Relate human psychophysics to fMRI measures (with David Somers)
Develop model based on neurophysiological results that describes factors affecting listening in complex settings