73
ktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006 Communicating in a Natural Cocktail Party: Relating Human and Animal Behavior to Neural Response Barbara Shinn-Cunningham Boston University Auditory Neuroscience Laboratory

Communicating in a Natural Cocktail Party: Relating Human and Animal Behavior to Neural Response

Embed Size (px)

DESCRIPTION

Communicating in a Natural Cocktail Party: Relating Human and Animal Behavior to Neural Response. Barbara Shinn-Cunningham Boston University Auditory Neuroscience Laboratory. Michele Dent Liz McClaine. Kamal Sen Rajiv Narayan. Erol Ozmeral Erick Gallun Gin Best. - PowerPoint PPT Presentation

Citation preview

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Communicating ina Natural Cocktail Party:Relating Human and Animal

Behavior to Neural Response

Barbara Shinn-CunninghamBoston University

Auditory Neuroscience Laboratory

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Erol Ozmeral

Erick Gallun

Gin Best

Michele Dent

Liz McClaine

Kamal Sen

Rajiv Narayan

With funding from ONR, AFOSR, & NIH

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

(Cocktail Party by SLAW, Maniscalco Gallery)

The everyday acoustic environment is full of competition and clutter

The “Cocktail Party Problem”

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Penguins and other social birds suffer from the cocktail party problem

(thanks to M. Dent)

Penguins recognize their mates and offspring amidst thousands of birds

Chicks identify parents from 11 m -- when call is 6 dB below the level of the background noise

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Zebra finches learn to make a call by listening to a tutor… while in a

large colony

Zebra finches are a model system for studying• vocal production learning

• hierarchical encoding of complex signals (e.g., “birds own song” neurons; Narayan et al., 2006)

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

How do we figure out what is in the world from the sound mixtures we hear?

air pressure

time (sec)

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

How do we figure out what is in the world from the sound mixtures we hear?

air pressure

time (sec)

Syllables / words heard as units

Confusions occur between sources (streaming over time)

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Frequency analysis breaks sound into parallel

channels

time (sec)

air

pressure When sounds

overlap in their spectral content, neural responses are a mixture

low

high

mechanical vibration in air

neural firing (electrical spikes) in auditory nerve

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Spectrotemporal structure of sound is critical

(in contrast with simple, “traditional” stimuli)

Supports segregation of competing source “units”• harmonicity• common onsets• comodulation

Reduces likelihood of spectrotemporal overlap (important elements are unlikely to be masked)

Moreover, the important information is contained in the spectrotemporal structure of sound.

BUT…

removing linguistic/semantic effects may tease out different contributing mechanisms

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

This project uses birdsongs from male zebra

finch

Can compare results with avian behavior (Dent lab) and neurophysiological responses (Sen lab)

Spectrotemporal structure supports segregation

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Listeners were trained to identify individual bird

songs

Moe Uno

Toro Nibbles

Junior

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

A quick test for you…

What is this?

Moe Uno

Toro Nibbles

Junior

What is this?

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Short-term “units” segregate, but streaming errors occur for similar

sources What is this?

Moe Uno

Toro Nibbles

Junior

Moe Nibbles

What is this?

Uno

+

noise

+

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Three maskers, to tease apart different types of

interference

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Three maskers, to tease apart different types of

interference

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Three maskers, to tease apart different types of

interference

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Why these maskers?

Noise

Mod. Noise

Chorus

Spectrotemporalstructure

DenseMuch overlap

SparseKey target features audible

More sparseTemporal control for chorus

Main type of interference

Reduce audibility

Cause streaming confusions

?

Effect of spatial separationImprove audibility through acoustic better-ear effects

Allow spatial attention to combat confusions

?

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Better-ear effects: the Target-to-Masker Energy

Ratio improves with separation

separated co-located

separated co-located

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Better-ear effects: the Target-to-Masker Energy

Ratio improves with separation

separated co-located

separated co-located

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Binaural effects: Interaural decorrelation causes masked signal to

be audible

Running cross-correlation output for 500-Hz channel

(simple model of brainstem processing in Medial Superior Olive)

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Binaural effects: Interaural decorrelation causes masked signal to

be audible

Running cross-correlation output for 500-Hz channel

(simple model of brainstem processing in Medial Superior Olive)

Important for signals below about 1500 Hz, but the birdsongs have a lot of high-frequency information

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Hypothesized role of spatial attention in

complex settings

?

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Measure performance with and without spatial

separation of target / masker

Masker

Masker

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Quantify spatial unmasking =

improvement in threshold

M

Co-located

M

Separated

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Quantify spatial unmasking =

improvement in threshold

M

Co-located

M

Separated

Spatial Unmasking

M

M

-

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

For birdsong, better-ear energy effects are large

M

M

Better earNo better ear

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Is there an additional benefit of perceived

separation? Diotic versus binaural

M

Diotic

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Is there an additional benefit of perceived

separation? Diotic versus binaural

M

Diotic

- Better-ear benefit

- No binaural processing

- Sources perceived at same location

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Is there an additional benefit of perceived

separation? Diotic versus binaural

M

Diotic

- Better-ear benefit

- No binaural processing

- Sources perceived at same location

M

Binaural

- Better-ear benefit

- Maybe binaural processing

- Sources perceived at different locations

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Why these maskers?

Noise

Mod. Noise

Chorus

Main type of interference

Reduce audibility

Cause streaming confusions

?

Effect of spatial separationImprove audibility through acoustic better-ear effects

Allow spatial attention to combat confusions

?

Diotic vs. binaural performance?

Identical

Binaural much better than diotic performance

?

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Masker

Noise Mod Noise Chorus

Threshold (dB)

-20

-10

0

10

Humans

worse performance

(need louder target)

For co-located target/masker,

the chorus causedthe most interference

Target threshold

(dB re: masker target)

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Spatial separation causes unmasking due to better-ear acoustics (diotic

presentation)

IdentifyTarget

improvement withspatialseparation

Best et al. 2005

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Spatial separation causes unmasking due to better-ear acoustics (diotic

presentation)

IdentifyTarget

Size of acoustic effect decreases as masker becomes sparser (audibility less of a problem)

improvement withspatialseparation

Best et al. 2005

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

For dissimilar maskers,there is no added benefit

of perceived spatial

separation

No advantage from spatial attention or binaural processing (high-frequency content)

IdentifyTarget

improvement withspatialseparation

Best et al. 2005

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

For a chorus masker, perceived location differences improve

identification

Perceived separation adds 10 dB of spatial unmasking, for confusable masker

IdentifyTarget

improvement withspatialseparation

No advantage from spatial attention or binaural processing (high-frequency content)

Best et al. 2005

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

LEDs on the speakers:- no information- which speaker- which time- or both

Ask listener to identify a song from a random

location, occurring at a random time

… …

…… …

Five simultaneous, similar sources, every 15 deg

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

For identification of familiar birdsongs in a

chorus,when and where both help

For best subjects, when cue less important; they report “pop out” of familiar songs

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

For identifying digits intime-reversed digits,

when doesn’t help

For all subjects, when cue less important; forward digits “pop out” of reversed speech

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Ongoing work

How prior knowledge affects spatial attention

The role of visual cuing of spatial attention

Divided auditory attention

Comparisons with visual attention

Modeling spatial release from different interference

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Psychophysics shows different maskers cause

different forms of perceptual interference Noise and modulated noise

• reduce audibility of song elements• are dissimilar from targets• are easily segregated • don’t cause confusion• show spatial release due to acoustic better-ear improvements

Chorus (or reversed speech)• is sparse enough that overall interference is not as great• consists of “units” (syllables) like those in the targets• is hard to segregate from target• causes confusion between target and masker• shows spatial release through spatial attention

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Comparing to avian behavior

Dent Lab

SUNY Buffalo

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

PECK left key to begin variable waiting period (2-7 s) HEAR a call from

one of six individuals

PECK left or right key

Correct: food reward

Incorrect: lights extinguished

Dent Lab: Teaching birds to recognize the songs

RECOGNIZE and CATEGORIZE call

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Zebra finch and budgerigars learn zebra

finch songs

*Average sessions to criterion = 34.25

100 Trial Sessions

0 10 20 30 40 50 60

% Correct

30

40

50

60

70

80

90

100

MaddoxTrumanMiloZolaP

ercent Correct

100 Trial Sessions

0 10 20 30 40

% Correct

30

40

50

60

70

80

90

100

BuckyBundy CosmoDixie

= 22

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Masker

Noise Mod Noise Chorus

Threshold (dB)

-20

-15

-10

-5

0

5

10

15

Humans

Zebra Finches

Budgerigars

worse performance

(need louder target)

Relative effectiveness of the maskers differs

across species Target threshold

(dB re: masker target)

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Next stage: measuring whether the birds use spatial attention like

humans

*Dent et al., Behav. Neurosci., 1997

N

S + N

S + N1.00 2.00 2.86 4.00

Masked Threshold (in dB)

0

5

10

15

20

25

30

35 Unilateral Sound SourceBilateral Sound Source

Binaural Hearing

Frequency (in kHz)

Budgerigars exhibit spatial release for noise maskers

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Avian psychophysics shows different maskers cause

different levels of interference

Degree of interference differs from species to species

Next stage will explore whether effect of spatial separation differs with masker type, as in humans

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Comparing to avian physiology

Sen Lab

Biomedical Engineering

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Recording from zebra finch forebrain Field L (homologue of primary

auditory cortex)Record neural spike trains in response to multiple copies of clean songs from five birds

Record neural spike trains in response to repetitions of each song embedded in each masker

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Each neuron has a set of spectrotemporal features

to which it responds

Frequency

Time

Broadband onset neuron Narrowband neuron

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Compare clean-song templates to target +

masker

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Compute single-neuron classification performance

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

The chorus causes the least interference in

performance

Target-to-Masker Energy Ratio

-10 -5 0 5 10 clean0

50

100

Percent Correct

chorusmod. noisenoise

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Narrowband neurons perform better, with larger differences

between masker types

Target-to-Masker Energy Ratio

-10 -5 0 5 10 clean -5 0 5 10 clean0

50

100

Percent Correct

-10

chorusnoisemod. noise

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Best single neuron performance is very good

chorusnoisemod. noise

bestneuron

ave.neuron

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

But overall percent correct classification

does not describe kind of interference

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Information in neural spike train is in timing / pattern

Frequency

Time

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Information in neural spike train is in timing / pattern

Frequency

Time

Representation of the neuron’s tuning (e.g., features in time-frequency)Spike output

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Hypothesize that noise masker suppresses

spikes to target features

Frequency

Time

Target content

Masker content

Target / Masker mixture

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Hypothesize that modulated noise masker adds extra spikes at

noise onsets

Frequency

Time

Target content

Masker content

Target / Masker mixture

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Hypothesize that chorus masker adds spikes, but fewer than the modulated

noise

Frequency

Time

Target content

Masker content

Target / Masker mixture

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Spike ratecolor plot(clean song)

Wideband neuron example: response to clean song

Targetpressure waveform

Averagespike rate

Time

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Wideband response to song in noise: spike

suppressionTargetpressure waveform

Averagespike rate

Spike rateas functionof SNR

Time

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Wideband response to song in mod noise: spike

addition (some suppression)Targetpressure waveform

Averagespike rate

Spike rateas functionof SNR

Time

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Wideband response to song in chorus: spike suppression!!

Targetpressure waveform

Averagespike rate

Spike rateas functionof SNR

Time

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Narrowband response to song in noise: spike

suppressionTargetpressure waveform

Averagespike rate

Spike rateas functionof SNR

Time

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Narrowband response to song in mod noise: spike

additionTargetpressure waveform

Averagespike rate

Spike rateas functionof SNR

Time

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Narrowband response to song in chorus: spike

suppression!!Targetpressure waveform

Averagespike rate

Spike rateas functionof SNR

Time

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

In between target syllables, all maskers

add some spikes

Mod noise adds the most spikes

Noise adds almost as many

The chorus adds the least

Ave. for clean targets

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

For Mod Noise, there is no effect of target level

clean ave.

clean ave.

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Within syllables, the rate increases with

target level for Chorus and Noise

clean ave.

clean ave.

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Neurophysiology shows neural correlates of different forms of

perceptual interference Single Field L neurons contain enough information to classify complex bird songs

Noise suppresses responses to song features

Modulated noise causes spurious extra spikes

Chorus response shows surprising amount of suppression rather than expected spike addition

Results consistent with extraordinary nonlinearity in response to complex features in chorus (further supported by estimates of linear spectrotemporal receptive field analysis)

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Future Work

Compare release from interference with spatial separation in avian species and humans

Measure effects of spatial position of sources on avian forebrain neural responses

Develop awake-behaving neurophysiological preparation to explore attention and single-trial events

Relate human psychophysics to fMRI measures (with David Somers)

Develop model based on neurophysiological results that describes factors affecting listening in complex settings

Cocktail Party: Shinn-Cunningham et al. CNS Colloquium, 14 April 2006

Space helps even when all elements are audible…

if sources are similar

air pressure

time (sec)