12
Speech recognition with amplitude and Speech recognition with amplitude and frequency modulations: frequency modulations: Implications for cochlear implant design Implications for cochlear implant design What’re AM and FM? What are their perceptual roles? Where to find it? Implications? Fan-Gang Zeng Kaibao Nie Ginger Stickney Ying-Yee Kong Ashish Bhargave Hongbin Chen Michael Vongphoe Janice Chang

What’re AM and FM? What are their perceptual roles? Where to find it? Implications?

Embed Size (px)

DESCRIPTION

Speech recognition with amplitude and frequency modulations: Implications for cochlear implant design. Fan-Gang Zeng Kaibao Nie Ginger Stickney Ying-Yee Kong Ashish Bhargave Hongbin Chen Michael Vongphoe Janice Chang. What’re AM and FM? What are their perceptual roles? - PowerPoint PPT Presentation

Citation preview

Page 1: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Speech recognition with amplitude and frequency Speech recognition with amplitude and frequency modulations:modulations:

Implications for cochlear implant designImplications for cochlear implant design

• What’re AM and FM?

• What are their perceptual roles?

• Where to find it?

• Implications?

Fan-Gang Zeng

Kaibao Nie

Ginger Stickney

Ying-Yee Kong

Ashish Bhargave

Hongbin Chen

Michael Vongphoe

Janice Chang

Page 2: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

What is fine structure?What is fine structure?

• Rosen’s definition:– Envelope (5-50 Hz)

– Periodicity (50-500 Hz)

– Fine structure (500-10,000 Hz)

• Hilbert’s definition:– Temporal envelope

– Fine structure

Original

AM

Fine Structure

FM

Page 3: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Little math Little math

• Flanagan (1980) “Parametric coding of speech spectra”

– Discard absolute phase:

– Discard relative phase (i.e., frequency modulation):

N

1kk

t

0

kckk d)(2tf2cos)t(A)t(s.

N

1k

t

0

kckk d)(2tf2cos)t(A)t(s.

N

1kckk tf2cos)t(A)t(s

Page 4: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

ImplementationImplementation

• Combo of Dudley’s vocoder and Flanagan’s phase vocoder

Input

AM filter, Envelope,

FM filter,

Output

FM filter, Envelope, Compression

AM

. . .

. . .

Zeng, Nie, Stickney et al. PNAS (2005)

Page 5: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Spectra: What does FM encode?

-6

-12

-18

-24

0

-30

4000

3000

2000

1000

5000

0

0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5

Time (s)

Fre

qu

ency

(H

z)

dB

4000

3000

2000

1000

5000

0

4000

3000

2000

1000

5000

0

4000

3000

2000

1000

5000

0

A

C

B

F

E

G D

Zeng, Nie, Stickney et al. PNAS (2005)

Page 6: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Sentence, speaker, and tone recognition

Zeng, Nie, Stickney et al. PNAS (2005)

Combo:

Target:

Masker:

Page 7: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Comparison with previous studies

Condition

CN4 CN8 CS4 CS8 HN4 HN8 HS4 HS8 IN4 IN8 IS4 IS8

Pe

rce

nt

Co

rre

ct

0

20

40

60

80

100 Shannon et al. 1995 Dorman et al. 1997 Zeng et al. 2005

Zeng, Nie, Stickney et al. PNAS (2005)

Page 8: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Spectral resolution and noise type

Am

plit

ud

e (d

B)

0 1 3 4 5 6 7 8 40

50

60

70

80

90

100

110

Frequency (kHz) 2

Original AM AM+FM

40

50

60

70

80

90

100

110

Target Male Female

A

C

E S

pe

ech

Re

cep

tion

Th

resh

old

(d

B) D

F

5

-25

-20

0

10

-15

-10

-5

15 B

5

-25

-20

0

10

-15

-10

-5

15

Speech- shaped noise

Male masker

Female masker

5

-25

-20

0

10

-15

-10

-5

15

40

50

60

70

80

90

100

110

Original AM AM+FM

AM+FM AM

NH CI

AM+FM AM

Am

plit

ud

e (d

B)

0 1 3 4 5 6 7 8 40

50

60

70

80

90

100

110

Frequency (kHz) 2

Original AM AM+FM

40

50

60

70

80

90

100

110

Target Male Female

A

C

E S

pe

ech

Re

cep

tion

Th

resh

old

(d

B) D

F

5

-25

-20

0

10

-15

-10

-5

15 B

5

-25

-20

0

10

-15

-10

-5

15

Speech- shaped noise

Male masker

Female masker

5

-25

-20

0

10

-15

-10

-5

15

40

50

60

70

80

90

100

110

Original AM AM+FM

AM+FM AM

NH CI

AM+FM AM

30-dB SRT

Zeng, Nie, Stickney et al. PNAS (2005)

Page 9: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Speech recognition in combined hearing

Kong, Stickney, and Zeng JASA (2005)

S2

Pe

rce

nt c

orr

ect 0

20

40

60

80

100 S3

S5

Signal-to-noise Ratio (SNR)

0 5 10 15 20

0

20

40

60

80

100 Mean

0 5 10 15 20

10-dB SRT

HA

CI

HA+CI

Page 10: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

FM detection in CIs: Results

10 100 1000

1

10

100

1000

upward Regression for upwarddownward Regression for downwardSinusoid Regression for sinusoid

standard frequency (Hz)

Dif

fere

nce

lim

en(H

z)

10 100 1000

1

10

100

1000

upward Regression for upwarddownward Regression for downwardSinusoid Regression for sinusoid

10 100 1000

1

10

100

1000

upward Regression for upwarddownward Regression for downwardSinusoid Regression for sinusoid

upward Regression for upwarddownward Regression for downwardSinusoid Regression for sinusoid

standard frequency (Hz)

Dif

fere

nce

lim

en(H

z)

Chen and Zeng JASA (2004)

Time

Fre

quen

cy

Page 11: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Summary Summary Using FM to improve auditory performance:

– Speech cues are not redundant: FM complements AM in speech

perception

– FM is important for speech recognition with competing voice as

maskers

– FM is important for music and tonal language perception

– FM is a slow version of fine structure that can be perceived and

used to improve cochlear implant performance

Page 12: What’re AM and FM?  What are their perceptual roles?   Where to find it?  Implications?

Acknowledgements

• NIH - NIDCD• Chinese NSF• Advanced Bionics Corp• Cochlear Corp• Medel• Peter Assmann• Ann Bradlow• Keli Cao and CG Wei• Larry Feth• Ruth Litovsky• Jones Ackland