17
I I T B o m b a y l e h a n a @ i i t b . a c . i n I C S C I 2 0 0 4 , H y d e r a b a d , I n d i a , 1 2 - 1 5 F e b 0 4 1 Introduction Analysis / synthesis Spec. Sub. Methodology •Results Conclusion and Future plan ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 USE OF HARMONIC PLUS NOISE MODEL FOR REDUCTION OF SELF LEAKAGE IN ELECTROALARYNGEAL SPEECH Parveen K. Lehana 1 , Prem C. Pandey 2 , Santosh S. Pratapwar 2 , Rockey Gupta 1 1 University of Jammu, India 2 IIT Bombay, India <[email protected]>

IIT Bombay [email protected] ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

Embed Size (px)

Citation preview

Page 1: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

1• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04

USE OF HARMONIC PLUS NOISE MODELFOR REDUCTION OF SELF LEAKAGE IN

ELECTROALARYNGEAL SPEECH

Parveen K. Lehana1, Prem C. Pandey2,Santosh S. Pratapwar2, Rockey Gupta1

1University of Jammu, India2IIT Bombay, India

<[email protected]>

Page 2: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

2• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

ABSTRACTArtificial larynx is an assistive device for providing excitation to vocal tract as a substitute to a dysfunctional or removed larynx. The speech generated by electrolarynx, an external vibrator held against the neck tissue, is not natural and most of the time is unintelligible because of the improper shape of the excitation pulses and presence of a background noise caused by sound leakage from the vibrator. The objective of this paper is to enhance the intelligibility of electrolaryngeal speech by reducing the background noise using harmonic plus noise model (HNM). The alaryngeal speech and the leakage signal are analyzed using HNM and average harmonic spectrum of the leakage noise is subtracted from the harmonic magnitude spectrum of the noisy speech in each frame. HNM synthesis is carried out retaining the original phase spectra. Investigations show that the output is more natural and intelligible as compared to input speech signal and the enhanced signal obtained from spectral subtraction without HNM analysis and synthesis.

Page 3: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

3• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

PRESENTATION OVERVIEW

Introduction HNM Analysis / synthesis Spectral subtraction with HNM Methodology Results Conclusion & future plan

Page 4: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

4• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

INTRODUCTION (1/5)

NATURAL SPEECH PRODUCTION Glottal excitation to vocal tract

Page 5: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

5• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

INTRODUCTION (2/5 )

))(exp()()(

1tjte

tK

kk

)];(exp[);();( tfjtfGtfH

)(

1

)));(()(());(()(tK

k

ttftjk kkettfGts

)()(

1)()( ti

tK

kk ketAts

)));(()()( ttftt kkk

If excitation and vocal tract transfer functions are

then output speech is

and can be simplified to

where

&

Page 6: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

6• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

INTRODUCTION (3/5 )

External electronic larynx (transcervical electrolarynx)

Excitation to vocal tract from external vibrator (creates background noise)

Page 7: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

7• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

INTRODUCTION (4/5 )

External electronic larynx (transcervical electrolarynx)

Leakage path:- back side of membrane/plate- improper tissue coupling

Page 8: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

8• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

INTRODUCTION (5/5 )

RESEARCH OBJECTIVEThe objective of this paper is to enhance the intelligibility of electrolaryngeal speech by reducing the background noise using harmonic plus noise model (HNM).

Page 9: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

9• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

HNM ANALYSIS / SYNTHESIS (1/3)

HARMONIC PLUS NOISE MODEL(Stylianou, 1995; 2001)

Speech signal divided into: • harmonic part • noise part

Harmonic part

Noise part

Parameters: • Max. voiced frequency• V/UV & pitch• Harm. ampl. & phases• Noise parameters

( )

0( ) Re ( )exp{ [ ( ) ]}L t t

l ll o

s t a t j l d

( ) ( )[ ( ; )* ( )]n t w t h t b t

Page 10: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

10• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

ANALYSIS / SYNTHESIS WITH HNM (2/3)

ANALYSIS

Page 11: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

11• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

ANALYSIS / SYNTHESIS WITH HNM (3/3)

SYNTHESIS

Page 12: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

12• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

SPECTRAL SUBTRACTION WITH HNM

x(n) = e(n)*hv(n) + e(n)*hl(n) Taking DFT: Xn(ej) = En(ej) [Hvn(ej) + Hln(ej) ]

Assumption:hv(n) & h(n) uncorrelated Xn(ej) 2 = En(ej) 2[Hvn(ej) 2 + Hln(ej) 2]

During non-speech segment: s(n) = 0Xn(ej) 2 = Ln(ej) 2 = En(ej) 2 Hln(ej) 2L(ej) 2 : averaged over many segments

Yn(k) = Xn(k) – L(k) Yn(k) = Yn(k) if Yn(k) L(k)

L(k) otherwise

(: subtraction, : spectral floor, : exp. factors)

Here n is frame index and k is harmonic index

Page 13: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

13• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

METHODOLOGY

STEPS FOR HNM BASED SPECTRAL SUBTRACTION

• Non speech segments analyzed

• Average harmonic spectrum obtained

• Noisy speech analyzed and average harmonic spectrum of noise

subtracted • Resynthesis with noisy speech phase

spectra

For comparison, spectral subtraction using DFT derived magnitude is also carried out.

Page 14: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

14• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

RESULTS (1/2)

Both DFT derived and HNM based harmonic

spectrum significantly reduce the background noiseBoth require empirical selection of the parameters DFT derived spectral subtraction more effective

during non-speechHNM based spectral subtraction more effective

during speech with less musical noise and enhanced

formant structureSaving in parameters and processing time in HNM

based spectral subtraction

Page 15: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

15• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

a) Recorded speech signal

b) Processed (DFT derived) ( = 2, = 0.001, and =1)

c) Processed (HNM derived) ( = 1, = 0.1, and = 1)

RESULTS (2/2)

Page 16: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

16• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan

CONCLUSION HNM based method provides an effective subtraction of noise during the speech and hence can be used for improving intelligibility of electrolaryngeal speech.

FURTHER PLAN

QBNE combined with HNM based spectral subtraction Phase resynthesis from enhanced magnitude spectrumEffect of artificial jitter in pitch on speech quality

Page 17: IIT Bombay lehana@iitb.ac.in ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis / synthesis Spec. Sub. Methodology Results Conclusion and

I

IT B

omba

[email protected]. in

IC

SCI 2

004,

Hyd

erab

ad, I

ndia

, 12-

15 F

eb’ 0

4

17• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan