17
Advances in WP2 Nancy Meeting – 6-7 July 2006 www.loquendo.com

Advances in WP2

  • Upload
    sheba

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Nancy Meeting – 6-7 July 2006. Advances in WP2. www.loquendo.com. Recent Work on NN Adaptation in WP2. State of the art LIN adaptation method implemented and experimented on the benchmarks (m12) Innovative LHN adaptation method implemented and experimented on the benchmarks (m21) - PowerPoint PPT Presentation

Citation preview

Page 1: Advances in WP2

Advances in WP2

Nancy Meeting – 6-7 July 2006

www.loquendo.com

Page 2: Advances in WP2

2

Recent Work on NN Adaptation in WP2

• State of the art LIN adaptation method implemented and experimented on the benchmarks (m12)

• Innovative LHN adaptation method implemented and experimented on the benchmarks (m21)

• Experimental results on benchmark corpora and Hiwire database with LIN and LHN (m21)

• Further advances on new adaptation methods (m24)

Page 3: Advances in WP2

3

LIN Adaptation

Output layer

….

….

….

Input layer

1st hidden layer

2nd hidden layer

Emission Probabilities

Acoustic phonetic Units

Speech Signal parameters….

Speaker

Independent

MLP

SI-MLP

LIN

Page 4: Advances in WP2

4

LHN Adaptation

Output layer

….

….

….

Input layer

1st hidden layer

2nd hidden layer

Emission Probabilities

Acoustic phonetic Units

Speech Signal parameters….

Speaker

Independent

MLP

SI-MLP

LHN

Page 5: Advances in WP2

5

Results Summary (W.E.R.)

Test set baseline LIN adapted

E.R. LHN adapted

E.R

WSJ0 16kHz bigr LM

10.5 9.4 10.5% 8.4 20.0%

WSJ1 Spoke-3

16kHz bigr LM

54.2 46.5 14.2% 30.6 43.5%

HIWIRE8kHz

11.6 7.8 32.1% 7.2 37.9%

Page 6: Advances in WP2

6

Papers presented:

• Roberto Gemello, Franco Mana, Stefano Scanzio, Pietro Laface, Renato De Mori, “Adaptation of Hybrid ANN/HMM models using hidden linear transformations and conservative training”, Proc. of Icassp 2006, Toulouse, France, May 2006

• Dario Albesano, Roberto Gemello, Pietro Laface, Franco Mana, Stefano Scanzio, “Adaptation of Artificial Neural Networks Avoiding Catastrophic Forgetting”, Proc. of IJCNN 2006, Vancouver, Canada, July 2006

Page 7: Advances in WP2

7

The “Forgetting” problem in ANN Adaptation

• It is well known, in connectionist learning, that acquiring new information in the adaptation process can damage previously learned information (Catastrophic Forgetting)

• This effect must be taken into account when adapting an ANN with limited amount of data, which do not include enough samples for all the classes.

• The “absent” classes may be forgotten during adaptation as the discriminative training (Error Backpropagation) assigns always zero targets to absent classes

Page 8: Advances in WP2

8

“Forgetting” in ANN for ASR

• While Adapting ASR ANN/HMM model, this problem can arise when the adaptation set does not contain examples for some phonemes, due to the limited amount of adaptation data or the limited vocabulary

• The ANN training is discriminative, contrary to that of GMM-HMMs, and absent phonemes will be penalized by assigning to them a zero target during the adaptation

• That induces in the ANN a forgetting of the capability to classify the absent phonemes. Thus, while the HMM models for phonemes with no observations remain un-adapted, the ANN output units corresponding to phonemes with no observations loose their characterization, rather than staying not adapted

Page 9: Advances in WP2

9

Example of ForgettingAdaptation examples only of E, U, O (e.g. from words: uno, due, tre); no examples for the other vowels (A, I, ə )The classes with examples adapt themselves, but tend to invade the classes with no examples, that are partially “forgotten”

F1 (kHz)

F2

(kHz)

E

e A

U O

0.00.5

5.0

4.0

3.0

2.0

1.0

1.51.00.5F1 (kHz)

F2

(kHz)

I E

e A

U O

0.00.5

5.0

4.0

3.0

2.0

1.0

1.51.00.5

I

Page 10: Advances in WP2

10

“Conservative” Training

• We have introduced “conservative training” to avoid the forgetting of absent phonemes

• The idea is to avoid zero target for the absent phonemes, using for them the output of the Original NN as target;

Let be FP the set of phonemes present in the adaptation set and FA the set of absent ones. The target are assigned according to the following equations:

0

1

0

iPi

iPi

Ai

fcorrectFfTARGET

fcorrectFfTARGET

FfTARGETStandard policy

0

__1

__

iPi

FjjiPi

iAi

fcorrectFfTARGET

fNNORIGINALOUTPUTfcorrectFfTARGET

fNNORIGINALOUTPUTFfTARGET

A

Conservative policy

Page 11: Advances in WP2

11

Conservative Training target assignment policy

A1 P1 P2 P3 A2

P2 is the class corresponding to the correct

phoneme

Px: class in the adaptation set Ax: absent

class

Posterior probability computed using the original network

0.03 0.00 0.95 0.00 0.02 0.00 0.00 1.00 0.00 0.00

Standard target assignment policy

Page 12: Advances in WP2

12

“Conservative” Training

• In this way, the phonemes that are absent in the adaptation set are “represented” by the response given by the Original NN

• Thus, the absent phonemes are not “absorbed” by the neighboring present phonemes

• The results of adaptation with conservative training are:– Comparable performances on target environment– Preservation of performances on generalist environment– Great improvement of performances in speaker adaptation,

when only few sentences are available

Page 13: Advances in WP2

13

Adaptation tasks

– Application data adaptation: Directory Assistance• 9325 Italian city names• 53713 training + 3917 test utterances

– Vocabulary adaptation: Command words• 30 command words• 6189 training + 3094 test utterances

– Channel-Environment adaptation: Aurora-3• 2951 training + 654 test utterances

Page 14: Advances in WP2

14

Adaptation Results on different tasks (%WER)

Adaptation Task

Adaptation

Method

Application

Directory

Assistance

Vocabulary

Command Words

Channel-Environment

Aurora-3 CH1

No adaptation 14.6 3.8 24.0

LIN 11.2 3.4 11.0

LIN + CT 12.4 3.4 15.3

LHN 9.6 2.1 9.8

LHN + CT 10.1 2.3 10.4

Page 15: Advances in WP2

15

Mitigation of Catastrophic Forgetting using Conservative Training

Models

Adapted on

Application

Directory

Assistance

Vocabulary

Command Words

Channel-Environment

Aurora-3 CH1

Adaptation

Method

LIN 36.3 42.7 108.6

LIN + CT 36.5 35.2 42.1

LHN 40.6 63.7 152.1

LHN + CT 40.7 45.3 44.2

No Adaptation 29.3

Tests using adapted models on Italian continuous speech (% WER)

Page 16: Advances in WP2

16

Conclusions

– The new LHN adaptation method, developed within the project, outperforms standard LIN adaptation

– In adaptation tasks with missing classes, Conservative Training reduces the catastrophic forgetting effect, preserving the performance on another generic task

Page 17: Advances in WP2

17

Workplan

• Selection of suitable benchmark databases (m6)

• Baseline set-up for the selected databases (m8)

• LIN adaptation method implemented and experimented on the

benchmarks (m12)

• Experimental results on Hiwire database with LIN (m18)

• Innovative NN adaptation methods and algorithms for acoustic

modeling and experimental results (m21)

• Further advances on new adaptation methods (m24)

• Unsupervised Adaptation: algorithms and experimentation (m33)