1 MAXENT 2007 R. F. Astudillo, D. Kolossa and R. Orglmeister

MAXENT 2007

R. F. Astudillo, D. Kolossa and R. Orglmeister

MAXENT 2007

PROPAGATION OF STATISTICAL INFORMATION THROUGH NON-LINEAR FEATURE EXTRACTIONS FOR ROBUST

SPEECH RECOGNITION

Overview:

1. Introduction: Automatic speech recognition.2. Problem: Imperfect noise suppression.3. Proposed solution: Uncertainty propagation.4. Tests & results.5. Conclusions.

R. F. Astudillo, D. Kolossa and R. Orglmeister - TU-Berlin

MAXENT 2007

Automatic Speech Recognizer (ASR)

• Feature extraction transforms signal into a domain more suitable for recognition.

• Speech recognizer models abstract speech components like phonemes or triphones, generates transcription.

• Most of speech recognition applications need noise suppression preprocessing.

MAXENT 2007

• Non-linear transformations that imitate the way humans process speech.

• Robust against inter-speaker and intra-speaker variability.

• Mel-cepstral and RASTA-PLP transformations.

Feature Extraction

MAXENT 2007

Speech Recognition

• Statistical models are used to model speech.

• Hidden Markov models with mixture of Gaussians (multivariable) for the emitting states.

Example:Mel-cepstral features

MAXENT 2007

Noise Suppression

• MMSE-LSA bayesian estimation [Ephraim1985] is one of the most used.

• Leaves residual noise.

• Introduces artifacts in speech.

• Most methods obtain an estimation of the short-time spectrum (STFT) of the clean signal .

Problem: Imperfect estimation.

MAXENT 2007

Solution: Modeling Uncertainty of Estimation

We model each element of the STFT as a complex Gaussian random distribution .

• Mean set equal to estimated clean value .

• Parameter controls the uncertainty.

MAXENT 2007

Propagation of Uncertainty

• We propagate first and second order moments of the distributions.

• Correlation between feature appears (covariance).

• Resulting uncertainty is combined with statistical model parameters for robust speech recognition

MAXENT 2007

Propagation of Uncertainty

• We propagate first and second order moments of the distributions.

• Correlation between feature appears (covariance).

• Resulting uncertainty is combined with statistical model parameters for robust speech recognition

MAXENT 2007

Approaches to Uncertainty Propagation

Analytic solutions Imply complex calculations. Specific for each transformation.

Pseudo-Montecarlo Unscented Transform [Julier1996]. Inefficient for high number of dimensions

(i.e. STFT 256 dim./frame).

►Piecewise Propagation Efficient combination of both methods. Valid for many feature extractions (i.e. MELSPEC, MFCC, RASTA-PLP).

MAXENT 2007

Piecewise Uncertainty Propagation

Exemplified with Mel-Ceptral transformation:

1. Modulus extraction (non-linear).2. Filterbank (linear).3. Logarithm (non-linear).4. Discrete-cosine-transform (linear).5. Delta and acceleration coefficients (linear).

MAXENT 2007

Propagation through Modulus

• By integrating the phase of a complex Gaussian distribution we obtain the Rice distribution.

• Mean and variance can be calculated as:

were L is the Legendre polynom.

MAXENT 2007

Propagation through filterbank

• Each filter output m is a weighted sum of frequency moduli.

• It can be expressed as a matrix multiplication.

• Mean and variance can be calculated as:

MAXENT 2007

Full Covariance and other linear transformations

• DCT, delta and acceleration can be computed similarly.

• Covariance after filterbank is no longer diagonal.

• Additional computation costs.

MAXENT 2007

Propagation through Logarithm

• Non-linear transformation

• Distribution after filterbank difficult to model

• not diagonal

• Dimesionality of the Mel features much smaller than the STFT features

► Unscented transform can be applied efficiently

MAXENT 2007

Unscented Transform

• Only points must be propagated.

• Points on the th covariace contour and the mean.

• = feature dimension

• Example for =2

MAXENT 2007

Unscented Transform II

• Mean and covariances are calculated by using weighted averages:

• Parameter allows higher moments of the distribution to be considered.

MAXENT 2007

Use of Uncertainty

• After propagation of uncertainty, missing feature techniques or uncertainty decoding may be applied.

• These techniques combine uncertainty and model information to ignore or reestimate noisy features.

Parametersof state f1

MAXENT 2007

Use of Uncertainty II

• Modified imputation [Kolossa2005] showed the best performance.

• It reestimates features for state q by maximizing the probability:

• Assuming multivariate Gaussian distribution for uncertaintyand model:

MAXENT 2007

Recognition Tests TI-DIGITS database

% correct identified words

Windnoise Streetnoise

Test Type Uncertainty -15dB 5dB -15dB 5dB

Clean Speech ( ) 98.76

Noisy ( ) 28.44 87.94 22.87 92.43

MMSE-LSA ( ) 34.78 75.27 36.63 92.43

+Aprox. uncertainty 46.68 88.72 22.72 94.90

+Ideal uncertainty 51.93 94.28 48.53 96.45

• 200 files (20 different speakers).

• Best, second best results.

MAXENT 2007

Conclusions

• The use of uncertainty in Mel-cepstral domain is useful to compensate imperfect estimation during noise suppression.

• Piecewise uncertainty propagation is valid for multiple feature extractions.

• Better estimation of uncertainty should improve the results.

MAXENT 2007

Thank You!

Some literature:

[Ephraim1985] Y. Ephraim, and D. Malah, Acoustics, Speech, and Signal Processing, IEEE Transactions on 33, 443–445 (1985).

[Julier1996] S. Julier, and J. Uhlmann, A general method for approximating nonlineartransformations of probability distributions, Tech. rep., University of Oxford, UK (1996).

[Kolossa2005] D. Kolossa, A. Klimas, and R. Orglmeister, “Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques,” Applications of Signal Processing to Audio and Acoustics, 2005. IEEE Workshop on, 2005, pp. 82-85.

1 MAXENT 2007 R. F. Astudillo, D. Kolossa and R. Orglmeister

Documents

Introduction to SDM with Maxent JohannesS Signer

Andrea astudillo ppt

Cf Astudillo Rc

Me Astudillo c

Maxent Dual

MaxEnt 2009 talk

Maxent Tutorial Slides

TESIS_Humberto Astudillo

MaxEnt’14, The 34th International Workshop on Bayesian ...djafari.free.fr › MaxEnt2014 › slides › MaxEnt2014_opening_slides_FB.… · MaxEnt’14, The 34th International Workshop

ASTUDILLO emana anta

Romo astudillo andrea_michelle_foro1_eps

Cf Astudillo Fw

SIG Maxent

Gabriel Layseca Astudillo

C5. Modelización espacial con MiraMon y MaxEnt

Antología de fabian alirio astudillo gil · Antología de fabian alirio astudillo gil Sobre el autor Fabian Astudillo nacio en Cali Colombia , sus padres Patrocinio Astudillo y su

Introduction to sdm with Maxent Johannes S

SIG Distribución Maxent

Information Geometry of MaxEnt Principle MaxEnt Principle Shun-ichi Amari RIKEN Brain Science Institute MaxEnt 07’

Presentación Jorge Astudillo