View
217
Download
2
Tags:
Embed Size (px)
Citation preview
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
Speech Recognition in Noise
Esfandiar Zavarehei
Department of Electronic and Computer Engineering
Brunel University
25 May, 2004
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
Contents
• The use of formant features in speech recognition - Variable-Order LP Formant Tracker with Kalman Filtering
- Results
• Kalman De-noising - Tracking and Filtering the Frequency Trajectories (RASTA)
- How Kalman Filter is applied to de-noising problem
- Advantages of Kalman
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
Variable-Order LP Formant tracker
LP Model Pole Extraction
Rule-Based Refinement
LP Order Adjustment
Continuity Measurement
Track History
Formant Track
Pre-Processed Speech
Kalman Filter
• Higher order of LP modelling for higher resolution• Continuity criteria for better classification• Kalman Filtering for smoother Tracks
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
Formant Feature (FF) Vectors
•In addition to the Frequency of poles their Band Widths and Magnitudes are used as well
•The HMM models are trained on mono-phones.
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
FF vs. MFCC with and without energy component
Mono-phone recognition in Train noise
•Better performance of FF in severe noisy conditions
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
Robustness of dynamic FF to noise
Mono-phone recognition in Train noise
•Dynamic Features are much more robust to noise
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
The use of the Formants for consonant recognition
Mono-phone recognition in Train noise
•Higher Recognition rates than vowels in higher SNR•More sensitive to noise because of the lower energy level
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
De-noising the speech by filtering frequency trajectories
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
RelAtive SpecTrA (RASTA) Processing
• Filtering the frequency trajectories of the cubic root of power spectrum using a fixed IIR filter
14
431
0.981
220.1H
zz
zzzz
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
The use of FIR filters in RASTA
• Filtering the frequency trajectories of the power spectrum using a bank of non-casual FIR filters
• not adaptive• experimentally derived
Filters’ Impulse Response
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
Kalman Filtering
• Kalman Filter adaptively updates itself with noise covariance
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
How Kalman Filter is applied to de-noising problem
Segment Frequency Bin Trajectory
VAD
Noise Modelling
Prior Noise Model and Trajectory Statistics
Spectral Subtraction
Observation
Predictor
Predicted
Error covariance
Noise Covariance
Mean
Kalman Gain
EstimatorOutput
Kalman Filtering
Neighbour Trajectory
Noise Modelling and updating
Co
mm
un
icat
ion
s &
Mu
ltim
edia
Sig
nal
Pro
cess
ing
Advantages of Kalman
• A more informed noise reduction
• Combining the prediction and the observation of the frequency trajectory
• Adaptively updating the noise model while filtering the trajectory (in comparison with RASTA)
• Could (and probably should) be combined with spectral subtraction for improved performance