38
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique Experimental Results Summary and Future Research Recent Advances in Speech Dereverberation Dr.Ir. Emanu¨ el Habets In collaboration with Dr. Sharon Gannot and Dr. Israel Cohen Department of Electrical Engineering, Technion - IIT School of Electrical Engineering, Bar-Ilan University IBM Speech Technologies Seminar 2008 Dr.Ir. Emanu¨ el Habets Recent Advances in Speech Dereverberation

Recent Advances in Speech Dereverberation

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Recent Advances in Speech DereverberationExperimental Results Summary and Future Research
Recent Advances in Speech Dereverberation
Dr.Ir. Emanuel Habets
In collaboration with Dr. Sharon Gannot and Dr. Israel Cohen
Department of Electrical Engineering, Technion - IIT
School of Electrical Engineering, Bar-Ilan University
IBM Speech Technologies Seminar 2008
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Outline
4 Experimental Results
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
What is reverberation? Motivation for Speech Dereverberation Applications Problem Formulation Challenges
Outline
4 Experimental Results
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
What is reverberation? Motivation for Speech Dereverberation Applications Problem Formulation Challenges
What is reverberation?
Reverberation is the process of multi-path propagation of a sound from its source to a receiver.
Audio Example:
Anechoic Speech.
Reverberation Speech.
example_clean.wav
Experimental Results Summary and Future Research
What is reverberation? Motivation for Speech Dereverberation Applications Problem Formulation Challenges
Motivation for Speech Dereverberation
Wall
Signal degradation that is caused by reverberation and ambient noise can decrease the fidelity and intelligibility of speech and the recognition performance of automatic speech recognition systems.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
What is reverberation? Motivation for Speech Dereverberation Applications Problem Formulation Challenges
Motivation for Speech Dereverberation
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
What is reverberation? Motivation for Speech Dereverberation Applications Problem Formulation Challenges
Applications
Automotive Hands-Free Car Phone Kits.
Health Hearing Aids, Home-Care.
Mobile Mobile Phones, Smartphones, PDA’s, Mobile Multimedia Systems.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
What is reverberation? Motivation for Speech Dereverberation Applications Problem Formulation Challenges
Problem Formulation
Given the anechoic speech signal s(n) and the acoustic impulse response h(n) we can express the reverberant speech signal as
z(n) = n∑
x(n) = z(n) + v(n).
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
What is reverberation? Motivation for Speech Dereverberation Applications Problem Formulation Challenges
Problem Formulation
Ultimate Goal
Complete Dereverberation: Given the microphone signals our objective is to estimate the anechoic speech signal s(n) up to an arbitrary scale and time delay.
Sufficient Goal
Partial Dereverberation: Given the microphone signals our objective is to estimate a filtered version of the anechoic speech signal s(n).
This filter should introduce less reverberation and spectral coloration compared to a reference acoustic channel.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
What is reverberation? Motivation for Speech Dereverberation Applications Problem Formulation Challenges
Challenges
Source Signal:
Impulse response is very long, i.e., approx. fs · RT60 samples.
Impulse response is nonminimum-phase.
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Classification Class I: Reverberation Cancellation Class II: Reverberation Suppression
Outline
4 Experimental Results
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Classification Class I: Reverberation Cancellation Class II: Reverberation Suppression
Existing Speech Dereverberation Techniques
In the context of automatic speech/speaker recognition dereverberation can be integrated into the recognizer.
Speech dereverberation can be performed in the
Feature Domain
Signal Domain
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Classification Class I: Reverberation Cancellation Class II: Reverberation Suppression
Classification
None
None
Litle
Litle
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Classification Class I: Reverberation Cancellation Class II: Reverberation Suppression
Class I: Reverberation Cancellation
s(t) x(t) s(t)Linear Inverse
Two distinct approaches:
Estimate s(t) directly, or the parameters of the signal model and the excitation signal, i.e., by treading the parameters of the system L(t) as nuisance parameters.
Firstly, model the linear system L(t). Secondly, estimate the parameters of the system L(t). Finally, deconvolve x(t) with L−1(t) to recover s(t).
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Classification Class I: Reverberation Cancellation Class II: Reverberation Suppression
Class I: Reverberation Cancellation
Null-space of the spatial correlation matrix [Gannot, 2003].
Bayesian parameter estimation techniques to estimate the unknown parameters of the speech and the channel model [Hopgood, 2000].
Problems and Limitations
Insufficiently robust to small changes in the AIR [Radlovic, 2000].
Channels cannot be identified uniquely when they contain common zeros.
Observation noise causes severe problems.
Some methods require knowledge of the order of the unknown system.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Classification Class I: Reverberation Cancellation Class II: Reverberation Suppression
Class II: Reverberation Suppression
Spatial Processing (e.g., delay and sum beamformer).
Linear Prediction Residual Enhancement [Gaubitch et al., 2004-2007].
Spectral Enhancement [Lebart, 2001; Habets, 2004-2007].
Problems and Limitations
a priori knowledge of the source and/or the channel is required.
Only partial dereverberation is possible.
Tendency to introduce speech distortions.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Outline
4 Experimental Results
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Proposed Approach
Early reflections introduce spectral coloration.
Late reflections change the waveform’s temporal envelope as exponentially decaying tails are added at sound offsets.
Independent research has shown that the speech fidelity and intelligibility are mainly degraded
by late reverberation.
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Proposed Approach
h(n)
h (n
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Proposed Approach
The received microphone signal x(n) can then be expressed as
x(n) = n∑
+ v(n)
X (`, k) = Ze(`, k) + Z`(`, k) + V (`, k).
We aim at the suppression of late reverberation and noise, i.e., at the estimation of the early speech
component Ze(`, k).
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Polack’s Statistical Reverberation Model
Polack developed a time-domain model where an AIR is described as a realization of a non-stationary stochastic process:
h(n) =
0 otherwise,
where b(n) is a white zero–mean Gaussian stationary noise sequence and α is linked to the reverberation time T60 through
α , 3 ln(10)
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Generalized Statistical Reverberation Model
To model the contribution of the direct path, the AIR h(n) is divided into two segments:
h(n) =
hr(n) = br(n)e−αn n ≥ Nr;
0 otherwise.
The value Nr is chosen such that hd(n) contains the direct path and that hr(n) consists of all later reflections.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Late Reverberant Spectral Variance Estimation
1 The spectral variance of the reverberant signal component zr(n) is given by
λzr(`, k) = e−2α(k)R (1− κ(k)) λzr(`− 1, k)
+ κ(k) e−2α(k)Rλz(`− 1, k),
where λz(`, k) = E{|Z (`, k)|2} and κ is inversely proportional to the Direct to Reverberation Ratio.
2 The spectral variance of the late reverberant signal component z`(n) is given by
λz` (`, k) = e−2α(k)(N`−R)λzr(`−
N`
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Single-Microphone Spectral Enhancement
x(n) TF Analysis
Synthesis
ze(n)
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Post-Filter
Various spectral enhancement methods can be used, e.g., spectral subtraction and statistical methods.
We used a statistical method that is based on a Mean Squared Error distortion measure and a Log Spectral Amplitude fidelity criterion. The STFT coefficients of the speech and interference are assumed to be complex Gaussian random variables.
The resulting gain function depends on the a priori and a posteriori Signal to Interference Ratios, and the speech presence probability.
We developed several modifications to improve the joint suppression of ambient noise and late reverberation.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Multi-Microphone Spectral Enhancement
Until now we exploited time diversity and spectral diversity. However, reverberation induces spatial diversity , which can be exploited by using multiple microphones.
The late reverberant spectral variance estimate can be improved using multiple microphones.
The speech presence probability estimation can be improved using spatial information (Mean Squared Coherence).
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Proposed Approach Statistical Reverberation Models Single-Microphone Spectral Enhancement Multi-Microphone Spectral Enhancement
Multi-Microphone Spectral Enhancement
The multi-microphone Minimum Mean Squared Error (MMSE) estimator can be divided into a Minimum Variance Distortionless Response (MVDR) beamformer and a single-Microphone MMSE estimator.
... MVDR
Beamformer
Figure: Multi-microphone MMSE estimator and the equivalent MVDR beamformer and single-microphone MMSE estimator.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Single-Microphone Multi-Microphone Audio Demonstration
4 Experimental Results
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Single-Microphone Multi-Microphone Audio Demonstration
xM (n)
Signal to Noise Ratio: 10 - 30 dB
Reverberation starts at N`/fs = 40 ms.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Single-Microphone Multi-Microphone Audio Demonstration
SIRseg = 1
|L| ∑ `∈L
10 log10
n=`R (s(n)− s(n))2
BSD = 1
|L| ∑ `∈L
ks=1 (Ls(`, ks))2 , (2)
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Single-Microphone Multi-Microphone Audio Demonstration
−15
−10
−5
0
5
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Microphone Processed NS Processed RS+NS
Figure: Segmental SIRs and BSDs of the unprocessed microphone signal, the processed signal after noise suppression (NS), and the processed signal after joint reverberation and noise suppression (RS+NS). The reverberation time varies between 0.2 and 1 s (SNR = 30 dB, D = 1 m, and N`/fs = 40 ms).
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Single-Microphone Multi-Microphone Audio Demonstration
0.5 1 1.5 2 2.5 3 3.5 4 −25
−20
−15
−10
−5
0
5
10
0.5 1 1.5 2 2.5 3 3.5 4 0
0.5
1
1.5
2
2.5
Microphone Processed NS Processed RS+NS
Figure: Segmental SIRs and BSDs of the unprocessed microphone signal, the processed signal after noise suppression (NS), and the processed signal after joint reverberation and noise suppression (RS+NS). The source-microphone varies between 0.25 and 4 m (SNR = 30 dB, T60 = 500 ms, and N`/fs = 40 ms).
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Single-Microphone Multi-Microphone Audio Demonstration
10 12.5 15 17.5 20 22.5 25 27.5 30 −20
−15
−10
−5
0
5
10
10 12.5 15 17.5 20 22.5 25 27.5 30 0.08
0.1
0.12
0.14
0.16
0.18
0.2
Microphone Processed NS Processed RS+NS
Figure: Segmental SIRs and BSDs of the unprocessed microphone signal, the processed signal after noise suppression (NS), and the processed signal after joint reverberation and noise suppression (RS+NS). The SNR of the received signal varies between 10 and 30 dB (D = 1 m,T60 = 500 ms, and N`/fs = 40 ms).
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Single-Microphone Multi-Microphone Audio Demonstration
1 2 3 4 5 6 7 8 9 −20
−15
−10
−5
0
5
1 2 3 4 5 6 7 8 9 0
0.1
0.2
0.3
0.4
0.5
n Number of Microphones
Microphone DSB DSB−PF
Figure: Segmental SIRs and BSDs of the reference microphone signal, the DSB signal, and the DSB-PF signal. The number of microphones ranges from 1 to 9 (D = 1.5 m, T60 = 0.5 s, SNR = 30 dB, and N`/fs = 40 ms).
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Single-Microphone Multi-Microphone Audio Demonstration
2
4
Processed
0
2
4
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
0.2
Microphone Processed
Figure: Spectrograms and waveforms of the microphone signal and processed signal (M = 4, D = 1.5 m, T60 = 0.7 s, SNR = 20 dB, and N`/fs = 48 ms).
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Outline
4 Experimental Results
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Summary and Future Research
Summary
We developed an effective and computational efficient estimator for the late reverberant spectral variance.
Suppression of late reverberation and ambient noise is possible using spectral enhancement.
Future Research
Optimal fidelity criteria for speech dereverberation?
A suitable technique to equalize the spectral colouration caused by the early reflections needs to be developed. Together with the developed spectral enhancement technique it can provide a practical solution for speech dereverberation.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation
Introduction Existing Speech Dereverberation Techniques Proposed Speech Dereverberation Technique
Experimental Results Summary and Future Research
Thank you for your attention....
For more information visit www.dereverberation.com and ehabets.dereverberation.com.
Dr.Ir. Emanuel Habets Recent Advances in Speech Dereverberation