7
AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION MULTIMEDIA COMMUNI- CATION SYSTEMS Edited by YITENG (ARDEN) HUANG Bell Laboratories, Lucent Technologies JACOB BENESTY Universite du Quebec, INRS-EMT Kluwer Academic Publishers Boston/Dordrecht/London

AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION …

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION …

AUDIO SIGNAL PROCESSING FOR NEXT-GENERATION MULTIMEDIA COMMUNI­CATION SYSTEMS

Edited by YITENG (ARDEN) HUANG Bell Laboratories, Lucent Technologies

JACOB BENESTY Universite du Quebec, INRS-EMT

Kluwer Academic Publishers Boston/Dordrecht/London

Page 2: AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION …

Contents

Preface xi

Contributing Authors xiii

1 Introduction 1 Yiteng (Arden) Huang Jacob Benesty

1. Multimedia Communications 1 2. Challenges and Opportunities 3 3. Organization of the Book 4

Part I Speech Acquisition and Enhancement

2 Differer itial Microphone Arrays Gary W. Elko

1. 2. 3. 4.

5.

6. 7.

Introduction Differential Microphone Arrays Array Directional Gain Optimal Arrays for Isotropic Fields 4.1 Maximum Directional Gain 4.2 Maximum Directivity Index for Differential Microphones 4.3 Maximum Front-to-Back Ratio 4.4 Minimum Peak Directional Response 4.5 Beamwidth Design Examples 5.1 First-Order Designs 5.2 Second-Order Designs 5.3 Third-Order Designs 5.4 Higher-Order designs Sensitivity to Microphone Mismatch and Noise Conclusions

11

11 12 22 24 24 28 32 37 39 39 40 44 52 58 60 64

Page 3: AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION …

vi Audio Signal Processing 3 Spherical Microphone Arrays for 3D Sound Recording 67 Jens Meyer Gary W. Elko

1. Introduction 67 2. Fundamental Concept 69 3. The Eigenbeamformer 71

3.1 Discrete Orthonormality 73 3.2 The Eigenbeams 73 3.3 The Modal Coefficients 74

4. Modal-Beamformer 76 4.1 Combining Unit 76 4.2 Steering Unit 76

5. Robustness Measure 77 6. Beampattern Design 79

6.1 Arbitrary Beampattern Design 79 6.2 Optimum Beampattern Design 79

7. Measurements 83 8. Summary 86 9. Appendix A 89

4 Subband Noise Reduction Methods for Speech Enhancement 91 Eric J. Diethorn

1. Introduction 91 2. Wiener Filtering 94 3. Speech Enhancement by Short-Time Spectral Modification 95

3.1 Short-Time Fourier Analysis and Synthesis 95 3.2 Short-Time Wiener Filter 96 3.3 Power Subtraction 97 3.4 Magnitude Subtraction 98 3.5 Parametric Wiener Filtering 99 3.6 Review and Discussion 100

4. Averaging Techniques for Envelope Estimation 104 4.1 Moving Average 105 4.2 Single-Pole Recursion 105 4.3 Two-Sided Single-Pole Recursion 106 4.4 Nonlinear Data Processing 107

5. Example Implementation 107 5.1 Subband Filter Bank Architecture 108 5.2 A-Posteriori-SNR Voice Activity Detector 109 5.3 Example 111

6. Conclusion 111

Part II Acoustic Echo Cancellation

5 Adaptive Algorithms for MIMO Acoustic Echo Cancellation 119 Jacob Benesty Tomas Gänsler Yiteng (Arden) Huang Markus Rupp

1. Introduction 120 2. Normal Equations and Identification of a MIMO System 121

2.1 Normal Equations 121

Page 4: AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION …

Contents Vll

2.2 The Nonuniqueness Problem 124 2.3 The Impulse Response Tail Effect 125 2.4 Some Different Solutions for Decorrelation 126

3. The Classical and Factorized Multichannel RLS 128 4. The Multichannel Fast RLS 130 5. The Multichannel LMS Algorithm 132

5.1 Classical Derivation 132 5.2 Improved Version 133

6. The Multichannel APA 134 6.1 The Straightforward Multichannel APA 134 6.2 The Improved Two-Channel APA 135 6.3 The Improved Multichannel APA 136

7. The Multichannel Exponentiated Gradient Algorithm 137 8. The Multichannel Frequency-domain Adaptive Algorithm 142 9. Conclusions 145

6 Double-Talk Detectors for Acoustic Echo Cancelers 149 Tomas Gänsler Jacob Benesty

1. Introduction 149 2. Basics of AEC and DTD 152

2.1 AEC Notations 152 2.2 The Generic DTD 152 2.3 A Suggestion to Performance Evaluation of DTDs 153

3. Double-Talk Detection Algorithms 154 3.1 The Geigel Algorithm 154 3.2 The Cross-Correlation Method 154 3.3 The Normalized Cross-Correlation Method 155 3.4 The Coherence Method 157 3.5 The Normalized Cross-correlation Matrix 159 3.6 The Two-Path Model 161 3.7 DTD Combinations with Robust Statistics 163

4. Comparison of DTDs by Means of the ROC 165 5. Discussion 167

7 The WinEC: A Real-Time Hands-Free Stereo Communication System 171 Tomas Gänsler Volker Fischer Eric J. Diethorn Jacob Benesty

1. Introduction 172 1.1 Signal model 173

2. System Description 173 2.1 The Audio Module 173 2.2 The Network Module 176 2.3 The Echo Canceler Module 177

3. Algorithms of the Echo Canceler Module 177 3.1 Adaptive Filter Algorithm 178

4. Residual Echo and Noise Suppression 181 4.1 Masking Threshold for Residual Echo in Noise 183 4.2 Analysis of Echo Suppression Requirements 184 4.3 Noise and Residual Echo Suppression 186

5. Simulations 186 6. Real-Time Tests with Different Modes of Operation 189

Page 5: AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION …

viii Audio Signal Processing

6.1 Point-to-Point Communication 189 6.2 Multi-Point Communication 189 6.3 Transatlantic Teleconference in Stereo 190

7. Discussion 191

Part IH Sound Source Tracking and Separation

8 Time Delay Estimation 197 Jingdong Chen Yiteng (Arden) Huang Jacob Benesty

1. Introduction 198 2. Signal Models 200

2.1 Ideal Propagation Model 200 2.2 Multipath Model 201 2.3 Reverberant Model 202

3. Generalized Cross-Correlation Method 202 4. The Multichannel Cross-Correlation Algorithm 204

4.1 Spatial Prediction Technique 204 4.2 Time Delay Estimation Using Spatial Prediction 207 4.3 Other Information from the Spatial Correlation Matrix 208

5. Adaptive Eigenvalue Decomposition Algorithm 211 6. Adaptive Multichannel Time Delay Estimation 213

6.1 Principle 213 6.2 Time-Domain Multichannel LMS Approach 214 6.3 Frequency-Domain Adaptive Algorithms 215

7. Experiments 219 7.1 Experimental Setup 219 7.2 Performance Measure 220 7.3 Experimental Results 221

8. Conclusions 223 9 Source Localization 229 Yiteng (Arden) Huang Jacob Benesty Gary W. Elko

1. Introduction 230 2. Source Localization Problem 232 3. Measurement Model and Cramer-Rao Lower Bound for Source Lo­

calization 234 4. Maximum Likelihood Estimator 235 5. Least Squares Estimators 236

5.1 The Least Squares Error Criteria 237 5.2 Spherical Intersection (SX) Estimator 239 5.3 Spherical Interpolation (SI) Estimator 239 5.4 Linear-Correction Least Squares Estimator 240

6. Example System Implementation 246 7. Source Localization Examples 247 8. Conclusions 249

10 Blind Source Separation for Convolutive Mixtures: A Unified Treatment 255 Herbert Büchner Robert Aichner Walter Kellermann

Page 6: AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION …

Contents ix

1. Introduction 256 2. Generic Block Time-Domain BSS Algorithm 259

2.1 Matrix Notation for Convolutive Mixtures 259 2.2 Cost Function and Algorithm Derivation 261 2.3 Equivariance Property and Natural Gradient 263 2.4 Special Cases and Links to Known Time-Domain Algorithms 265

3. Generic Frequency-Domain BSS Algorithm 271 3.1 General Frequency-Domain Formulation 271 3.2 Natural Gradient in the Frequency Domain 276 3.3 Special Cases and Links to Known Frequency-Domain Al­

gorithms 277 4. Weighting Function 284

4.1 Off-line Implementation 285 4.2 On-line Implementation 285 4.3 В lock-on-Line Implementation 286

286 289

297

297 298 300 301 302 308 309 311 314 321

12 Sound Field Synthesis 323 Sascha Spors Heinz Teutsch Achim Kuntz Rudolf Rabenstein

1. Introduction 324 2. Rendering of Sound Fields with Wave Field Synthesis 325

2.1 Physical Foundation of Wave Field Synthesis 325 2.2 Wave Field Synthesis Based Sound Reproduction 327

3. Model-based and Data-Based Rendering 329 3.1 Data-Based Rendering 329 3.2 Model-Based Rendering 330 3.3 Hybrid Approach 331

4. Wave Field Analysis 331 5. Loudspeaker and Listening Room Compensation 333

5.1 Listening Room Compensation 334 5.2 Loudspeaker Compensation 337

6. Description of a Sound Field Transmission System 339

5. 6.

Part IV

11

Experiments and Results Conclusions

Audio Coding and Realistic Soun

Audio Coding Gerald Schüler

1. 2. 3.

4. 5. 6. 7.

Introduction Psycho-Acoustics Filter Banks 3.1 Polyphase Formulation 3.2 Modulated Filter Banks 3.3 Block Switching Current and Basic Coder Structures Stereo Coding Low Delay Audio Coding Conclusions

Page 7: AUDIO SIGNAL PROCESSING FOR NEXT- GENERATION …

X Audio Signal Processing

6.1 Acquisition of Source Signals 339 6.2 Sound Stage Reproduction Using Wave Field Synthesis 341

7. Summary 342

13 Virtual Spatial Sound 345 Carlos Avendano

1. Introduction 345 1.1 Scope 347

2. Spatial Hearing 348 2.1 Interaural Coordinate System 348 2.2 Interaural Differences 349 2.3 Spectral Cues 351 2.4 Distance Cues 352 2.5 Dynamic Cues 353

3. Acoustics of Spatial Sound 353 3.1 TheHRTF 353 3.2 Room Acoustics 357

4. Virtual Spatial Sound Systems 358 4.1 HRTF Measurement 358 4.2 HRTF Modelling 360 4.3 Virtual Spatial Sound Rendering 363

5. Conclusions 366

Index 371