Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

1

Speech & Audio Processing - Part–II

Digital Audio Signal Processing

Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven

[email protected] homes.esat.kuleuven.be/~moonen/

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2

Speech & Audio Processing

•  Part-I (H. Van hamme) speech recognition speech coding (+audio coding) speech synthesis (TTS) •  Part-II (M. Moonen): Digital Audio Signal Processing microphone array processing noise- ,echo-, feedback- cancellation (de)reverberation active noise control, 3D audio PS: selection of topics

2


Digital Audio Signal Processing

•  Aims/scope •  Case study: Hearing instruments •  Overview •  Prerequisites •  Lectures/course material/literature •  Exercise sessions/project •  Exam


Aims/Scope

Aim is 2-fold : •  Speech & audio per se S & A industry in Belgium/Europe/… •  Basic signal processing theory/principles : Optimal filters Adaptive filter algorithms (APA, Filtered-X LMS,..) Kalman filters (linear/nonlinear) etc...

3


1921

20

07 (O

ticon

)

Case Study: Hearing Instruments 1/12

èHearing Aids (HAs) •  Audio input/audio output (`microphone-processing-loudspeaker’) •  ‘Amplifier’, but so much more than an amplifier!! •  History:

•  Horns/trumpets/… •  `Desktop’ HAs (1900) •  Wearable HAs (1930) •  Digital HAs (1980)

•  State-of-the-art: •  MHz’s clock speed •  Millions of arithmetic operations/sec, … •  Multiple microphones


Ale

ssan

dro

Volta

174

5-18

27

© C

ochl

ear L

td


Electrical stimulation for low frequency

Electrical stimulation for high frequency

èCochlear Implants (CIs) •  Audio input/electrode stimulation output •  Stimulation strategy + preprocessing similar to HAs •  History:

•  Volta’s experiment… •  First implants (1960) •  Commercial CIs (1970-1980) •  Digital CIs (1980)

•  State-of-the-art: •  MHz’s clock speed, Mops/sec, … •  Multiple microphones

èOther: Bone anchored HAs, middle ear implants, …

Intra-cochlear electrode

4


•  Hearing loss types: •  conductive •  sensorineural •  mixed

•  One in six adults (Europe) …and still increasing •  Typical causes:

•  aging •  exposure to loud sounds •  …


[Source: Lapperre]


Hearing impairment : Dynamic range & audibility Normal hearing Hearing impaired subjects subjects


Level

100dB

0dB

5


Hearing impairment : Dynamic range & audibility Dynamic range compression (DRC) (…rather than àmplification’)


Level

100dB

0dB Input Level (dB)

Out

put L

evel

(dB

)

0dB 100dB

0dB

100dB

Design: multiband DRC, attack time, release time, …


Hearing impairment : Audibility vs speech intelligibility •  Audibility does not imply intelligibility •  Hearing impaired subjects

need 5..10dB larger signal-to-noise ratio (SNR) for speech understanding in noisy environments

•  Need for noise reduction (=speech enhancement) algorithms: •  State-of-the-art: monaural 2-microphone adaptive noise reduction •  Near future: binaural noise reduction (see below) •  Not-so-near future: multi-node noise reduction (see below)


SNR

20dB

0dB 30 50 70 90

Hearing loss (dB, 3-freq-average)

6


HA technology requirements •  Small form factor (cfr. user acceptance) •  Low power: 1…5mW (cfr. battery lifetime ≈ 1 week) •  Low processing delay: 10msec (cfr. synchronization with lip reading)

DSP challenges in hearing instruments •  Dynamic range compression (cfr supra) •  Dereverberation: undo filtering (ècho-ing’) by room acoustics •  Feedback cancellation •  Noise reduction



DSP Challenges: Feedback Cancellation •  Problem statement: Loudspeaker signal is fed back into microphone,

then amplified and played back again •  Closed loop system may become unstable (howling) •  Similar to feedback problem in public address systems (for the

musicians amongst you)


Model

F

-

Similar to echo cancellation in GSM handsets, Skype,… but more difficult due to signal correlation

7


DSP Challenges: Noise reduction Multimicrophone ‘beamforming’, typically with 2

microphones, e.g. ‘directional’ front microphone and ‘omnidirectional’ back microphone


“filter-and-sum” the

microphone signals


Binaural hearing: Binaural auditory cues •  ITD (interaural time difference) •  ILD (interaural level difference)

•  Binaural cues (ITD: f < 1500Hz, ILD: f > 2000Hz) used for

•  Sound localization •  Noise reduction =`Binaural unmasking’ (‘cocktail party’ effect) 0-5dB


ITD

ILD signal

8


Binaural hearing aids •  Two hearing aids (L&R) with wireless link & cooperation •  Opportunities:

•  More signals (e.g. 2*2 microphones) •  Better sensor spacing (17cm i.o. 1cm)

•  Constraints: power/bandwith/delay of wireless link •  ..10kBit/s: coordinate program settings, parameters,… •  ..300kBits/s: exchange 1 or more (compressed) audio signals

•  Challenges: •  Improved localization through cue preservation •  Improved noise reduction + benefit from binaural unmasking •  Signal selection/filtering, audio coding, synchronisation, …



Future: Multi-node noise reduction – sensor networks


9


Overview General speech communication set-up : - background ‘noise’ → noise suppression, source separation - far-end echoes → acoustic echo cancellation - reverberation → de-reverberation/deconvolution Applications :

•  teleconferencing/teleclassing •  hands-free telephony •  hearing aids, etc..


Overview : Lecture-2

Microphone Array Processing Spatial filtering - Beamforming Fixed vs. adaptive beamforming Example filter-and-sum beamformer :

Application: hearing aids

),(1 θωY

),(2 θωY

),(1 θωY

)(1 ωF

)(2 ωF

)(ωmF),( θωmY

)(ωMF),( θωMY

md),(1 θωY

)(ωS

θ

Σ),( θωZ

θcosmd

10


Overview : Lecture-3 Noise Reduction `microphone_signal[k] = speech[k] + noise[k]’ •  Single-microphone noise reduction

–  Spectral Subtraction Methods (spectral filtering) –  Iterative methods based on speech modeling (Wiener & Kalman Filters)

•  Multi-microphone noise reduction –  Beamforming revisited –  Optimal filtering approach : spectral+spatial filtering



Acoustic Echo Cancellation Adaptive filtering problem: •  non-stationary/wideband/… speech signals •  non-stationary/long/… acoustic channels

Adaptive filtering algorithms AEC Control AEC Post-processing Stereo AEC

11



Acoustic Feedback Cancellation •  Ex: Hearing aids •  Ex: PA systems •  correlation between filter input (`x ’) and near-end signal (‘ n ’) •  fixes : noise injection, pitch shifting, notch filtering, ...

amplifier



Reverb & De-reverberation ` microphone_signal[k] = filter*speech[k] (+ noise[k]) ’

•  Reverb = effect of acoustic channel in between speaker and microphone(s)

•  Reverb has an impact on coding, speech recognition, etc.

•  Single-microphone de-reverberation –  Cepstrum techniques

•  Multi-microphone de-reverberation: –  Estimation of acoustic impulse responses –  Inverse-filtering method –  Matched filtering

12



Active Noise Control •  Solution based on `filtered-X LMS’ •  Application : active headsets/ear defenders


Overview : Lecture-7bis

3D Audio & Loudspeaker Arrays •  Binaural synthesis …with headphones head related transfer functions (HRTF) …with 2+ loudspeakers (`sweet spot’) crosstalk cancellation

13



Case Study: Signal Processing in Cochlear Implants

1Hr lecture by Cochlear LtD To be scheduled


Aims/Scope (revisited)

Aim is 2-fold : •  Speech & audio per se •  Basic signal processing theory/principles : Optimal filtering / Kalman filters (linear/nonlinear) here : speech enhancement other : automatic control, spectral estimation, ... Advanced adaptive filter algorithms here : acoustic echo cancellation other : digital communications, ... Filtered-X LMS here : 3D audio other : active noise/vibration control

14


Lectures

Lectures: 7*2hrs + 1*1hr –  PS: Time budget = (15hrs)*4 = 60 hrs

Course Material: Slides

–  Use version 2013-2014 ! –  Download from DASP webpage

http://homes.esat.kuleuven.be/~dspuser/dasp/


Prerequisites

•  H197 Signals & Systems (JVDW) •  HJ09 Digital Signal Processing (I) (PW) signal transforms, sampling, multi-rate, DFT, …

•  HC63 DSP-CIS (MM) filter design, filter banks, optimal & adaptive filters

15


Literature

Literature (General) (available in DSP-CIS library) •  Simon Haykin Àdaptive Filter Theory’ (Prentice Hall 1996) •  P.P. Vaidyanathan `Multirate Systems and Filter Banks’ (Prentice Hall 1993) Literature (specialized) (some available in DSP-CIS library) •  S.L. Gay & J. Benesty Àcoustic Signal Processing for Telecommunication’ (Kluwer 2000) •  M. Kahrs & K. Brandenburg (Eds) Àpplications of Digital Signal Processing to Audio and Acoustics’ (Kluwer1998) •  B. Gold & N. Morgan `Speech and Audio Signal Processing’ (Wiley 2000)


Exercise Sessions/Project Acoustic source localization

–  Direction-of-arrival estimation –  Noise reduction –  Echo cancellation –  Simulated set-up

Direction-of-arrival θ

16


•  Runs over 4 weeks (non-consecutive) •  Each week

–  1 PC/Matlab session (supervised, 2.5hrs) –  2 ‘Homework’ sesions (unsupervised, 2*2.5hrs)

PS: Time budget = 4*(2.5hrs+5hrs) = 30 hrs •  ‘Deliverables’ after week 2 & 4 •  Grading: based on deliverables, evaluated during sessions

•  TAs: guiliano.bernardi@esat (English+Italian)

alexander.bertrand@esat (English+Dutch)

PS: groups of 2

Acoustic Source Localization Project


Work Plan

–  Week 1: Design Matlab simulation set-up –  Week 2: Direction-of-arrival (DoA) estimation *deliverable* –  Week 3: DoA estimation + noise reduction –  Week 4: DoA estimation + echo cancellation *deliverable*

Acoustic Source Localization Project

..be there !

17


•  Oral exam, with preparation time •  Open book •  Grading

7 for question-1 7 for question-2 +6 for project ___ = 20

Exam


•  Oral exam, with preparation time •  Open book •  Grading

7 for question-1 7 for question-2 +6 for question-3 (related to project work) ___ = 20

September Retake Exam

18


Website

1)  TOLEDO 2)  http://homes.esat.kuleuven.be/~dspuser/dasp/ •  Contact: guiliano.bernardi@esat •  Slides (use `version 2013-2014’ !!) •  Schedule •  DSP-library •  FAQs (send questions to marc.moonen@esat)


Questions?

1)  Ask teaching assistant (during exercises sessions)

2)  E-mail questions to teaching assistant or marc.moonen@esat 3) Make appointment marc.moonen@esat ESAT Room 01.69

Documents

Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •