18
1 Speech & Audio Processing - Part–II Digital Audio Signal Processing Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] homes.esat.kuleuven.be/~moonen/ Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing Part-I (H. Van hamme) speech recognition speech coding (+audio coding) speech synthesis (TTS) Part-II (M. Moonen): Digital Audio Signal Processing microphone array processing noise- ,echo-, feedback- cancellation (de)reverberation active noise control, 3D audio PS: selection of topics

Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

1

Speech & Audio Processing - Part–II

Digital Audio Signal Processing

Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven

[email protected] homes.esat.kuleuven.be/~moonen/

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2

Speech & Audio Processing

•  Part-I (H. Van hamme) speech recognition speech coding (+audio coding) speech synthesis (TTS) •  Part-II (M. Moonen): Digital Audio Signal Processing microphone array processing noise- ,echo-, feedback- cancellation (de)reverberation active noise control, 3D audio PS: selection of topics

Page 2: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

2

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 3

Digital Audio Signal Processing

•  Aims/scope •  Case study: Hearing instruments •  Overview •  Prerequisites •  Lectures/course material/literature •  Exercise sessions/project •  Exam

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 4

Aims/Scope

Aim is 2-fold : •  Speech & audio per se S & A industry in Belgium/Europe/… •  Basic signal processing theory/principles : Optimal filters Adaptive filter algorithms (APA, Filtered-X LMS,..) Kalman filters (linear/nonlinear) etc...

Page 3: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

3

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 5

1921

20

07 (O

ticon

)

Case Study: Hearing Instruments 1/12

èHearing Aids (HAs) •  Audio input/audio output (`microphone-processing-loudspeaker’) •  ‘Amplifier’, but so much more than an amplifier!! •  History:

•  Horns/trumpets/… •  `Desktop’ HAs (1900) •  Wearable HAs (1930) •  Digital HAs (1980)

•  State-of-the-art: •  MHz’s clock speed •  Millions of arithmetic operations/sec, … •  Multiple microphones

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 6

Ale

ssan

dro

Volta

174

5-18

27

© C

ochl

ear L

td

Case Study: Hearing Instruments 2/12

Electrical stimulation for low frequency

Electrical stimulation for high frequency

èCochlear Implants (CIs) •  Audio input/electrode stimulation output •  Stimulation strategy + preprocessing similar to HAs •  History:

•  Volta’s experiment… •  First implants (1960) •  Commercial CIs (1970-1980) •  Digital CIs (1980)

•  State-of-the-art: •  MHz’s clock speed, Mops/sec, … •  Multiple microphones

èOther: Bone anchored HAs, middle ear implants, …

Intra-cochlear electrode

Page 4: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

4

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 7

•  Hearing loss types: •  conductive •  sensorineural •  mixed

•  One in six adults (Europe) …and still increasing •  Typical causes:

•  aging •  exposure to loud sounds •  …

Case Study: Hearing Instruments 3/12

[Source: Lapperre]

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 8

Hearing impairment : Dynamic range & audibility Normal hearing Hearing impaired subjects subjects

Case Study: Hearing Instruments 4/12

Level

100dB

0dB

Page 5: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

5

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 9

Hearing impairment : Dynamic range & audibility Dynamic range compression (DRC) (…rather than `amplification’)

Case Study: Hearing Instruments 5/12

Level

100dB

0dB Input Level (dB)

Out

put L

evel

(dB

)

0dB 100dB

0dB

100dB

Design: multiband DRC, attack time, release time, …

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 10

Hearing impairment : Audibility vs speech intelligibility •  Audibility does not imply intelligibility •  Hearing impaired subjects

need 5..10dB larger signal-to-noise ratio (SNR) for speech understanding in noisy environments

•  Need for noise reduction (=speech enhancement) algorithms: •  State-of-the-art: monaural 2-microphone adaptive noise reduction •  Near future: binaural noise reduction (see below) •  Not-so-near future: multi-node noise reduction (see below)

Case Study: Hearing Instruments 6/12

SNR

20dB

0dB 30 50 70 90

Hearing loss (dB, 3-freq-average)

Page 6: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

6

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 11

HA technology requirements •  Small form factor (cfr. user acceptance) •  Low power: 1…5mW (cfr. battery lifetime ≈ 1 week) •  Low processing delay: 10msec (cfr. synchronization with lip reading)

DSP challenges in hearing instruments •  Dynamic range compression (cfr supra) •  Dereverberation: undo filtering (`echo-ing’) by room acoustics •  Feedback cancellation •  Noise reduction

Case Study: Hearing Instruments 7/12

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 12

DSP Challenges: Feedback Cancellation •  Problem statement: Loudspeaker signal is fed back into microphone,

then amplified and played back again •  Closed loop system may become unstable (howling) •  Similar to feedback problem in public address systems (for the

musicians amongst you)

Case Study: Hearing Instruments 8/12

Model

F

-

Similar to echo cancellation in GSM handsets, Skype,… but more difficult due to signal correlation

Page 7: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

7

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 13

DSP Challenges: Noise reduction Multimicrophone ‘beamforming’, typically with 2

microphones, e.g. ‘directional’ front microphone and ‘omnidirectional’ back microphone

Case Study: Hearing Instruments 9/12

“filter-and-sum” the

microphone signals

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 14

Binaural hearing: Binaural auditory cues •  ITD (interaural time difference) •  ILD (interaural level difference)

•  Binaural cues (ITD: f < 1500Hz, ILD: f > 2000Hz) used for

•  Sound localization •  Noise reduction =`Binaural unmasking’ (‘cocktail party’ effect) 0-5dB

Case Study: Hearing Instruments 10/12

ITD

ILD signal

Page 8: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

8

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 15

Binaural hearing aids •  Two hearing aids (L&R) with wireless link & cooperation •  Opportunities:

•  More signals (e.g. 2*2 microphones) •  Better sensor spacing (17cm i.o. 1cm)

•  Constraints: power/bandwith/delay of wireless link •  ..10kBit/s: coordinate program settings, parameters,… •  ..300kBits/s: exchange 1 or more (compressed) audio signals

•  Challenges: •  Improved localization through cue preservation •  Improved noise reduction + benefit from binaural unmasking •  Signal selection/filtering, audio coding, synchronisation, …

Case Study: Hearing Instruments 11/12

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 16

Future: Multi-node noise reduction – sensor networks

Case Study: Hearing Instruments 12/12

Page 9: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

9

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 17

Overview General speech communication set-up : - background ‘noise’ → noise suppression, source separation - far-end echoes → acoustic echo cancellation - reverberation → de-reverberation/deconvolution Applications :

•  teleconferencing/teleclassing •  hands-free telephony •  hearing aids, etc..

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 18

Overview : Lecture-2

Microphone Array Processing Spatial filtering - Beamforming Fixed vs. adaptive beamforming Example filter-and-sum beamformer :

Application: hearing aids

),(1 θωY

),(2 θωY

),(1 θωY

)(1 ωF

)(2 ωF

)(ωmF),( θωmY

)(ωMF),( θωMY

md),(1 θωY

)(ωS

θ

Σ),( θωZ

θcosmd

Page 10: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

10

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 19

Overview : Lecture-3 Noise Reduction `microphone_signal[k] = speech[k] + noise[k]’ •  Single-microphone noise reduction

–  Spectral Subtraction Methods (spectral filtering) –  Iterative methods based on speech modeling (Wiener & Kalman Filters)

•  Multi-microphone noise reduction –  Beamforming revisited –  Optimal filtering approach : spectral+spatial filtering

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 20

Overview : Lecture-4

Acoustic Echo Cancellation Adaptive filtering problem: •  non-stationary/wideband/… speech signals •  non-stationary/long/… acoustic channels

Adaptive filtering algorithms AEC Control AEC Post-processing Stereo AEC

Page 11: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

11

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 21

Overview : Lecture-5

Acoustic Feedback Cancellation •  Ex: Hearing aids •  Ex: PA systems •  correlation between filter input (`x ’) and near-end signal (‘ n ’) •  fixes : noise injection, pitch shifting, notch filtering, ...

amplifier

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 22

Overview : Lecture-6

Reverb & De-reverberation ` microphone_signal[k] = filter*speech[k] (+ noise[k]) ’

•  Reverb = effect of acoustic channel in between speaker and microphone(s)

•  Reverb has an impact on coding, speech recognition, etc.

•  Single-microphone de-reverberation –  Cepstrum techniques

•  Multi-microphone de-reverberation: –  Estimation of acoustic impulse responses –  Inverse-filtering method –  Matched filtering

Page 12: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

12

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 23

Overview : Lecture-7

Active Noise Control •  Solution based on `filtered-X LMS’ •  Application : active headsets/ear defenders

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 24

Overview : Lecture-7bis

3D Audio & Loudspeaker Arrays •  Binaural synthesis …with headphones head related transfer functions (HRTF) …with 2+ loudspeakers (`sweet spot’) crosstalk cancellation

Page 13: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

13

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 25

Overview : Lecture-8

Case Study: Signal Processing in Cochlear Implants

1Hr lecture by Cochlear LtD To be scheduled

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 26

Aims/Scope (revisited)

Aim is 2-fold : •  Speech & audio per se •  Basic signal processing theory/principles : Optimal filtering / Kalman filters (linear/nonlinear) here : speech enhancement other : automatic control, spectral estimation, ... Advanced adaptive filter algorithms here : acoustic echo cancellation other : digital communications, ... Filtered-X LMS here : 3D audio other : active noise/vibration control

Page 14: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

14

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 27

Lectures

Lectures: 7*2hrs + 1*1hr –  PS: Time budget = (15hrs)*4 = 60 hrs

Course Material: Slides

–  Use version 2013-2014 ! –  Download from DASP webpage

http://homes.esat.kuleuven.be/~dspuser/dasp/

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 28

Prerequisites

•  H197 Signals & Systems (JVDW) •  HJ09 Digital Signal Processing (I) (PW) signal transforms, sampling, multi-rate, DFT, …

•  HC63 DSP-CIS (MM) filter design, filter banks, optimal & adaptive filters

Page 15: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

15

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 29

Literature

Literature (General) (available in DSP-CIS library) •  Simon Haykin `Adaptive Filter Theory’ (Prentice Hall 1996) •  P.P. Vaidyanathan `Multirate Systems and Filter Banks’ (Prentice Hall 1993) Literature (specialized) (some available in DSP-CIS library) •  S.L. Gay & J. Benesty `Acoustic Signal Processing for Telecommunication’ (Kluwer 2000) •  M. Kahrs & K. Brandenburg (Eds) `Applications of Digital Signal Processing to Audio and Acoustics’ (Kluwer1998) •  B. Gold & N. Morgan `Speech and Audio Signal Processing’ (Wiley 2000)

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 30

Exercise Sessions/Project Acoustic source localization

–  Direction-of-arrival estimation –  Noise reduction –  Echo cancellation –  Simulated set-up

Direction-of-arrival θ

Page 16: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

16

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 31

•  Runs over 4 weeks (non-consecutive) •  Each week

–  1 PC/Matlab session (supervised, 2.5hrs) –  2 ‘Homework’ sesions (unsupervised, 2*2.5hrs)

PS: Time budget = 4*(2.5hrs+5hrs) = 30 hrs •  ‘Deliverables’ after week 2 & 4 •  Grading: based on deliverables, evaluated during sessions

•  TAs: guiliano.bernardi@esat (English+Italian)

alexander.bertrand@esat (English+Dutch)

PS: groups of 2

Acoustic Source Localization Project

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 32

Work Plan

–  Week 1: Design Matlab simulation set-up –  Week 2: Direction-of-arrival (DoA) estimation *deliverable* –  Week 3: DoA estimation + noise reduction –  Week 4: DoA estimation + echo cancellation *deliverable*

Acoustic Source Localization Project

..be there !

Page 17: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

17

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 33

•  Oral exam, with preparation time •  Open book •  Grading

7 for question-1 7 for question-2 +6 for project ___ = 20

Exam

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 34

•  Oral exam, with preparation time •  Open book •  Grading

7 for question-1 7 for question-2 +6 for question-3 (related to project work) ___ = 20

September Retake Exam

Page 18: Speech & Audio Processing - Part–Idspuser/dasp/... · Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 2 Speech & Audio Processing •

18

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 35

Website

1)  TOLEDO 2)  http://homes.esat.kuleuven.be/~dspuser/dasp/ •  Contact: guiliano.bernardi@esat •  Slides (use `version 2013-2014’ !!) •  Schedule •  DSP-library •  FAQs (send questions to marc.moonen@esat)

Digital Audio Signal Processing: Introduction Version 2013-2014 Lecture-1: Introduction p. 36

Questions?

1)  Ask teaching assistant (during exercises sessions)

2)  E-mail questions to teaching assistant or marc.moonen@esat 3) Make appointment marc.moonen@esat ESAT Room 01.69