Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an...

Preview:

Citation preview

Pattern Recognition

Gerhard Schmidt

Christian-Albrechts-Universität zu KielFaculty of Engineering Institute of Electrical and Information EngineeringDigital Signal Processing and System Theory

Part 1: Introduction and Motivation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 2

Introduction and Motivation

Contents of the Lecture „Pattern Recognition“

❑ Speech and audio signal paths in a car

❑ Contents of the lecture

❑ Boundary conditions of the lecture (exercises, exam, etc.)

❑ Notation used in the lecture

❑ Literature

❑ Example of medical, speech, and audio signal processing

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 3

Introduction and Motivation

Speech and Audio Signal Paths in a Car – Part 1

Into the car

Out of the car

Within the car

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 4

Introduction and Motivation

Speech and Audio Signal Paths in a Car – Part 2

Signal processing inthe „receiving path“

Signal processing forenhancing the

communication quality and the sound impression

Signal processing inthe „sending path“

Speech dialogsystem and

phone

Music and audio

sources

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 5

Introduction and Motivation

Contents of the Lecture (Entire Term)

❑ Preprocessing for improving the „noise robustness“

❑ Single-channel noise suppression

❑ Beamforming

❑ Pattern recognition (using speech and speaker recognition as an example)

❑ Basics of speech production

❑ Feature extraction

❑ Codebook generation

❑ Generation of Gaussian mixture models (GMMs)

❑ Hidden Markov models (HMMs)

❑ Enhancing the playback of audio signals

❑ Extending the bandwidth of speech signals (as application of codebooks)

❑ Loudspeaker equalization

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 6

Introduction and Motivation

Boundary Conditions of the Lecture

❑ ECTS points

❑ 4 credit points

❑ Oral examination

❑ about 20 minutes per student

❑ After the term

❑ Talks (part of the exercise)

❑ About 10 minutes talk plus 5 minutes discussion

❑ Topics are available from now on

❑ Lecture slides

❑ Printed at the beginning of each lecture

❑ In the internet via dss.tf.uni-kiel.de

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 7

Introduction and Motivation

Notation – Part 1

Scalars:

❑ Signals:

❑ Impulse responses (time-variant):

❑ Example for a (real) convolution:

Vectors:

❑ Signal vectors:

❑ Impulse response vectors (time-variant) :

❑ Example for a real convolution:

Matrices:

Coefficient index

Boldface and uppercase

Boldface and lowercase

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 8

Introduction and Motivation

Notation – Part 2

Random variables and processes:

❑ Notation:

❑ Probability density function:

❑ Stationary random processes:

❑ Expected values of stationary random processes:

No differences between deterministic signals and randomprocesses – different writing styles:

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 9

Introduction and Motivation

Notation – Part 3

Auto and cross correlation for real, stationary random processes:

❑ Auto-correlation function:

❑ Cross-correlation function:

❑ (Auto) power spectral density:

❑ (Cross) power spectral density:

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 10

Introduction and Motivation

Notation – Part 4

Stationary white noise:

❑ Auto-correlation function:

❑ Auto power spectral density:

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 11

Introduction and Motivation

Literature – Part 1

❑ E. Hänsler: Statistische Signale: Grundlagen und Anwendungen, Springer, 2001 (in German)

❑ A. Papoulis: Probability, Random Variables, and Stochastic Processes, McGraw-Hill, 1965

Statistical signal theory:

❑ E. Hänsler, G. Schmidt: Acoustic Echo and Noise Control, Wiley, 2004

❑ S. Haykin: Adaptive Filter Theory, Prentice Hall, 2002

❑ A. Sayed: Fundamentals of Adaptive Filtering, Wiley, 2004

Noise suppression, beamforming, adaptive filters:

❑ E. Hänsler, G. Schmidt: Topics in Acoustic Echo and Noise Control, Springer, 2006

❑ B. Iser, et al.: Bandwidth Extension of Speech Signals, Springer, 2008

❑ E. Hänsler, G. Schmidt: Speech and Audio Processing in Adverse Environments, Springer, 2008

❑ J. Benesty, et al.: Speech Enhancement, Springer, 2005

Application examples for speech processing:

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 12

Introduction and Motivation

Literature – Part 2

Speech processing:

❑ L. R. Rabiner, R. W. Schafer: Digital Processing of Speech Signals, Prentice Hall, 1978

❑ P. Vary, U. Heute, W. Hess: Digitale Sprachsignalverarbeitung, Teubner, 1998 (in German)

❑ P. Vary, R. Martin: Digital Speech Transmission, Wiley, 2006

❑ L. R. Rabiner, R. W. Schafer: Introduction to Digital Speech Processing, Now, 2008

❑ B. Pfister, T. Kaufman: Sprachverarbeitung, Springer, 2008 (in German)

Audio processing:

❑ U. Zölzer: DAFX – Digital Audio Effects, Wiley, 2002

❑ E. Larsen, R. M. Aarts: Audio Bandwidth Extension, Wiley, 2004

❑ M. Talbot-Shmith: Audio Engineer‘s Reference Book, Focal Press, 1998

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 13

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 1

Hands-free telephony:

❑ Echo cancellation as well as noise and residual echo suppression

❑ Double talk and barge-in (interrupting a speech dialog system)

Medical signal processing:

❑ Brain computer interfaces

Speech recognition:

❑ Applications for a mobile phone

Audio signal processing:

❑ Loudspeaker equalization

❑ Demo of KiRAT (Kiel Real-time Audio Toolkit)

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 14

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 2

Example 1

Hands-Free Telephony

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 15

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 3

( )y n

+

Noise and residual echo suppression

Echocancellation

Hands-free telephony – a basic system:

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 16

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 4

Transmission to thecommunication partner

(channel delay: about 180 ms)

Remote communication

partner

Received signal(„Hearing channel“ of the remote communication partner)

Initial filter convergence:

Adaptation at thebeginning of the call

Without Wiener filter

With Wiener filter

Enclosure dislocations:

Stereo signals (16 kHz):

Left:

Receivedsignal ...

Right:

Sentsignal ...

... of the remote communication partner

Double talk:

Both partners speak simultaneously

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 17

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 7

Example 2

Pattern Recognition forMedical Applications

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 18

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 8

Electro-encephalography(EEG)

Magneto-encephalography(MEG)

Electro-cardiography

(ECG)

Magneto-

cardiography

(MCG)

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 19

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 9a

❑ Helping medical doctors to distinguish better between deseases

❑ „Conventional“ measures

❑ Establishment of so-called early biomarkers

❑ To localize areas of interest in the heart or in the brain

❑ Networks that cause epilepctic seizures, etc.

❑ Unwanted „exciation channels“in the heart

❑ Brain-computer interfaces

❑ Control of electronic devices for handicapped people

What are these measures good for?

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 20

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 9b

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 21

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 10

Example 3

Speech Recognition

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 22

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 11

Video from/with:

❑ Raymond Brückner (SVOX)❑ Andreas Löw (SVOX)❑ Patrick Langer (SVOX)

Link to video

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 23

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 12

Example 4

Audio Signal Processing

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 24

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 13

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 25

Introduction and Motivation

Summary and Outlook

Summary:

❑ Speech and audio signal paths in a car

❑ Contents of the lecture

❑ Boundary conditions of the lecture (exercises, exam, etc.)

❑ Notation used in the lecture

❑ Literature

❑ Example of medical, speech, and audio signal processing

Next week:

❑ Noise suppression

Recommended