25
Pattern Recognition Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Part 1: Introduction and Motivation

Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Pattern Recognition

Gerhard Schmidt

Christian-Albrechts-Universität zu KielFaculty of Engineering Institute of Electrical and Information EngineeringDigital Signal Processing and System Theory

Part 1: Introduction and Motivation

Page 2: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 2

Introduction and Motivation

Contents of the Lecture „Pattern Recognition“

❑ Speech and audio signal paths in a car

❑ Contents of the lecture

❑ Boundary conditions of the lecture (exercises, exam, etc.)

❑ Notation used in the lecture

❑ Literature

❑ Example of medical, speech, and audio signal processing

Page 3: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 3

Introduction and Motivation

Speech and Audio Signal Paths in a Car – Part 1

Into the car

Out of the car

Within the car

Page 4: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 4

Introduction and Motivation

Speech and Audio Signal Paths in a Car – Part 2

Signal processing inthe „receiving path“

Signal processing forenhancing the

communication quality and the sound impression

Signal processing inthe „sending path“

Speech dialogsystem and

phone

Music and audio

sources

Page 5: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 5

Introduction and Motivation

Contents of the Lecture (Entire Term)

❑ Preprocessing for improving the „noise robustness“

❑ Single-channel noise suppression

❑ Beamforming

❑ Pattern recognition (using speech and speaker recognition as an example)

❑ Basics of speech production

❑ Feature extraction

❑ Codebook generation

❑ Generation of Gaussian mixture models (GMMs)

❑ Hidden Markov models (HMMs)

❑ Enhancing the playback of audio signals

❑ Extending the bandwidth of speech signals (as application of codebooks)

❑ Loudspeaker equalization

Page 6: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 6

Introduction and Motivation

Boundary Conditions of the Lecture

❑ ECTS points

❑ 4 credit points

❑ Oral examination

❑ about 20 minutes per student

❑ After the term

❑ Talks (part of the exercise)

❑ About 10 minutes talk plus 5 minutes discussion

❑ Topics are available from now on

❑ Lecture slides

❑ Printed at the beginning of each lecture

❑ In the internet via dss.tf.uni-kiel.de

Page 7: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 7

Introduction and Motivation

Notation – Part 1

Scalars:

❑ Signals:

❑ Impulse responses (time-variant):

❑ Example for a (real) convolution:

Vectors:

❑ Signal vectors:

❑ Impulse response vectors (time-variant) :

❑ Example for a real convolution:

Matrices:

Coefficient index

Boldface and uppercase

Boldface and lowercase

Page 8: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 8

Introduction and Motivation

Notation – Part 2

Random variables and processes:

❑ Notation:

❑ Probability density function:

❑ Stationary random processes:

❑ Expected values of stationary random processes:

No differences between deterministic signals and randomprocesses – different writing styles:

Page 9: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 9

Introduction and Motivation

Notation – Part 3

Auto and cross correlation for real, stationary random processes:

❑ Auto-correlation function:

❑ Cross-correlation function:

❑ (Auto) power spectral density:

❑ (Cross) power spectral density:

Page 10: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 10

Introduction and Motivation

Notation – Part 4

Stationary white noise:

❑ Auto-correlation function:

❑ Auto power spectral density:

Page 11: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 11

Introduction and Motivation

Literature – Part 1

❑ E. Hänsler: Statistische Signale: Grundlagen und Anwendungen, Springer, 2001 (in German)

❑ A. Papoulis: Probability, Random Variables, and Stochastic Processes, McGraw-Hill, 1965

Statistical signal theory:

❑ E. Hänsler, G. Schmidt: Acoustic Echo and Noise Control, Wiley, 2004

❑ S. Haykin: Adaptive Filter Theory, Prentice Hall, 2002

❑ A. Sayed: Fundamentals of Adaptive Filtering, Wiley, 2004

Noise suppression, beamforming, adaptive filters:

❑ E. Hänsler, G. Schmidt: Topics in Acoustic Echo and Noise Control, Springer, 2006

❑ B. Iser, et al.: Bandwidth Extension of Speech Signals, Springer, 2008

❑ E. Hänsler, G. Schmidt: Speech and Audio Processing in Adverse Environments, Springer, 2008

❑ J. Benesty, et al.: Speech Enhancement, Springer, 2005

Application examples for speech processing:

Page 12: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 12

Introduction and Motivation

Literature – Part 2

Speech processing:

❑ L. R. Rabiner, R. W. Schafer: Digital Processing of Speech Signals, Prentice Hall, 1978

❑ P. Vary, U. Heute, W. Hess: Digitale Sprachsignalverarbeitung, Teubner, 1998 (in German)

❑ P. Vary, R. Martin: Digital Speech Transmission, Wiley, 2006

❑ L. R. Rabiner, R. W. Schafer: Introduction to Digital Speech Processing, Now, 2008

❑ B. Pfister, T. Kaufman: Sprachverarbeitung, Springer, 2008 (in German)

Audio processing:

❑ U. Zölzer: DAFX – Digital Audio Effects, Wiley, 2002

❑ E. Larsen, R. M. Aarts: Audio Bandwidth Extension, Wiley, 2004

❑ M. Talbot-Shmith: Audio Engineer‘s Reference Book, Focal Press, 1998

Page 13: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 13

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 1

Hands-free telephony:

❑ Echo cancellation as well as noise and residual echo suppression

❑ Double talk and barge-in (interrupting a speech dialog system)

Medical signal processing:

❑ Brain computer interfaces

Speech recognition:

❑ Applications for a mobile phone

Audio signal processing:

❑ Loudspeaker equalization

❑ Demo of KiRAT (Kiel Real-time Audio Toolkit)

Page 14: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 14

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 2

Example 1

Hands-Free Telephony

Page 15: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 15

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 3

( )y n

+

Noise and residual echo suppression

Echocancellation

Hands-free telephony – a basic system:

Page 16: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 16

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 4

Transmission to thecommunication partner

(channel delay: about 180 ms)

Remote communication

partner

Received signal(„Hearing channel“ of the remote communication partner)

Initial filter convergence:

Adaptation at thebeginning of the call

Without Wiener filter

With Wiener filter

Enclosure dislocations:

Stereo signals (16 kHz):

Left:

Receivedsignal ...

Right:

Sentsignal ...

... of the remote communication partner

Double talk:

Both partners speak simultaneously

Page 17: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 17

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 7

Example 2

Pattern Recognition forMedical Applications

Page 18: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 18

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 8

Electro-encephalography(EEG)

Magneto-encephalography(MEG)

Electro-cardiography

(ECG)

Magneto-

cardiography

(MCG)

Page 19: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 19

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 9a

❑ Helping medical doctors to distinguish better between deseases

❑ „Conventional“ measures

❑ Establishment of so-called early biomarkers

❑ To localize areas of interest in the heart or in the brain

❑ Networks that cause epilepctic seizures, etc.

❑ Unwanted „exciation channels“in the heart

❑ Brain-computer interfaces

❑ Control of electronic devices for handicapped people

What are these measures good for?

Page 20: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 20

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 9b

Page 21: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 21

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 10

Example 3

Speech Recognition

Page 22: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 22

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 11

Video from/with:

❑ Raymond Brückner (SVOX)❑ Andreas Löw (SVOX)❑ Patrick Langer (SVOX)

Link to video

Page 23: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 23

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 12

Example 4

Audio Signal Processing

Page 24: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 24

Introduction and Motivation

Application Examples from Medical, Speech, and Audio Processing – Part 13

Page 25: Pattern Recognition - Uni Kiel · Pattern recognition (using speech and speaker recognition as an example) Basics of speech production Feature extraction Codebook generation Generation

Digital Signal Processing and System Theory | Pattern Recognition | Introduction Slide 25

Introduction and Motivation

Summary and Outlook

Summary:

❑ Speech and audio signal paths in a car

❑ Contents of the lecture

❑ Boundary conditions of the lecture (exercises, exam, etc.)

❑ Notation used in the lecture

❑ Literature

❑ Example of medical, speech, and audio signal processing

Next week:

❑ Noise suppression