33
Speech Coding Basics Speech Coding Basics Mahdi Amiri Supervisor Dr. H. R. Rabiee April 2009 Sharif University of Technology A Tutorial A Tutorial

Speech Coding Basics

  • Upload
    thimba

  • View
    93

  • Download
    0

Embed Size (px)

DESCRIPTION

Speech Coding Basics. A Tutorial. Mahdi Amiri Supervisor Dr. H. R. Rabiee April 2009 Sharif University of Technology. Speech Coding. A road map. PCM DPCM ADPCM LPC CELP. Pulse-code Modulation (PCM). Basics. Digital Representation of an Analog Signal Sampling and Quantization - PowerPoint PPT Presentation

Citation preview

Page 1: Speech Coding Basics

Speech Coding BasicsSpeech Coding Basics

Mahdi Amiri

Supervisor

Dr. H. R. Rabiee

April 2009

Sharif University of Technology

A TutorialA Tutorial

Page 2: Speech Coding Basics

Page 2 of 30 Speech Coding Basics

Speech CodingSpeech CodingA road mapA road map

PCMDPCMADPCMLPCCELP

Page 3: Speech Coding Basics

Page 3 of 30 Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)BasicsBasics

Digital Representation of an Analog Signal Sampling and Quantization

Parameters:– Sampling Rate (Samples per Second)

– Quantization Levels (Bits per Sample)

Page 4: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Page 4 of 30 Speech Coding Basics

Why Call it PCM?Why Call it PCM?

4-bit PCM4-bit PCM

Page 5: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

How to choose proper…– Sampling Rate

• 8 Khz ?

– Quantization Level• 8 bit/sample ?

Bit per Second for 8000 Hz 8 bit PCM– 64 kbit/s

Page 5 of 30 Speech Coding Basics

Bit per Second (bit/s)Bit per Second (bit/s)

Page 6: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Human Hearing Frequency Range– 20 Hz to 20 kHz– Play with “HearTest” to test your hearing– Most people will find that their hearing is most

sensitive around 1-4 kHz and that it is less sensitive at high and low frequencies.

Page 6 of 30 Speech Coding Basics

Sampling RateSampling Rate

Page 7: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Page 7 of 30 Speech Coding Basics

Hearing RangeHearing Range

Page 8: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Human Vocal Range– Normal: 80 Hz to 1100 Hz– Charles Kellogg (14 KHz) (not verified)

– Guinness Book of Records• Female: Georgia Brown

– (Eight octaves, 25087Hz)

• Male: Tim Storms– (Six octaves)

Page 8 of 30 Speech Coding Basics

Sampling RateSampling Rate

Page 9: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

8,000 Hz: Telephone, adequate for human speech 11,025 Hz 22,050 Hz – radio 32,000 Hz - miniDV digital video camcorder, DAT (LP mode) 44,100 Hz - audio CD, also most commonly used with MPEG-1 audio

(VCD, SVCD, MP3) 48,000 Hz - digital sound used for miniDV, digital TV, DVD, DAT, films

and professional audio 96,000 or 192,000 Hz - DVD-Audio, some LPCM DVD tracks, BD-ROM

(Blu-ray Disc) audio tracks, and HD-DVD (High-Definition DVD) audio tracks

2.8224 MHz - SACD, 1-bit sigma-delta modulation process known as Direct Stream Digital, co-developed by Sony and Philips”

Page 9 of 30 Speech Coding Basics

Common Sampling RatesCommon Sampling Rates

Page 10: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Want to prevent human ear fatigue by minimizing quantization noise

Signal-to-Noise Ratio = 6.02B dBSNR is approximately 6 dB per bit.

– 16-bit => 96 dB– Above 36 dB is required

Page 10 of 30 Speech Coding Basics

Quantization LevelsQuantization Levels

Page 11: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

The average person cannot tell the difference between a bitrate above 192 kbit/s and the original CD/WAV.

Even if your headphones seal really well around your ears, they will probably only give you about 20 to 25 dB insulation from the external sound.

Page 11 of 30 Speech Coding Basics

Good to KnowGood to Know

Page 12: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Page 12 of 30 Speech Coding Basics

ImagesImages

Page 13: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Page 13 of 30 Speech Coding Basics

u-law, a-lawu-law, a-law Nonuniform quantizers: Difficult to make, Expensive. Solution: Companding Uniform Q. Expanding

Page 14: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Page 14 of 30 Speech Coding Basics

U-law, A-lawU-law, A-law

Page 15: Speech Coding Basics

Pulse-code Modulation (PCM)Pulse-code Modulation (PCM)

Page 15 of 30 Speech Coding Basics

u-law, a-lawu-law, a-law

North America and JapanNorth America and Japan EuropeEurope

Page 16: Speech Coding Basics

Page 16 of 30 Speech Coding Basics

Differential PCM (DPCM)Differential PCM (DPCM)IdeaIdea

Page 17: Speech Coding Basics

Differential PCM (DPCM)Differential PCM (DPCM)

Page 17 of 30 Speech Coding Basics

Basic SchemeBasic Scheme

1Delta Modulation (DM): i n ia x z

Problem?Problem?

General Predictive CodingGeneral Predictive Coding

Page 18: Speech Coding Basics

Differential PCM (DPCM)Differential PCM (DPCM)

Page 18 of 30 Speech Coding Basics

Better StructureBetter Structure

Page 19: Speech Coding Basics

Page 19 of 30 Speech Coding Basics

Adaptive DPCM (ADPCM)Adaptive DPCM (ADPCM)IdeaIdea

Problem?Problem?

Page 20: Speech Coding Basics

Adaptive DPCM (ADPCM)Adaptive DPCM (ADPCM)

Page 20 of 30 Speech Coding Basics

Size of Quantization StepSize of Quantization Step

ADM: [ ] [ 1]n M n

12, 2P Q

1 if [ ] [ 1]

1 if [ ] [ 1]

M P c n c n

M Q c n c n

Page 21: Speech Coding Basics

Page 21 of 30 Speech Coding Basics

Speech Compression ConceptsSpeech Compression ConceptsSpectrogram, STFTSpectrogram, STFT

3D surface spectrogram of a part from a music piece.3D surface spectrogram of a part from a music piece.

Page 22: Speech Coding Basics

Speech Compression ConceptsSpeech Compression Concepts

Page 22 of 30 Speech Coding Basics

SpectrogramSpectrogram

Spectrogram of a male voice saying ‘nineteenth century’.Spectrogram of a male voice saying ‘nineteenth century’.

Page 23: Speech Coding Basics

Speech Compression ConceptsSpeech Compression Concepts

Page 23 of 30 Speech Coding Basics

Spectrogram, DemonstrationSpectrogram, Demonstration

Bat Echolocation CallBat Echolocation Call Flute by Jean Pierre RampalFlute by Jean Pierre Rampal

Singing VoiceSinging Voice Face!Face!

Page 24: Speech Coding Basics

Speech Compression ConceptsSpeech Compression Concepts

Page 24 of 30 Speech Coding Basics

FormantFormant

Page 25: Speech Coding Basics

Page 25 of 30 Speech Coding Basics

Linear Predictive Coding (LPC)Linear Predictive Coding (LPC)ModelingModeling

Page 26: Speech Coding Basics

Linear Predictive Coding (LPC)Linear Predictive Coding (LPC)

Page 26 of 30 Speech Coding Basics

Modeling (Hiss or Buzz)Modeling (Hiss or Buzz)

1

[ ] [ ]P

ii

x n a x n i

Predictor for each frame:Predictor for each frame:

Buzzer Buzzer Filter Filter

Speech = Formants + ResidueSpeech = Formants + Residue

Chuncks: 30 thr. 50 frames/sec.Chuncks: 30 thr. 50 frames/sec.

Page 27: Speech Coding Basics

Linear Predictive Coding (LPC)Linear Predictive Coding (LPC)

Page 27 of 30 Speech Coding Basics

Modeling (Hiss or Buzz)Modeling (Hiss or Buzz)

Page 28: Speech Coding Basics

Page 28 of 30 Speech Coding Basics

Code Excited Linear PredictionCode Excited Linear PredictionCELPCELP

Problem of LPC– Where there is both Hiss and Buzz

Solution– Encode residue

Method– Vector Quantization (Codebook)

Page 29: Speech Coding Basics

Page 29 of 30 Speech Coding Basics

ComparisonComparisonSample SpeechSample Speech

A lathe is a big tool. Grab every dish of sugar.A lathe is a big tool. Grab every dish of sugar.

Page 30: Speech Coding Basics

ComparisonComparison

Page 30 of 30 Speech Coding Basics

DemonstrationDemonstration

OriginalOriginal ADPCMADPCM

LPCLPC CELPCELP

Page 31: Speech Coding Basics

Page 31 of 30 Speech Coding Basics

Speech Coding BasicsSpeech Coding Basics

Thank You

FIND OUT MORE AT...

1. http://ce.sharif.edu/~m_amiri/

2. http://www.aictct.ir/dml/

A TutorialA Tutorial

Page 32: Speech Coding Basics

Page 32 of 30 Speech Coding Basics

Animated TitleAnimated TitleTitleTitle

Abc

Page 33: Speech Coding Basics

Page 33 of 20 Speech Coding Basics

TitleTitleTitleTitle

Abc

100 (100 )old

d

Definition ofDefinition ofVanishing Percentage (VP)Vanishing Percentage (VP)