EE 2 LECTURE ON LOW RA TE CODING
N.M 25.1
PE Spring,1999
25D
ORGAN / B.GOLD LECTURE 23
University of CaliforniaBerkeley
College of EngineeringDepartment of Electrical Engineering
and Computer Sciences
rofessors : N.Morgan / B.GoldE225D
Low Rate Coding
Lecture 25
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.2
Algorithms
3 fferential
1
Modelation [Standard?]
ystemslling,rm Coding)
d Channel Vocoders LPedictive Coding.
ulars, STCders, LPC, Cepstral.
zation Algorithms.
Standard 2400bps Systems.n to Reduce arameters
tization
-Synthesiscoder.
25D
ORGAN / B.GOLD LECTURE 23
Digital Vocoder Date Rates Applications & Performance
0,000 bps
5,000-10,000 bps
2,000-5,000 bps
Wide-band, High Fidelity
Medium Band, Good Quality
Reasonable Quality
More Vulnerable to Environment.
Adoptive Di
1,000-2,000 bps
500-1000 bps
100-500 bps
2,400 bps (standard)
0,000-20,000 bps
Speech Transmission
Speech Transmission.Some Noise Immunity
Speeker I.D.? Secrecy.Pitch Errors Restricted Bandwidth.
Secrecy.
Extremely Restricted
Underwater Transmission?
Pulse Code
Split Band S(Part ModePart Wavefo
Voice ExciteVELP & REAdaptive Pr
Telephone Speech
CELP, MultipChannel VocoSpeech Diziti
Frame-fill for Transformatio# of Bits for P
or Busy Channels.
Super-Restricted Channels.
Vector Quan
RecognitionPhonetic Vo
Very Restricted Channels.
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.3
y be assigned
ore bits.]
25D
ORGAN / B.GOLD LECTURE 23
Frame - fill
* Start with a 2400 bps algorithm [Channel Vocoder].
* How do you get 2400bps?
Assume 400bps for excitation.
Assume 20ms frame or 50 frames/sec.
20005∅
------------ 40 bits/channel for a channel vocoder.=
Assume 16 channels 4016------ 2.5 bits/channel, bits ma=
on a perceptual basis. [Usually low channels need m
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.4
.
be reduced.
t B A.=
t B C.=
onstruct
C.
1050
25D
ORGAN / B.GOLD LECTURE 23
Question
How to reduce bit rate below 2400bps?
Algorithm- Transmit A&C
- Reconstruct B at Receiver
Control bits tell Receiver
So 40bits gets reduced to 22. Pitch can also
Measurement are model
- If B A reconstruc,≈
- If B C reconstruc,≈
-If B A≈ B C, rec≈∉
A B C
Channel nn
B by averaging A ∉
40bits 50× 2000= 42bits 25× =
at the transmitter.
how to handle the incomming data.
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.5
e same structure
re a few different
st structure for
sult
works best.
25D
ORGAN / B.GOLD LECTURE 23
Frame - fill for LPC
Whereas the channel vocoder synthesizer is always th
(a filler bank with malulaters for each b.p.filter) There a
LPC structures. For frame-fill, the trick is to find the be
finding a criterion of similarity.
a-parameters direct formk - parameters lattice formarea rationlog area ration
Experimental Re
Lattice structure
DRT Results
Table IV?
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.6
able.
d transmitter.
ns.
by 2:1.
ble, we
25D
ORGAN / B.GOLD LECTURE 23
Pattern Matching on Vector Quantization
Consider the 2400bps channel vocoder.
Assume 40bits per frame.
* This corresponds to 240 possible patterns.
Assume that 106 patterns are perceptually distinguish
If all 106 patterns were stored at both the receiver an
Algorithm
- Compare incoming pattern with All 106 stared patter
- Send the 20 bit address of the best stared pattern.
- Reconstruct this best pattern at receiver.Thus, bit rate is decreasing
If only 1000 patterns were perceptually distinguisha
gain from 40 to 10 bits per frame. [4:1 Reduction]
C.P. Smith (1960’s) Buzo et al (1980’s ?)
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.7
rs.
lter.
igrates towards unit circle.
d.
o transmit.
icedrmantsRCOR
25D
ORGAN / B.GOLD LECTURE 23
Kany - Caulter 600bps Vocoder
Includes LPC analysis
clever LPC parameters to fewer formant paramete
Vector quantization. - a form of frame fill.
Viewgraphs
Fig. 32.1 - Complete block diagram of Kamy - Cou
Fig. 32.2 - loci of poles is
Fig. 32.3 - Spectrum of all pole system as poles m
Vector Quantization - 128 formant patterns were store
Figure overall rate,
possible parameters t(7bits)
(including everything) to get 600 bps.
K10 1→
VofoPA
Viewgraph of Lattice
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.8
Control bits are sent to
frames. The sent frames
nd decides whether these
e than just two adjacent
ion rate is Variable, and
r quantized.
25D
ORGAN / B.GOLD LECTURE 23
Frame - Fill vs. Merging
In frame - fill, alternate frames are not sent - instead,
direct the receiver to “optimally” recreate the missing
can still be vector quantized.
In merging, the analyzer compares adjacent frames a
frames can be merged into a single frame. Perhaps mor
frames can be merged. This means that the transmiss
Control bits are needed. Merged framed can also be vecto
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.9
Synthesized
Speech
25D
ORGAN / B.GOLD LECTURE 23
Channel Signal 1
Channel Signal 2
VARIABLEBANDPASSFILTER 1
Channel Signal k
VARIABLEBANDPASSFILTER 2
VARIABLEBANDPASSFILTER k
Formant 1
Formant 2
Formant k
Figure 29.8: Parallel formant synthesizer.
Excitation
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.10
+
25D
ORGAN / B.GOLD LECTURE 23
Pu lse
N o ise Para lle l S tructu re for
M ost C onsonants
Serial S tructure fo r Vow els and N asalsSource
Source+
+
× ×
× ×
A1
A2
A3
A4
F igure 29.12 : S tructure o f K la tt Synthesizer.
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.11
Fant. Form [20]
G
etwork
NG
Σ
orkF1F2 F5
N1
25D
ORGAN / B.GOLD LECTURE 23
Figure 29.2: OVE II Speech Synthesizer of Gunnar
Pulseenerator
Fricative and Plosive N
SourceFilter 1
oiseenerator
Vowel Netw
Nasal Network
FO
FH F3 F4
NO N3 N2 N4
KO K1 K2
SourceFilter 2
SourceFilter 3
AH
AO
AN
AC
+-
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.12
A
.
rs
ultiplexer
Transmission C and Encoder
Media
25D
ORGAN / B.GOLD LECTURE 23
nalog to
Figure 32.1 : Kang-Coulter 600 bps Voice Digitizer
Filter
Input
Linear Prediction Excitation
Excitation Paramete
Pitch Period
Amplitude
Voice/Unvoice
M
Vocal-Tract-Filter Conversion From Speech
0.1 to 3.6kHz
Digital onverter
Predictive Analysis Filter
AnalysisResidual
Linear Predictive Coefficients to
Formant Frequencies
Information
Parameters
(a) Transmitter
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.13
r
Digital to
Filter
Speech
inear
Out
0.1 to 3.6kHz
Analog Converter
edictive nthesis ilter
TrM
25D
ORGAN / B.GOLD LECTURE 23
Figure 32.1 : Kang-Coulter 600bps Voice Digitize
(b) Receiver
P u lse-Tra in and L
Excitation Parameters
Pitch Period
Amplitude
Voice/Unvoice
Demultiplexer
Vocal-Tract-Filter Conversion From
PrSyF
and Decoder
Formant Frequencies to Linear Predictive
Coefficients
Information
Parameters
R an dom -N o ise E x c ita tio n
S igna l G enerato rs
ansmissionedia
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.14
Unity.
= 0Hz
25D
ORGAN / B.GOLD LECTURE 23
Figure 32.2 : Loci of the Poles as Approachesk10
f = 2000HzUnit Circle
Origianl Pole Positions
1
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.15
���������������� ��������������������������
wards Unit.Π Z Zi–( )
L 1=
n
∑=
te to unit circle.
25D
ORGAN / B.GOLD LECTURE 23
Figure 32.3 : Spectra of All-Pole System as Poles Migrate ToZn α1– Zn 1– α2Z
n 2–– … αn–
’as dn 1 Zi s migra,→
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.16
Predictor Coefficients
ts
Formants
Ratios
I OutputC
F ig ed on L PC analys is.(cont.)
25D
ORGAN / B.GOLD LECTURE 23
+x n( ) y n( )
a1
z 1–
a2
a3
ap
z 1– z 1–z 1–
y n( ) pk---kkkkkkk=
...
... ap 1–
(a) Direct-Form Digital Filter with Variable “a” Coefficien
Area
(b) Acoustic Tube with Variable Area Functions
nput
≈
≈
– AAp 1–A1
u re 29 .5 : Tw o configurations fo r a ll po le syn thesizers bas
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.17
eters
F ased on LPC analysis.
age 0
In
B Front (Lips)
Output
z 1–
1
K0
K0
-
r Coefficients
oiced Sounds.
ts.
25D
ORGAN / B.GOLD LECTURE 23
(c) All-Pole lattice Network with Variable “k” Param
igure 29.5 : Two configurations for all pole synthesizers b
+
Stage P-1 Stage 1 St
put
ack (Glettis)
... +
z 1–
+
+ + +z 1– z 1–
-
+
-+
+ +
++...
...
Kp 1– K1
Kp 1– K1
- -
-
ParcoPredictor Coefficients
Synthesis Strategy for Voiced Sounds For unv
From (a)
Vector quantization of the PARCOR Coefficien
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.18
e Digitizer
00-Bit-Per-Second
40Hz
7 bits/frame
1 bit/frame
4 bits/frame
5 bits/double frame
1 bit/double frame
30 bits/double frame
Voice Digitizer
25D
ORGAN / B.GOLD LECTURE 23
Figure 32.4 : Parameter Coding for 600bps Voic
Parameter
Coding
Frame rate
Vocal-tract-filter parameters
Excitation parameters
Voice/unvoice decision
Amplitude
Pitch
Synthronization
Total number of bits
Typical 2400-Bit-Per-Second 6
44.444Hz
40 bits/frame
1 bit/frame
6 bits/frame
6 bits/frame
1 bit/frame
54 bits/frame
Linear Predictive Encoder
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.19
tored patterns.
he actual pattern.
25D
ORGAN / B.GOLD LECTURE 23
V.Q.Algonithm
* Storage of the patterns.
Compare incoming pattern with All previously s
2N patterns stored.
* Receiver also has stored availavle.
Send the address of the stored pattern nearest to t
15
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.20
ract Parameter
Text to Speech
25D
ORGAN / B.GOLD LECTURE 23
Kang-Coulter 600bps Vocoder
Started with 2400bps
Reduce # of parameter to 4 or 5
Vector quantization
Frame-fill.
10 Vocal TLPC
has no speaker I.D.
Speech Recognizer Speech to Text
75bps
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.21
Algorithms
3
2
1 ystemsling,rm Coding)
d Vocoders
C hommerphic
lars, STCders, LPC, Cepstral.zation Algorithms.
n other Systems.
tization
-Synthesiscoder.
25D
ORGAN / B.GOLD LECTURE 23
Low Rate VocoderDate Rates Applications & Performance
0,000 bps
5,000-10,000 bps
2,000-5,000 bps
High Fidelity
Medium quality
Reasonable Quality
More Vulnerable to Environment.
ADPCM
1,000-2,000 bps
500-1000 bps
100-500 bps
,400 bps (standard)
0,000-20,000 bps
(Telephony)
Good NoiseSome Noise Immunity
Speecher I.D.? Secrecy.Pitch Errors.
Secrecy.
Extremely Restrected
Underwater
DPCM
Split Band S(Part ModelPart Wavefo
Voice Excite
Channel LP[High Intelligibits]
CELP, MultipoChannel VocoSpeech Digiti
Frame-FillTransformatio
or Busy Channels.Vector Quan
RecognitionPhanetic Vo
Very Restricted Channels.
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.22
0bps Excitationocal Tract
, - Interpolate.
A, B A.=
C, B C.=
A CB̂
F1 F2F1 F2+2
-----------------
n
25D
ORGAN / B.GOLD LECTURE 23
1000-2000bps Frame-fillstart with a complete System 240
400bps 2000bps V
Analyzer
2 Control Bits
2400bps 1200bps
Synthesizer
If neither is true
Is B A ?≈
Or is B C ≈ If B is close to
If B is close to
Or is neither true.
A
B
C
0 20ms 40ms
not sent
n
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.23
el Vocoder]
2000
25D
ORGAN / B.GOLD LECTURE 23
500-10Mbps
C.P. Smith Pattern Matching
2400bps Vocoder
2000bps Spectrum
50Hz Rate 40 Bits Per Frame
240 Different Spectrum [Chann
220 1 million
Vector Quantization LPC
50 40× =
Reduction of # of Parameters
EE 2 LECTURE ON LOW RA TE CODING
N.M 25.24
25D
ORGAN / B.GOLD LECTURE 23
Frame-fill for LPC ?
Predictor coefficients
SensitivityΚ-Parameter [PARCOR]
Area Retios.
Log Area Ratios.
’a s ↔