43
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Glottal source and excitation analysis Fant, G. journal: STL-QPSR volume: 20 number: 1 year: 1979 pages: 085-107 http://www.speech.kth.se/qpsr

Glottal source and excitation analysis

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Glottal source and excitation analysis

Dept. for Speech, Music and Hearing

Quarterly Progress andStatus Report

Glottal source and excitationanalysis

Fant, G.

journal: STL-QPSRvolume: 20number: 1year: 1979pages: 085-107

http://www.speech.kth.se/qpsr

Page 2: Glottal source and excitation analysis
Page 3: Glottal source and excitation analysis

STL-QPSR 1/1979

, ' . y<;l-?.!,.. 5 : > k 3 j ! -,ic . ; I ; '-, *., . III. SPEECH PRODUCTION . .

- . , .- , . , k. .,l.rt.-;b 5 , ~f1 . i : : ci.! - : z .- . t '

A . GLOTTAL SOURCE AND EXCITATION ANALYSIS ..r~, 2 a . G. Fant . . . . . . . . . . . . . . . .. . . . ; . - 2 . . :S~* ; . .L ,.; .. ,>. .; - 2 [ Y ! . j ; :

Abstract

The source-filter concept of the production of voiced sounds is extended to include a derivation of the exact wave shape from Laplace transforms operating on a parameterized model of the glottal source. The conventional spectral representation of source and sound i s compared with a specification in terms of discrete excitations and analysis of the output a s the sum of damped oscil- lations and sinusoidal elements representing the residue of the voice source. The latter elements constitute a low-frequency "glottal pulse formant" and provide for characteristic properties of the waveform of voiced sounds. With increasing voice effort the glottal pulse amplitude increases much less than the amplitude of formant oscillation, which i s associated with a relative deemphasis of the glottal pulse formant. Above F I the greater part of the model generates source spectrum slopes of constant - 12 d ~ / o c t . - < , d3? -. -

The time -variable properties of glottal damping a r e analyzed in detail. The non-stationarity of the glottal impedance during a voice cycle will in practice " f i l l out" the sharp source zeroes that appear in conventional source spectrum analysis and adds to the dominance of the closure excitation. Time-domain calculations of wavesh;tpes agree with experimental data. A parametric analysis of voice source variations in connected speech i s exemplified.

. .

Introduction .

Strategic problems and techniques have their definite attraction

'throughout the history of research and a tendency to be considered

solved and then rediscovered. Studies of the human voice source by

means of inverse filtering i s one such area . The first,detailed r e -

port comes from John Miller a t Bell Labs in 1959. A short note by

Fant (1959) appeared a t this time. 3-3,F:-, -.:).

In the very f i rs t issue of STL-QPSR No. 1/1960, Cederlund, ' '<

Krokstad, and Kringlebotn illustrated the wave shape and spectrum

of sung vowels after inverse filtering. I quote them: "Confirming ; . . , - earl ier observations, e. g . Miller (1 959), i t was found that the appa-

' rent starting point of the formant oscillation within a voice cycle

i s typically close to the instant of closing of the vocal cords. During

t h e open glottis interval the formant oscillation i s often markedly

reduced in amplitude. The coupling to the subglottal system thus

causes appreciable damping. As could be expected owing to the flow

Page 4: Glottal source and excitation analysis

STL-QPSR i/1979 . 0 I " 8 6 .

dependent resistance this damping i s larger a t low voice efforts

than at high voice efforts. The work i s continuing with theoretical

studies of the time domain charac te~ is t i cs of vowels from Laplace . . transforms' assuming excitation functions of various shapes. " These

problems have attained a renewed interest and , occupy , a larger part * .

of the present 'article. , . , - . .!I 8 .

. ? ., *F . Our most important availableidata from various types of voices,

vocal effort and pitch stem from by Holmes (1 962; 19?6), Rothenberg L (,

(1 973), 'Monsen and Engebretson (1977), Lindqvist (1 970), and Sund-

berg and ~ a u f f i n (1978) using the Rothenberg inverse filtering mask.

An experimental study of the relation of optical glottography to in-

verse filtering was reported by'Fant and SonessonJ(1962). Detailed

stddies of source spectra of vowelg by means of f r~quency domain

deconvolution were undertaken by ~ i r t o n ~ (1964; 1965). His studies

i ncluded some observations of dynamic mriations in connected ',

~ & ~ t ~ ~ ~ (1 96ij', -.. f , ,CJ , ? ~ J ~ J L % J I A ' J = * ~ G g , > i i l 18 i s s ~ k - ~ +-I2;

The present study started with a pilot experiment that revealed

several characteristics of interest, see Fig. III-A-1 and stimulated I . : . ' .-. i: :>

a deeper analysis of glottal excitation and damping. ,,;- :O c a ,.- LcGc,.l 6.t fi;? f c;?f :,. . r ::.rs~r:~3 IS;: t 37 ' E aPtu ; i!+=

A real-time high speed recording on a Mingograph (Oscillomink)

was made of a few words spoken through a condense+ rniciophonk s ,

and subjected to processing in two channels, one for a fixed inverse

filtering with Fi = 500 Hz, the other one for preprocessing the

speech oscillogram with some base cut. The condenser microphone

amplifier system had a response preserved down to 5 Hz. * . ..,+. , ,

Two facts became apparent. One was a confirmation that F i of ./

the vowel La] was especially heavily damped in the glottal open part

of the voice cycle i. e. a tendency of "truncation" not present to ."' '

the same extent in the vowel 6 . ~ h d &her was the appa rekc6n-

stancy of the peak value of the glottal flow pulses within a breath ,T;.

group independent of the F1 amplitude and extending all the bay into

the final offset of voicing. In the region of main s t ress in the second

syllable the peak amplitude of glottal flow actually appears somewhat

reduced. A separate experiment, see Fig III-A-2 verified the ,!

tendency of truncation of the [a:] vowel. ~ + r r . * t r ' - I ..

From this experiment grew out further studies of the "truncation"

effects including a perceptual analysis together with Liljencrants (1979) I

Page 5: Glottal source and excitation analysis

. . . , I ! i f 1 ,{!I. ,(! *!!! ,I.! <I.:: ;!'! ':< <:: * . ,.: . . . .

.------ -*-.+-+ *<. , , 4; u~;~-?~;~~41;~:wqq,~~d~~$w:Jj~~e~~~~'~4,1~~'~,t~~~+~~~4;t; :i a ~7,:: a , i t 'P ti 5: y - . . w ~ \ ~ , . ! 5!:?&$i,fdt!!t4$! Z t . $I ii 1 2 ~ I+,: iMJ+,;4! *\:. ,$; "k.. *;+. ..: $,$;; ,* L.4,: :F, ! % : . .* - . . 2: .. J A::(>- d., 2. .,i. c .. .. . , .. (1 , . i < ' : ' ! 8 .; .; ., >: ..

Page 6: Glottal source and excitation analysis

Fig. 111-A-2. ( 1 ) Waveforms of vowels [a:] and C00 ( 2 ) a ) Vowel [a:] with breathy voice

b ) Vowel [a:] with somewhat s t r a ined voice

(3) The vowel [a:] phonated with a swell ing tone , subjec t JS.

Page 7: Glottal source and excitation analysis

STL-QPSR 1/1979

which i s reported in a separate article in this QPSR. A source

model was developed for transform calculations and simulations to

provide a basis for a general parameterization of voice production.

Finally an experimental study was undertaken to confront this mo- , , x < + A < , z ~ 3 . j - . . . J ~ . ~ L , ~ , u U ~ I L i :L '5>'..

del with a speech material. , . . , , , q * 7 J hs.!JiLfijd.;' 9 . 6 2?:11:1'

> I

I. How to define source and f i l ter . ;L s z s i'l , r ~ : i p ~ i ? 3 & : ~ s r i j ri jd"'

The concept of speech a& the product of a source and a filter func-

tion has certain theoretical limitations. Disregarding the acoustic

to mechanical reaction on the vocal cord vibrations we ape still faced

with the problem how to separate source and filter functions. The

time-variable and non-linear glottal impedance is responsible for1 '"

shaping the source waveform and also enters a s an element of the

vocal t rac t filter function. Accordingly, one 'way to define the source

is by the hypothetical volume velocity flow passing through the glottis

when working into a short circuit instead of into the vocal t ract impe-

dance. As specified by Thevenin' s theorkm this source current i s

fed into the vocal t ract in parallel with the supraglottal impedance.

This was the approach of Fant (1 960). An alternative approach is

that of Guerin et al. (1976). They calculated frem the basic differential

equations (as in Flahagan' s model) an expression for the true volume

velocity through the glottis. Their vocal t rac t transfer fhc t i on is accor-

dingly defined a s the ratio of volume velocity output a t the lips to

the actual flow through the glottis. Since this glottal flow contains

oscillatory components from vocal tract resonances it i s apparent

that the source defined in this way will depend on the filter function.

On the other hand, in this model the glottal impedance i s by de'-'

finition excluded from the filter function. This could be an advantage

for defining a proper source for terminal analog synthesizers but 3! '

the consequences a r e not very practical. A source correction net-

work controlled by the formant pattern P i , FZ, etc. must be inserted

between the ideal interaction-free source and the synthesis filter.

A theoretically more satisfactory approach would be to introduce

pitch synchronous additional formant damping to repre sent glottal

losses a s required by the usual source-filter definition. - - - . 2 %, ,-, . , : .. The temporal details of the excitation process a r e lost in Guerin' s

source model, since he deals with Fourier transforms extended

over a complete voice fundamental period. -a Y qozb altlrazrb 1,;

Page 8: Glottal source and excitation analysis

STL-QPSR 1/1979

. " r . The excitation process becomes clearer in a Laplace transform

view. The interruption of the flow at the instant of the vocal cords

reaching closure se ts up damped oscillation in the supraglottal a s

well a s in the subglottal system of the trachea, bronchi, and lungs.

These a r e submitted to constant damping during the closed phase.

When the glottis opens there i s a slight detuning of both the supra- '

glottal and subglottal resonances which now may penetrate the en-

t i re system and mix in with oscillations evoked a t the glottal opening

and peak. :,::c 4 ~ 1 ; .LLo~$& l a ~ i byi35, , & ~ Q v Rfl" :I.,) I 1, ?:)&31 J&3SJV4(i.=" I

There i s as yet no f i rm evidence of any appreciable influence of

subglottal resonances in the supraglottal output of voiced sounds. "

Except for aspirated segments, the glottal impedance i s large enough

to ensure an effective decoupling and i s small enough to cause appre-

ciable damping. Theoretically there could be some interaction when

F1 of a vowel coincides in frequency with the f irst resonance of the

subglottal system a t about 600 Hz o r when the glottis i s quite open. -.

Subglottal formants a r e often seen in the aspiration phase of the r e -

lease of an unvoiced stop, Fant et al (1972). The calculations per-

formed by Wakita and Fant (1978) support the approximation of neg-

lecting the subglottal impedance which otherwise should have entered

A . a s an element in series with the glottis impedance.' f l a u O Y J J T{:!~: )121.* ;r

Most of our present insight in the glottal impedance derives from

art icles by Flanagan, e . g. (1965). The following paragraph i s a re -

sum6 of a presentation by Wakita and Fant ( 1978). At glottal openings

largqr than about 0.02 crn2 the differential signal resistance i s do- :

minated by turbulent losses which a r e of the same order a s the "kine-

tict,' resistance. 4 0 - .1~3i?"f35lq y T 3 V 3 0 C 1 $ 2 6 2:30tX@iSJJYi?lIQ3

I- , r . , ; kv 9

k = ,I , 1: '? IXY"?*,Q jnsi-llx (31 :,r f -id bsllo-r ~ i r q c . 5, - Ag ;T;P rtd) i n + Q y,r . - ,~ . - ~ 3 - n r j + * , a - r ~ a t n Ic-h- - 4 . 3 . a=.-:, !

><l - A is the instantaneous glottal area and v the particle velocity of

g g the air'flow. The constant k i s set to 0.875. This resistance is C!

twice the aerodynamic flow resistance relating total pressure drop

to total flow. ,. . ,, r .

T ~ Z particle vetocity i s proportional to'the square =dot of the g \ . , . pressure drop PG.

Page 9: Glottal source and excitation analysis

STL-QPSR 111979 89. ' 1-'

2PG i/z . . I .. . . - * , , , * .. <- . V ) , . , . I c,*' *

g ii, . , ,, I f ( 2 )

' I . . J . 9. :.r i :

Unless there i s a pressure drop in a supragbttal constriction P G equals the subglottal pressure P The volume velocity U = A v S' g g g through the glottis is then proportiolial to glottal a rea and to the

square root of the subglottal pressure, PS. U i s the quantity we c g - - ,

derive from inverse filtering. -. : . , y r . , . = - . z . , J.

- Given a simultaneous 6k almost simultaneotls measure of sub-

, glottal pressure (approximated by mouth pressure in adjacent un-

voiced stops), i t i s possible to separate the A and v components .1 g - of the glottal flow. The inductance L = dg / A ~ should also be taken I

g . into account. The reactance UL becomes equal to R in a frequen-

g @; cy range around 700-1200 Hz. A series to parallel transformation

i s needed before applying the glottis impedance as, a shunt to the " , - 8 - - - A . - - .- a ,,,4+a-;->AaA . 4 2 . J > , d & & n . p , i - , ~ j x , -

vocal tract . . . ;z.(: L'17. -lX,! r ,.. J ~ L U ( J..~-.A L * ,

In the region of the f irst formant the supraglottal impedance may

be approximated by a paralle1,LRC circuit. The bandwidth, see .;

Fant (1960), Wakita and Fant (1978) contributed by the glottal paral-

le l resistance R i s then, - - - -. - .- g e

where C is the capacitance. Providing the articulation simulates a 2 Helmholtz resonator C = V/Q c , where V i s the total volume.

Another model i s a single tube of length C , cross- sectional a r ea

A. Its characteristic impedance Z = g c / ~ . The $glottal bandwidth I I

-.-- --I -I .- component of any formant i s then ' 9 s 7 , .

, . 1 , 2' 3 :

o r twice that of the simple resonator with the same volume. . .

Wakita and Fant ( 1978) found the maximum instantaneous values

of glottal bandwidth a t the flow peak to be of the order of 600 Hz 2 for the vowel ' [o~ (Ag = 0.16 crn , PS = 6 crn HZO) and 200 Hz for the

I I . vowel [ e J . These a r e the order of magnitudes adopted for detailed

calculations of formant waveforms in this article. From Eqs. ( z ) , I

( 3 ) , and (4) we see that the instantaneous value of glottal bandwidth

Page 10: Glottal source and excitation analysis

.' . STL-QPSR 1/1979

Bg(t) i s inversely proportional to the square root of glottal pressure drop

PC and the resonator volume V .

' ~ 3 If P had been raised to 12 crn H20 and A decreased to 0.11 cm 2 S g

the peak flow would have been the same and the glottal bandwidth de- '

creased by a factor 2. Likewise a reduction of PS to 3 cm H20 and I / 2

an increase of the glottal area to 0.23 cm would also have preserved

the peak flow but increased the losses by a factor 2. Such variations

'probably occur with varying emphasis and stress within a sentence. '

I . . - 1 How do we derive mean or effective values of glottal handwidths?

This problem was the object of a birceptual study of truncated signals

by Fant and Liljencrants (1979). The results support a model of band- . . . fA width invariance with the 'ratio ~f peak ,tq mean Galue 'of the decay enve-

lope. Effective values of the order of 20- 160 Hz were derived. In . ,

practice the glottal bandwidth increases dit-h voice'fundambntal fre-

quency and with the relative duration of the open part of theglbttal ' cycle. At high glottal losses the latter parameters.are more im-

portank than the particular scale value of B (t). . . . - . , . - . -. . , . g

V 8!.0&Qi)r 2qsT 3. A glottal source model. Transform equations ,a - : , . ,.. , -

The model of glottal flow adopted for the calculations, Fig. In-A-3,

: . . has a smoothly rising branch . . . . , . , r I - ' ' >\ : $3 rt 18

1 U = U (- - cosw t) I ' . - - I ?

0 2 g L .' . . s ~ ~ a ~ J i 3 s q G..! v.34; ei 3 3 7 41i:lr ( 5 4

5 ' O cT2 - - . ,,T,,' b.. . "3 pi,': : .d T ; J . S ~ O ~ B ~ : , T I ~ O L ! + ( J H

, . . up to the time Ti = X / W followed by a falling branch ,.A,-, i,,h g

T 2 < t c T 3

I which hits the zero line a t Tg = - K-i a r c ~ s ( ~ ) . The parameters a re Wg

the peak value, Uo, the pulse rise frequency F = w /ZTL = 1/2T2 defined by g g

the duration of the rising branch and K which specifies the steepness " ',, of the falling branch. When K= 0. 5 the falling branch i s symmetrical

. . ' to the rising branch and with K= oo the falling branch i s a step func-

-. tion from the peak to clos.ure.-'. , ,

- : > . + - . - . I. . I

The stApness of the falling branch kt th'e instant bf closdre defined by

the ratio uo/Td i s a major determinant of excitation strength. The 1

. , ,. - . , I . 1 I

Page 11: Glottal source and excitation analysis
Page 12: Glottal source and excitation analysis
Page 13: Glottal source and excitation analysis
Page 14: Glottal source and excitation analysis

STL-QPSR 1/1979 , . & , . . T f - , . . a

93.

The standard solution is a set of exponentials

,c, I

r , . > I - ({. :<' : &here A a r e the complex amplitude factors of the complex pole

n frequencies including poles a t zero frequency.

' 0

The second alternative i s based on the form H(s) = C(S)/B(S).

Combining Eqs. (9)(11)(12)(16) assuming a single formant transfer

fupction . , A. , < I

, . - . . I + * - r r - ' 2 2 . U; ' - 5 ; - -, . el; 1.3 3b~tiJq.-y!, 3.:*

s.$. up Wo Ht(s) = o : j-~*)i t t S ~ L O - 7 :YO. 2 2 , . ..

: -*: ,;., .. , q ~ ~ ~ J s ( s +ag )(s+ a -jwl)(s+a+jwi) r , r . , ~ k 7 0 ,L~,E.?~

( 1 9 )

.. -

- - -+

3

/

L ..q,5 \ 5 - 0

A

2 a . A* - - -%. an W 0 i 2 . u 0 ,"- - . z i

2 " t "

~ C ( - J W + ( 0 2 + ~ i J2jug < * - g D

I - - . , ) T - and - a l t hl(t)=Alie ~ i n ( ~ ~ t + $ ~ ) + A ~ ~ ~ i n ( ~ t + $ ~ )

g

Page 15: Glottal source and excitation analysis

STL-QPSR i / i979

A 2~ a i ,25s fi: e ~ j o q anibul=ni n e a s i ~ r r u z , .

9 = artg 2 2 (25)

W .'. t . g -*oi 6 h J A 1 / -'%- -. . - - { - = I r

. h l 'n n :, Ej ' 3

2 W i a 1 *:

$2 = artg 2 s2 e i 3. ? ~ G ~ x R $ I B bno~str sd2 w +ai - w ,

(2 6 )

- x i - T r r - r C t r r c m r ' , y z r f i , , n ~ ~ ( , i t \ i<F1*() . j \ ip\ ? d ' T ~ ~ T ? ~ ~ C ~ , V O ~ )

Ai i s the initial amplitude of the formant ( F l ) a t point T i and A. - -

the amplitude of the sinusoidal residue of the glottal pulse. In medium

and low voice efforts i t appears in the spectrum a s a "baseband"

formant at o r somewhat above F = W / 2 ~ . The bandwidth of the Fi g g

formant i s , dl~~. - , $(,ij:i,!-b-, 1 ,A

1 2 B1 = C ( ~ / T C 3,,i 7 .J . A .i- ' 7 A. 2 (l), r '1,

t (27)

-sTZ Since H2(8) = Hi(s)(i-2k). e , see Eq. (9), we may immediately

, $ 1 ' h' write 3 . .

t - . - . - - *--?Y- -^-.-.-- - - ll. C L ' ~ S t w ;.'(,wjrD-);s

h2(t) = (I -2k)hi(t-T2) 4 (28)

Similarly the main excitation 3b generates the time function

where

2 . .

!I1 . fi;I -. * ,: Ai3b = ZVZT Vi+ai2/wi ' A i l ......-- - . . . - - . x- , (32)

1. ,ki3. g , ; 2; 4 . t i c - \-: i: 0

/ c'

a i 1 -! 1

$ = artg - \. . , -. , ", 4 ,

I '

( 3 3) '3 .

, . .! * ' \ r y ; . , : .. $1 .?. $ 2 , , , ;I;& - 2 j : . = ( : ) , I! .i. .s ; ;j 4

Page 16: Glottal source and excitation analysis

STL-QPSR 1/1979

and (P2 as in Eq. (26). In the low-loss case Q 3 and l + a i / W c - . 7

i a r e negligible.

- h ,

y r . The total wave is thus written ;' .-

y < 2 hi (t) thlZ(f-TZ)+hi 3a(t-T3)th1 3b(t-T3) I (34)

li. ~ f r f i cr A ~ , r r ~ t i r r : ~ ~ ysrn .:2; 1 '

The h - terms representing the voice source residue cancel out- 0

- I side the region of the glottal pulse, i. e. t > T

3 ; ,<,,&-I .,: In the low-loss case ( a c w ) the glottal source residue in the 1 i

soundwave i s .LA y -:-c ,a ',I)

o c t <T2

.It / * \ < iL b * ~ a .;I ;I 20-3 where Ad Uo. W V N G In the second interval

g ,Xo:?t"3ixs ' fc 3rri 1.3

I - . ,--:- . ii;. i .. . . . . - . . . ,r;rrb{21. ,::r : . . ~ - j l : ; ' ~ t r.k' 3 ho(t) = -Ao* Ksin [W ( t - ~ ~ ) i $i ] '

5:2:, .!:;::!. , ., A. .-2 ;I, 1,; ., f,:e',.,.>.L g ( 3 6 )

i - : . . , . . .) ~4116"!:5593q fI614.-~: 3 . t

- i-I which conforms 'with a direct differentiation of the glottal source,

JLJ J J +, i;: . \ i l d E l i ~ YIJ u 1 J b 1 T l i X O L C l q 6 3(! 776,m ,+ iiz Eq. (5). .i,O

.lob ~i -..,).L r 4 9 $3 S--A hqs ,:' The F1 oscillations a t the glottal opening, peak, and closing may

now be written. aJhd L1.13Ylld I C I :'IlC4ij3I.C'41q $.?&I i i3 .it

Page 17: Glottal source and excitation analysis

STL-QPSR 1/1979

A . 5 ( 2 K - ~ ) e

-ai( t-T3) I , hi3a(t) = - 2 sin [ w ~ ( ~ - T ~ ) + J) 21 "'1 ., .. 1

t wT3 1 rird* .,i syzj;w 1 1 - : ~ ? 3

-ai( t-Tg) h13b (t) = -A e , o cos [Wl(t-T3)f J)

f ;.. ; - t > T 3

;:: 1 (,+ 1' . > J ~ : , t r ~{~'l'-$)~~ 1ri4( s T-3)xjd+ ( 9 ) l i d , One may combine h13a and h

13b I . - I -. . srfT

- - .-A m e -a1 (t-T3)

h i3 - h13ath13b - o cos t ti^ 3 )+

a, = artg (d/b) T > j > : > S

For X > 0.6 and Wi > w the initial amplitude of Fl a t the main g - a" L; . .. - 2. 0 0 point of exitation i s simply

U k = U KRJTd, compare Eq. (7). o R g o ? . % . + r e r,- = (?>

( 4ib)

An s-plan presentation of the source transform a t t=T3 combined

with a conjugate pair of poles representing Fi is shown in Fig. 111-A-4 At

\ K-values close to 0 .5 , hi3(t) of Eq. (40) i s dominated by hi 3a(t)=-hi i ( tb

Otherwise hi 3(t) may be approximated by hi 3b(t). If we choose , . plr W1=2U and k=2 the e r ro r i s 4% only and more noticeably 187'0 if

Wl = w and k=0.6. h (t) vanishes for k=i . These exampleis g 13a

facilitate predictions of speech data. ... qt3r

.- , A "zero-order" approximation to inverse filtering, i s a simple

integration of the speech wave. In the small loss case i L'

The factor NG from Eq. (23) i s close to 1 and $ approaches 0

when Fi > F The falling branch of the restored glottal pulse is g '

Page 18: Glottal source and excitation analysis

Fig. III-A-4. s-plane representation of the excitation at the instant of flow interruption and an F i transfer function.

Page 19: Glottal source and excitation analysis

STL-QPSR 1/1979

h ( t - ~ ~ ) + h ~ ~ ( t - ' I ' ~ ) = Uo KR 01 IS cos [w (t-T2)+ +I-K

t - s . - < - e -- L ,* . 2 .

8 a .' .,L

, . . , , ,,: : JO~. , , . . [ , ,f y,; ?y,:3 , ::: s * - t c; .: '? j fv G - . , : -. (; 'i~.;!;. ..... . ., . . S .7: .- .

. , (43) ,;;,;: . -. 3 : , ; , < r : ! ~ 7 1 c ; 3 . r j h i ; cj':: . .\j 3:3k,-; f i r , c t j * i t,kJiLn-i . i d ? !> .)-,. . ,.. 4;. , - . _ . ,,.. _ ._ _ . _ _ ._^I"-^ ---.I . - . . .-.,-.,-.- --

/ . . 8 . :. iij:J :,Wki~;;;;.: *[ : 3ij.i ,:.,. 3n5h{;I<l;~-gl~:-.,y-;

. - ., . . - . . -4- - --- which conforms with the glottal soUrcc pdse ahape. For t > T ho(t) vaniahes as required. The F1 oscillations may be expressed as cj .! I . .

1 ., T '

+g2+ (113+J1 4 41 .x y f .: cv,, . - =- ?\= 1: ,- L~

E ' (46)

conforming to Eq. (40).

Whenk > 1 and w < a i e; - J ~ ~ ? i n c 3 , ~ - -, lo:. -~lelnnm: qi: r! ' i*

Thue, in the integrated speechwave since "1, = I / T ~ the ratio g

of the initial amplitude of the F 1 ripple in the closed phase to the

amplitude of the preceding glottal peak i s simply f / ( w ~ T ~ ) and the

Fi ripple starts as a minus sine.-function. This has been confirmed C! 7J accurately from experimental data, see Figs . III -A - 9- III-A- 18.

h 1 , ~ ~ t ; b i)r~,- . ':'" The significance of the factor NG representing the mutual rein-

forcement of the glottal pulse "formant" FG and Ff when they come

close in frequency and df the starting phase $ of the FG wave has

not been studied closely. ,,.;:.,'L >,~;J. I s - ; f . ~ i 3fl:l 0 2 !-$:,? i.J5Cf(C4C J 2 ~ 7 LJJ.L:'~~.' ?: ? : ' i

The extension to a vocal tract transfer function with any number

of formant i s simple. The general transfer function, Eq. ( i t ) , at

the complex frequency of the formant pole with this pole factor removed

i

Page 20: Glottal source and excitation analysis

STL-QPSR 1/1979

replaces the same operation for the single pole pair. The usual rela-

tion between formant amplitudes and formant frequencies known from

the frequency domain analysis, Fant (1960), enters. It should be

observed that the initial amplitudes of the formants in the time domain

a r e independent of their bandwidths. - 1 . r . A - . - . * - - 2 - :,:w

A special case easy to handle i s the neutral vowel, i. s. single - . .

1 tube resonator. Assume a length C and an attenuation constant a .

e

i Considering the expression for poles - .A DS- 1) = ( 3 j S l . 5

A A * s C = + j (2n- l ) . r . ZTL+ a'c ,us-

, / , , , ns 'n 1 e . ..--- / 5 \ 9; t q 1 l i (50)

* I

with Fi = c/4 1 : t e

- 1 t v r f t 1 I

and B i = a c/n = al.n we may rewrite Eq. (49) , , ~, . ,. . i ;2 i ot ~ ~ x . ~ ~ ~ x T o ' ? c ~ o

1 1 H(s) = I

I . . \

Bi -2- = A

\ .i n i =.-iq'q- s 1 i"'. (51)

cosh [T - . Fi

+ s - 4FI 1 With an impulse source the initial amplitudes of all formants a r e derived

from H (s) = i / ~ ( s ) . i s - J . 1 " - -- - - .- -

i : , - I rr ' .: -L:7 - ' -5 --

.a_- -0 'I TG

- (j}f,iiia = i n [ . - ~t A *I + j(2n-1) ]

B' ( s=sn)

, 1 a ) r- 4Fi -?I? "".-', I ' .& 3 ~ a i z ivsiv~f>.s3qa "ilfr;~~ajni sri~ ni .arr:i i - - TIBl

r t l" $43 In i~kslfilqrxis fsnini e,f (52 ) ! J j sin(2n- I) 2 cosh 2Fi ib?3oig pfrit . : J S T ~ 9tft ?c t r b ~ j I ? ~ m i

.noiJ~wi-sni;a arrsfin-r s s c a3ra3n ~ ~ 1 ~ r ; i r f '

which indicates that all formants have starting amplitudes + E ~ ~ ~ - , : ~

8F1 and alternating signs ., a z q 3 ~ r6 *n'-s+ 3-:1 io $335. . ! T i ? l ; ~ s ,; A

-sit ,,, -sit . r , , ~ , rX+.cpfn a -sit m m n - - n

"-" h(t) = t 8 F e sinwit - €Fie sin 3WitC8F e sin 5wit - etc i? :- 1 (53)

Trlo,o;..r barn + D tprr-r! fa - This should be compared to the impulse response of a single con-

jugate pole function, r , a t -

Page 21: Glottal source and excitation analysis

, .

STL-QPSR 1/1979

In speech the source and radiation enter a factor of 11s in the

transform and we a r e back to the minus cosine starting characteristic

. i

The extension from a single resonator filter to t&e complete neutral

tract transfer function apparently adds an overall scale factor .

~ F ~ / w = ~ / Z R = 4/n to the inverse transform. This would upset

the continuity between source residue ,and formant oscillations. How- I l l

ever, the infinite sum of initial amplitudes (i - - + - - - etc . ) ap- 3 5 7 proaches E/4 which restores the continuity. n . ~ L , . ? A ~ , ~ ~ s q a s;lt

To derive the overall spectrum of the voice source i s a straight-

forward operation of calculating the absolute value bf Eq. (9) with . ,-,

s = jw. 4 ,. . i ; . ti;^ t.n.,:,. 1 1 1 :. .. I. . \ \ : t. j -*l Z I O ' ~ J S ~ X ~ X S sTi!?..> L

"C : ' f i ~ ~ . : l ' l ( ? ' - r ~ ' 3'

. ' Introduce the normalized frequen~y x = f / ~ ~ . o s c , , ,.,>q t 3 2 ~ ,!.: . & I U V U ~ , Kv :9 jr >:.d. .,

, . 1 . ... ' .I

'i . . ,,* . : '.T ?i3J J .$, <> .!.r ~,~$~+'i&~;~~ n.:- '!.,: ?I:;. . . A+B+C) 2 + ~ D + E + F ) ~ ~ 1/2, .. . .:, .! !. r e - H = I( ,.G 5Pj1i5p c\Xii? z i d . - . .

' . '. , . . . . . 2 (56)

. . , , . . ,., 4 ..> - L , . Lf , A .< . . , , 2rx(x - I ) . ; 9: , ,& $& z r 7 , : j . . , . '. .,,.,.;: ;i i,;:. r ; & ~ T f i - * , i . l - i t r . .;i.-i.;!' if:,..',^;.) 1,;i.s I&'sd!::rrie yc' ;?!?+=*i:*, I

, .: where '. : :.,!., .:,.; -.T ,dj.rd? , y.In;, ,: i;f:-jk t . ~ d ! ! r;f q?j $i-$rrr..r y . ) i 1 5 ~ 7 ( 3 . 9 - '

. , .. - A = C O ~ [X(TC+'P~)] D = 2 x V 5 : .i i$fsl $525 .i?s25Lr i :? c:e. 3

. , .. , r . a i a q d a riru~L,:;;: 2

:. ... ! < ... , B = (1-~K)cos(xIP J ...; *,:, &? ,... ) ., E = sin[x(~t+'P~)] ';ks;fano; k;., t ...%..: c .-

. .. . : . , . .'; ; .. ... , - ,. . .. . m k'- i 2 G !,atj ~ 1 - gq-,:.: - .*. r - r . IPj = arcos (-) . . . ,. . ., '. ",: . ,. ., ,, :: . I :,> ' > : A . , , ::,-, I a ..,.% ..; ,.. .. .:,.K.: b . . i < ; ~ , ~ ~ ! . ~ . , : , s ~ T ~ ~ : ~ i ~ i v : r'i , : . I , . : . "

0 C . . . v Z J j d I$ncjUr(-.*;,;-,- +

4. Spectrum and waveform calculations

A family of source spectrum curves calculated from Eq. (56) and

corresponding spectra from a filtering with a neutral vowel a re shown

in Fig. UI-A-5. The parameter K has been varied in eight steps from

0.5 to infinity. The glottal pulse frequency F was fixed at 125 Hz. .,->,'; ; g

Page 22: Glottal source and excitation analysis

STL-QPSR . , 1/1979 100.

The spectral energy concentration around o r somewhat above

F = 125 Hz stays almost invariant in intensity level with the raise g

of the pulse steepness factor K whilst the spectrum level above F g

i s submitted to an almost parallel displ acement upwards retaining a

- 12 d ~ / o c t slope for K'-factors between 0 . 5 1 and 4. The extreme

values K=0.5 and K=infinity represent - 18 d ~ / o c t respectively - 6 d ~ / o c t

slope. At K l a r g e i than 4 , the minimum between the glottal pulse

formant' FG and F I disappears The sinx/x fine structure of spectral .

zeroes i s apparent for K-values up to 0.55. The recurrence of these

in the spectrum of the neutral vowel .presumes stationarity of the

filter function which in reality is uwet &.y the asymmetry of glottal , - . damping. . : , . ,. . . - ,

An s-plane representation of the Eg exdtation combined with an . 5

F l resonance was shown in Fig. 111-A-4.. An alternative'way of view- "

ing the spectral changes with the growth in the K-factor i s with re-

ference to the initial amplitudes of formant oscillations. This i s ,

brought out in r i g . 111-A-6, the upper part pe~rtaining to the glottal L ,

closure excitation Eg, Eq. (40). The initial amplitude of BG has

been arbitrarily selected to coincide with the residue in the speech

wave of the interval of glottal flow up to i t s peak which is half a per-

iod of an undamped sine function of F . The overall pattern of Fig g

111-A-6 i s the same a s that of Fig. UI-A-5. These spectral! represen-

tations provide a 'more complete picture of spectrum slopes that dis-

cussed by Sundberg and Gauffin (1978). Their analysis covers a low

frequency range up to the 8th harmonic only, where the slope varia-

tions indeed a r e large in contrast to the part above 500 Hz where the

spectrum slope is almost constant. An increase in F everything g'

else held constant, shortens the rising branch of the glottal pulse,

which tends to increase the cutoff frequency above which the spectrum

reaches the -12 d ~ / o c t level. There i s usually a covariation of K and

F and Fo with increasing voice effort. The K-increase causes a more g

o r less proportional shift up of all formant amplitudes while an F g . - _

increase causes a translation to the right of the source spectrum pro-

file. !

F , *. 1 t - r - .. What about the excitations a t thi glottal opening El and peak E2?

The lower part of Fig. IIIrA-6 is derived from Eqs. (35-40) with K=2

which represents a stressed syllable produced with moderate voice I

Page 23: Glottal source and excitation analysis

I '

1 1.5 2 2:s 3 3:s FREQUENCY IN HZNIOOO

Fig. 111-A-5. ( 1 ) Fourier transforms of the source for various K-parameter values and Fg=125 Hz . (2) Corresponding spectra of sounds with a neut formant pattern ~ ~ = ( 2 n - 1 ) 5 0 0 Hz and B = ~ O ( ~ / F ~ )

Page 24: Glottal source and excitation analysis

dB € 3 L

I I

- -

- AMPLITUDES

0 1 2 3 4 kHz (Fg=125 1 8 16 24 32 F / F ~

4 kHz (Fg=125 Hz)

I Fig. 111-A-6. Initial amplitude spectra of neutral vowel filtered

i sound with various K-parameters and E3 excitation. I Below, E l , E2 , and E3 excitation for the special case K=2.

Page 25: Glottal source and excitation analysis

i . 'L' . , , . . -, 1 , . * " A . , ' > , L ! 5;s t,;' .: . < ; I .: ,. I .

effort. The relative dominance of the E3 excitation i s apparent as

well ag the -18 d ~ / o c t fall of E l and E2. This i s a characteristic

feature of our model. In practice, many voices show a more sharp-

ly bounded glottal opening ahd' peak which would cause F i and E2 to .i8.' I . . . #. attain prominence* , , . vc.;,i 3 I U ~ ~ R : f 9T.G .*>sq 3 3 ~ :

. . .i ' . .c I ". ... : ,.. - ; b r.0

2 , .-. .r ,.. f f-- ,,., * *.! f f

A detailed calculation '&s'.been carried out of the entire ti'me

domain w d d fun&& of two productions with k i ' i and F =125 and g

F = i o d H e , seeFigs . . I l -Ai7andi l1-A-8. Therfilterftmction I . 0

contained a single formant representing Fi=500 Ha, . Zn one of the . . sa'kpies the bandwidth of the resonance during the glottal closure .. '\ .I -f ..;

: wbs set to Bo=25 Hz and the additional bandwidth in the glottal ope- . .* ,*.. %

:i ,fiing was given the peak value 8+180 Hz. For the other sample

i . r B ~ 3 7 . 5 HE and Bmax=690 Hz. These might represent vowels ~b 1

. 0 and [h.] , respectively. These samples illustrate the time domain

. : L'. building blocks in the frequency range of Fi down to zero frequency.

1- 1

The recipee f ~ r ~ s y n t h e s i s of the wave is apparently a s follows. . . , -

* l i t - 3:l&.?.fxf .I 2

Select a glottkl' pulse form;' Calculate its derivative (implied by

the radiation trinsfer). The derivative ends with a negative peak

amplitude proportional to U W ( 2 ~ - I ) i / 2 = ~ o / ~ d a t the instant of o g

closure. This i s the starting point and startihg amplitude of a minus 7 . :

cosine function. ' It proceeds with a constant damping'factbr

e x p ( - ~ B ~ t ) t i l l the onset of next glottal pulse whm thebandwidth is

-. .:; . , . , 1 ( . . ;, F.. i,;:, .>.. . . + + , : > I , ~ - , J ~ ~ ~ ; ~ ~ ~ . ~ . . . ~ ~ . , . t<,~..2:, .- r:: !;[I): .:,no is noi3sx;.;.3jrrj: 3c a n.63 rrr

A (t) . . * ! ,? . . . . < ., ! ., : I:, : ::q i:. .: .P ,? = !./ B , L -i:;fr!ia .i?;v; f.:.:r*j.! (t ei. ,:;. >. , . , ' .'( . ' . . :!. . ,,...:.. . . - . (57) , . j y [ i n , ,.

.. . , ...

' ' " Ti'" my Agmax . . . ' . - ? c ~ s ~ - M ' ~ in ;lyr '3nc;rr -;r3i.Tl

: ....; . . . . . - . . i . . .. , , . i : . .. . '

J i. , ' ,. , . , \ . . * . I ' . :+,: dr;sd-irsiq ~ c f hsvr5l

... ! I : : 2 , . '

- :!7;t0: providh an attenuation exp [ - i ~ (t). t ] !.:, l i ,bs,,iqa 3:!ei,inl : ! . B . I '

-) f . J , 2 - As $ e c n ' i n ~ i ~ . 111-A-7 with Bo=25 and B = 180 Hz the F1 oscil- . . max . .~

' lation decays 2 dB during the glottal closed part of the cycle and 21 dB

during the glottal open phase. In the other example with Bo=37. 5 Hz $ * , . ,

., and Brnax =69O Hz, Fig. 111-A -8, I . the decay i s 5 dB during glottal

closure and 65 dB during .the glottal open phase.. These examples . . show that even at moderate degrees of glottal damping the tail end

Page 26: Glottal source and excitation analysis

of the Fi oscillation i s rapidly attenuated during glottal damping.

The superposition from earlier excitations a re therefore insigni- :

ficant. %

C / / . ." . . ..-- ..*. . . .* .,.< , - C ' :

In the example with K=i and F1=4F the initial amplitudes of ' g'

the oscillation evoked a t glottal opening and the oscillation evoked'

a t the peak a re both a factor 8 below that of the main excitation~3:

The total wave functions in these figures a re typical of real

speech a s will be shown in a later section., 3-?s:) hi; ~.39 * 2- I 1 , , .

' Fant and Liljencrants (1979) have given an algorithm for cal.culr

ating the eifective bandwidth of truncated envelopes. The effective

bandwidtk is the value selected for a conventional constant decay re-

ference that matches the vowel. The criterion of the ratio of peak to rec- I tified mean being preser&d allows. a calculation of B =45 Hz for tho smal- e 8 ,-, I

le r damping, Fig. 111-A- 7, and 70 Hz for the greater damping, Fig. 1 1

111-A-8. The components a re B =45-25=20 Hz and B ~ 7 0 - 3 8 . g= ge . f

=32' kz ., respectively. The relative small size of these glottal band- I

width components contrast with the excessive values of maximum in- I

stantaneous bandwidth. The ratio between glottal ope n and total voice

period duration is a more critical paraheter than the particular i ..

scale value of glottal damping. When thia ratio approaches i ' the *'

- , , . I '

glottal losses increase drastically. I . a); ; A .. . " $ .a ili,l 7 i:.>.3; 4 . ,

5. Supportin~dataanddiscussion L. . , -~~= . - ,+ b . . * I

& - - A '-3

' . An oscillographic recording of the single word sentence "adjli"

[aj/:] i s shown in Fig. 111-A-9. The upper trace i s the s&echwi~e.

the next line i s the zero-order approximation to inverse filtering by

means of integration alone and the lower trace i s the speechwave low-

, pass filtered with a simple integrator at 1000 Hz cutoff. A condenser

microphone and an FM-taperecorder was used for the recording fol-

lowed by play-back at 32 times lower speed on a Mingograph run at

5 cm/sec speed, i. e. effectively 160 crn/sec. The F'l -initial ampli-

tude At and frequency F1 was measured from the lower trace. whilst

the middle, integrated trace was used for deriving the glottal pulse : parameters Uo* K, Fo, Fg, and Td. Fig. 111-A- I0 shows the.tem-t

poral variations within the word. The glottal pulse amplitude stays

almost constant whilst the main change in A% is, correlated to the,pulse

I steepness parameter Ks and to F The minimum A t and K in the

g '

Page 27: Glottal source and excitation analysis

Fig. 111-A-7. Waveshape of a sound produced with the standard source K = l and one-formant vocal t rac t t ransfer r e - presenting Fi =500 Hz. Decomposition into elementar constituents. Moderate degree of glottal damping.

Page 28: Glottal source and excitation analysis

Pig. 111-A-8. The same a s in Fig. 111-A-7 with high glottal damping. I I

Page 29: Glottal source and excitation analysis
Page 30: Glottal source and excitation analysis
Page 31: Glottal source and excitation analysis

STL-QPSR i/1979 . 103.

voiced fricative C j ] i s probably related to the loss of transglottal

d - 1 /2 pressure. The correlation between the offset time T - l/u [ZK-11 =

g =Sc of Sundberg and Gauffin ( 1978) and the ratio u ~ / A ~ i s excellent

a s seen in Fig. 111-A-11. L - 7

This is a test of the predictability of the f i r s t formant . . . : r t

amplitude in the sound pressure wave a$ a distance of '- ' . # - I . = a centimeter from the speaker, Eqs. (5, i i , 12, 23, 35, 40). . .

,* ,, .., -8 * $7 " - !' with r ; , ; i ' , j r , -t " ,. .---

* ~ r i ~ o ~ " 'tt 2 i 2 2 -1 (58) . . , A ~ = T< ~ ~ ( f ) = KT(£) --I? . [I-FIZ/FJ- [ i - ~ g / ~ ~ ]

*d . - - which takes into account the Fi locating relative Fg and F2.

Except for small A i values at the termination of voicing this rela-

tion holds within 1 decibel which is the accuracy of measurements.

It is interesting to note that this close fit requires that the KT(f) cor- 2

rection for radiated power transfer in excess of w i s included.

This needs to be validated by control experiments . - r - ? ~ ~ \ ;-

The gen era1 resemblance between the calculated waveshape,

Fig. JII-A-8, and the oscillographic trace of the vowel [ $1 in F ig . I

111-A-9 period no. 26 i s apparent. - - - - . -. .

The glottal pulse frequency F which i s defined a s the inverse of g

twice the duration of the rising branch is on the average about 5070

greater than Fo or l / 3 of the duration of a whole fundamental period.

In the stressed vowel f 6: ] in adj6, F reaches the maximum' value g

1 . 9 F =I78 Hz a t the instant of maximal formant amplitude A t . The 0

location of maxihum F i s reached about 70 m s earl ier . The signi- 0

ficance of this timing difference remains to be tested. During the

last 100 m s of voicing in the final vowel the glottal pulse amplitude

Uo retains a relative high value even after the formant amplitude

A i has decayed more than 15 dB. F remains higher than Fo up till g

the very last pulses where the vocal folds fail to reach contact. I,

The general appearence of Fig. 111-A-9 (subject JS) i s similar to

hat of Fig. III-A-1 (subject G F ) including a tendency of a slight min-

, I imum in Uo in the region of maximum h i . The tendency of constant

Uo a t varying Al i s earl ier known from studies of stationary vowels

produced with various degrees of vocal effort, e. g. Lindqvist ( 1 970).

I I

Page 32: Glottal source and excitation analysis
Page 33: Glottal source and excitation analysis

1. I I I I

0 0.1 0.2 0.3 1

- a-j- 0*4 sec

PJI.

dB '

I I 1

2 - prediction error in A, - 01

-2 - - I I I I I

0.1 I

0.2 0.3 0.4 sec

Fig. 111-A-11. P r e d i c t a b i l i t y of A from U and Td and an 1 0 estimated VT t r a n s f e r function i s within 1 dB over t h e major p a r t of the u t terance , Fig. 111-A-10.

Page 34: Glottal source and excitation analysis

-.--/---.------ --\-pP.\- ,dnb,F-\s \.-. pdv-'%\ ..I f \',!-- \JqLi' j'.-\ if ;\ ' A/'- \/v . .-. .-%=.=--- I .Li _a,-- _.-. __ . - --. . -ICP .l--.*..- d m - - - --._-_I..-

I .

. , . , 2.. i p ,. ., ;, !* \; !, . j l ~ - , . 4 ~ l ~ ~ ~ 4 1 . ~ . j.,.w+".~... ,AX-C--.P.......,* ,.. \ ; : , *.---- .-,* "."rl*-r:-r:. w -CU-n-4-‘b-.,---- , i" %

Fig. 111-A-12. Same a e Fig. 111-A-9 but for a somewhat lower vocal effort.

Page 35: Glottal source and excitation analysis

Fig. 111-A-13. Volume velocity output of the Rothenberg mask. The word Cja:] spoken by JS at medium voice effort. Measurements of source parameters are illustrated.

Page 36: Glottal source and excitation analysis

Fig. 111-A-14. Same a s Fig. 111-A- 13. A lower vocal effort.

Page 37: Glottal source and excitation analysis

Fig. III-A-13. Volume velocity recording as in Fig. 111-A-13. The word ~ajd:] , speaker JS, normal voice effort.

Page 38: Glottal source and excitation analysis
Page 39: Glottal source and excitation analysis

Fig. 111-A-17. Same as Fig. 111-A- 15 with high voice effort.

Page 40: Glottal source and excitation analysis

Fig. III-A-18. Volume velocity recording as in Fig. III-A- 13 of the word "inversfiltermask".

Page 41: Glottal source and excitation analysis

STL-QPSR 1/1979 a \ ' ' 105.

1 . 3 .

I I ? 1. . ' 5

of the Fi ripple i s predictable from Eq. (46) and proves to be pro-

portional to This i s of course also valid for the approxi-

mate inverae filtering by integration of the sound pressure wave alone,

Fig. IIL-A-1 3-III-A- I8 , .,,,,,, This technique i s simpler and more re-

liable than the Sondhi (1975) tube method and requires merely the

access to a condenlei microphone, a wide band oscillographic re-

cording eystdm, and the simplest of a l possible filters, an integrator, '

. - 1 - e. g. kh RC low-pass filter runed to about 5 Hz. . , ?., , r- .+4 G3,7. f . . . ., , ..

Ii- I >

This study was in part supported by funds from the Bank of Sweden Tercentenary Foundation. Johan Sundberg and Jan Gauffin have contributed to experiments and f ruitfuJ ~ s c u s s i o n ~ . , , ~ ~ ~

. . . ,- . . I . , . . , ' - .i.,.:.-,:; [,.-),->./ S.!j 2 . ; 1 ~ * : * , C l r f : . . I . - : a h . , . , . , ) 7 : : - , i?L1- . . +<; ;-.!.;:,. .-

I .+ . .. . .. . . . .,.! \)? ? :< :. . ,ju ,;::. . i.* . ~:~;' ;-, , !. rj , ?<'t?, !:a -A :: ' ,' . . .,..

Page 42: Glottal source and excitation analysis

1

. STL-QPSR 1/1979

Refe rences - CEDERLUND, C. , KROKSTAD, A. , & KRINGLEBOTN, M. ( 1 960): "Voice s o u r c e studies", STL-QPSR 1/1960, pp. 1-2. ,ql-. - 8 9d..

FANT, G. (1959): "The aco us~ . i c s of speech", P r o c . of the 3 rd Int. Congr. o Acoust ics , Stuttgart (ed. L . C r e m e r ) , E l s ev i e r

k n Publ.Co., A m s t e r d a m 1961, Vol . . I , pp. 188-201. ffi.%

FANT, G. ( 1 960): - Acoust ic Theory of Speech Product ion, Mouton, - ' s-Gravenhage (2nd edition 1970). ,,-, , t~ , , ; , ~ j ;.:J?.Q~; -en,- nsi :~ r ~ u l , i i

FANT, G. (1979): "Tempora l f ine s t r uc tu r e of formant damping and excitation", pa p e r p resen ted to the ASA -meet ing, Boston, June.

, FANT, G. & LTLJENCRANTS, J'. ( 1979) : "Percep t ion of vowels with t runcated in t raper iod decay envelopes", STL-QPSR 1/ 1979, pp. 79-84.

FANT, G. & SONESSON, B. ( 1 962): " Indirect s tudies of glottal cycles by synchronous i n v e r s e f i l ter ing and photo-e lect r ica l glotto- graphy", STL-QPSR 4/1962, pp. 1-2.

FANT, G. , ISHIZAKA, K. , LINDQVIST, J. , & SUNDBERG,> J. ( 1972): I "Subglottal formants" , STL-QPSR 1/1972, pp. 1 - 12. ?rr?a nEri'P' FLANAGAN, J: L. ( 1 955): speech ana lys i s s y n t h e s i s a d percept ion,

- Springer Ver lag, Heidelberg.

FLANAGAN, J. L . ,:. ISHIZAKA; K. , & SHIPLEY, K. ( 1975) : "Syn- thes i s of speech f r o m a dynamic model of the vocal c o r d s and vocal t r a c t " , Bell System Techn. J. 54, pp. 485-506.

GUERIN, B. , MRAYATI, M. , & CARRE, R. ( 1976): "A voice sou rce taking account of coupling with the supraglot ta l cavit ies", Repor t f r o m Lab. de l a Communication par lhe , E. N. S. E. R. G. , Grenoble.

HOLMES, J. N. ( 1962): "An investigation of the volume velocity waveform a t the l a r y nx dur ing speech by m e a n s of a n i n v e r s e f i l ter" , P r o c . SCS-62, KTH, Stockholm.

HOLMES, J . N. ( 1 976): " F o rman t excitat ion before and a f t e r glotta.1 c losure" , I E E E Conf. on Acous t i cs , Speech and Signal P r o c e s s i n g , P h i 1, delphia, P a . , Apr i l .

LINDQVIST, J . ( 1965): "Studies of the voice sou rce by means of i nve r se f i l tering", STL-QPSR 2/1965, pp. 8 - 13.

LINDQVIST, 5 . (1970): "The voice sou rce studied by m e a n s of in - v e r s e f i l tering", STL-QPSR 1/1970, pp. 3-9.

MARTONY, J. ( 1 961): "Studies of the voice source" , STL-QPSR 4/1961, p. 9.

MARTONY, J. (1964): "On the vowel s o u r c e spec t rum" , STL- QPSR 1/1964, pp. 3-4 . , MARTONY, J. (1965): "Studies of the voice source" , STL-QPSR 1/1965, pp. 4-9.

MILLER, R. L. ( 1959): "Nature of the vocal co rd wave", J .Acoust . Soc.Am. 31, pp. 667-677. - MONSEN, R. B. & ENGEBRETSON, A. M. (1977): "Study of va r i a - t ions in the ma l e and female glottal wave", J .Acoust . Soc. Am. 62, pp. 981-993.

Page 43: Glottal source and excitation analysis

STL-QPSR i / i979 107.

ROSENBERG, A. E. (1971) : "Effect of glottal pulse on the quality of natural vowels", J. Acoust. Soc. Am. 49, pp. 583-588.

ROTHENBERG, M. (1973): "A new inverse-fil tering technique for deriving the glottal ai? flow waveform during voicing", J . Acoust, Soc.Am. - 53, pp. 1632-1645.

ROTHENBERG, M., CARLSON, R. , GRANSTR~M, B. , & LINDQVIST-GAUFFIN, 3. ( 1 974): I'A three-parameter voice source for speech synthesis", Speech Communicatioi-, Vol, 2 (ed. G.Fant) , pp. 235-243 (Proc. SCS-74, Stockholm), Almqvist & Wiksell Int., Stockholm 1 975.

SONDHI, M. M, ( 1975): "Measurement of the glottal waveform", J.Acoust. Soc. Am. 57, pp, 2281232.

SUNDBERG, J, & GAUFFIN, J. (1978): l'Waveforms and spectrum of the glottal voice source", STL-QPSR 2-3/1978, pp. 35-50.

WAKITA, H. & FANT, G. (1978): "Towards a better vocal t rac t modelu, STL-QPSR 111978, pp. 9-29.