23
Some studies on acoustic features of human speech in relation to Hindi speech sounds DwiJEflW Dutta Majttmdeb, A soke K tjmar Dutta and Njhah RanJAN Ganouli Electronics and Communication Sciences Laboratory Research & Training School, Imlian Statistical Institute, Calcutia-^5 {Received 16 June 1972) (Plates— 3 - 9 ) The present j)aper aims t o present a brief discourse on acoustic features related to liuman speech from the view point of analysis. After giving a brief review of modern methods currently in use in speech communi- cation research, acoustic characteristics and features of human speech sound on the basis of spcctrographic analysis of a limited number of Hindi Speech Sound are presemted. The acoustic phonetic and the ae,oustic prosodic parameters of human speech are briefly explained and the formant frequencies, and duration of Hindi vowels, concentration of acoustic energy for plosives and some affricates along with other related j)aramcters of (umsonants are presented in tabular form and discussed. Indian J. Phya. 47. 698-013 ( 1073) 1. I ntboduotion 1.1, One of the most common mode of communication between human beings is speech. The intolhgonce transferred through speech is largely contained in the sound waves and therefore the extraction of this intelligence should theoreti- cally bo possible from these and the associated phenomena. But because of the highly complex character of the speech waves, inadequacy of the hnowledgo of how the intelligence is imbedded in the acoustic and other parameters and the statistical variations associated with biological processes involved, this appa- rently simple problem has engaged the attention of researchers for almost a cen- tury. The attempt to analyse speech sounds dates back to Helmholtz (1877) and the present state of knowledge in this field may be ascribed to the work of scientists like Stumf (1926), Paget (1930), Barezinski & Thienhaus (1935), Chiba & Kajiyama (1941) Smith (1947), Tarnoezy (1948), Halle et a? (1952), Fant (1952), Cooper et al (1952), Peterson (1951), Jaokobson & Halle (1956) and scores of others. Without going into the historical perspective of the subject, a brief review of modern methods in speech communication research may be helpful before we discuss acoustic-phonetic features of human speech sound. One of the purposes 598

Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

Some studies on acoustic features of human speech in relation to Hindi speech sounds

DwiJEflW Dutta Majttmdeb, Asoke K tjmar Dutta and Njhah Ran JAN Ganouli

Electronics and Communication Sciences Laboratory Research & Training School, Imlian Statistical Institute, Calcutia-^5

{Received 16 June 1972)

(Plates— 3 - 9 )

The present j)aper aims t o present a brief discourse on acoustic features • related to liuman speech from the view point of analysis. After giving

a brief review of modern methods currently in use in speech communi­cation research, acoustic characteristics and features of human speech sound on the basis of spcctrographic analysis of a limited number of Hindi Speech Sound are presemted. The acoustic phonetic and the ae,oustic prosodic parameters of human speech are briefly explained and the formant frequencies, and duration of Hindi vowels, concentration of acoustic energy for plosives and some affricates along with other related j)aramcters of (umsonants are presented in tabular form and discussed.

Indian J . Phya. 47 . 698-013 ( 1073)

1. Intboduotion

1.1, One of the most common mode of communication between human beings is speech. The intolhgonce transferred through speech is largely contained in the sound waves and therefore the extraction of this intelligence should theoreti­cally bo possible from these and the associated phenomena. But because of the highly complex character of the speech waves, inadequacy of the hnowledgo of how the intelligence is imbedded in the acoustic and other parameters and the statistical variations associated with biological processes involved, this appa­rently simple problem has engaged the attention of researchers for almost a cen­tury. The attempt to analyse speech sounds dates back to Helmholtz (1877) and the present state of knowledge in this field may be ascribed to the work of scientists like Stumf (1926), Paget (1930), Barezinski & Thienhaus (1935), Chiba & Kajiyama (1941) Smith (1947), Tarnoezy (1948), Halle et a? (1952), Fant (1952), Cooper et al (1952), Peterson (1951), Jaokobson & Halle (1956) and scores of others.

Without going into the historical perspective of the subject, a brief review of modern methods in speech communication research may be helpful before we discuss acoustic-phonetic features of human speech sound. One of the purposes

598

Page 2: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

Some studies on acoustic features etc. 599

of scientifio analysis of a language is to discover those features which are the physical manifestations of the functionally distinct phonological entities within tlie language under investigation. Different languages may ho using different

'features in making phonemic distinctions. In recent years investigations into the relationship between the physical aspects of speech and the functional aspects of speech have been greatly aided by the development of liiglily sophisticated research tools, such as tlie sound spectrograph, the pitch extractors and speech synthesizers for analysis of articulatory dynamics at the coustic level, and development of articulatory models (Dunn 1950, Stevens, Kasowski & Fant 1935, 1960) and cineradiographic techniques (Moll 1960, Doelerk 1965, Subtelny 1967) for investigating the articulatory dynamics and the generative procossos at the physiological level.

Though the studies at the physiological level are beyond the scope of the present paper, it should bo emphasized that investigation on articulatory models of speech mechanism and vocal mechanism using ehetrical analog of the vocal tract, where the tract is idealized by a sufficiently laigc number of cylindrical tube sections in cascade, though not yet comprehensive enough has given much insight into tho problems of speech production.

1.2 CineradiographyLateral (jinoradiography is a technique by which a sequence of latiiral X-ray

photographs of tho articulatory mechanisms suc!h as lip positions, labial movement, dynamics of tongue positions and activity of tlie velum can be taken 500 frames per second, though for a very insufficient amount of time (a few minutes only), providing some data on the neuromuscular system which seem to bo responsible for speech production responding to a seiiuonce of instmotions.

At tho acoustic level the analytical methods employed before the lost Avorld ar certainly contributed to the sioro of information on tho acoustic cues of speech

but those analyses in some cases lod to erroneous conelusions The liberating impact on speech rosoarch come with the dovolopiiient at tho Boll Laboratonos ill 1945 of the Sound Spexirogrwph and the subsequent development at several Laboratories of Speech Synthesizers that uses various means to transform spectro- graphic patterns to produce intolligible speech,

1 3. Spectrographic Analysis of Speech Sou7idSpectrographic instruments (Kay Sonagraph model 7029-A) usually permits

ihree types of display. First typo of display gives an overall, three dimensional [licUiro of the signal being analyzed; frequency, amplitude and time (Figures 4a, b). The second type of display permits tho individual intensity of ea^h fre- quoncy component to be displayed at any preselected pomt m time, which is usually releL d to as a section (Figure 4a). The third analysis that can bo performed shows the average amplitude of all frequencies present relative to tune.

Page 3: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

600 D u tta M aju m der, D u tta and G anguli

With tJiis pattern, the entire input signal can be examined for flatness, or any amplitude study relative to time (Figure 4b). The sound spectrograms are in effect the automation of Fourier analysis of speech spectra in physical dimensions. It immediately made evident acoustic factors of speech that had not been suspected, and helped to verify or invalidate various aspects of the theories that analytical methods had been gradually yielding,1.4. Speech Synthesizing Technique

The development of Speech Synthesizer in the early 1950’s at the Haskins Laboratory opened a new way for a programmatic method of exploration. Ele­ments of synthoti(5 spectrograms were, successively suppressed, and the patterns thus amputated were run through the synthesizer. By listening to the result, it was i)ossible to determine, step by step, which acoustic elements formed the acoustic cues for recognition. The work soon revealed the importance of first three formants m vowel perception. In the next phase a new synthesizer called Digital Spectrum Manipulator has been developed also at the Haskins Laboratory with which it is possible to make microsurgical modifications of speech spectro­grams for studying stress, intonation and relative importance of individual cues for soimds wdicn multiple cues exist in the natural speech

It should bo pointed out that these instruments wore used for acoustic level of study, and were largely motivated by the problems of speech recognition which is also the motivation of the present authors (Dutta Majumder & Dutta 1969, 1968). It can be remarked that these studies have provided weighty evi­dence that it is not feasible to build machines for speech recognition based on the acoustic level alone, and so now methods, new mstruments, new experiments and now directions would be required for the purpose.

1.5 ArLiculatory Analog TechniqueHecoiitly, the development of an Articulatory Analog of the human vocal

tract by Stevens at the MIT provides a new philosophical and experimental out­look to the problem The equipment is a synthesizer incorporating the acoustic information, along with the generative features of language, and may be thought of as foiming a bridge between the research at the acoustic level and linguistic level.

However, there is another research tool which in itself is a subject of tre­mendous importance, the computer, which promises to open ne'w objectives of speech research. Its power as tools i)f analysis, synthesis or simulation, as digitisers and sorters, will load to profound changes in experimental phonetics and application of statistical methods in speech research

The present paper, which is part of an investigation being carried out in this Laboratory, motivated by problems associated with speech pattern recognition

Page 4: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

and coding for computer use, aims at making spectrographic analysis of Indian languages, and their statistiool study with the help of computers,

2.0 G e n EBAI. TEBMS used in AOOtrsXIC-rHONETIOS ANO THEIBMEASUEEMENTS

It is wellknown that sound energy is propagated in tlie form of longitudinal pressure wares which can he described in terms of amplitude, frequency, phase and a mathematical relation between them genoraUy known as the ware-equation. The simplest type of sound wave is a simple harmonic wave (figure 1) usually known as pure tone, e g. the tone produced by a vibrating tuning fork. A complex periodic sound wave can be resolved into a sot of pure tones with appropriate frequencies and intensities (figure 2). The method oi such analysis is known as Fourier analysis, though the speech wave raioly meets wiUi tlm mathematical renirictions of Fourier arialjfsis.

Some studies on acoustic features etc. 601

Figure 1 Puro To

Figui'6 2. Saw-tooth wave form, the Funtlamoiitol and two Higher HarmonicB

2 1 Sound IntensityOBoillographio and speotrographiv measuroments of the speech wave involve

the spooifioation of amplitude or intensity measures. The term amplitude refers to the instantaneous or time average value of sound pressui’e, volume velocity, or particle velocity at a particular point in the sound field, or to the oorrespond-

voltages or currents as delivered by a microphone.4

Page 5: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

602 D u tta M ajum der, D u tta and Ganguli

Sound intensity is the energy per unit time transmitted through a unit area Tn a piano or spherical free progressive sound wave the intensity in the direction of propagation is

I — P^jpe erg seo“ /c7W 0)Where P is the rms sound pressures in dynes/cm®, p is tlie density of the medium

ill gm/cm®, and c is the velocity of propagation in cm/sec. Tlie product pc is known as the specific (ico'ustical resistance which is 41.4 dynes soc/cra® at 20"C’ and 40 0 at 35°C (the latter value is appropriate for the wave propagation within the vocal cavities).

Sound pressure or intimsily data are generally specified on a logaritlunic scale with the unit deoi Bel (dB) relative to a fixed reference The standardised sound pressure referonoo is — 0 0002 dynes/cm® Avhich corresponds to a reference intensity

_ 10-1® watt/cm®

The sound level is dolined as the mtensity ratio

L ^ logjo / (Bel) = 1 0 / - doci Bel (dB)■'ll 0

which is identical with sound pressure level ■

( 2)

L 20 ]ogi„ A - dB,J- n (3)

provided P is related to /„ by eqn. 1.

2.2 Frequency, Voice Fundamental artd Forrrmnts1’h(i number of complete oscillations executed by every particle of the modmiii

through which the sound wave passes in one second is called the frequency of sound.

In a general periodic continuous wave the frequency of the periodic wave is called the fundamental frequency, and the simiile harmonic waves of higher multiple froquencies are generally known as harmonics (figures 2).

The complex sound wave originating in the larynx passes through the vocal tract and its cavities before coming out in the open air. The sound wave which ultimately radiates into air has only those frequencies which are accentuated by the vocal tract and the cavities. Therefore in the plot of frequency vs energy, known as freipiency spectrum or energy spectrum, there are peaks corresponding to tlie resonant frequencies, that appear in baud like structure. These bands are known as formants (Jfigur© 4a).

Page 6: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

Some studies on acoustic features etc. f^03

Tlie basic property of a vocal chord sound source is its periodicity expressed |)V tlie duration Tq o complete voieo period or by its invorse value, the voice fundamental frequency given by F q~ often comos acoross the fol-JoAving notations while making spoctrogi-apliic studios of speech waves, which arc also being used in the present paper

Fji — frequency of formant number n in c/.s,

= banwidth of formant nuber n in o/.s,

— level of formant number w in dB,

^ formant number n,

_ frequency of the voice fundamental with or without dimension

3.0 A cottstical chabaotertsticj of spftbch

I’lie acoustical characteristic oi‘ human speech is of a much more complex nature, than the simnd produced normally through mechanical means, suoli as musical instruments, in fact, liuman spee.ch wave are never either fully periodic or lully noise like random plieiiornoiia. Wo can classify all speoch waves into six basic categories namely, (1) quiescent, (2) burst, (3) quasi-random, (4) quasi- pcnodic, (5) double periodic and (G) irregular periodic (figure 3).

3 1 A quiescent speech wave is such a speech wave for which the instanta­neous amplitude of the wave gciioratod by vocal mechanism is approximately zero Tliis type ol‘ speecli wave is produeed dining quiet rosjiiration or during certain closure of the vocal tract (blguro 3a).

3.2. A speech hurst has the form of impulse and is produced by the release of a closure in the vocal tract (Figure 3b).

3 3 A quasi-random speech wave is produeed by the vocal rnochamsm such that successive amplitudes at a regular interval of time are approximately un- conelatcd and random (Figure 3c).

S.4 4 g«a»i-j)enodic speech wave is prodaoed by a vocal mechanism whichis vccurronl and in which the wave forms for sucoossivo periods aro approximatelythe same (Figure 3d).

3 5 A dovJbU periodic sound wav.- is produced by a vocal moohanism which is recurrent and in which the wave forms of alternate periods aro more similar tliaii those of successive periods (Figure 3e).

3 6. An irregular periodic speech wave is produced by a vocal meohmusm ivh.ch ia recurrent and in which the wave forms of sneoessive periods vary in an iirogular manner (Figure 3f).

Page 7: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

604 D u tta M ajum der, D u tta and Ganguli

(.a) C io iE S C E N T . (-6) Burst

(C> a u A % .-R a n d o m 0 - u A » r - P e r io d ic

(C) Do u b l e - P e r \o d ic O J I r r e g u l a r - P eriodic

4.0

Figiiro .3. Basic speech waves

Acoustic pauameteus of human speech

The acoustical speech parameters are divided into two classes : (1) acousticphonetic parameters and (2) acoustic prosodic pammefers.Peterson & Shoup (1966) enumerated following six acoustic phonetic parameters, wlxich we shall briefly explain along with their speotrogi aphic illustrations from Hindi Speech Sounds

4.1. A gap is an interval during which the vocal mechanism does not produce sound, which is proceeded and/or followed by an interval during which the vocal mechanism does produce a sound, and during which a driving pressure is applied to a closure of vocal tract (Figure 4b, plate 4).

Page 8: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

Some studies on acoustic features etc. 606

4.2. The frequency of a vowel or constant formant is the frequency of the decay­ing sinusoidal response of a resonance in the vocal tract (figure 4a, plate 3)

The natural range of variation of the voice fundamental frequency and of ' formant frequencies for non-nasal voiced sounds uttered by average male subjeots

according to Fant (1959) are ;Fo 6 0 - 240 o/s

1 6 0 - 850 c/sFa 500-2500 c/sFg 1500-3500 c/sF4 2500-4600 o/s

Pom ales h a v e o n e o c ia b o h ig h er fu n d a m en ta l p itch b u t 17% h ig h er fo rm a n t IroquenoiciS, ch ild re n (u p to 10 y rs ago) h a v e o n th e average 25%higher th a n males and /o ~ 300 c /s

It may bo accepted that the first throe formant frequencies are sufficient for the phonetic identity of a vowel. The individual formani; freciuencios show considerable overlapping for all the vowels. Even the first two formant fro- (pieiicies taken together do not resolve the o\eilapping. The third formant frequency has to be considered to achieve proper resolution. Peterson (1951) suggested constant ratios for the formant froquonoies whereas Pant (1959) sug­gested substitution of second formant froquoncy by an weighted average of second and third formant frequencies. However, we have presented from our spectro- graphic studies the formant frequencies of Hindi vowels in tabular form (table 1), which does not show much of a difference from that of Pant, and have used the method of difference for phonetic identification of vowels which show complete separation (figure 8).

4 3. The amplitude of vowel or consonant formant is the maximum amplitude of the decaying sinusoidal response of a resonance in the vocal tract (figure 4a, plate 3).

4.4. A voice-har is a low frequency resonant response of the vocal tract that is excited by successive impulses of air emitted from true vocal folds while the voice tract is closed by one or more supra-glottal articulators (figure 6a, plate 6 iSc b, plate 7).

4.5. The frequency of a consonant antiresonance is the frequency at which a minimum occurs in the magnitude of the envelope of the energy spectrum, if the minimum is in a region where the magnitude of the envelope of the spectrum is substantially reduced relative to the magnitude that the envelope would have if the antiresonance were not present (figure 5a, b, c, plates 6, 7 & 8).

The frequency of antiresonanco is an important cue for distinguisliing nasal consonants.

Page 9: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

606 D u tta M ajum der, D u tta and G anguli

4.6. A hroad-haTid continwivs specrum is a band of energy which is continuous over a relatively wide range of froquoncios in the speech spectrum which has higher total energy than any other band of frequencies that occur simultaneously in the spectrum and within which the peak energy divided by the energy of any other frequency within the band docs not exceed 30clB (figure 6a, b, o, plates 8 & 9 ), The fricative consonants are characterised by the presence of broad band continuous spectrum (because of the associated turbulanoe noise).

4.7. In addition to these another acoustic phonetic parameter, the formant bandwidth is very useful and is defined to bo the range of frequency of a formant measured upto 3dB down from the peak of the envelope.

5.0 T h e aooitsttc p ro so d ic pa r a m e t er s

The acoustic prosodic parameters are 1) acoustic phonetic duration, 2) average fundamental froquen(‘-y and 3) average speech power as explained below.

5 1. A raustic Phonetic Duratum ; This is the time inttwval of the duration of a phono plus the duration ol' those phonetic transitions which are associated with a phone (table 4). Usually this parameter is associated with studios of stress and intonation. However it wdlJ be showm in a later section (6.0) that this pararUeter inay also profitably be used for the distinction between aspirated and unaspiratod consonants.

5 2. Average fundamental frequency is the frequency of the first harmonic of a pei'iodic speech wave which is produced by the true vocal folds and whief; is averaged by a weighting function over an interval of time which we have shown earlier (figure 4a, plate 3)

5.3. Average speech power is the product of pressure and volume velocity in a speech wave produced by the vocal mechanism averaged by a weighting function over an interval of time (figure 4b.).

Plosives can be distinguished by the sharp impulsive nature of the speech power (figure 4b, plate 4) The latera,ls are characterised by rapid changes in speech power

6. HrSOUSSION o n e x p e r i m e n t a l RBStTLTS ON AOOHSTIO PARAMETER STUDIES

The formant frequencies and the phonetic duration of eight Hindi vowels are presented in table 1. The vowels exhibit their characteristic formant fre­quencies. The low value of the standard deviations, particularly for Hindi ar (a) and sq (a) indicates acoustic stability of these phonemes.

It was observed that the spread for 1st formant is from 295 o/s (for Hindi f , i) h, to 707 o/s (for Hindi «q) a and that for 2nd formant is from 675 c/s (Hindi U, to 2400 c/s (for Hindi i). This compares well with the results given by Pant

Page 10: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

\[a.iitmimr. Dutta GANaiTLT Inrhan J. Pliys Vol 47, No 10, ]973Plate 3.

f !

4 ’

■ >

i ’i^iirf 4a Niirmw band s]jiHjti'ugiaj)li oJ ^ (•l‘d ('T^) I'l '* I'lnl sboAvs cni'igv \'s, li'i'cij. I’lot Lnwi‘J pail sliuivs cnoigy dusl in Limo dunum f’u J'uiuiuniriital Pi(‘q - -b’oiimuils

Page 11: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

DifTTA Mazum dar. D litta & Ganclti-I Tndiiin J. Phys Vol 47, No 10, i! ; ‘jPlato 4.

h f,

I

Kimji'o 'll) Wido band hpoctinjiiuqjli n( word ( )Uj)] rl pail shows mlnisdy A s. timo j)lol. Jjowoi pm I shows ounrgy DisN in IiiikmIo

- Fmmaid fiotpionoiHS t3 - CiajJ

Page 12: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

1 MAJUMnAR,, D u t t a & G a n o it m J y ^ i

l*la.le 5

\ yt:* - 1

% & ■ J'* T ^ '“

i ^ : - " ■■

p i t .

9 ' ^

Ljy

Fi^uro li ‘ (Inniiuii' (IihijImv nf tViii won] ( Meti< ]

Page 13: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

1>|’TTA Majtimdam, D ii'jta & DANfitJi.r Indian J. Phvs. Vol. 47, i(,Plate 0

li'if.HiiL' fill. Wide build sjM r(,rt)grai)li <)i iIkmmiuI ( |nJT )U — Vmi'ii Hui Ku Hi'.hoiuiiu'i'

Page 14: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

p, i Majumdar, Dutta & GANaiiLi Iiiflian ,r Phys. Vol +7, No. 10, 1973Plato 7

i[3' fV

(1.13 l.c

cq

Page 15: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

D utta Majumbah, DrTTA & Gajsuult Indian J Phys. Vol. 47, No.Plate 8

10, irr,

• . * !■'

'X ,4'''-.,'

IJ

Page 16: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

I„ \ Majitmdab, D iitta & G anouli Imlmu ,T Pl,y,s V„1 47, No. 10 8973Plat e 9

~1

S ^ fi

s

i »i t V * tt. ^

' V ’'' tSs

Page 17: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

o>

Some studies on acoustic features etc.g 1

1

< 1

: ! l 1t-i<2 .§

do

1 i .3 ^ a fl

egT3§

Ehc2

egXg

1Xg

Jo

XI tCCmO

«s § cS c2d

iac3a19

*2 S

1 1

U3.2do

ICcS

Mdo

M1 I

kTXdtf P

1 § .M -a

agcn'gCmO

TflmwX3OCmO

"SMOCmo

aCD1CmO

'oOJg

I a

® ,20)g

CD n e<5CO

dinCO eg

llCOlO

d

1

ok; O «2

!<?

a ZG 1 1 I 1

Eh go?3

t'(MpM'Ct00eg

CD tHegCO

00o IDegegegC5eg IDIDeg

S R 1 1 1 1o GQ 1 1 1 1 1 1 I 1liT § »J0 o>o o c oCDCO toCO >oCO ClCO 1 1 IDCDCO 1

P o 00 oiCS 1"o

CO ID 1 1 I 1

1tMegeg

oegeg

oCDOCOoCOegCO

olO ID1-eg■DCDei 1

Q CO CD17- GO oo 1 1 1o

«2 r, 1 1 1 1aegjj

egopM

oegegOCOei

oo-tegoc-o

IDr~CDc>Degeg

oo00«o CO eg oP CO CO I—1 eg 1 1 1GO 1 1 1

u1

o•OU317-o1—

oCO

ID05eiCDCOCO

OIDCOID05

COODeoCO -# o oP eg 1 I 1 1GO 1 1 1

§ r~or—1 o »o oeg IDeg g eg 00oCmOOPI

OQrOo

o1—Io ID Uj CO eg CO

%>

a ^ M ^ d g D ^ ® 1

6 0 7

Page 18: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

6 0 8 D u tta M aju m der, D u tta and G anguli

(vide section 4.2) for vowels. The phonetic identity of vowels seems to depend not on the absolute values of the formant frequencies but on the relative overall formant Structure of the speaker. Third formant frequency seems to serve as an indicator to the formant structure of an individual speaker. The Hindi vowels show considerable overlapping in the i ’g plane. Even for a single speaker "U (gi) overlaps with 0 («fl) and i (t) with e (v) (fiigure 7). This overlapping does not exist in plane (where = F^—F^ and F.^ = F^—F^ (figure8). So a clear separation of different vowels and clustering of the same vowels may bo obtained giving proper weight to F . In coartioulation with consonants the vowel formants show marked shift during transitory parts. The rate and direction of the formant movements are important cues for identification of con­sonants.

uisruiBuiiON Oh HINDI vomiMiferMAHi

"rfn--- ,W

Fi^uro 7. Distribution of Hindi Vowels in F i — F^ Plane

We shall now discuss some interesting features of the concentration of energy for consonants. Table 2, presents the concentration of acoustic energy for plo­sives and some affricates. Though the number of observations are not large

Page 19: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

enough to be oonolusive but they indicate certain features of considerable signi­ficance firoin the aooustic phonetic point of view. Sometimes the energy concen­trations ate so weak that they are untraceable in the spectrographs.

Some, studies on acoustic features etc. dOd

DisTHiBi/noN or mu \towns m ptAne

Figure H. Distribution of Hindi Vowels in F ^ —F ^ Plano

TJi.0 first concentration of energy for aveolars and labials is below 1000 c/s., whereas tliat of velars and dentals, are above 1000 c/s. Affricates exhibit the first con­centration at very high frequencies 2500 o/s and above It is often very difficult to separate plosive or fricative part from the aspirated part of on aspirated conso­nant directly from a study of a normal spectrogram. Of course, the duration of plosion con be determined from the frequency vs time plot. However the onset of the vowel transition is usully clearly marked and represents the end of the preceeding consonant. It will therefore, be easy to determine the total duration of the aspirated consonants. The detection of occlusive portion of a consonant, where it exists presents no dififioulty,

5

Page 20: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

6 1 0 D u tta M a ju m d e r , D u tta a n d G an g u li

-fc!c

H

I 1

I I

I I

ts>

Iff S'

2 5 |r~

0 oS -8 Si-H (M

§ -2 I2 s °IN lOTil Til0 ^ 0 lo oO eo m2 oCOIN OoCO IX) IN

Iff g

1 5 I»« 5<=> oI ° sI s ®3 ** S

S 8

o O 5 CO QOIN (N

° 5 §

2 B 2

o o p g 3 S

hy ^! O o

I 8.lO0 BCO

® -gCO ^

I1 8eo

I 8

I

CO

I<LiP

tr e

I—I IN

Its p o -H Its

O ° O CO lO

Iff O,g o §

II

H II II

cS* t?

Page 21: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

Some studies on acoustic features etc. 611

The table 3 represents the conoontration of acoustic energy for three groups of consonants namely nasals, laterals and fricatives. The nasal and lateral conso­nants show concentration of acoustic energy at lower as well as higher ranges, whereas the fricatives show concentration of acoustic energy at higher ranges only-

Tables 4 and 5 represent time duration for occlusion, plosion, aspiration and total average time, corresponding to the consonants of table 2 and 3 It will appear from table 4 that the total duration of consonant minus the occlusion period could be effectively used as a distinctive feature for distinguisliing aspirated from unaspii*ated counterparts, the aspirated consonants exhibiting much larger value of duration.

Table 3. Concentration of acoustic energies for Hindi consonants (nasal, lateral & fricatives)

no. of obs.

no. of obs. Ca Ca Ci

5T 150 160 1150 2200

Nasal in) to - — (m) to to to —

500 450 1400 2400

150 1200 2000 200 1400 2300 3400

Lal/oral (r) 6 to to ro 0) 7 to to to to

600 1700 2300 460 1700 3000 3800

1600 3600 5000 1700 3700

Fractive (s) 7 to to to (s') 3 to to

1800 4500 7000 3000 6000

? 150 1150 3500

(h) 5 to to to - - _ _ — — —

700 ISOO 6000

^ finerev O, = Sooond concentration of ocoustic energyf?i = Pirat concentration of acoustic eneigy. j ■ j.- r‘ n i. „ CIa = Fourth concentration of acoustic onergy

Gj -- Third concentration of acoustic onergy O*

Page 22: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

Table 4. Period of occlusion, plosion and aspiration for Hindi consonants

012 Dutta Majumder, Dutta and Ganguli

consonant no. of consonant no. ofplosive obs. <0 tp plosive obs. 0 taap

8 98 24 2 126 130(k) (kh)»T6 GO 23(g)4 67 38 W

2 76 130(0) (ch)5T 7 133 30 7 103 93(i) (jh)3 94 2 2

7 3 153 81(t) . (tb)3*1 27 IS(d)cT4 99 18(t)3 63 18 «T 3 60 73(d) (dh)

g 4 76 2 2gi 3 78 135(P) (ph)g 5 1 1 2 18 IT

1 99 63(b) (bh)to = OccluBion period tp = Plosion period t'anp = Aspiration period

Page 23: Indian J. Phya.arxiv.iacs.res.in:8080/jspui/bitstream/10821/3041/1/Some studies on... · Njhah Ran JAN Ganouli Electronics and Communication Sciences Laboratory Research & Training

Some slmdiea on acoustic features etc.

Table 5. Time duration of Hindi consonants

613

no. of obs.

avoragototal

duration

( n )

3T(m)

(1)

(r)

(«)

(s')

(h)

A c k n o w led g e m e n ts

The aiithors express their grateful thanks to Prof. Hiorjo Kostic, Director Institute of Experimental Phonetics, Hoograd, Yugoslavia for the helpful discus­sion, and Prof. C. R. Rao, E.R S., Director, Research and Training School, TSI and Late Prof. P. C. Mahalanobis, F.R.S. for their kind interest in the work. The authors also thank Shri d Das, Shri B. Mukhorjoo and Sm. Krishna Dutta Roy for their help in the work. The authors express their gratitude to the autho­rities of All India Radio particularly the Hindi Nows aimounoers of AIR, Now Delhi, for providing tho Indian Statistical Institute with their recorded voices for a number of Hindi Speech Sound.

R efer en ces

Barczinaki L. & Thienhaus E. 1936 Klangspectrc7i urul Lausiarke deutscher Spra^he, Arch,Neerl. de Phon. Exp. II, 47.

Cooper F. S., Delattre P. C., Libberman A. M , Borstand .T M. & Geratman L. J. 1962 24, 097.Chiba T. AKajiyama, 1941 The Vowelr-ite Nature and Structure, Tokyo.Bulta Hajumder D. & Dutta A. K. 1969, J IT E , 15, 4.Dutta Majumder D. & Dutta A. L. 1968, /r»d*an J. Phya. 42. 7.Dutta Majumder D. & Dutta A. K. 1968 Automa^^one E Strumentazione, Milan.Doclerk J. L. 1905 Proc. Fifth. Internatwnal Oongreaa %n Acouatica, Liege, A26.Diiiin H. K. 1950 J. Aeauat. Soc Amer. 22, 740. , o i.Pant G. 1959 Aeouatie Ancdyaia and Syntheata of Speech wUh AppUcaUon to

Pmal Rept. AFOB^64:-ZOO. 1903 Research Laboratory of Electronics. MIL, Cambridge,