Download pdf - Bin-Muqbil - 2006 - Phonetic and Phonological Aspects of Arabic Emphat

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

PHONETIC AND PHONOLOGICAL ASPECTS

OF ARABIC EMPHA TICS AND GUTTURALS

by

Musaed S. Bin-Muqbil

A dissertation submitted in partial fulfillment of

the requirements for the degree of

Doctor of Philosophy

(Linguistics)

at the

UNIVERSITY OF WISCONSIN-MADISON

2006

elijah


UMI Number: 3222872

Copyright 2006 by Bin-Muqbil, Musaed S.

All rights reserved.

INFORMATION TO USERS

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

®

UMI UMI Microform 3222872

Copyright 2006 by ProQuest Information and Learning Company.

All rights reserved. This microform edition is protected against

unauthorized copying under Title 17, United States Code.

ProQuest Information and Learning Company 300 North Zeeb Road

P.O. Box 1346 Ann Arbor, Ml48106-1346


©Copyright by Musaed S. Bin-Muqbil2006 All Rights Reserved


.::!.l

.!!!

A dissertation entitled

Phonetic and Phonological Aspects of Arabic Emphatics and Gutturals

submitted to the.Graduate School of the University of Wisconsin-Madison

in partial fulfillment of the requirements for the degree of Doctor of Philosophy

by

Musaed S. Bin-Muqbil

Date of Final Oral Examination: April 5, 2006

Month & Year Degree to be awarded: December May 2006 August

**************************************************************************************************

"7 //}Approval Signatures of Dissertation Committee

(LA

Signature, Dean of Graduate School


To my family.


11

ABSTRACT

Existing formal representations of Arabic emphatic and guttural sounds are ill-

motivated articulatorily and suffer from descriptive and analytic inadequacies. This dis-

sertation aims to clarify our understanding of the articulatory attributes of these sounds as

reflected in their acoustic characteristics. The present experimental finding that the sec-

ondary articulation of emphatics is distinct from the primary articulation of gutturals re-

quires a grounded representational distinction.

Three acoustic experiments, using Modern Standard Arabic speech samples from

five male subjects, tested the acoustic characteristics of emphatics and gutturals. The first

experiment, comparing spectral qualities of consonants, found no reliable differences be-

tween spectral shapes of emphatics and non-emphatics. Acoustic attributes of uvular con-

tinuants argue for a fricative, not approximant, articulation. The second experiment ex-

amined the coarticulatory impact of consonants on formant frequencies of adjacent

vowels. Results indicate pharyngeals are more strongly associated with high Fl transi-

tions than emphatics and uvulars, which are associated with low F2 transitions. The F2

effect was stronger in emphatics than in uvulars. Emphatics and uvulars are thus under-

stood to be articulated with a retracted tongue dorsum while pharyngeals are articulated

with a retracted tongue root. Dorsal retractions in emphatics and uvulars are argued to be

qualitatively different. The third experiment investigates vowel-to-vowel coarticulation

across intervening consonants. Results show emphatics blocking or weakening coarticu-

lation. Coarticulatory effects of the three uvulars depend on their degree of constriction: a

stronger constriction corresponds to a stronger resistance to vowel-to-vowel coarticula-


iii

tion. Remaining sounds allow vowel-to-vowel coarticulation. These results are attributed

to articulatory differentiation: emphatics employ the styloglossus and hyoglossus for their

dorsal articulation; uvulars primarily use the palatoglossus and secondarily the styloglos-

sus.

Taken together, experimental results lead to important implications for phonetic

grounding of Arabic emphatics and gutturals: emphatics and uvulars share a secondary

dorsal component; uvulars, pharyngeals, and laryngeals share a primary radical compo-

nent. The pharynx, then, is best viewed within phonology as a single active articulator

grouping guttural subclasses into one natural class. Formal representations based on these

views are more capable of handling the patterning of these sounds in Arabic phonology.

Implications for phonological analyses of Tigre and Sta'at'imcest (Lillooet Salish) are

discussed.


iv

ACKNOWLEDGEMENTS

Anyone who has undertaken a doctoral dissertation would, more likely than not,

remember all sorts of challenge, anxieties, and sleepless nights. They would also remem-

ber faces, names, and exchanges that soothed those pains away. I should know. There

were occasions when certain difficulties I faced bordered on being insurmountable. Luck-

ily, I was surrounded by people who were eager to lend a capable helping hand. At the

forefront is my academic advisor, Dr. Thomas Purnell. It is quite difficult to extend

enough gratitude to a man who, for several years, patiently guided my steps through this

winding road till I reached my goal. I have worked with Dr. Purnell for years and never

once do I recall him being less than gracious and supportive. Above all, he instilled in me

a great deal of confidence without which I doubt this work would have ever seen the

· light. Thanks, Dr. Purnell.

I all honesty, I have been blessed with a committee of highly regarded professors

who combine mastery of their respective fields of study with welcoming attitudes. I am

greatly indebted to Dr. Raymond Kent for his direction and insights in the field of ex-

perimental phonetics. I can never forget how cheerful and respectful that man is. I have

never left him after a meeting with him without feeling much more informed than before.

I'm very proud to say that I have learned from one of the few masters in the field. I thank

Dr. Paul Milenkovic for all the help I received from him in regards to the experimental

methods used in this dissertation. He dedicated a great deal of his valuable time to the

development of a capable computer algorithm to calculate Multi-Band Spectra specifi-

cally for my dissertation. I am very grateful and I am sure many future researchers will


v

be, too. I also send my gratitude to Dr. Joseph Salmons for all the generous help I have

received from him. As he would typically do, Dr. Salmons provided me with enriching

feedback that elevated the quality of my work. I thank him for it. I would also like to

thank Dr. Rand Valentine for all the advice and support I have received from him. His

cheerful attitude and consummate workmanship are example that I hope I would be able

to follow. No matter what academic degree is conferred upon me now or in the future, I

will always consider myself a student of those gentlemen.

I am grateful to King Saud University for their generosity in granting me the op-

portunity to pursue my higher studies. I am particularly indebted to the faculty of the

English Language Department for placing their faith in me and providing me with the

chance to realize my dreams.

There is no possible way that I could show my gratitude to my family. I am sure

my late father would have been very proud of me. I ask the Almighty Allah to bestow His

mercy on him. My dear mother had to endure my years-long absence in silence. Her suf-

fering and her prayers dwarf any thanks I can direct to her. I ask Allah to enable me to

honor her the way she should be honored. My gratitude to my brothers and sisters knows

no bounds. I also thank my dear relatives and my wonderful in-laws. Their prayers and

well wishes will never be forgotten. In their absence, my wonderful son Faaris has been

the source of my cheers and happiness. His laughter and playfulness never failed to ener-

gize me whenever I felt down. Many times he would wipe the worries of the outside

world off my mind with a simple 'hala baba!' ('Welcome, papa!') as I walk into the

house. My ever-flowing love and gratitude go to my dear wife Abeer who has been with

me through thick and thin. She endured more than five years away from her family just to


Vl

share it all with me. Words can never do her justice. Thank you, Abeer. You are truly a

blessing.

My first, last, and continuous thanks go to the Almighty Allah who blessed me

with everything that I have and everything that I am. I pray to Him to enable me to use

whatever I have learned for the good of mankind.


Vll

TABLE OF CONTENTS

ABSTRACT ..................................................................................................................... ii

ACKNOWLEDGEMENTS ............................................................................................... iv

TABLE OF CONTENTS .................................................................................................. vii

LIST OF FIGURES ............................................................................................................. X

LIST OF TABLES ........................................................................................................... xiv

CHAPTER 1 Introduction ................................................................................................. 1 1.1 Aims ............................................................................................................ 1 1.2 Rationale ..................................................................................................... 4

1.2.1 Experimental phonetics and phonological representations ............. 4 1.2.2 Acoustic-articulatory relations ........................................................ 8

1.3 Modern Standard Arabic (MSA) .............................................................. 14 1.4 Overview of the dissertation ..................................................................... 17

CHAPTER2 2.1

2.2

2.3

2.4

2.5

Background and Literature Review .......................................................... 22 Basic Vocal Tract Anatomy ...................................................................... 23 2.1.1 The Tongue ................................................................................... 23 2.1.2 The Pharynx ........................................................................ , ......... 26 2.1.3 The Soft Palate .............................................................................. 27 2.1.4 The Larynx .............................................................. : ..................... 29 Phonetic Properties of Arabic Emphatics and Gutturals ........................... 31 2.2.1 Emphatics ...................................................................................... 31 2.2.2 Uvulars .......................................................................................... 40 2.2.3 Pharyngeals ................................................................................... 45 2.2.4 Laryngeals ......... : ........................................................................... 50 Gutturals as a Nat ural Class ...................................................................... 54 2.3.1 Morpheme Structure Constraints .................................................. 55 2.3.2 Guttural Lowering ......................................................................... 60 Representations of Emphatics and Gutturals ............................................ 62 2.4.1 McCarthy (1994) ........................................................................... 65 2.4.2 Rose (1996) ................................................................................... 68 2.4.3 Zawaydeh (1999) .......................................................................... 70 Representational Problems ........................................................................ 73

CHAPTER 3 Experiment One: The Spectral Shapes of Consonants ............................. 79 3.1 Overview ................................................................................................... 79 3.2 Methods ..................................................................................................... 86


Vlll

3.2.1 Subjects ......................................................................................... 86 3.2.2 Stimuli ........................................................................................... 87 3.2.3 Procedures ..................................................................................... 89 3.2.4 Acoustic Analysis ......................................................................... 90

3.2.4.1 Spectral Moments .......................................................... 90 3.2.4.2 Multi-Band Spectra (MBS) ............................................ 93

3.2.5 Reliability ...................................................................................... 93 3.3 Results ....................................................................................................... 94

3.3.1 Spectral Moments ......................................................................... 94 3.3.1.1 Voiceless Continuants- Pooled Data ............................ 94 3.3.1.2 Voiceless Continuants- Individual Subjects ................. 99 3.3.1.3 Voiceless Continuants- Specific Vowel Contexts ...... 101 3.3.1.4 Voiceless Continuants- Discriminant Analysis .......... 103 3.3.1.5 Voiced Continuants- Pooled Data .............................. 104 3.3.1.6 Voiced Continuants- Individual Subjects ................... 109 3.3.1.7 Voiced Continuants- Specific Vowel Contexts .......... 111 3.3.1.8 Voiced Continuants- Discriminant Analysis .............. 112 3.3.1.9 Voiceless Stops- Pooled Data .................................... 113 3.3.1.10 Voiceless Stops- Individual Subjects ......................... 117 3.3.1.11 Voiceless Stops- Specific Vowel Contexts ................ 120 3.3.1.12 Voiceless Stops- Discriminant Analysis .................... 121 3.3.1.13 Voiced Stops- Pooled Data ........................................ 122 3.3.1.14 Voiced Stops- Individual Subjects ............................. 125 3.3.1.15 Voiced Stops- Specific Vowel Contexts .................... 127 3.3.1.16 Voiced Stops- Discriminant Analysis ........................ 127

3.3.2 Multi-Band Spectra ..................................................................... 128 3.3.2.1 Voiceless Continuants .................................................. 128 3.3.2.2 Voiceless Continuants- Discriminant Analysis .......... 131 3.3.2.3 Voiceless Stops ............................................................ 132 3.3.2.4 Voiceless Stops- Discriminant Analysis ....................... 134

3.4 Discussion and Conclusions ................................................................... 135 3.5. Summary ................................................................................................. 143

CHAPTER 4 Experiment Two: Anticipatory and Carryover Consonant-Vowel Coarticulation .......................................................................................... 145

4.1 Overview ............... .- ................................................................................. 145 4.2 Methods ................................................................................................... 149

4.2.1 Subjects ....................................................................................... 149 4.2.2 Stimuli. ........................................................................................ 149 4.2.3 Procedures ................................................................................... 151 4.2.4 Acoustic Analysis ....................................................................... 151 4.2.5 Reliability .................................................................................... 152

4.3' Results ..................................................................................................... 153 4.3.1 Anticipatory (VC) Coarticulation ............................................... 159

4.3.1.1 Anticipatory Coarticulation in Fl ................................ 159


4.4 4.5

CHAPTERS 5.1 5.2

5.3

5.4 5.5

CHAPTER6 6.1 6.2 6.3

6.4 6.5

IX

4.3.1.2 Anticipatory Coarticulation in F2 ................................ 164 4.3.1.3 Anticipatory Coarticulation- Discriminant Analysis . 169

4.3.2 Carryover (CV) Coarticulation ................................................... 173 4.3.2.2 Carryover Coarticulation in F1 .................................... 176 4.3.2.2 Carryover Coarticulation in F2 .................................... 181 4.3.2.3 Carryover Coarticulation- Discriminant Analysis ...... 186

Discussion and Conclusions ................................................................... 189 Summary ................................................................................................. 204

Experiment Three: Vowel-to-Vowel Coarticulation .............................. 206 Overview ................................................................................................. 206 Methods ................................................................................................... 210 5.2.1 Subjects ....................................................................................... 210 5.2.2 Stimuli ......................................................................................... 210 5.2.3 Procedures ................................................................................... 210 5.2.4 Acoustic Analysis ....................................................................... 210 5.2.5 Reliability .................................................................................... 211 Results ..................................................................................................... 212 5.3.1 Anticipatory Vowel-to-Vowel Coarticulation ............................ 212 5.3.2 Carryover Vowel-to-Vowel Coarticulation ................................ 217 Discussion and Conclusions ................................................................... 223 Summary ................................................................................................. 230

Implications and Alternatives ................................................................. 233 Emphatic and Guttural Articulations ...................................................... 234 Alternative Basis for the Guttural Natural Class .................................... 243 Formal Representations ........................... .-.............................................. 248 6.3.1 Arabic Morpheme Structure Constraints Revisited .................... 255 6.3.2 Guttural Lowering Revisited ....................................................... 262 A Note on Ethio-Semitic and Interior Salish .......................................... 264 Summary ................................................................................................. 268

CHAPTER 7 Conclusion and Future Directions ........................................................... 271

REFERENCES ............................................................................................................... 278

APPENDIX A ................................................................................................................ 290

APPENDIX B ................................................................................................................ 296

APPENDIX C ................................................................................................................ 301


X

LIST OF FIGURES

Figure 1.1. Points of minimum velocity (nodes) and maximum velocity (an tin odes) for the first two formant frequencies of vowels .................................. 11

Figure 1.2. Illustration of how the articulation of the three Arabic vowels [i, u, a] is related to their acoustic shapes in the light of the source-filter theory .............. 13

Figure 2.1. The extrinsic muscles of the tongue along with some other vocal tract organs ............................ .-........................................................................................ 25

Figure 2.2. The pharyngeal constrictors and related structures ......................................... 25

Figure 2.3. Muscles of the soft palate along with related structures ................................. 28

Figure 2.4. Structure of the larynx ..................................................................................... 28

Figure 2.5. A schematic illustration of the vocal tract configuration during the articulation of an Arabic emphatic coronal and its non-emphatic counterpart. ............................................................................................................ 3 2

Figure 2.6. Schematic illustrations of the vocal tract configurations during the articulation of an Arabic uvulars ........................................................................... .41

Figure 2. 7. A schematic illustration of the vocal tract configuration during the articulation of an Arabic pharyngeal consonant. .................................................. .46

Figure 3.1. A multi-band spectrum (stepped line) and an FFT spectrum for the Arabic voiceless fricative [s] in the sequence [asa] both generated from a 40-ms full Hamming window placed at the middle of the frication noise ............. 85

Figure 3.2. Locations of the sampling windows at which the spectral moments for fricatives (above) and stops (below) were calculated ............................................ 92

Figure 3.3. Spectral moments values for voiceless continuants at the five sampling window locations ................................................................................................... 97.

Figure 3.4. Box plots of the distributions of the spectral moments scores for the four voiceless continuants [s, s", x, h) ................................................................... 98

Figure 3.5. Box plots showing the distributions of the four voiceless continuants spectral moments scores for each of the five individual subjects ........................ 1 00


Xl

Figure 3.6. Spectral moments values for voiced continuants at the five sampling window locations ................................................................................................. 1 06

Figure 3. 7. Box plots of the distributions of the spectral moments scores for the four voiced continuants [0, o", B", )] ......•.•.•.•.••..•.•.•... ; ......•.•.•.•.•.•••........•.••••.•.•.•.• 108

Figure 3.8. Box plots showing the distributions of the four voiced continuants spectral moments scores for each of the five individual subjects ........................ 110

Figure 3.9. Spectral moments values for voiceless stops at the two sampling window locations ................................................................................................. 115

Figure 3.10. Box plots of the distributions of the spectral moments scores for the four voiceless stops [t, t\ k, q] ............................................................................ 117

Figure 3 .11. Box plots showing the distributions of the four voiceless stops spectral moments scores for each of the five individual subjects ........................ 118

Figure 3.12. Spectral moments values for voiced stops at the two sampling window locations ................................................................................................. 124

Figure 3.13. Box plots of the distributions of the spectral moments scores for the two voiced stops [d, d'] ....................................................................................... 125

Figure 3.14. Box plots showing the distributions of the two voiced stops spectral moments scores for each of the five individual subjects ..................................... 126

Figure 3.15. Four histograms replicating the multi-band spectra of the four voiceless continuants ........................................................................................... 130

Figure 3.16. Four histograms replicating the multi-band spectra of the four voiceless stops ...................................................................................................... 133

Figure 4.1. Cursor locations at vowel steady states in the CV (b) and VC (c) contexts as well as at the vowel transition edges in the two contexts (a and d, respectively) ................................................................... ; ................................. 152

Figure 4.2. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the four Arabic plain coronals [t, d, o, s] and their emphatic counterparts [tl', dl', ol', ................................................................... 157

Figure 4.3. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the velar [k], the three uvulars [q, x, B], the two pharyngeals [h, )] and the two laryngeals [h, ?] ................................................. 158


Figure 4.4. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] following the three Arabic plain coronals [t, d, s] and their

Xll

emphatic counterparts dl:, ........................................................................ 174

Figure 4.5. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the velar [k], the three uvulars [q, x, B"], the two pharyngeals [h, )] and the two laryngeals [h, ?]. ................................................ 175

Figure 4.6. Mean F2 transitions next to the non-emphatic coronals and their emphatic counterparts .......................................................................................... 194

Figure 4. 7. Stylized second formant tracks of the three Arabic vowels [i, a, u] preceding the four Arabic plain coronals [t, d, o, s] and their emphatic counterparts [tl:, o'l, s"] ................................................................................... 196

Figure 4.8. Stylized second formant tracks of the three Arabic vowels [i, a, u] preceding the Arabic velar [k] as well as the seven gutturals [q, x, ff, h, ), h, ?] ...................................................................................................................... 197

Figure 4.9. Stylized second formant tracks of the three Arabic vowels [i, a, u] following the three Arabic plain coronals [t, d, s] and their emphatic counterparts ..................................... -.................................................... 198

Figure 4.10. Stylized second formant tracks of the three Arabic vowels [i, a, u] following the Arabic velar [k] as well as the seven gutturals [q, x, ff, h, ), h, ?] ...................................................................................................................... 199

Figure 5 .I. Anticipatory V V coarticulatory effects on the three Arabic vowels [i, a, u] across the four plain coronals [t, d, o, s] and their emphatic counterparts [t", s'>] ......................................................................... _ .......... 214

Figure 5 .2. Anticipatory V -to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the velar [k], the three uvulars [X, ff, q], the two pharyngeals [h, ?], and the two laryngeals [h, ?] ................................................ 215

Figure 5.3. Carryover V-to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the four plain coronals [t, d, o, s] and their emphatic counterparts [t\ d'I, o'>, s'>] ................................................................................... 219


Figure 5.4. Carryover V-to-V coarticulatory effects on the three Arabic vowels [i, a, u] across [k], the three uvulars [X, B", q], the two pharyngeals [h, ?],

xiii

and the two laryngeals [h, ?]. .............................................................................. 220

Figure 5.5. Sizes of anticipatory and carryover vowel-to-vowel coarticulatory effects across the sixteen Arabic consonants under investigation ....................... 222

Figure 6.1. X-ray tracings of palatalized and velarized laterals and liquids in Russian (Bolla 1981, plates 76-79) ..................................................................... 239


XIV

LIST OF TABLES

Table 2.1 A Summary of the phonetic attributes of Arabic emphatic, uvular, pharyngeal, and laryngeal sounds .................................................................. , ....... 53

Table 2.2. Frequencies of consonant cboccurrences in Arabic roots (from McCarthy 1994:204) ............................................................ , ................................. 57

Table 3.1. Mean values of spectral moments for voiceless continuants averaged across speakers, window locations, and vowel contexts ........................................ 95

Table 3.2. Mean values of spectral moments for voiceless continuants averaged from windows 2 and 3 and across speakers and vowel contexts ........................... 98

Table 3.3. Results of the discriminant analysis for the voiceless continuants based on the four spectral moments' values combined together as predictors .............. 1 04

Table 3.4. Mean values of spectral moments for voiced continuants averaged across speakers, window locations, and vowel contexts ...................................... 105

Table 3.5. Mean values of spectral moments for voiced continuants averaged from windows 3 and 4 and across speakers and vowel contexts .................................. 107

Table 3.6. Results of the discriminant analysis for the voiced continuants based on the four spectral moments' values combined together as predictors ................... 112

Table 3.7. Mean values of spectral moments for voiceless stops averaged across speakers, window locations, and vowel contexts ................................................. 113

Table 3.8. Mean values of spectral moments for voiceless stops calculated at window 1 and averaged across speakers and vowel contexts .............................. 116

Table 3.9. Results of the discriminant analysis for the voiceless stops based on the four spectral moments' values combined together as predictors ......................... 122

Table 3.10. Mean values of spectral moments for voiced stops averaged across speakers, window locations, and vowel contexts ................................................. 123

Table 3.11. Mean values of spectral moments for voiced stops calculated at window 1 and averaged across speakers and vowel contexts .............................. 125

Table 3.12. Results of the discriminant analysis for the voiced stops based on the four spectral moments' values combined together as predictors ......................... 128


Table 3.13. Mean relative intensity values at the 11 frequency bands for the four voiceless continuants averaged from the two sampling windows across

XV

speakers and vowel contexts ................................................................................ 129

Table 3.14. Mean normalized relative intensity values at the 11 frequency bands for the four voiceless continuants averaged from the two sampling windows across speakers and vowel contexts ..................................................... 129

Table 3.15. Results of the discriminant analysis for the voiceless continuants based on the normalized intensity values at each of the 11 frequency bands, averaged from the two sampling window locations, combined together as predictors ........................................................................................... 131

Table 3.16. Mean relative intensity values at the 11 frequency bands for the four voiceless stops averaged across speakers and vowel contexts ............................. 132

Table 3.17. Mean normalized relative intensity values at the 11 frequency bands for the four voiceless stops averaged across speakers and vowel contexts ......... 133

Table 3.18. Results of the discriminant analysis for the voiceless stops based on the normalized intensity values at each of the 11 frequency bands combined together as predictors .......................................................................... 135

Table 3.19. Results of the discriminant analyses for the plain/emphatic consonant pairs based on the spectral moments values as predictors ................................... 137

Table 3.20. Results of the discriminant analyses for the plain/emphatic voiceless consonant pairs based on the normalized relative intensity values of the multi-band spectra as predictors .......................................................................... 137

Table 4.1. Average formant frequency values for the vowel [i] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants ................................................................................ 154

Table 4.2. Average formant frequency values for the vowel [a] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants ................................................................................ 155

Table 4.3. Average formant frequency values for the vowel [u] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants ................................................................................ 156

Table 4.4. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F1vowei values of the vowel [i] in the context [iC] ................................................ 160


Table 4.5. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of

XVl

F1offset values of the vowel [i] in the context [iC] ................................................. 160

Table 4.6. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F1vowei values of the vowel [a] in the context [aC] ............................................... 162

Table 4.7. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F 1 offset values of the vowel [a] in the context [ aC] ............................................... 162

Table 4.8. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F 1 vowel values of the vowel [ u] in the context [ uC] ............................................... 163

Table 4.9. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F1offset values of the vowel [u] in the context [uC]. .............................................. 163

Table 4.10. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowei values of the vowel [i] in the context [iC]. ............................................... 165

Table 4.11. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2offset values of the vowel [i] in the context [iC] ................................................. 165

Table 4.12. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowei values of the vowel [a] in the context [ aC] ............................................... 166

Table 4.13. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2onset values of the vowel [a] in the context [ aC] ............................................... 166

Table 4.14. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowei values of the vowel [u] in the context [uC] ............................................... 168

Table 4.15. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2offset values of the vowel [ u] in the context [ uC] ............................................... 168


Table 4.16. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of

XVll

F 1 transitions in VC contexts ............................................................................... 170

Table 4.17. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F2 transitions in VC contexts ............................................................................... 171

Table 4.18. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1onset values of the vowel [i] in the context [Ci]. ................................................ 177

Table 4.19. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F 1 vowel values of the vowel [i] in the context [Ci] ................................................ 177

Table 4.20. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1onset values of the vowel [a] in the context [Ca]. ............................................... 178

Table 4.21. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1vowel values of the vowel [a] in the context [Ca] ............................................... 178

Table 4.22. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1onset values of the vowel [u] in the context [Cu] ............................................... 180

Table 4.23. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1vowel values of the vowel [u] in the context [Cu] ............................................... 180

Table 4.24. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [i] in the context [Ci] ................................................. 182

Table 4.25. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowel values of the vowel [i] in the context [Ci] ................................................ 182

Table 4.26. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [a] in the context [Ca] ................................................ 183


Table 4.27. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of

xviii

F2vowel values of the vowel [a] in the context [C'a] ............................................... 183

Table 4.28. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [u] in the context [Cu] ............................................... 185

Table 4.29. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowel values of the vowel [ u] in the context [Cu] ............................................... 185

Table 4.30. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F1 transitions in CV contexts ............................................................................... 187

Table 4.31. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F2 transitions in CV contexts ............................................................................... 187

Table 5.1. Second formant frequency means (and standard deviations) in Hz of the V1 transitions preceding the four MSA emphatics and their non-emphatic counterparts .......................................................................................... 213

Table 5.2. Second formant frequency means (and standard deviations) in Hz of the V1 transitions preceding the MSA gutturals and the velar stop [k] ................ 213

Table 5.3. Second formant frequency means (and standard deviations) in Hz of the V2 transitions following the MSA emphatics and their non-emphatic counterparts .......................................................................................................... 218

Table 5.4. Second formant frequency means (and standard deviations) in Hz of the V2 transitions following the MSA gutturals and the velar stop [k]. ............... 218


1

CHAPTER 1

Introduction

1.1 Aims

Much of the phonetic and phonological research on Arabic discusses the sound

classes of emphatics and gutturals. Arabic emphatics ([t', d', o', s"]) are a set of complex

phonemes that are produced with a primary coronal articulation and a secondary articula-

tion involving the retraction of the tongue body into the oropharynx. This secondary ar-

ticulation is what distinguishes the four emphatics from their non-emphatic counterparts

([t, d, o, s]). Arabic gutturals are a class of consonants produced primarily in the lar-

ynx/pharynx region. Arabic has seven guttural consonants. The two laryngeals, [h] and

[?], are produced at the larynx with a fully open or fully constricted glottis, respectively.

The two pharyngeals, [h, 1], are produced by a retraction of the tongue root, the anterior

wall of the pharynx, and the epiglottis towards the posterior wall of the pharynx. The

uvulars, [X, E, q], are produced with a retracted and raised tongue body accompanied, in

the case of [E], by a lowered soft palate forming a constriction in the uppermost orophar-

ynx, or, in the cases of [X] and [q], by a raised and flattened soft palate. While gutturals

are clearly produced at different points of articulation, significant phonological evidence

has been presented which suggest that these three subsets are members of a single phono-

logical natural class in terms of place of articulation.


2

The term 'emphatics' is one of several terms that have been used to refer to the set

of complex coronals in Arabic. According to Lehn (1963), these sounds have also been

termed pharyngealized, velarized, uvularized, retracted, strongly articulated, and heavy:

While some of these terms (including 'emphatics') are rather impressionistic, other terms

reflect disagreements between linguists in regards to the articulatory nature of the secon-

dary articulation involved in the production of these sounds. The most prominent pro-

posal is that these sounds are pharyngealized. This is basically a place of articulation term

that reflects the fact that the pharynx is generally narrowed during the articulation of

these sounds. It is possible that the prevalence of this designation emanates from the at-

tractive notion of equating the secondary articulation of emphatics to the primary articu-

lation of pharyngeals since both sets of sounds exist in the same language and in both the

pharynx is constricted. Hence, as detailed below, several phonologists propose formal

representations that involve some sort of a pharyngeal component that is present as a

primary place/articulator feature in gutturals and as a secondary feature in emphatics. As

a result, these proposals face some formidable phonetic and phonological challenges

ranging from phonetic-phonological disparity to theoretical descriptive and analytical in-

adequacies.

The main goal of this study is to highlight the inadequacies of the existing formal

proposals for representing Arabic emphatics and gutturals and to propose alternative rep-

resentations that overcome those weaknesses. Although the central pursuits here revolve

around the nature emphatic articulation, the linguistic nature of these sounds cannot be

fully understood without including gutturals in the same investigation. While the present

1 Lehn also lists another somewhat curious term, u-resonance.


3

study concludes that the secondary articulation of emphatics and the primary articula-

tion(s) of gutturals have to be fundamentally different, there are some compelling reasons

to cover gutturals extensively in the same study. First, in order to refute the notion that

emphatics and gutturals employ a similar articulator, we have to compare and contrast the

two classes based on similar parameters. Second, the guttural subclass of uvulars holds

some interesting phonetic affinities to emphatics. We have already mentioned that some

phonologists term emphatics as 'uvularized' in reference to a uvular articulation accom-

panying the main coronal articulation in emphatics. Third, the articulatory similarities

and differences between emphatics and gutturals may not be the same in all languages.

Brief, but important, considerations of other languages are stated in various locations of

this dissertation that highlight this particular issue. Fourth, existing groupings of the three

subclasses of gutturals (uvulars, pharyngeals, and laryngeals) into one natural class is in

need of clarification, specifically with regard to the unifying foundation of this sound

class.

The present study proposes articulator-based alternatives to the formal representa-

tions of emphatics and gutturals on the basis of acoustic data. The idea is to relate the sa-

lient acoustic correlates of these sounds to their possible articulatory traits. It is possible,

then, to accept or refute the different claims regarding those traits on the basis of acous-

tic/articulatory compatibility. The acoustic data reported here suggest that the secondary

articulation in Arabic emphatics is fundamentally different from the articulations of all

three guttural The secondary articulation in emphatics is argued to be exe-

cuted through the retraction of the tongue dorsum with no active involvement of any pha-

2 To foreshadow, however, the data suggest that there are some common traits between the secondary articulation in Arabic emphatics and the articulation of uvulars.


4

ryngeal component. By comparison, all three guttural subclasses are produced with active

participation of the pharynx which is understood to be a linguistic reference to the area

extending from the anterior faucial pillars to the larynx, inclusive. Accordingly, emphat-

ics are represented with a primary coronal articulation and secondary dorsal articulation.

Uvulars are represented with a primary radical articulation and a secondary dorsal articu-

lation. Pharyngeals and laryngeals are represented as purely radical sounds. These repre-

sentations are shown to be more adequate at describing and explaining the most promi-

nent phonological phenomena associated with these sounds. Furthermore, it is argued that

the pharyngeal region can be considered an active articulator, not merely a place of ar-

ticulation, for the class of guttural sounds. Unlike oral articulators, however, the pharyn-

geal articulator is defined at an abstract neuromotoric level.

The following section of this chapter explains the basic rationale behind this

study. Section 1.3 acquaints the reader with Modern Standard Arabic (MSA) from which

the experimental data is collected. Section 1.4 overviews the dissertation.

1.2 Rationale

1.2.1 Experimental phonetics and phonological representations

The position taken in this dissertation is that experimental phonetic methods play

an important part in motivating, verifying, and refuting formal phonological representa-

tions. This is in spite of the murky pool of arguments and counterarguments that charac-

terize the phonetics-phonology relationship. This relationship ranges in the literature

from a near-total separation to a full integration of the two fields. The recognition of the


5

two fields started rather vaguely with Ferdinand de Saussure's (in Course in General

Linguistics; 1915; reprinted in translation in 1966) distinction between langue (a higher

cognitive system of idiosyncratically related signifiers- signs- and signifieds- ide-

alized concepts) and parole (the physical instantiation of speech). But it was Trubetzkoy

(1969) who drew a principled distinction between phonetics and phonology3• In his view,

phonetics is "the study of sound pertaining to the act of speech, which is concerned with

concrete physical phenomena." This field "would have to use the methods of the natural

sciences". Meanwhile, phonology is "the study of sound pertaining to the system of Ian-

guage". This field "would use only the methods of linguistics, or the humanities, or the

social sciences" (pp. 3- 4). However, in spite of these proposed methodological delimita-

tions, Trubetzkoy utilizes phonetic terminology based on articulatory and acoustic speech

properties to describe the distinctive oppositions among speech sounds stating that "no

other discipline except phonetics can teach us about individual sound properties".

While Trubetzkoy believed that the minimal components of sound structure are

phonemes, his close colleague and fellow Prague School member Roman Jakobson main-

tained that distinctive features, the building units of phonemes, are the minimal compo-

nents. Jakobson, Fant, and Halle (1952) and Jakobson and Halle (1956), represent the

earliest extensive experimental accounts aimed at characterizing distinctive features by

reference to their acoustic, auditory, as well as articulatory correlates. In this early system

distinctive features are encoded into cover terms that implied a number of phonetic di-

mensions; articulatory, acoustic, and perceptual. However, the adequacy of this system is

3 According to Trubetzkoy, however, "Baudouin de Courtenay ... was the first to arrive at the idea th;:tt there should be two distinct types of descriptive sound study, depending on whether concrete sounds were to be investigated as physical phenomena or as phonic signals used by a speech community for purposes of communication." (pp. 4- 5)


6

challenged by languages whose phonemic inventories require separation of those dimen-

sions in order to express phonological contrasts. An example directly related to the topic

of this dissertation is provided by McCawley (1967; cited in Anderson 1985). The Jakob-

sonian feature [+flat] refers to sounds that involve a labial or back narrowing of the vocal

tract causing an acoustic lowering of the higher frequency components. In Arabic, the

feature [+flat] describes the back rounded vowel [u] as well as the emphatic consonants

since they involve a secondary constriction in the back of the vocal tract. As things stand

so far, there is no formal problem since, in Arabic, rounding is contrastive only in vowels

while 'pharyngealization' is contrastive only in consonants. But Arabic vowels become

pharyngealized when adjacent to pharyngealized consonants. How can this be expressed

as an acquisition of [+flat] by the vowel [u] which is already specified for that feature?

These challenges notwithstanding, however, it has been generally acknowledged that

phonological units are phonetically grounded since the publication of Jakobson's works.

In the early works of generative phonology, as explained in Chomsky and Halle's

(1968) Sound Pattern of English (SPE), the underlying forms of morphemes are made up

of strings of abstract, but not arbitrary, "phonetic features". These features are universal

since they "represent the phonetic capabilities of man" (p. 299). Further developments in

the theory saw the articulatory basis of features, as well as the phonetic role in phonol-

ogy, receiving considerable attention. Many of the subsequent generative frameworks

used feature systems based on Halle's (1983) proposal that phonological features repre-

sent neural instructions to the articulators. This "Articulatory Model" marks a departure

from the view that links features to passive cavities and places of articulation in the vocal

tract (or to inconsistent descriptions like the location of the highest point in the tongue).


7

Instead, the articulatory correlates of features are described as the actions of the active

movable articulators. Another influential development in the consideration of the nature

of features is the introduction of Feature Geometry (Clements 1985). Phonological fea-

tures came to be recognized as autosegmental entities-rather than matrix entries-that

are hierarchically grouped, reflecting the independent action of certain sets. of features in

phonological processes. In some of the later geometrical models (e.g. Keyser and Stevens

1994, Halle 1995, Halle et al. 2000) the anatomical architecture of the human vocal tract

plays a central role in the construction of feature trees. A more recent example

of such models is given in (1). Features are grouped under common articulator nodes that

denote the speech organs that physically execute these features. These nodes are further

grouped under articulator group nodes that reflect the anatomical or neural affinities

among the articulators. Finally, the articulator group nodes, along with the articulator-free

stricture features, are grouped under the root node.

The development of formal phonological representations within the general

framework of generative phonology brought alqng with it an increasingly tighter integra-

tion of phonology and phonetics. An important consequence of this progression is that

phonological representations, which are highly grounded in phonetics, lend themselves to

experimental methods. In recent years, ambitious attempts to subject phonological hy-

potheses to empirical validation (experimental phonology) have been gaining momentum.

The LabPhon forum (1990-present) is a noticeable example of such aspirations to shift

the field of phonological research into the domain of the mature sciences.


8 (1) Halle et al.' s (2000) feature tree.

[suction] [continuant] [strident] [lateral] [round]

>Lips [labial] [anterior]

/>Tongue Blade [consonantal] [distributed] Place [coronal] [sonorant]

[high] [low]

Tongue Body [back] [dorsal] [nasal] > Soft Palate [rhinal] [ATR]

) Tongue Root [RTR] [radical] [spread gl] Guttural · [constricted gl] [stiff vi] Larynx [slack vf] [glottal]

1.2.2 Acoustic-articulatory relations

This dissertation is built on the belief that the acoustic attributes of speech sounds

are a reflection of their articulatory qualities. It has been shown in various seminal works

that the different configurations assumed by the vocal tract correspond to systematic

acoustic The present dissertation depends on this relationship to further our un-

derstanding of the articulation of the speech sounds in question on the basis of their vari-

ous acoustic correlates. In this section we go through the basic tenets of acoustic-

articulatory relation in vowel production. The choice of vowel production as an example

4 As such, this dissertation follows the notion of articulatory and acoustic stability of Stevens (1989, 1999).


9

follows from the need to limit this discussion to a manageable length. This choice is also

based on the fact that acoustic-articulatory models of vowel production are less complex

than those of obstruents. Vowel articulations involve a single wide open resonator (the

vocal tract) and an energy source at its end (the vibrating vocal folds). Obstruents, on the

other hand, involve more complex resonators due to the higher degree of constriction in

their articulations. Furthermore, the energy source during obstruent articulations resides

within the resonator. Nevertheless, many of the basic aspects of the following discussion

apply to the cases of obstruents as well.

Classic works on vowels (Fant 1960, Steven and House 1961) have modeled the

human vocal tract (during the articulation of a simple mid-central vowel like English [g])

as a uniform pipe open at one end (corresponding to the lips) and closed at the other end

(corresponding to the glottis). The different resonating frequencies for this type of reso-

nator are calculated by the formula in (2).

(2) Fn = (2n

Where Fn is the n1h resonating frequency, cis the velocity ofsound, and lis the length of

the tube. What this formula means is that when the pipe resonator is excited by the acous-

tic energy generated by the vibration of the vocal folds it resonates at frequencies corre-

sponding to the odd multiples of the quarter-wavelength of a sine wave. This is because

those multiples coincide with maximum air volume velocity and minimum air volume

pressure at the open end of the tube, and to the opposites at the closed end (Chiba and

Kajiyama 1958; originally published in 1941). These resonating frequencies are known as

'formants'. Based on this formula, a pipe 17.7 em in length, which approximates the


10

length of an average male speaker (Stevens 1998), would produce formants at the fre-

quencies 500Hz, 1500Hz, 2500Hz, etc.

Those frequency values are based on the assumption that the vocal tract has a

more or less uniform diameter along its length as to produce the English mid-central

vowel variations to the diameter of the tube at different locations (corre-

sponding to the various narrowings in the vocal tract when producing other vowels) have

been shown to correspond to systematic and rather predictable variations in the values of

the formant frequencies (Stevens and House 1955). The perturbation theory of Chiba and

Kajiyama (1958) relates the patterns of changes for a given formant to the constrictions

made at the points of maximum volume velocity (points of minimum volume pressure;

antinodes) or at points of minimum volume velocity (points of maximum volume pres-

sure; nodes) of that formant. If the articulation of a given vowel results in a constriction

at or near the an tin ode of a certain formant, the formant is lowered. Conversely, if the

constriction is at or near the formant node, the formant is raised. Widening, rather than

constricting, the vocal tract at those points has an opposite effect. Lip rounding has one or

both of two possible articulatory products. It can result in a constriction at the antinodes

of all formants (since, as explained earlier, the odd multiples of the quarter-wavelength of

the sine wave always have their last antinode at the lips) or in a lengthening of the vocal

tract. Constriction at the lips lowers all frequencies since it is a constriction at the anti-

nodes. Vocal tract lengthening also lowers all frequencies since the base value in the

formula in (2) increases. The approximate locations of the nodes and antinodes for the

formant frequencies F1 and F2 are schematized in Figure 1.1. So, the overall geometric


11

Figure 1.1. Points of minimum velocity (nodes) and maximum velocity (antinodes) for the first two for-mant frequencies of vowels. An indicates the antinode of formant n, while Nn indicates the node of that for-mant.

shape of the vocal tract above the glottis filters the sound energy for the vibrating vocal

folds to give the distinctive acoustic shapes of vowels.

As a related example, let us consider the articulation of the three Arabic vowels [i,

u, a]. The high front vowel [i] is articulated with the tongue body raised and fronted to-

wards the alveolar region. As a result of this forward thrust of the tongue mass, the lower

pharynx is usually widened during [i]. The alveolar constriction takes place close to the

node of F2 which yields a high value for that formant. Meanwhile, widening the pharynx

takes place nearthe node of Fl which is why this frequency is usually low for [i]. For [u],

the tongue body is raised and backed towards the velar region. Also, the lips are rounded

(constricted) and, occasionally, protruded. Both the velar and the labial constrictions take

place near the two antinodes of F2. This is why this formant is usually very low for [u].


12

Furthermore, the labial constriction takes place at the antinode of Fl, which, like F2, is

usually very low for [u]. If the lips are also protruded, this would elongate the vocal tract

and lower both formants as well. Arabic [a] is a mid low vowel that is accompanied by a

mildly narrowed lower pharynx. The oral tract is somewhat wider when taking the Eng-

lish mid-central vowel [ g] as a reference. The mild constriction in the lower pharynx

takes place close to the nodes of both Fl and F2, which should yield higher values for

both F1 and F2. However, the pharyngeal narrowing effect on F2 seems to be counterbal-

anced by the oral widening at the other node of F2. This is why, compared to English [g],

F1 for Arabic [a] is higher while F2 is about the same.

Figure 1.2 illustrates the relation between the articulatory configurations of the

three Arabic vowels and their typical power spectra. The glottal line spectrum refers to

the frequency components (harmonics) of the energy source which drop in amplitude at a

rate of 12 dB per octave (Pickett 1999). The radiation characteristics refer to the ten-

dency of the higher frequency components to gain in amplitude at a rate of 6 dB per oc-

tave as a result of the radiation of the sound signal out of the lips and into the open air. So

the net source spectrum drops in amplitude at a rate of 6 dB per octave. The transfer

function is basically the filtering effect of the specific shape of the vocal tract. The output

spectra reflect the filtering effects of the vocal tract during the production of each vowel.


dB 80 70 60 50 40 30 20 10

Glottal Line Spectrum

Vowel

Vocal tract shape

Vocal tract trans-fer function

Output vowel spectrum

0 2 3kHz

[i]

13

Radiation Characteristics

+ ---------

[u] [a]

Figure 1.2. Illustration of how the articulation of the three Arabic vowels [i, u, a] is related to their acoustic shapes in the light of the source-filter theory.


14

1.3 Modern Standard Arabic (MSA)

The experimental portions of this dissertation rely exclusively on data from Mod-

ern Standard Arabic (henceforth MSA). So, it is helpful to be acquainted with this variety

of Arabic and review its phonemic inventory.

Arabic is the main language in the Arab countries which occupy most of the Mid-

dle East and North Africa. Close to 200 million people in that region speak one variety of

Arabic or another as their first language. Furthermore, Classical Arabic (henceforth CA)

is used as a liturgical language by more than 1 billion Muslims around the world. Mus-

lims believe that Islam's holy book, the Holy Qur'an, which is worded in a form of CA

highly admired by Arabs (Kaye 1990), is the direct words of Allah (God). CA is often

referred to asjus'naa (clearest). As time passed, different Arabic-speaking peoples devel-

oped, naturally, numerous regional vernaculars that are mostly spoken, but rarely written.

MSA emerged as a direct descendent of CA that fills the need for a standardized form of

Arabic that can also be expressed in writing. Many Arab intellectuals hail MSA as a more

'proper' form of Arabic than the regional vernaculars which they view as signs of the

corruption that befell the revered CA. MSA is currently the language of the media, the

public education systems, practically all written and technical forms of Arabic, as well as

intellectual circles. MSA can also be thought of as a pan-Arab lingua franca used when-

ever dialectal differences veer into unintelligibility. According to Holes (1994), the wide-

spreading of education and mass-media exposure has a "leveling influence" which brings

the divergent Arabic dialects gradually closer to MSA.


15

As noted earlier, MSA is a descendant of CA and retains the basic syntactic, mor-

phological, and phonological systems. But MSA brings added 'standardization'. Bateson

( 1967) lists the following main differences between MSA and CA:

1. MSA is a simplified form of CA. This simplification is mostly realized as the

placement of limitations on the choices of syntactic structures and vocabulary

items used. MSA only uses a subset of the possible syntactic structures avail-

able in CA as well as a substantially reduced lexicon.

2. Included in the MSA lexicon are newly derived, coined, and borrowed vo-

cabulary items that are intended to address the need for technical and other

modern-use terminology.

3. There are idiomatic, stylistic, and even syntactic innovations introduced into

MSA mainly due to the influence of European languages. Such influences are

brought about mostly by direct translations of European texts into Arabic.

The MSA phonemes are listed in (3) and (4). These phonemes are essentially di-

rectly inherited from CA. Overall, there are 28 consonant and three vowel phonemes.

Like other Semitic languages, Arabic is known for its root-and-pattern morpho-

logical system which differs from concatinative systems in that the morphemes are, more

or less, interwoven rather than linearly ordered. Most Arabic stems are based on roots of

two or three consonants between which vowels are inserted. Generally speaking, the con-

sonantal root carries the semantic meaning of the word while the vocalism and the vowel-

consonant ordering reflect the word's inflection and its part of speech. The example

words in (5) are all based on the tri-consonantal root ktb 'write'. Inflectional prefixes and

suffixes can also be attached to the stems. Compare the examples in (6).


16 (3) Arabic consonant phonemes.

Bila- Labio- Dental Alveo- Palato- Pal a- Velar Uvular Pharyn Glot-bial dental Jar Alveo- tal geal tal

lar Stop b d k q. 7

tl' dl'. Fricative f e 0 s z J X B h

<)I' \' s Affricate d3

Nasal m n

Trill r

Approximant w w h l

Lateral

(4) Arabic vowel phonemes.

a

(5) a. katab 'wrote' b. kutib 'was written' c. kaatib 'writer' d. kitaab 'book'

(6) a. katab-a 'wrote' 3rct m. sg. katab-at 'wrote' 3rct f. sg. b. ja-ktub 'write' 3rct m. sg. na-ktub 'write' 1st m./f. pl.

Following the theoretical proposals of Goldsmith (1976), McCarthy (1979) han-

dles the theoretical challenges this morphological system poses to traditional linear theo-

ries by proposing the separation of the consonantal root, the vocalism, and the CV skele-

ton of the word into separate autosegmental tiers. The consonants and vowels are mapped


17

into the CV slots of the skeleton by means of association lines as shown in (7). As such,

the consonants that appear separated by vowels in the surface structure of the word are

underlyingly adjacent.

(7) Consonantal Tier k t b I I I

CV-Template cvcvc v Vocalic Melody a

1.4 Overview of the dissertation

Besides the current chapter, this dissertation is comprised of six chapters. Chapter

2 (Background and Literature Review) lays out the phonetic and phonological back-

ground and reviews the literature pertaining to Arabic emphatic and guttural sounds. The

chapter discusses the most prominent formal representations of Arabic emphatics and

gutturals and highlights the descriptive and explanatory inadequacies facing them. The

chapter also goes over some of the relevant vocal tract anatomical details.

Chapters 3, 4, and 5 are the core chapters of the dissertation. As noted earlier, this

dissertation investigates the acoustic correlates of Arabic emphatics and gutturals and re-

lates those correlates to the articulatory traits of those sounds. It is essential for the sue-

cess of this approach to locate salient and reliable acoustic correlates to emphatic and gut-

tural articulations. Each one of the three core chapters focuses on one possible source for

acoustic correlates to articulation. Three sources are focused on here since they have been

widely studied and have been shown to be rich in acoustic cues for articulation: the spec-


18

tral shapes of the consonants themselves, formant transitions in the vowels adjacent to the

consonants in question, and consonants' effects on vowel-to-vowel coarticulation.

Chapter 3 (Experiment One) focuses on the spectral shapes of Arabic emphatics

and gutturals along with other related consonants. The goal of this chapter is to address

two gaps in the acoustic literature on Arabic emphatics and gutturals. This first gap con-

cerns the consonantal spectral correlates to emphaticness. As explained in the next chap-

ter, the majority of the previous attempts to distinguish emphatic consonants from non-

emphatic ones based on their consonantal spectral shapes have been either sketchy or

subjective or both. A comprehensive acoustic comparison between emphatics and their

non-emphatic counterparts is presented in Chapter 3 using more recent objective methods

of characterizing consonant spectra. The chapter concludes that no highly reliable acous-

tic correlates to emphaticness can be located in the spectral shapes of the consonants

themselves. This excludes the canonical spectra of consonants as the potential acoustic

source to pursue when addressing the main goals of this dissertation. The second gap ad-

dressed in this chapter concerns the consonantal status of Arabic uvular continuants.

These sounds are sometimes described as approximants, a classification that has crucial

theoretical repercussions as explained in the next chapter. The chapter concludes that

Arabic uvular continuants posses strong fricative spectral qualities. This finding demands

major reconsiderations of the phonological claims that are based on the treatment of all

Arabic gutturals as approximants.

Chapter 4 (Experiment Two) examines the coarticulatory impact of the sounds in

question on the formant frequencies of adjacent vowels. While this issue has been treated

thoroughly in the literature, this experiment aims to provide more objective evaluations


19

of the precise coarticulatory correlates to emphatic and guttural articulations. The subtle

similarities and differences among emphatics, uvulars, and pharyngeals are highlighted

and interpreted as solid indications of the characteristic articulatory properties of those

sounds. The main and only reliable correlate to emphaticness is shown to

be a substantially low and stable F2 locus in the adjacent vowel. Uvulars are also associ-

ated with low F2 transitions. However, unlike emphatics, uvulars are not associated with

identifiable F2 loci in adjacent vowels. The magnitude of F2 drop in uvulars depends on

the identity of the vowel. Pharyngeals are associated with consistently high Fl transi-

tions. While emphatics and uvulars are also generally associated with high Fl transitions,

this association is not as strong nor as stable as in the case of pharyngeals. These findings

are interpreted as indications that only pharyngeals achieve their pharyngeal constriction

through active tongue root retraction in Arabic. Emphatics and uvulars, on the other hand

involve mainly tongue dorsum retractions. Any tongue root retraction in these sounds is a

by-product of the general retraction of the tongue mass. This challenges the phonological

views that represent Arabic emphatics and uvulars as [+RTR] sounds. Furthermore, the

more stable association between low F2 transitions and emphatics, as opposed to uvulars,

is interpreted as an indication that the dorsal retractions in emphatics and uvulars are not

fully similar. This particular issue is addressed further in Chapter 5.

Chapter 5 (Experiment Three) examines the vowel-to-vowel coarticulatory effects

across the sounds in question. It is widely acknowledged that the tongue dorsum is the

main articulator in vowels. Since the sound classes investigated in this dissertation in-

volve different degrees and types of tongue participation in their production, the influ-

ence of those sounds on vowel-to-vowel coarticulation provides an experimental oppor-


20

tunity to compare and contrast their articulations. This is particularly true in the cases of

emphatics and uvulars, both of which involve active participation of the tongue dorsum.

The results show that plain orals, pharyngeals, and laryngeals permit substantial degrees

of vowel-to-vowel coarticulation. Emphatics, on the other hand, strongly resist such ef-

fects. Uvulars' impact on vowel-to-vowel coarticulation depends on their degrees of con-

strictions. The uvular stop [q] shows emphatic-like resistance to vowel-to-vowel coarticu-

lation. The voiceless fricative [X] allows significant vowel-to-vowel coarticulation. The

voiced fricative [B"] is highly transparent to vowel-to-vowel coarticulation. These results

are interpreted in the light of the possible musculature involved in the production of em-

phatics and uvulars. Dorsal retraction in emphatics is most likely produced through con-

striction of the lingual muscles the styloglossus and the hyoglossus. Both of which are

implicated in the production of vowels. Dorsal retraction in uvulars is attributed to the

styloglossus only. The magnitude of this participation depends on the degree of constric-

tion involved in the sound. Tongue raising in uvulars is attributed to the contraction of the

palatoglossus which is not implicated in the production of vowels.

Chapter 6 (Implications and Alternatives) casts the acoustically-motivated articu-

latory claims in the three experiments into formal phonological representations of Arabic

emphatics and gutturals. Emphatics are considered to be secondarily dorsal sounds. Uvu-

lars are considered to be secondarily dorsal and primarily radical. Pharyngeals and laryn-

geals are considered to be radical sounds. The representations based on these considera-

tions are shown to be more capable of handling the challenges that face previous

proposals. Chapter 6 also proposes an abstract neuro-motoric foundation for the grouping

of the three guttural subclasses into one natural class. The chapter then concludes with a


21

brief look at emphatics and gutturals in Tigre and Sta'at'imcest who differ from Arabic in

that their guttural natural classes include emphatics but exclude laryngeals. It is suggested

that these phonological differences are explainable on the basis of phonetic differences.

Chapter 7 (Conclusion and Recommendations) concludes the dissertation and

suggests possible topics for future research.


22

CHAPTER2

Background and Literature Review

This chapter lays out the phonetic and phonological background for the disserta-

tion. Since there are numerous anatomical references in this chapter and throughout the

dissertation, the first section in this chapter goes over the most relevant speech organs in

some detail. Section 2.2 reviews the phonetic literature on Arabic emphatics, uvulars,

pharyngeals, and laryngeals. The section concentrates on the most prominent articulatory

and acoustic reports on these sounds. Occasional reviews of perceptual works are also

provided. Section 2.3 goes over the phonological evidence supporting the grouping of the

three Arabic guttural subclasses into a single natural class in terms of place of articula-

tion. The section concentrates on two types of phonological evidence (morpheme struc-

ture constraints and vowel lowering in guttural contexts) as reported in Arabic as well as

other related and unrelated languages. Section 2.4 reviews the most prominent formal

phonological representations of Arabic emphatic and guttural sounds. The section focuses

on the representations in McCarthy (1994), Rose (1996), and Zawaydeh (1999). These

three examples cover the general formal representational trends as far as Arabic emphat-

ics and gutturals are concerned. Section 2.5 highlights the descriptive and explanatory

inadequacies facing those proposals and links them primarily to the lack of a clear under-

standing of the articulatory traits of these sounds.


23

2.1 Basic Vocal Tract Anatomy

At several locations in this dissertation, extensive references to articulatory organs

and their musculature are made. In this section, we take a look at the active articulatory

organs and describe the main muscles that underlie their actions. Only the articulatory

organs that are directly implicated in the articulation of the sounds of interest are dis-

cussed here. This is why, for example, the lips are ignored in the following review. The

details presented in this section are based on the descriptions and illustrations of Zemlin

(1968), Perkins and Kent (1986), Lieberman and Blumstein (1988), Palmer (1993), and

Seikel et al. (1997).

2.1.1 The Tongue

This flexible mass of muscle fiber is arguably the most notable articulatory organ.

From a linguistic perspective, the tongue is divided into four parts, mostly on the basis of

their relation to the fixed structure of the vocal tract: the tip, the blade, the dorsum, and

the root. The rear and radical portions of the tongue are fixed to the velum, the pharynx,

the epiglottis, and the hyoid bone. The lingual movements are executed by two sets of

muscles: the intrinsic muscles of the tongue, which originate from inside the tongue it-

self, and the extrinsic muscles of the tongue, which arise from neighboring structures and

terminate at various points in the tongue.

The intrinsic muscles of the tongue are the superior longitudinal muscle, the infe-

rior longitudinal muscle, the transverse muscle, and the vertical muscle. The superior

longitudinal muscle is a sheet of muscle tissue that extends throughout the length of the


24

tongue just below its upper surface. When contracted, it shortens the tongue or lift the tip

and neighboring sides upwards giving the tongue a concave shape. Contracting one side

of this muscle alone causes the tongue to turn to that side. The inferior longitudinal mus-

cle is a paired muscle that arises from the root of the tongue and extends all the way to its

tip. It courses along the lower surface of the tongue following two side paths separated

along the middle by the genioglossus muscle (discussed below). Contracting this muscle

shortens the tongue or lowers its tip. Like the superior longitudinal muscle, contraction of

one side of this muscle causes the tongue to turn to that side. The transverse muscle is a

paired muscle whose fibers radiate from the median fiber wall of the tongue and stretch

laterally to terminate at the side edges of the tongue. Contraction of this muscle narrows

the tongue and lengthens it. The vertical muscle is also a paired muscle whose fibers ex-

tend vertically from just below the upper surface of the tongue flowing downward to-

wards the base of the tongue. Along the way, these fibers intertwine with those of the

transverse muscle. This muscle flattens the tongue when contracted.

The extrinsic muscles of the tongue are the genioglossus muscle, the hyoglossus

muscle, the styloglossus muscle, and the palatoglossus muscle. These muscles are sche-

matized in Figure 2.1. The genioglossus is the largest of the tongue muscles. Its fibers

arise from the inside surface of the mandible and fan upward and backward to insert into

the tongue from its tip all the way to its root. It occupies a medial location along the

width of the tongue. Contracting the anterior portion of the genioglossus draws the

tongue back while contraction of the posterior portion slides the tongue forward. When

both portions are contracted, the tongue assumes a concave shape along the middle. The

hyoglossus is a paired muscle that arises from the hyoid bone and inserts into the lower

Reproduced w

ith permission of the copyright ow

ner. Further reproduction prohibited without perm

ission.

1 I II ---.........._-----.., ' I I // .......

v:: 1\ 5

\ 6 \ 5

!'-- -.... \ 7 2 ( '.\.l;.\ 2

3

4 8

Figure 2.1. The extrinsic muscles of the tongue along with some other vocal tract organs.

1. Maxilla 5. Palatoglossus Muscle 2. Genioglossus Muscle 6. Styloglossus Muscle 3. Mandible 7. Hyoglossus Muscle 4. Geniohyoid Muscle 8. Hyoid Bone

6

3 I \ I '-

r 7

4 \ -:=;:jl l \ I \ .I I ....---.__) 8

Figure 2.2. The pharyngeal constrictors and related structures.

1. Stylohyoid ligament 5. Superior Constrictor Muscle 2. Pterygomandibular ligament 6. Middle Constrictor Muscle 3. Larynx 7. Inferior Constrictor Muscle 4. Trachea 8. Esophagus

N VI


26

sides of the tongue. When contracted, the hyoglossus lowers and retracts the tongue mass.

The styloglossus is a paired muscle that emerges from the styloid process and inserts into

the lower sides of the tongue. Contraction of this muscle draws the tongue back and up-

wards. The palatoglossus is a paired muscle considered to be a part of the anterior faucial

pillars. This muscle is classified anatomically as one of the extrinsic muscles of the

tongue as well as one of the muscles of the soft palate (see §2.1.3 below). It originates

from the anterior portion of the soft palate and inserts into the sides of the back of the

tongue. Contracting this muscle either lowers the soft palate or, if the soft palate is fixed,

raises the back of the tongue.

2.1.2 The Pharynx

The pharynx is, roughly speaking, a tube-like structure that extends from the pos-

terior region of the nasal cavity to the larynx. The upper region of the pharynx above the

velum is called the nasopharynx. The region extending from the velum down to the hyoid

bone is the known as the oropharynx. The region from the hyoid bone down is called

the laryngopharynx. The most notable of the pharyngeal muscles are its three constrictor

muscles shown in Figure 2.2. The superior constrictor muscle originates from the ptery-

gomandibular ligament and courses backwards to insert into the midline tendinous raphe.

The middle constrictor muscle originates from the hyoid bone and the stylohyoid liga-

ment and courses backward to insert into the midline tendinous raphe. The inferior con-

strictor muscle is a rather large sheet of muscle fibers. It starts from the cricoid cartilage

and the thyroid lamina and fans back and around to insert into the midline tendinous ra-


27

phe. Contracting any of the three pharyngeal constrictors narrows the diameter of the

pharynx at its particular location.

2.1.3 The Soft Palate

The soft palate, or velum, is a flexible flap of muscle fibers and other tissues that

forms the posterior part of the roof of the mouth. It is attached anteriorily to the rear edge

of the hard palate, by means of the palatal aponeurosis, and laterally to the superior con-

strictors of the pharynx. In speech, the soft palate plays a major role in the production of

nasal sounds. When fully lowered it allows the pulmonic airstream to pass through the

nasal cavity producing the acoustic and auditory effect of nasalization.

As shown in Figure 2.3 the soft palate has two main elevator muscles: the levator

veli palatini muscle and the uvular muscle; and two main depressor muscles: the pala-

toglossus muscle and the palatopharyngeus muscle. The levator veli palatini is a paired

elevator muscle. It arises from the temporal bone and the Eustachian tube and descends to

insert into the aponeurosis of the velum. When contracted, this muscle lifts the velum up

and back. The other elevator muscle, the uvular muscle, originates from the posterior

palatal bones and from the palatine aponeurosis and extents backwards until it inserts into

the uvula. When contracted, this muscle shortens and raises the velum. The palatoglossus

has already been described as one of the extrinsic muscles of the tongue. As explained

above, contraction of this muscle lowers the soft palate or, if the soft palate is fixed, it

raises the back of the tongue. The palatopharyngeus is another velar depressor. Its fibers

arise from the soft palate and stretch laterally and downwards and attach to the thyroid

cartilage as well as the pharyngeal walls. It is part of the posterior facial pillars. Contrac-

Reproduced w



ission.

__.-'fl'///k '' 5 2

3 ......... ,_.,,,,,_,_""""'-- //,',e/1

4 I I 6

Figure 2.3. Muscles of the soft palate along with related structures.

1. Palatal Bone 4. Tongue 2. Uvular Muscle 5. Levator Veli Palatini Muscle 3. Palatoglossus Muscle 6. Palatopharyngeus Muscle

2

3 _,_,__ ___

4 5

Figure 2.4. Structure of the larynx.

1. Hyoid Bone 2. Hyothyroid Membrane 3. Thyroid Cartilage 4. Cricothyroid Muscle

(Pars Recta)

: : 6

I 7

: 8

5. Cricothyroid Muscle (Pars Oblique)

6. Arytenoid Cartilage 7. Cricoid Cartilage 8. Trachea

N 00


29

tion of this muscle brings that faucial pillars closer and lowers the velum in a sphincteric

move that narrows the diameter of the pharynx. It can also raise the larynx.

2.1.4 The Larynx

The larynx (Figure 2.4) is a complex structure made up of muscle tissues and car-

tilages. The main cartilage structures that make up the larynx are the cricoid cartilage, the

thyroid cartilage, the arytenoid cartilages, and the epiglottis. The cricoid rests atop the

upper end of the trachea. It is shaped like a ring with its posterior part reaching higher

(i.e., is thicker) than the anterior part. The thyroid is the largest laryngeal structure. It is

situated above the cricoid and the two are joined by means of a pair of facets, one on each

side. The arytenoid cartilages are small pyramid-shaped structures that rest on facets lo-

cated on the two sides of the upper posterior surface of the cricoid cartilage. Attached to

the front and sides of these cartilages are the vocal folds which are multi-layered tissues

whose other edge is attached to the inner surface of the front of the thyroid. The epiglottis

is a shoe-hom-shaped cartilage that is attached to the lower inner surface of the front of

the thyroid. It rises upwards extending above the level of the hyoid bone. It is attached to

the arytenoid cartilages by means of the aryepiglottic folds. While not structurally a part

of the larynx, the hyoid bone is closely related to laryngeal and oral structures. It is an

arch-shaped bone sitting above the larynx and is linked to the thyroid by means of the

lateral hyothyroid ligaments which extend from the posterior tips of hyoid downwards to

the horns of the thyroid. Also, the two structures are linked by means of the hyothyroid

membrane which drapes down from the hyoid and attaches to the upper rim of the thy-

roid.


30

The larynx has the following intrinsic muscles: the lateral cricoarytenoid muscle,

the transverse arytenoid muscle, the oblique arytenoid muscles, and the cricothyroid

muscle. The lateral cricoarytenoid muscle is a vocal fold adductor muscle. It extends

from the upper rim of the cricoid cartilage to the muscular process of the arytenoid carti-

lages. Contraction of this musele rotates the arytenoids bringing the vocal folds closer to

each other. The transverse arytenoid muscle stretches from the side and back of one

arytenoid to the other. When contracted, it brings the arytenoids, and subsequently the

vocal folds, closer to each other. The oblique arytenoid muscles are paired muscles. Each

one arises from the bottom of the posterior part of one of the arytenoids and inserts into

the top of the other. From there it continues upward to form the aryepiglottic muscles

which insert into the sides of the epiglottis. Contraction of the oblique arytenoid muscles

has a rather similar effect to that of the transverse arytenoid muscles. Contraction of the

aryepiglottic muscles, with the aid of the oblique arytenoid muscles, pulls the epiglottis

down to cover the larynx. The cricothyroid muscle arises from the sides and front of the

cricoid cartilage then divides into two parts: the pars recta and the pars oblique. The for-

mer extends upwards and the latter extends up and backwards till they both attach to the

lower edge of the thyroid. Contraction of the pars recta tilts the front of the thyroid down

increasing the distance between the thyroid and the arytenoids causing a tension in the

vocal folds. Contraction of the pars oblique slides the thyroid forward also increasing the

distance between the thyroid and the arytenoids and tensing the vocal folds.


31

2.2 Phonetic Properties of Arabic Emphatics and Gutturals

2.2.1 Emphatics

The famed Arab grammarian Sibawayh5 (d. circa 796 A.D.) notes that the four

emphatics [t', d'i, cF, s'i] are articulatorily similar to their non-emphatic counterparts [t, d,

o, s]. The exception being that in the emphatics, "your tongue would cover (the area ex-

tending) from their main place of articulation to portion of the palate opposite the tongue

(which) you raise towards the palate" (Kitab Sibawayh, vol. 2, p: 406). For this reason,

Arab grammarians termed these four sounds mut'1baqah ('covered'). Ibn Sina (d. 1037

A.D. - known in western historical and philosophical circles as Avicenna) adds that

emphatics are articulated with a depressed tongue surface behind the main articulation.

This point, as discussed below, is verified by modern research techniques.

Modern studies show that, beside their primary coronal articulation, all Arabic

emphatics have a secondary articulation involving the back of the tongue. Descriptions of

the latter involvement differ from one study to the other. It is generally accepted that the

secondary emphatic articulation involves mainly a retraction of the tongue body. The

schematic in Figure 2.5 illustrates this articulatory configuration. Among the earliest x-

ray examinations Of emphatics is the one done by Al-Ani (1970). His x-ray tracings

clearly show that the tongue body is pulled backwards into the upper oropharynx during

the articulation of [t'i]. Based on this articulatory evidence, the author favors pharyngeali-

zation over velarization as the proper description for the secondary emphatic articulation.

5 For some reason, Sibawayh's name is misspelled as 'Sibawayhi' in the majority of the modern works published in western countries. The source of the added i at the end of the name is not known to me. The spelling I use here is a transliteration of the Arabic spelling,


Figure 2.5. A schematic illustration of the vocal tract configuration during the articulation of an Arabic emphatic coronal and its non-emphatic counterpart. This schematic is based on descriptions and illustrations in Al-Ani (1970), Ali and Daniloff (1972), and Ghazeli (1977).

32

The cineflurographic investigation by Ali & Daniloff (1972) using Iraqi speakers

arrived at similar findings. The difference reported by the authors between emphatics and

non-emphatics is that the former class of sounds involves a retraction of the pharyngeal

tongue dorsum causing a narrowing in the upper pharynx. The authors found that the pos-

terior wall of the pharynx and the velum were not significantly implicated in the articula-

tory difference. The only significant involvement of the velum occurs during the produc-

tion of [k"] ([q]) - which they consider as an emphatic version of [k] -during which

the velum is moved toward the tongue. Additionally, the authors are careful to point to an

active participation by the palatine tongue dorsum in the articulatory difference between

emphatics and non-emphatics. The palatine dorsum is depressed during emphatics caus-

ing a widening of the oral cavity- an adjustment also shown in Al-Ani's tracings.


33

A more extensive articulatory investigation is offered by Ghazeli (1977) who

points that the accompanying depression of the palatine dorsum is either the cause or the

result of the rearward movement of the tongue back. The author reports that the retraction

of the tongue back into the upper pharynx takes place "at the level of the second cervical

vertebra" (p. 72). The precise location of the constriction, however, does not seem to be

an area of agreement among articulatory studies. Based on x-rays of a speaker of Bagh-

dad Arabic, Giannini & Pettorino (1982) report that the extremum of the pharyngeal con-

striction takes place closer to the level of the third and fourth vertebrae.

The x-ray-based investigations of Ali & Daniloff (1972) and Ghazeli (1977) point

to a retraction in the upper pharynx during emphatics achieved by a retraction of the

tongue body towards the posterior pharyngeal wall while little or no adjustments take

place in the lower pharynx (Al-Ani's (1970) x-ray tracings do not show the lower phar-

ynx). Ghazeli notes that there is an accompanying backward movement of the epiglottis

but no significant adjustments in the laryngopharynx. This suggests that the epiglottal

constriction is a byproduct of the general retraction of the tongue. However, Laufer &

Baer (1988) argue that the epiglottal constriction is actually what defines the secondary

emphatic articulation on the bases of a fiberscopic study of nine subjects speaking Ara-

bic, Hebrew, or both languages. Their images show noticeable backing of the epiglottis in

emphatics as opposed to non-emphatics. The authors conclude that the secondary articu-

lation in emphatics and the primary articulation in pharyngeals are qualitatively similar: a

constriction in the lower pharynx achieved by a backward movement of the epiglottis to

form a constriction with the pharyngeal walls. According to the study, however, the pha-

ryngeal constriction is less extreme and less constant in emphatics than in pharyngeals.


34

The fiberscopic study by Zawaydeh ( 1999) also concludes that there is an articulatory

similarity between emphatics, uvular, and pharyngeals, all of which involve pharyngeal

narrowing. Zawaydeh's study, however, does not discuss the precise locations of the con-

strictions.

Fiberscopic images are a very valuable method for investigating the lateral and

annular movements in the pharynx which cannot be captured by lateral x-ray images.

However, compared to x-rays, fiberscopes are at a disadvantage when we consider the

area of coverage and the coordination between articulators. Rhinal fiberscopes are typi-

cally inserted into the subject's nostril, extended backwards through the nasal cavity, and

then dangled downwards into the upper oropharynx below the level of the uvula. They

provide a top-to-bottom look at the mid and lower pharynx. Thus, fiberscope images can-

not capture the whole tongue dorsum nor can they reliably judge the vertical placement of

the larynx. Lateral x-rays images are wide enough to cover the whole vocal tract and

show how the movement of all articulators are timed and coordinated. The x-ray tracings

in the studies discussed above show a rearward movement by the epiglottis accompany-

ing the general backing of the tongue dorsum. Furthermore, Giannini and Pettorino

(1982) state that the aryepiglottic muscle which, when contracted, depresses the epiglottis

backwards is not involved in this articulation. Accordingly, Laufer & Baer's (1988) con-

clusions regarding the common active articulator for emphatics and pharyngeals have to

be questioned. A rearward movement of the epiglottis can be the result of a general re-

traction of the tongue back or the tongue root to which the epiglottis is attached. It was

mentioned earlier that Ghazeli (1977) and Giannini & Pettorino (1982) noted such dis-


35

placement in x-ray images, but it was not the most significant articulatory difference be-

tween emphatics and non-emphatics.

So far, most attempts to distinguish emphatic consonants from their non-emphatic

counterparts on the basis of their acoustic shapes have depended on visual inspection of

spectrograms and generally met little success. Using synthesized sound tokens, Obrecht

(1961) found that [s] and [s'] cannot be perceptually distinguished from each other based

on lower cutoff edge of their fricative noise portions. The lower frequency cutoffs for [s]

and [s'] reported by Al-Ani (1970) are at about 3000 Hz and 2750 Hz, respectively.

Ghazeli (1977), however, found that both [s] and [s'] have energy concentrations that

start at 3000 Hz. Giannini and Pettorino (1982) report that the spectrograms of both [s]

and [s'] exhibit similar "irregular striations of equal intensity above 3000-4000 cps".

Card (1983) also found it impossible to link the difference between the two sibilants to

the lower edge of spectral frequency. Norlin (1987) could not find differences between

the Egyptian Arabic emphatic/plain fricative pairs [s, s'] and [z, z'] using mingograms

and spectrograms. However, using critical band spectra he concluded that the spectral

center of gravity in emphatics was generally lower than in non-emphatics. Although he

also found that emphatics, on average, had higher energy dispersion and lower mean in-

tensity than non-emphatics, this was not true for all of his subjects.

For the most part, Al-Ani (1970) reports no difference in duration between em-

phatics and non-emphatics. Meanwhile, Giannini and Pettorino (1982) found that the dif-

ferences in duration between emphatics and non-emphatics demonstrate some variation:

before [a], emphatics are longer while before [i] non-emphatics are longer.


36

.The voiced pair [o, o1], on the other hand, show detectable acoustic differences

in their spectrograms due to the absence of intense noise that would otherwise mask those

differences. Ghazeli (1977) found that both fricatives have visible formant-like structures.

For [o] F2 is at 1600, 1600, 1400Hz after [i], [a], and [u], respectively. For [o'], the val-

ues are 1100, 1000, and 800 Hz.

Regarding emphatic/non-emphatic stops, Al-Ani (1970) reports that the energy

concentration in the stop burst of [t'] is lower than in the burst of [t]. No such difference

was reported by Giannini and Pettorino (1982) nor by Ghazeli (1977) who found that the

most visible concentration of energy in the bursts of both stops is at 4000Hz. However,

Ghazeli found that the VOT is longer for [t] than for [t'] (30 msec vs. 10 or 15 msec).

Giannini and Pettorino also report that the [t], as opposed to [t1], is sometimes followed

by aspiration. This comes in spite of Fre Woldu's (1981) finding that there is no differ-

ence in peak intraoral pressure (expressed in mm in H20) between emphatic and non-

emphatic consonants.

The coarticulatory effect of emphatics on neighboring vowels, or emphasis spread

(ES), is a well known acoustic attribute of these sounds. The most reported effects are a

lowered F2 and a raised F1 (either at the transition only or throughout the vowel). The

rise in F1 is not reported in all studies. Al-Ani (1970) reported large F2 onset drops in

vowels following emphatics consonants as opposed to non-emphatic ones. The vowel [i]

exhibited rising transitions from emphatics while [u] had falling transitions. Meanwhile,

no major differences were noticed between the frequency values at the transition and

steady state of [a]. The absence of coarticulatory effects on [a] is somewhat surprising

since most of the other studies indicate that [a] is the vowel most susceptible to ES.


37

Ghazeli (1977) found that the drop in F2 extends throughout [a] while in [i] only the on-

set of F2 in is low followed by a rising transition in [i] and there is no transition in [u]. He

also found F1 is raised in all vowels. Younes (1982) found similar patterns in Northern

Palestinian. Meanwhile, F3 in vowels does not seem to reflect any coarticulatory influ-

ence by adjacent emphatics. Giannini & Pettorino (1982) found no change in F3 locus

next to emphatics while El-Dalee (1984) reports that the changes in F3 were inconsistent.

The spread of emphasis was found to be a strong perceptual cue for the presence

of a secondary articulation in the consonant. Ali and Daniloff (1974) prepared truncated

Baghdad Arabic minimal pairs of natural words. In each word, the sound that was spliced

away was an emphatic or its non-emphatic counterpart in either a word-initial or word-

final position with a vowel adjacent to it. When they presented the tokens in carrier

phrases to their ten subjects, the authors found that, in a statistically significant number of

cases, speakers could tell whether a word contains an emphatic or a non-emphatic. The

authors did not attribute the perceptibility of emphasis to any single vowel formant.

However, Obrecht (1961), who used synthetic tokens, found that the low locus ofF2 next

to emphatics was successful in cuing the perception of emphaticness. He found the per-

ceptually-effective F2 locus next to emphatics, or the "zone of velarization" (he claims

that the secondary articulation in emphatics is velarization), to be between 1000 and 1400

Hz.

Some segments have been shown to be opaque toES. This means that seg-

ments would resist the articulatory and acoustic effects of ES and would block those ef-

fects from reaching beyond them to other segments. The most reported opaque sound is

the high front vowel [i] (e.g., Ghazeli 1977, Card 1983, Heath 1987, Younes 1993, and


38

Davis 1995). The semivowel [j] and the voiceless fricative [f] have also been frequently

reported to be opaque toES (e.g., Card 1983, Heath 1987, Younes 1993, Davis 1995,

Shahin 1997a, b). In Arabic dialects that possess the voiced fricative [3], this sound is

also cited as an opaque segment (e.g., Heath 1987). Additionally, Shahin (1997a, b)

found that, in Abu Shusha Palestinian Arabic, the two affricates [1f, d.3] are opaque toES.

A common articulatory trait between those opaque segments is that they involve raising

and fronting of the tongue dorsum. This maneuver is antagonistic to the tongue dorsum

retraction involved in the secondary articulation in emphatics which is what gets spread

to neighboring segments. Equally important, but generally ignored, is the fact that the ar-

ticulation of those opaque segments negates the lowering of the palatine dorsum surface

which is witnessed during emphatic articulation.

Emphasis spread generally travels both leftward and rightward relative to the em-

phatic consonant. Most reports in the literature indicate that leftward ES is more sizable

and more constant than rightward ES. Ghazeli (1977) found that R-to-L ES is less re-

stricted than L-to-R ES. The former can be weakened but not blocked by [i] or [j], while

the latter is strongly weakened or blocked by [i]. Younes ( 1993), however, finds that,

while the same is true for Palestinian Arabic, Cairene Arabic exhibits the opposite trend:

ES is less restricted in the L-to-R direction than in the R-to-L direction. Zawaydeh (1999)

provides acoustic evidence suggesting that, in Ammani-Jordanian Arabic, L-to-R ES is

gradient while R-to-L is categorical. This means that, in the L-to-R direction, the acoustic

effect of ES wears off as we move further away from the emphatic consonant. The acous-

tic effect of R-to-L ES, on the other hand, remains strong and relatively constant. El-

Dalee's results agree with Zawaydeh's, but he points that while L-to-R ES is gradient, it


39

still lies within "the acoustic range which is assumed to induce the perception of (em-

phaticness )" (p. 141 ).

Word boundary delimits ES as reported by Ghazeli (1977) and Younes (1993).

There is, however, disagreement on the effect of morpheme boundary onES. Inboth Pal-

estinian Arabic and Cairo Arabic, Younes (1993) found that morpheme boundary option-

ally blocks R-to-L ES but has no significant effect on L-to-R ES. Zawaydeh (1997), on

the other hand, reports that emphasis spreads obligatorily into prefixes and optionally into

suffixes. If the word ends in an emphatic, emphasis spread into suffixes becomes obliga-

tory as well. It seems that distance also plays a role in the degree of ES. Younes (1993)

found ES to be stronger on closer segments than on further ones. But this was true only in

Palestinian Arabic. In the other dialect he studied, Cairene Arabic, distance has no bear-

ing onES. Zawaydeh (1997) also found that, in Ammani-Jordanian Arabic, The further

away the trigger of ES, the weaker its effect becomes. In general, the studies cited earlier

which state that L-to-R ES is gradient also provide support to the effect of distance from

the emphatic trigger.

Lehn (1968) and Broselow (1979), both of whom worked on Egyptian varieties of

Arabic, argue that the domain for ES is the syllable. According to Broselow, if a conso-

nant in a syllable is assigned [+RTR] (i.e., emphatic- she clearly agrees with the view

that emphatics involve an actively retracted tongue root), the node dominating the sylla-

ble is assigned [ +RTR]. Thus, all segments dominated by that node will be assigned the

same feature. However, neither Lehn nor Broselow provide acoustic verification for their

claims.


40

In sum, Arabic emphatics, [t", d', o", s'], are a set of coronal obstruents that in-

volve a secondary articulation in the form of a retracted tongue dorsum resulting in a nar-

rowing in the upper portion of the pharynx. This retraction is accompanied by small re-

traction by the lower part of the anterior wall of the pharynx and the epiglottis. Emphatics

are generally associated with a lowered F2 and raised Fl in adjacent vowels in compari-

son with their non-emphatic counterparts. This acoustic effect, known as 'emphasis

spread' (ES) reaches far in both directions and is usually blocked or weakened by high

front sounds like [i], [j] and [J]. The F2 drop has been shown to be a reliable cue for the

perception of emphatics.

2.2.2 Uvulars

Early Arab grammarians noticed that the two uvulars [B", :X] hold an articulatory

affinity to the other guttural sounds in terms of place of articulation. They have observed

that the articulation of those two uvulars was a pharyngeal rather than oral one, but,

unlike other gutturals, uvulars are articulated at a point in the pharynx very close to the

mouth. Sibawayh described these two uvulars as "the sounds whose point of articulation

is (at the part of the throat) bordering the mouth" (Kitab Sibawayh, vol. 2, p. 405). The

possibility of an oral participation in the articulation of uvulars was suggested by Ibn

Jinni (d. 1002 A.D.) who noted that uvulars are pronounced at the upper end of the throat,

along with the edge of the mouth. This is an important observation supported later by

modern phonetic and phonological research. As for the uvular stop [q], it was considered

by Arab grammarians as an oral stop whose point of articulation, according to Sibawayh,

is "at the portion of the tongue furthest back and the part of the palate just above it". An-


41

other important observation by Arab grammarians that received modern support is the

articulatory (and auditory) similarities between uvulars and emphatics. Sibawayh, and

later Ibn Jinni and Ibn Al-Jazari (d. 1429 A.D.), grouped these two sets of sounds into the

class of mustraliyah sounds. This term is derived from the Arabic word 7isti)laa7, which

is described by Sibawayh as the elevation of the tongue towards the palate.

Modem studies offer somewhat similar articulatory accounts of these sounds. The

schematics in Figure 2.6 show the articulatory configurations of the three Arabic uvulars

[X, ff, q]. Catford ( 1977) describes the articulation of uvulars, including the Arabic set of

[X, ff, q], as moving the rear-most portion of the tongue surface towards the posterior soft

palate and the uvula. For this reason, he terms their articulation as dorsa-uvular. This

general description is supported by the published x-ray investigations of these sounds,

although, at least in the case of Arabic, the articulation of [ff, x, q] is more complicated

than Catford's brief account. Based on successive x-ray frames of a single Lebanese

speaker, Delattre (1971) describes a rather dynamic articulation for the three uvulars [ff],

[X], and [q]. In his account, the tongue slides horizontally backwards then moves up-

Figure 2.6. Schematic illustrations of the vocal tract configurations during the articulation of an Arabic uvulars. These schematics are based on descriptions and illustrations in Delattre (1971) and Ghazeli (1977).


42

wards to create a constriction in the upper pharynx. For this reason, he provided two il-

lustrative tracings for the two uvulars [ff] and [X], one for each movement (only one trac-

ing of [q] is provided). This curved path followed by the back of the tongue is common

among the three sounds. The articulation of [ff], as shown in Delattre's tracings, also in-

volves a downward curling of the uvula towards the raised back of the tongue causing a

slight trill which he notes to be "hardly noticeable on spectrogram, and even does notal-

ways take place" (p. 135). The articulation of [X] is generally similar except that it is nar-

rower than that of [ff] and does not involve similar participation from the uvula, which,

according to the author's descriptions and x-ray tracings, is held flat over the back of the

tongue. When comparing the uvula position during the articulation of [X] to those ac-

companying the other fricatives reported by Delattre, it appears clearly that the flattened

shape of the uvula is somewhat unique. The author explains that this configuration is in-

tended "to prolong the stricture and contribute to the production of friction turbulence"

(p. 137). It is quite visible from comparing the tracings for [ff] and [X] that the narrower

constriction of the latter is achieved by a higher and more bayked position of the tongue.

This position seems to be exaggerated further during the articulation of [q] to achieve a

full occlusion.

In general, the area and manner of uvular constrictions reported in the x-ray inves-

tigation of Ghazeli (1977), who served as his own subject during the x-ray part of his

study, were similar to those of Delattre (1971). However, the tongue back positions dur-

ing the articulations of [ff] and [X] as reported in the two studies are different. Ghazeli

noted that the tongue dorsum is retracted more during [ff] while the place of constriction

for during [X] falls between those for [q] and [k]. Furthermore, Ghazeli's descriptions


43

and x-rays indicate that the anterior wall of the pharynx as well as the epiglottis is pulled

backwards towards the posterior wall of the pharynx. Delattre reported no such adjust-

ments. Ghazeli also reports slight raising of the larynx during [X] and [q], but not [ff].

While not expressed by Ghazeli, his tracings show that the tongue is backed the most dur-

ing [q]. Accordingly, the pharyngeal volume above the epiglottis is smaller during [q]

than during [ ff] or [X]. A possible reason for this is that the occlusive nature of [ q] de-

mands a full articulatory seal at the vertical as well as the horizontal surfaces of the uvula

causing more raising and backing of the tongue.

Acoustically, the voiced uvular [ff] features somewhat vowel-like formants

throughout its duration accompanied by some weak noise pointing to a mildly fricativ{{

manner of articulation (Al-Ani 1970). The formant-like spectral structures are subject to

coarticulatory conditioning by neighboring vowels. Al-Ani actually refers to these for-

mant-like structures as a "continuation of Fl, F2, and F3" of neighboring vowels. Ghazeli

(1977) reports that Fl of [ff] ranges from 500 to 600Hz next to the low vowel [a] while

F2 ranges from 1200 to 1300Hz. Next to [i] and [u] Fl is lower while F2 is raised next to

[i] and lowered next to [u] (though no precise numbers were given). Both Al-Ani and

Ghazeli describe the spectrograms of the voiceless uvular [X] as aperiodic noise. The

lower limit of spectrographic energy reported by Ghazeli ranges from 600 to 1500 Hz,

depending on the subject. Al-Ani, on the other hand, explains that the lower limit of the

spectral energy depends on the vowel context: around 1500 Hz, 1000 Hz, and 800 Hz

next to [i], [a], and [u], respectively.

Acoustic investigations also show that, like emphatics, uvulars spread emphasis

into neighboring vowels. There are differences in the reports of the size and domain of


44

ES from uvulars. Al-Ani (1970) reports that, next to [B"] and [X], F2 onset value in [i] is

1 owe red to 1800-1900 Hz while F2 onset in [ u] is raised to 13 50 Hz. As for F2 onset in

[a], there was a stronger coarticulatory effect from [B"] (1250-1300 Hz) than from [X]

(1350-1500 Hz). The coarticulatory effect exhibited by [q] is stronger still: F2 onset val-

ues were 1600Hz, 1150-1200 Hz, and 900Hz next to [i], [a] and [u], respectively. Un-

fortunately, Al-Ani did not report any F1 onset values which, one would expect to be

somewhat raised next to uvulars. Giannini & Pettorino (1982) found that F1 locus next to

[B"] and [X] is at 500Hz, while that of F2 is at 1500Hz. F1 and F2 loci next to [q] are at

500 Hz and 1400 Hz, respectively. Interpreting these values in the light of nodes and

antinodes of F1 and F2, Giannini & Pettorino conclude that [B"] and [X] are articulated at

the same place as [ q] and that all three sounds are uvular.

While Heath (1987), who studied Moroccan Arabic, states that the coarticulatory

effect of uvulars is as strong as that of emphatics, the general agreement is that ES from

uvulars is somewhat milder and does not reach as far. Ghazeli (1977) notes that ES from

uvulars does not affect adjacent high vowels, adjacent consonants, or non-adjacent seg-

ments. Kuriyagawa (1984), who studied only the uvular stop [q] along with emphatics in

Standard Arabic spoken by an Egyptian subject, also found that, while the coarticulatory

effect of that uvular on vowels is qualitatively similar to that of emphatics, it does not

reach into the following syllable. Similarly, El-Dalee (1984) found that, in Egyptian Ara-

bic, ES from uvulars affects the adjacent vowel only.

To sum up, Arabic has two uvular continuants, the voiceless [X] and the voiced

[B"], and one uvular stop [q]. These sounds are produced with a general raising and retrac-

tion of the tongue dorsum towards then soft palate. This maneuver is comprised of two


45

movements: the dorsum is first pulled back, and then it is raised towards the uvular re-

gion. In [X] and [q], the uvula is flattened and held up. In [B"], the uvula is curled down-

wards towards the tongue. Acoustically, [B"] shows mild frication with formant-like struc-

tures while [X] shows aperiodic noise. All three uvulars spread emphasis onto

neighboring vowels. However, emphasis spread from uvulars is generally not as sizeable

nor as far-reaching as emphasis spread from emphatics.

2.2.3 Pharyngeals

According to Sibawayh's and Ibn Jinni's descriptions, the two pharyngeals [1]

and [h] are articulated "at the middle of the throat". While rather vague, this description

does capture the fact that the main point of articulation for pharyngeals lies in between

those of laryngeals and uvulars. This has been largely confirmed in modern articulatory

studies. The schematic in Figure 2. 7 illustrate the general articulatory configuration of

Arabic pharyngeal consonants. Unfortunately, the x-ray tracings provided by Al-Ani

( 1970) for the two pharyngeals [h, 1] do not cover the mid and low regions of the phar-

ynx. Interestingly, Al-Ani claims that the most common allophone of [1] is a voiceless

stop while in intervocalic positions it is realized as either a stop or a glide. While it is true

that [1] surfaces as a stop in certain dialects of Arabic (Al-Ani bases his conclusion on

acoustic data obtained from Iraqi subjects), most of the published phonetic literature

clearly indicate that it is always realized as a voiced fricative or approximant. Such wide

variation in the degree of constriction for [1] was also exhibited by Hebrew subjects

(Laufer and Condax 1981, 1979).


Figure 2. 7. A schematic illustration of the vocal tract configuration during the articulation of an Arabic pharyngeal consonant. This schematic is based on descriptions and illustrations in Delattre ( 1971) and Ghazeli ( 1977).

46

The tracings in Delattre ( 1971) and Ghazeli ( 1977) cover the whole vocal tract.

Both studies report that Arabic pharyngeals are articulated mainly by retracting the

tongue root towards the posterior pharynx wall with the narrowest constriction taking

place at the level of the epiglottis. Delattre notes that the constriction for [h] is lower than

that one for [)]. He also notes, as does Ghazeli, that the constriction is narrower for [h]

than for [)] since, as a voiceless fricative, [h] requires a narrow constriction to produce

adequate turbulence in the air stream. Other important articulatory movements reported ·

by Ghazeli are raising the larynx and a forward movement of the lowest part in the poste-

rior wall of the pharynx. While x-ray tracings reveal a backward displacement in the

tongue root and the epiglottis, the anatomical makeup of the musculature involved sug-

gests a rather two-dimensional annular gesture that cannot be reflected in lateral x-rays.


47

Catford (1977) describes the articulation of [h] and [<i'] as "largely a sphincteric semi-

closure of the oro-pharynx" 163).

Ladefoged and Maddieson (1996) maintain that Semitic [h] and [<i'] are "neither

pharyngeals nor fricatives" (p. 168) arguing instead that these sounds are epiglottal ap-

proximants. Most available accounts of these sounds (see acoustic descriptions below) do

support the view that Arabic [h] and [<i'] are approximants. In Butcher and Ahmad's

(1987) words, [h] and [<i'] are "formed in a region of the vocal tract where true fricatives

are very difficult to produce" (p. 156). However, Ladefoged and Maddieson's claim that

these sounds are epiglottals and not pharyngeal is not without problems. In the same

book, Ladefoged and Maddieson describe the connection between the epiglottis and the

tongue root as follows:

"The relation between the root of the tongue and the epiglottis is similar to

that between the tip and the blade of the tongue. They can be moved sepa-

rately, but because of their proximity only one or the other can be the

principal articulator in any given sound." (p. 11)

Hence, Ladefoged and Maddieson's claim that [h] and [)] are epiglottals rather than

pharyngeals means that the epiglottis is the organ which moves backwards to make the

constriction. This claim is expressed by Laufer and Condax ( 1981) based on fiberscopic

data. Laufer and Condax assert that the tongue does not participate in the articulation of

these sounds. These positions are at odds with the cited x-ray accounts that clearly show

that the root of the tongue is pulled backwards causing an unavoidable retraction of the

epiglottis along with it. If the epiglottis was moved independently (through the action of

the aryepiglottic muscles), we would not expect a consistently concomitant retraction of


48

the root of the tongue. An x-ray investigation by Boff Dkhissi (1983) (cited in Ladefoged

and Maddieson ( 1996)) concludes instead that the movement of the tongue root and the

epiglottis are not independent of each other and that the constriction is made by the two

organs jointly. The stronger argument seems to be that [h] and [l] are made, primarily, by

a retracted tongue root. This retraction pulls the lower surface or the posterior wall of the

pharynx as well as the epiglottis backwards along with the tongue root.

Meanwhile, the tongue body assumes a mid position inside the mouth in an [a)-

like fashion. These positions are clearly seen in Delattre's and Ghazeli's x-ray tracings.

In reference to this oral configuration, Ghazeli actually describes a second narrowing tak-

ing place in the oral tract during pharyngeals approximately 6 em behind the lips.

Acoustically, Ghazeli (1977) explains that [l] has vowel-like formant structures.

F1 and F2 of [l] fall between 650-900 Hz and 1300-1700 Hz, respectively, depending on

the vowel context. The spectrogram of voiceless [h], on the other hand, has aperiodic

noise together with formant structures. The value ranges for F1 and F2 are 550-1100 Hz

and 1100 -1800 Hz, respectively. The noticeably high F1 values, according to Ghazeli

are credited to the very low place of constriction in pharyngeals as well as the relatively

wide oral cavity. It is worth noting that, while Al-Ani describes [l] as a voiceless stop,

his acoustic account of [h] is generally similar to that of Ghazeli. It seems that, for the

particular Iraqi dialect studied by Al-Ani, the contrast between pharyngeals is not in voic-

ing, but rather in degree of constriction.

In regards to the coarticulatory impact of pharyngeals on neighboring vowels,

Ghazeli (1977) reports only small effects: raising of F1 throughout the vowel and long

transitions in F2. Al-Ani (1970), on the other hand, notes that F1 in vowels neighboring


49

[5'] is much higher than their usual values ( 400 Hz or higher up from the prototypical

275-300 Hz in [i]-no reports on Fl in [a] or [u]). Al-Ani also reports that F2 of [i] starts

at 1500Hz or lower while F2 in [u] rises to 950Hz and in [a] it drops to 1250-1350 Hz.

The F2 drop in [a] extends throughout the vowel, not just at the onset. The F2 starting

values following [h] were not as low: 1750-1900 Hz, 900Hz, and 1300-1450 Hz for [i],

[u], and [a], respectively. It looks from these values and patterns that the Iraqi Arabic ver-

sion of [)] spreads emphaticness in a manner similar to that exhibited by uvulars. Recall,

however, that Al-Ani describes [5'] as a stop. It is possible that, in order to attain an oc-

clusion in the pharyngeal area, the whole tongue mass needs to be along with

the tongue root to facilitate the full contact with the posterior wall of the pharynx. In this

way, the articulation of Iraqi Arabic [)] resembles to a certain degree the retracted articu-

lations involved in emphatics and uvulars. Unfortunately, it is not possible to verify this

claim since, as mentioned earlier, Al-Ani's x-rays of[)] and [h] do not cover the middle

and lower pharynx.

The most observable effect of pharyngeals on neighboring vowels is a rise in Fl.

An extensive acoustic account of this effect is presented in Butcher and Ahmed (1987).

The authors found that both pharyngeals are accompanied by a raised F 1 at the steady

state of neighboring vowel. Still, there is also a rising Fl transition from the vowel to the

pharyngeal consonant. Alwan (1989) studied this effect as a perceptual cue for pharyn-

geals using synthesized speech samples. When Fl values were high, the guttural sound

was perceived as [)], while lower values cued the perception of the uvular [ff]. El-Halees

( 1985) arrives at similar conclusions for the same uvular/pharyngeal pair as well as their

voiceless counterparts [X, h]. He concludes that Fl is a strong perceptual cue for distin-


50

guishing sounds made in the posterior portion of the vocal tract-the sounds produced

further back correspond to higher Fl transitions. While other researchers stress the im-

portance of F2 onset for the perception of place of articulation in consonants, Alwan

(1989) notes, it is Fl that plays a significant role in distinguishing uvular from pharyn-

geal. She explains that this is because in other consonants, F 1 starts at a somewhat similar

low hub while after uvulars and pharyngeals it starts at a higher point, creating a distinc-

tion between orals and gutturals. Among the gutturals, Fl further helps in distinguishing

between pharyngeals and uvulars: Fl is usually higher next to pharyngeals than next to

uvulars.

In short, Arabic has two pharyngeal sounds: the voiceless [h] and the voiced [1].

Both sounds involve a low pharyngeal constriction due to the retraction of the tongue root

and the epiglottis. Articulation of pharyngeals is also reported to involve raising the lar-

ynx and advancing the lower part of the posterior wall of the pharynx. Meanwhile, the

tongue body is held in a medial position in the oral cavity. While there are different re-

ports regarding the degree of constriction of Arabic pharyngeals, they are more convinc-

ingly described as approximants. Acoustically, both sounds show vowel-like formant

structures throughout their articulation. The voiceless [h] has also aperiodic noise. Arabic

pharyngeals are generally associated with high Fl in neighboring vowels.

2.2.4 Laryngeals

Early Arab linguists noted that the two laryngeals are articulated lower and further

back than any other speech sounds. Sibawayh describes the two laryngeals [h] and [?]

(along with the vowel [a]) as "The sounds whose point of articulation is the furthest


51

(down the throat)''. Ibn Sina offers a more detailed articulatory description. In his ac-

count, the glottal stop is formed by a laryngeal obstruction of the pulmonic air pressure

"which is then, expelled, being forced out both by (the activity of) the muscles (which

cause the larynx to be) opened and by the air pressure" (Semaan 1963:35). He notes that

the articulatory mechanism for [h] is quite similar, except that "the obstruction is not

complete, but is modified by the edges of the exit (in the larynx)" causing a perturbation

in the exiting air stream.

The only modern instrumental account of the articulation of laryngeals in Arabic

is Zawaydeh ( 1999). She reports in her fiberscopic study that the pharyngeal area during

the articulation of the two Arabic laryngeals is as wide as it is during the articulation of

plain oral sounds. By comparison, the pharynx is significantly narrower during the articu-

lation of emphatics, uvulars, and pharyngeals. Generally speaking, it is unlikely that Ara-

bic laryngeals differ articulatorily from the most cross-linguistically common forms of

[h] and['?]. Catford (1977) terms these two sounds as glottals since they form a subset of

more possible laryngeals (the remaining members of which are made primarily by the

action of the ventricular bands). Using the term laryngeals to refer to Arabic [h] and [?]

is acceptable since they do not contrast with any other laryngeals in the language. Curi-

ously, the laryngeal fricative [h] is described by Al-Ani (1970) as an oral voiceless frica-

tive. It is possible that the presence of oral configurations coarticulated from neighboring

vowels led him to believe that these configurations were required for the articulation of

[h]. By definition, however, these configurations would vary depending on vowel con-

texts and cannot be ascribed to unique articulatory demands of [h].


52

Al-Ani (1970) describes [h] acoustically as noise whose starting frequency de-

pends on the adjacent vowel: 2000-2700 Hz next to [i], 1500-2000 Hz next to [a], and

1200Hz next to [u]. The spectral shape of the glottal stop [?] varies: when single, [?] ap-

pears as a series of glottal pulses that look somewhat like formants and are more widely

spaced than the glottal pulses one sees in vowels. When geminated, it appears as a long

silent gap. While Al-Ani reports only slight or no coarticulatory impact of laryngeals on

neighboring vowels, Zawaydeh (1999), based on an acoustic investigation of Ammani-

Jordanian Arabic, reports that, on following [a] she found that gutturals, including laryn-

geals, and emphatics are followed by statistically significant higher F1 values than other

sounds. Since it was possible that the high F1 of the low vowel might not be due to a rais-

ing effect from the guttural, but rather a non-lowering effect (since next to oral obstru-

ents, F1 usually start from a very low hub), Zawaydeh conducted a follow up test of the

coarticulatory effect on the high vowel [i] which has a low Fl to begin with. Her results

indicate that, following emphatics, uvulars, pharyngeals, and laryngeals, Fl in [i] was

significantly higher than when following plain orals. This result is somewhat surprising

since laryngeals usually have no vocal tract constrictions of their own above the glottis.

As noted earlier, in the articulatory part of her study, Zawaydeh maintains that there is no

narrowing in the pharynx during laryngeal articulations in Ammani-Jordanian Arabic. It

is possible that some degree of larynx raising is involved in the production of Ammani-

Jordanian Arabic laryngeals or that Zawaydeh's subjects produced these sounds with a

much wider mouth and lip openings than usual.

Reproduced w



ission.

Table 2.1 A Summary of the phonetic attributes of Arabic emphatic, uvular, pharyngeal, and laryngeal sounds.

Emphatics

Uvulars

Pharyngeals

Laryngeals

Articulation

• Main coronal articulation. • Tongue dorsum retracted into upper

pharynx. • Lowered palatine dorsum. • Mildly retracted tongue root and epi-

glottis.

• Tongue dorsum raised and retracted into upper pharynx.

• [q] and [X]: Uvula flattened and raised. • [E]: Uvula curled downwards. • Mildly retracted tongue root and epi-

glottis.

• Tongue root and epiglottis retracted into lower pharynx.

• Raised larynx and forward movement of posterior wall of pharynx also re-ported.

• Tongue body held in mid oral location.

• [h] is articulated with an open glottis. • [?] is articulated with a constricted

glottis. • No pharyngeal or oral constriction is

involved.

Acoustic Shape

• No consistent reports of spectral dif-ferences between emphatic and non-emphatic consonants.

• Emphatic stops are followed by shorter VOT's than non-emphatic ones.

• This area is in need of more extensive investigation.

• [K] has formant-like structures. • [X] appears as aperiodic noise. • [q] is a stop.

• While sometimes referred to as frica-tives, Arabic pharyngeals are more convincingly described as approxi-mants.

• [1] has strong formant-like structures. • [n] has aperiodic noise alongside the

formant-like structures.

• [h] appears as aperiodic noise. • [?] appears as glottal pulses when sin-

gle and as silent gap when geminated.

Coarticulatory Effect

• Spread emphasis (ES) onto other sounds (rise in Fl + fall in F2 in vow-els).

• ES travels in both directions and is a far-reaching effect.

• ES is often blocked by [i], [j], and [f]. Other blockers are also reported.

• Domain of ES varies depending on dialect.

• Spread emphasis (ES) onto adjacent other sounds.

• ES not as strong nor as far-reaching as ES from emphatics.

• Associated with high Fl in adjacent vowels.

• Not associated with any particular coarticulatory effects on other sounds.

VI VJ


54

In sum, Arabic has two laryngeals: [h] and [?]. The only articulatory study on

these sounds indicated that they do not involve any pharyngeal narrowing. Like most of

their equals in other languages, Arabic laryngeals do not involve any supraglottal vocal

tract configurations of their own. Acoustically, [h] appears as noise whose starting fre-

quency depends largely on the vowel context. [?] appears as widely-spaced glottal pulses

when singles and as a silent gap when geminated. Most studies report no coarticulatory

effects of Arabic laryngeals on neighboring vowels. However, Zawaydeh (1999) notes

that laryngeals, like pharyngeals, are associated with high F1 in neighboring vowels.

A summary of the phonetic qualities of Arabic emphatics, uvulars, pharyngeals,

and laryngeals is provided in Table 2.1.

2.3 Gutturals as a Natural Class

This section reviews some of the phonological evidence presented in support of

the treatment of guttural sounds as a single natural class in terms of place of articulation.

Most of the pieces of evidence discussed in the literature come from Semitic languages.

However, Cushitic and Interior Salish evidence for this classification has also been pre-

sented. It should be noted, however, that membership to this class differs in different lan-

guages. In Arabic, for example, uvulars, pharyngeals, and laryngeals form the guttural

class. On the other hand, the Interior Salish guttural class includes uvulars, pharyngeals,

and retracted alveolars (emphatics- see §2.3.1 below), but not laryngeals.

The present discussion focuses on two general types of evidence that are consid-

ered to be the most compelling as well as the most reported cross-linguistically. The first


55

type of evidence is the guttural-conditioned morpheme structure constraints. The selected

examples of this type come from Arabic, Qafar, and Moses-Columbian. The second type

of evidence is the guttural-conditioned vowel lowering processes. Selected examples of

this type come from Arabic, Hebrew, Maltese, and Tigrinya.

2.3.1 Morpheme Structure Constraints

In the canonical roots of Arabic, as well as other Semitic languages, adjacent

identical consonants are strictly prohibited. This explains the complete absence of roots

like * kkb or * t 77. This constraint is explained in McCarthy (1986) as a function of the

Obligatory Contour Principle (OCP), originally proposed by Leben (1973) to account for

the prohibition of adjacent identical tones in lexical The function of this

principle was extended further to apply to Place features, and not just full segments

(Mester 1986; McCarthy 1988, 1991, 1994; Yip 1989). Thus, the cooccurrences of horn-

organic consonants in Arabic roots are avoided because they would project their identical

Place feature on the same tier, making them adjacent on that particular tier. So, a root like

*fbk which has two adjacent labials is either rare or totally absent from the Arabic lexi-

con. This avoidance applies also, but less forcefully, to non-adjacent consonants. It should

be noted that the avoidance of homorganic segments is not as absolute as the ban on iden-

tical segments. Sequences of identical segments are totally prohibited while the avoidance

of homorganic segments, though somewhat strict, is not absolute. For the prohibition to

6 The position taken in this dissertation is that the OCP is a well-supported grammatical principle. Some linguists question the status of the OCP in phonology arguing that it is violable and exception-ridden (see Odden 1986, 1988; Blevins 2004). However, the phonological evidence in support of the status of the OCP comes from numerous languages and affects different phonological units (tones, features, segments). Such evidence is quite compelling and hard to ignore. Furthermore, many cases of apparent OCP violations have been accounted for quite systematically (see McCarthy 1994 and references therein, Padgett 1995; see also §6.3.1 of this dissertation.) ·


56

be invoked by the place feature [coronal], this particular feature has to be taken in con-

junction with the major class features [sonorant] and [continuant], though not without

exceptions. In general, the cooccurrence of any two members of one of the sound classes

in (8) is avoided.

(8) Arabic sound classes subject to OCP-based MSCs (from McCarthy 1994:204).

a. Labials bfm b. Coronal sonorants 1 r n c. Coronal stops t d t" d" d. Coronal fricatives eo s z s" z" I e. Velars gkq f. Gutturals XBh)h?

McCarthy (1994) provides statistical support for the Place-related cooccurrence

restriction. Table 2.2, which is a reproduction of McCarthy's Figure 12.1 ( 1994:204 ), lists

the frequencies of adjacent consonant combinations in Arabic roots? Consonants listed in

the row headers are linearly ordered before the ones listed in the column headers in the

consonantal root. The statistical assessment of the frequencies was obtained through X2

tests on 1 df (for explanation of the test parameters see Padgett 1995). Combinations of

identical consonants were excluded from these tests since there is an absolute ban on

them. What is important is that the frequencies of cooccurrences of uvulars, pharyngeals,

and laryngeals with each other are significantly lowers than expected. The fact that the

members of the three subclasses of gutturals do not freely cooccur is taken as an indica-

7 As one can see in Table 2.2, there are phonetic symbols that are not present in the inventory of MSA shown in (3) in the previous chapter. In his original figure, McCarthy (personal communication) uses the symbol [Z] to represent the emphatic interdental [ (F]. Also, McCarthy lists the velar stop [g] among the Arabic consonants. This symbol represents the Arabic jim ([d3]) which and patterns as (and originates from) [g]. See Clark and Yallop (1995:372) for further explanation.


57

tion that these subclasses should be combined to form the larger natural class of gutturals,

much like the bilabial [b], the labiodental [f], and the bilabial nasal [m] form the natural

class of labials.

Table 2.2. Frequencies of consonant cooccurrences in Arabic roots (from McCarthy 1994:204).

tl' dl' eo J gk q XK <Jh ?h I r n wj

bfm 43 43 31 79 44 180 40 91

td 10 32 21 69 20 51

tl' dl' 9 25 11 59 14 38

eo 4 5 9 44 3 24

s z 19 40 24 75 21 65

sl' zl' 4 16 7 38 5 24

J 10 33 8 37

gk 75 24 90 29 47

q 51 11 15 6 45 12 31

XK 70 18 31 13 23 13 >o 63 13 42

<Jh 91 42 29 17 35 27 2 83 28 60

?h 67 32 10 10 29 4 8 25 6 2 65 16 54

I r 149 51 36 15 58 20 20 66 48 29 74 42 0 91

n 55 23 19 7 26 12 14 31 26 16 28 21 2 X 51

wj 83 44 31 14 44 14 18 34 33 20 41 29 89 26 ·zjm

p < 0.05 p < 0.005

Combinations of uvulars and velars are also avoided. As discussed later, all major

views concerning the representation of these sounds indicate that uvulars are complex

sounds with pharyngeal and dorsal components (see Elorrieta 1991 for extensive discus-


58

sion supporting this view). It is, then, the dorsal place in the representations of both uvu-

lars and velars that motivates the avoidance of their combinations. The uvular stop [ q]

rarely cooccurs with other uvulars or velars. Again, [q] is argued to have pharyngeal and

dorsal components. However, as Table 2.2 shows, this sound cooccurs freely with lower

gutturals. Note also that emphatics cooccur freely with all gutturals even though they are

argued to contain a pharyngeal component representing their secondary articulation.

Meanwhile, all emphatic and velar combinations are noticeably infrequent. These issues

will be discussed later in different points of this dissertation.

The restrictions on consonant cooccurrences are found in non-Semitic languages

as well. One such language is Qafar, an east Cushitic Language discussed in Hayward

and Hayward (1989). However, the root restrictions in Qafar work somewhat differently.

Roots may contain either identical or non-homorganic consonants. The classes of homor-

ganic consonants in Qafar are listed in (9). Again, the classification of the two pharyn-

geals ['l, h] with the laryngeal [h] is taken as an indication that they constitute a natural

class of homorganic consonants.

(9) Qafar homorganic consonants (from Hayward and Hayward 1989: 183).

a. Labials bf b. Coronal sonorants 1 r c. Coronal stops t d ct d. Velars gk e. Gutturals 'lhh

Another non-Semitic language that exhibits similar restrictions on root consonants

is Moses-Columbian, an Interior Salish language. According to Bessell and Czaykowska-


59

Higgins (1992), a morpheme structure constraint prevents pharyngeals, uvulars, and re-

tracted alveolars from occurring in the second consonant location if the same root has a

pharyngeal in its first consonant location. The avoided combinations are listed in (10).

There are some interesting points to note here. First, unlike Semitic root cooccurrence

restriction which affects all sound classes, the Moses-Columbian constraint pertains only

to gutturals. Second, while Moses-Columbian has laryngeal consonants, they are not af-

fected by the constraint. In fact, the whole purpose of the Bessell and Czaykowska-

Higgins article was to show that laryngeals are placeless and are not part of the guttural

class in Interior Salish languages (see Rose 1996, however, for counter arguments). The

third point of interest is that, according to Bessell and Czaykowska-Higgins (1992), the

retracted alveolars oflnterior-Salish "resemble the emphatic consonants of Arabic in both

phonological and phonetic properties" (p. 37). As a matter of fact, in citing the same Bes-

sell and Czaykowska-Higgins article, both Rose (1996) and Zawaydeh (1999) term the

Interior Salish retracted alveolars as 'emphatics'. It is the case, then, that unlike in Arabic,

the Moses-Columbian MSC addresses gutturals (excluding laryngeals) and emphatics as

one natural class in terms of place of articulation.

(10) Consonant combinations that are disallowed in Moses-Columbian (from Bessell and Czaykowska-Higgins 1992:42).

a. *Pharyngeal (V) Pharyngeal b. *Pharyngeal (V) Uvular c. *Pharyngeal (V) Retracted alveolar


60

2.3.2 Guttural Lowering

In several languages, a strong link between guttural consonants and low vowels

has been noticed. Numerous cross-linguistic phonological processes involve vowels being

lowered or epenthetic vowels surfacing as low vowels in guttural contexts. A prominent

example reported by McCarthy (1991, 1994) involves the type of the second vowel in

Arabic imperfect verbs. This vowel almost always surfaces as [a] if a guttural precedes or

follows it ( 411 of 436 incidents). Examples are given in (11 ).

( 11) Perfect/imperfect vowel alternations in Arabic verbs (from McCarthy 1991 :69; 1994:207).

Plane roots Guttural roots Perf. Imperf. Perf. Imperf. katab jaktub 'write' fa1al jaf1al 'do' d'arab jad'rib 'beat' rada1 jarda1 'prevent' farib jafrab 'drink' balud jablud 'be stupid'

McCarthy also provides a Hebrew example of guttural-conditioned vowel lower-

ing. In Hebrew, CVCVC words with stress on the penult vowel are considered to be un-

derlyingly of the canonical form /CVCC/. One phonological rule assigns stress followed

by an epenthesis rule that inserts the second vowel. As the examples in (12) show, the ep-

enthetic vowel surfaces as [a] when the consonant immediately preceding it is a guttural.


61

(12) Hebrew epenthetic vowel lowering (from McCarthy 1994:210).

Plain medial consonant Guttural medial consonant /malk/ [melek] 'king/my king' /ba)l/ [ba)al] 'master' /sipr/ [se:per] 'book' /kanJI [kahaJ] 'lying' !qudf/ [qmdeJ] 'holiness' /lahb/ [lahab] 'flame'

/tu?r/ [tu?ar] 'form/his form'

Brame (1972), cited in Hayward and Hayward (1989), notices a similar phenome-

non in Maltese. In this language, as can be seen in (13), the vowel in the 1st sg. imperfect

prefix is [i] if the stem starts with a non-guttural consonant and [a] if the stem has a gut-

tural in the initial position.

(13) Maltese 1st sg. imperfect verbs (from Hayward and Hayward 1989:185, citing Brame 1972).

Plain roots Guttural roots 1lli+kteb 'I write' 1lla+7bez 'I jump' 1lli+nzel 'I descend' 1na+?leb 'I overturn' 1ni+dneb 'I sin' 1na+hdem 'I work' 1lli+freJ 'I spread' 1na+hleb 'I milk'

Hayward and Hayward ( 1989) also discuss a case in Tigrinya in which vowels are

lowered when adjacent to a guttural. In Tigrinya the low vowel [a] appears in syllables

that contain a guttural while the mid central vowel Ui] appears in non-guttural syllables.

Examples are shown in (14).


62 ( 14) Tigrinya vowel lowering (from Hayward and Hayward 1989: 179).

Plain syllables sabar-a 'he broke (something)' fanaw-a 'it decayed' mazaz-a 'he drew (a sword from its sheath)' sabar-ka 'you have broken (something)' k'arab-ka 'you have approached'

Guttural syllables

?axal-a 'it was enough' hadar-a 'he spent the night' )arag-a 'he ascended' bala)-ka 'you have eaten' sarah-ka 'you have worked'

2.4 Representations of Emphatics and Gutturals

As cited in the previous chapter, Jakobson et al. (1952) give emphatics the feature

[+flat] distinguishing them phonologically from non-emphatics. This feature applies pri-

marily to rounded vowels since they involve narrowing and protruding the lips. This la-

bial setting has the acoustic effect of flattening, or a general lowering of some or all for-

mants. The authors explain that a similar acoustic effect can also be achieved by

constricting the pharynx. Here is their account:

"Instead of the front orifice of the mouth cavity, the pharyngeal tract, in its

turn, may be contracted with a similar effect of flattening. This independ-

ent pharyngeal contraction, called pharyngealization, affects the acute

consonants and attenuates their acuteness .... The fact that peoples who

have no pharyngealized consonants in their mother tongue, as for instance,


63

the Bantus and the Uzbeks, substitute labialized articulations for the corre-

sponding pharyngealized consonants of Arabic words, illustrates the per-

ceptual similarity of pharyngealization and lip-rounding. These processes

do not occur within one language. Hence they are to be treated as two

variants of a single opposition - flat vs. plain." (p. 31)

It should be noted here, though, that there is an important acoustic difference be-

tween pharyngeal constriction and lip rounding. While both cause a lowering of F2, pha-

ryngeal constriction has the effect of raising F1 while lip rounding lowers all formants.

This can be considered as a challenge to the above stated view which equates the two

gestures on acoustic/auditory grounds.

In SPE, emphatics are distinguished from non-emphatics in that the former are

[+low, +back]. These tongue body specifications follow from the fact that the tongue

body is actively involved in the production of the secondary articulation of emphatics.

The SPE feature matrices for the consonants in question, among others, are given in (15).

(15) SPE feature specifications of (plain and emphatic) alveolars, velars, uvulars, pharyngeals, and laryngeals (according to Chomsky and Halle 1968:307).

Plain alveolars Emphatic alveolars Velars Uvular gutturals Pharyngeal gutturals Laryngeal gutturals

Anterior Coronal High Low Back + +

+ +

+ +

+ +

+ + + +


64

Note that the SPE model assumes that emphatics are truly pharyngealized since

their tongue body specifications are identical to those of pharyngeal sounds. McCarthy

( 1994) notes that, on the basis of these feature specifications, gutturals are distinguished

from the rest of the sounds in that they are [-anterior, -high] while the specifications for

[low, back] distinguish the three guttural classes from each other. McCarthy also points

to some problematic aspects regarding the feature specifications for gutturals. He states

that the feature [-high] for uvulars is inconsistent with the actual articulation of these

sounds which involves a raised tongue8• Moreover, he argues, convincingly, that the SPE

featural specifications for pharyngeals and laryngeals inaccurately involve the tongue

body features [low] and [back]. Neither subclass involves the tongue body as its articula-

tor. It should be noted here that McCarthy's challenge of the feature specification for

pharyngeals cannot be easily extended to include emphatics - the so-called 'pharyngeal-

ized' sounds. Emphatics clearly involve a retraction of the tongue body that cannot be

found in pharyngeals. The feature [+back], then, is a potentially acceptable consideration

as far as emphatics are concerned.

Among the more recent feature-geometry-based representational proposals, only

three that are either influential or embody somewhat novel representational suggestions

are discussed here. These include, of course, McCarthy's (1994) influential work. This

article has stirred significant debate among linguists. It also served as the launch pad for

almost all of the ensuing alternative accounts (including the present dissertation). Rose's

(1996) work relies on an extensive review of cross-linguistic data and attempts to recon-

cile the seemingly variable grouping of laryngeals in the natural classes in several Ian-

8 See, however, Chapters 5 and 6 of this dissertation for more on the articulation of uvulars. It is argued that the raising maneuver in uvulars is actually due to a radical articulation, not a lingual one.


65

guages. Zawaydeh' s (1999) is a more recent work that relies on articulatory and acoustic

data to investigate the articulatory properties of emphatics and gutturals and introduces a

new distinctive feature called [Retracted Tongue Back] to characterize the lingual in-

volvement in emphatics and uvulars. There are two other prominent proposals that are not

discussed here: Herzallah (1990) and Davis (1995). Herzallah adopts V-Place-based rep-

resentations. Such representations suffer from independent theory-internal problems (see

Halle et al. 2000 for criticism). Davis's proposals contain representational elements found

also in Rose's and Zawaydeh's, both of which are discussed below. For these reasons, I

find it unnecessary to go into Herzallah's and Davis' particular proposals in detail.

2.4.1 McCarthy (1994)

McCarthy ( 1991, 1994) argues that it is not possible to represent Arabic gutturals

in a fashion that is consistent with basic concepts of the articulator theory. This follows

from the fact that the three Arabic guttural subclasses are articulated at three distinct

points within the pharyngeal region. He therefore identifies those three subclasses with

the feature [pharyngeal] denoting their common place of articulation. To handle the

asymmetry created by this proposal (arising from the fact that other sounds are identified

with the [labial], [coronal], and [dorsal] active articulators), McCarthy embraces a differ-

ent understanding of distinctive features. He follows Perkell's (1980) characterization of

distinctive features as "orosensory patterns corresponding to sound producing states"

(Perkell1980:338, cited in McCarthy 1991:84 and 1994:199). In McCarthy's elaboration

of this proposal, he defines distinctive features "as particular patterns of feedback from

the vocal tract (which have consistent acoustic consequences)" (1994: 199; parenthesis in


66

the original). McCarthy cites some histological and neurosensory works by Grossman

(1964), Ringel (1970), and Penfield and Rasmussen (1950) which argue that the pharynx

lacks the neural density and the tactile sharpness that can be found in the oral regions.

Accordingly, the pharynx, though a comparatively large area, can be considered equal,

from a sensory point of view, to the [labial], [coronal], or [dorsal] areas of the oral tract.

McCarthy proposes [pharyngeal] as the feature which identifies the class of gutturals

whose articulation is in the broad region that spans "the area from the larynx to the oro-

pharynx inclusive" ( 1994: 198-199). The "consistent acoustic consequences" for this type

of articulation is a high F1 which McCarthy argues to be witnessed in all gutturals. The

feature [pharyngeal] is also used to refer to the secondary articulation in Arabic emphatic

coronals since they too possess pharyngeal constrictions. The general feature tree given

by McCarthy which reflects his arguments is given in (16) while the individual represen-

tations of the subclasses of gutturals as well as emphatics are given in (17).

(16) McCarthy's feature geometry (1994:223) .

•

• Laryngeal node • Place node

[voice] [constr] • Oral

[lab] [cor] [ dors] [pharyngeal]


(17) McCarthy's proposed representations for gutttuals and emphatics (1994:221)

a. Pharyngeals and Laryngeals b ) h ? •Place

[pharyngeal]

c. Emphatics t' d' o' s' •Place

[coronal] [pharyngeal] ( [dorsal])

b. Uvulars X B"

•Place

[pharyngeal] [dorsal]

d. Uvular stop q

•Place

[pharyngeal] [dorsal]

67

Given these representations, adjacent guttural consonants would be avoided in

Arabic roots since they would project their [pharyngeal] features on the same tier causing

a violation of the OCP. As for vowel lowering, McCarthy reasons that it involves the

spreading of the feature [pharyngeal] from the guttural to the target vowel.

Emphatics include the feature [pharyngeal] in their representation since, like the

primary articulation of gutturals, the sec,ondary articulation of emphatics is a constriction

in the pharynx. As both emphatics and uvulars achieve their pharyngeal constrictions

through tongue dorsum retraction, both classes posses the articulator feature [dorsal] as

well. Notice that in emphatics the feature [dorsal] in parenthesis. What this means is that

this feature is redundant for this class of sounds and is not part of their underlying speci-

fication. The perceived articulatory similarity between emphatics and uvulars leads

McCarthy to suggest that emphatic sounds should be described as 'uvularized' rather than


68

'pharyngealized'. In regards to the uvular stop [q], McCarthy considers this sound as the

emphatic version of [k]. This claim means that [q] is the only non-coronal emphatic.

2.4.2 Rose (1996)

For Rose, the feature [RTR] is present in all sounds that involve a constriction in

the pharynx. This includes emphatics, uvulars, and pharyngeals. Laryngeals are excluded

from the set of [RTR] sounds. The main focus of Rose (1996) article is the representa-

tional status of the laryngeals. Rose presents an extensive set of data in support of her

claim that the status of laryngeals is decided by the phonemic inventory of the language

in question. In languages like Arabic where laryngeals contrast with other gutturals, la-

ryngeals are phonologically classified as Pharyngeal sounds. If the language does not

have other gutturals, laryngeals are considered placeless. Rose bases her claim on A very

and Rice's (1989) 'Modified Contrastive Specification' of speech segments. According to

this claim, a class node is underlyingly specified for a certain segment only if this seg-

ment contrasts with other segments in features that depend on the relevant node.

It has been argued by Bessell (1992) and Bessell and Czaykowska-Higgins (1992)

that laryngeals in Interior Salish languages should be viewed as placeless and are not as a

subset of the guttural class. One piece of evidence is the absence of laryngeals from the

sets of sounds that trigger the MSC on guttural consonants cooccurrences in Moses-

Columbian as explained in §2.3.1. Another piece of evidence is that, unlike uvulars,

pharyngeals, and emphatics, Interior Salish laryngeals do not trigger vowel retraction.

Furthermore, the realization of the epenthetic vowel in Moses-Columbian varies based on

the identity of the adjacent consonant. Of importance is that the vowel appears as [<;!-] next


69

to uvulars and as [if] next to pharyngeals? Next to laryngeals the vowel surfaces as [a]

which, although still low, has a higher quality than the other two realizations. Bessell and

Czaykowska-Higgins argue that [a] is the "uncoarticulated value of the default vowel"

which is what one would expect to surface next to a placeless consonant.

To account for these issues, Rose proposes that cases in which laryngeals act dif-

ferently from other gutturals are in fact due to the lack of a tongue root retraction [RTR]

in laryngeals. So Rose explains vowel retraction as spreading [RTR] from the source con-

sonant onto the vowel. As for the different surface quality of the epenthetic vowel, Rose

actually uses it as further evidence that laryngeals are specified as Pharyngeal sounds in

Interior Salish. She points to Bessell and Czaykowska-Higgins' (1992) data which shows

that, regardless of the precise surface quality of the epenthetic vowel, it always surfaces

as low next to uvulars, pharyngeals, and laryngeals. This is akin to the cases of vowel

lowering next to guttural sounds in other languages. Rose, however, does not present any

compelling explanation for why laryngeals are not included in the sounds that trigger the

morpheme structure constraints in Moses-Columbian. Rose's representations of laryn-

geals, pharyngeals, uvulars, and emphatics are given in (18).

9 Bessell and Czaykowska-Higgins (1992) use the symbols [<,1] and [:f] to represent lower versions of [a] in Moses-Columbian. [:f] is apparently the lowest version of [a] while [<,1] falls in between the two.


70

(18) Rose's proposed representations for gutturals and emphatics (1996:80)

a. Laryngeals h ? Place or ROOT

Pharyngeal

d. Emphatics t1 d1 61 s1

Place

Oral Pharyngeal

b. Pharyngeals h ) Place

I Pharyngeal

I [RTR]

Coronal Dorsal [RTR]

c. Uvulars X B R

Place

I

0 Dorsal

e. Uvular stop q Place

[RTR]

Oral Pharyngeal

I I Dorsal [RTR]

Like McCarthy ( 1994 ), Rose draws a distinction between the uvular continuants

[X, B] and the uvular stop [q]. According to Rose, in the continuants, the pharyngeal

component is primary, while in [q] it is secondary, making the uvular stop essentially an

emphatic version of [k].

2.4.3 Zawaydeh (1999)

The class of Arabic gutturals in Zawaydeh's view includes uvulars, pharyngeals,

laryngeals, and emphatics - essentially all sounds that include a constriction in the back

of the vocal tract. Based on her interpretation of her fiberscopic evidence that emphatics,

uvulars, and pharyngeal involve a constriction in the pharynx, she asserts. that the phar-

ynx is an active articulator in these sounds. However, she was careful to exclude the la-


71

ryngeals from this claim since, as shown by her data, there is no pharyngeal constriction

involved in their articulation. However, she bases the membership of laryngeals in the

guttural class on acoustic grounds. Her acoustic experiment shows that emphatics, uvu-

lars, pharyngeals, and laryngeals are associated with high F1 in adjacent vowels. Her

acoustic findings in regards to laryngeals have already been questioned in §2.2.4. Such

findings merit further scrutiny since they go against the predictions of the source-filter

theory. Zawaydeh's claim regarding the basis on which gutturals are grouped is therefore

questionable. Zawaydeh maintains that emphatics and uvulars both involve a movement

by the back of the tongue towards the uvular region, which was initially declared by

McCarthy's (1994). For this reason, Zawaydeh introduces a new feature [Retracted

Tongue Back] which she uses only in the representations of emphatics and uvulars.

Following representational proposals by Vaux (1993) and Davis (1995), Zaway-

deh's feature trees involve splitting place node into two branches: a lower vocal tract

node (L VT) and an upper vocal tract node (UVT). The former dominates pharyngeal and

laryngeal articulators and features while the latter dominates oral features and articula-

tors. Additionally, Zawaydeh follows Davis (1995) in labeling main places of articulation

as (1 place) and labeling secondary places of articulation as (2 place). The resulting rep-

resentations for emphatics, uvulars, pharyngeals, and laryngeals as provided by Zaway-

deh are listed in (19).


72 (19) Zawaydeh's proposed representations for emphatics, uvulars, pharyngeals, and

laryngeals (1999:82)

a. Emphatics ROOT

1 place 2 place

I I UVT LVT

I I [Coronal] Pharyngeal Constriction

I Retracted Tongue Back

c. Pharyngeals ROOT

I 1 place

I LVT

I Pharyngeal Constriction

I [Retracted Tongue Root]

b. Uvulars ROOT

I 1 place

UVT LVT

I I [Dorsal] Pharyngeal Constriction

I Retracted Tongue Back

d. Laryngeals ROOT

I 1 place

I LVT

I [Laryngeal]


73

2.5 Representational Problems

The different representational proposals for emphatics and gutturals discussed

above share some common traits. The most important is that they all include, in one form

or another, a pharyngeal component to represent the primary pharyngeal articulations in

uvulars, pharyngeals, and laryngeals as well as the secondary articulations in emphatics.

The precise implementation of this common view depends mainly on each researcher's

interpretation of the articulatory characteristics of emphatics. Each proposal directly re-

flects the assumption that emphatics are either pharyngealized or uvularized. All of those

proposals, however, ignore some crucial differences in regards to the particular vocal ap-

paratus that implements the aforementioned constrictions. While the representations of

laryngeal articulations are relatively straightforward, those of uvulars, pharyngeals, and

emphatics are somewhat vague or inaccurate. Additionally, these proposals fail to explain

some important phonological processes and patterns. The end results are theoretical rep-

resentations that are ill-motivated both phonetically and phonologically. This section

highlights these issues.

When considering the previously reviewed experimental literature on the articula-

tory maneuvers involved in the production of pharyngeals, one can clearly see that these

sounds are implemented by a retraction of the tongue root in the lower pharynx. Accord-

ing to Ladefoged and Maddieson (1996), "[t]he root of the tongue and the epiglottis can

be moved independently of the body of the tongue" (p. 11). Those studies also show that

the tongue body is not retracted during the articulation of pharyngeals. The active articu-

lator for these sounds is the tongue root which is pulled back to create the lower pharyn-


74

geal constriction. In uvulars, no independent retraction of the tongue root is noted. In-

stead, the retracted structures include the tongue root, the tongue dorsum, and the epiglot-

tis-essentially all the movable structures behind the tongue blade that are parts of or are

directly attached to the tongue. So the apparent tongue root retraction in uvulars is possi-

bly a by-product of the retraction of the tongue body as a whole. When comparing the

general shape of the pharynx during the production of uvulars and pharyngeals, it appears

that the pharynx is constricted more and at a lower point during pharyngeals than during

uvulars. So, while the pharynx itself, most likely through the action of the pharyngeal

constrictors, moves independently to produce the constrictions in [b] and [1], it is the

movement of the tongue body that constrict the pharynx during [X], [ B].

Like uvulars, the secondary articulation in emphatics also involves a tongue re-

traction that narrows the upper pharynx and retracts the tongue root and the epiglottis

along with it. However, the articulations of these two sets of sounds have four fundamen-

tal differences that are mostly ignored by the previous representational proposals. First,

the tongue body is retracted further during emphatics than during uvulars. Second, de-

spite some articulatory variability, the back of the tongue is generally moved vertically

towards the uvular area during uvulars but is horizontally slid backwards during emphat-

ics. Third, the upper surface of the tongue dorsum is depressed during emphatics but not

during uvulars. Fourth, the soft palate is actively involved in the production of all Arabic

uvular sounds. This participation is quite visible during [B], but is rather subtle during [X]

and [q]. Recall from the previous review of the articulatory studies that the uvula is

curled downwards and touching the tongue during [B]. It was also noted that the uvula is

raised and held firmly flat during [X] and [q]. Both are different positions when compared


75

to other (non-nasal) sounds in which the uvula is raised but somewhat relaxed and not

flat.

Norlin (1987) points to the difference between emphatics and pharyngeals noting

that "[a]s far as the pharynx is concerned it does not play an active part as an articulator"

in the production of emphatics. Instead, "[i]t is the tongue which by a backing movement

causes the constriction" (p. 72). McCarthy (1994) himself acknowledges that the pharyn-

geal constriction in emphatics and uvulars is executed by the [dorsal] articulator. How-

ever, he formally represents this constriction in emphatics underlyingly by referring to its

place rather than its active articulator. In his representation of uvulars, both place and ac-

tive articulator of the pharyngeal constriction are present. These different proposals are at

odds with his claim that the secondary articulation in emphatics is an equivalent to the

articulation of uvulars. Zawaydeh (1999) also notices the same difference between em-

phatics and pharyngeals and adds that uvulars, like emphatics, also involve a pharyngeal

constriction as a function of tongue backing. It is for this reason that she presents the

novel feature [Retracted Tongue Back] to describe uvulars and emphatics. Aside from the

fact that this feature is motivated only by this difference, Zawaydeh subscribes to the idea

that emphatics are uvularized which, as noted earlier, is articulatorily problematic. Addi-

tionally, the role of the tongue dorsum in her representation of uvulars is quite vague. It is

implicated twice: as implementer of the [dorsal] feature and as implementer of the [Re-

tracted Tongue Back] feature. To complicate things further, the two implementations are

nested under two different vocal tract nodes.

Phonologically, as explained earlier, one of the most important pieces of evidence

that Arabic gutturals constitute a natural class is their clearly low frequency of cooccur-


76

renee in Arabic consonantal roots. Cooccurrence of two consonants belonging to that set

of sounds in the same root would violate the place-OCP-based constraint against the

cooccurrence of homorganic segments in the same root. For each one of the proposals in

§2.4 to account for the restricted cooccurrence of gutturals in Arabic roots, the place-

OCP has to address a terminal feature or class node that is common among all guttural

subclasses (the feature [pharyngeal] in McCarthy's 1994 proposal, the class node Pha-

ryngeal in Rose's 1996 proposal, and the class node LVT in Zawaydeh's 1999 proposal).

In all three proposals, those common features or nods are also present in emphatics. All

existing representational proposals would predict that roots containing emphatics and gut-

turals should be significantly infrequent. However, looking at Table 2.2, it is clear that

emphatics and gutturals cooccur rather freely. To solve this problem, McCarthy proposes

the conjunction of the major class feature [approximant] to the feature [pharyngeal] when

defining the guttural class. This would limit the applicability domain of the OCP to [pha-

ryngeal] sounds that also share the feature [+approximant]. Since emphatics are not ap-

proximants, they would be excluded from this domain. However, this approach presup-

poses that all gutturals are approximants. While this might be true for low gutturals, the

two uvulars [X] and [B] are widely considered as fricatives. While Ladefoged and Mad-

dieson ( 1996) suggest that Arabic pharyngeals are indeed approximants, they made no

such claim in regards to uvulars. This particular point is addressed experimentally in

Chapter 3 (Experiment One) of this dissertation. Alternatively, Pierrehumbert (1993) pre-

sets a requirement that the place OCP should exclude secondary articulations. However,

she makes no formal suggestions on how that might be implemented. Zawaydeh's (1999)

representations do reflect the difference between primary and secondary articulations.


77

Nevertheless, the problem persists. As can be seen in Table 2.2, emphatics and velars

rarely cooccur. Since emphatics are primarily coronal, we have to assume that it is the

secondary articulation in emphatics that causes a violation of the OCP-based restriction.

Indeed, in Rose's proposals, emphatics underlyingly include a [dorsal] articulation. We

can assume that it is this component that triggers the OCP violation if there is a [k] or [g]

in the same root with an emphatic. It seems that simply excluding secondary articulations

from the applicability domain of the place OCP is not a viable solution.

Another major phonological challenge that faces existing representations of em-

phatics and gutturals concerns the vowel lowering issue. Recall that some of the most

compelling evidence for the existence of a guttural natural class is the lowering of vowels

to [a] in the adjacency of guttural sounds. For the most part, emphatics do not trigger a

similar effect. To account for this dissimilarity, McCarthy again appeals to the feature

[approximant]. However, no formal explanation is provided. One might also appeal to

different effects of spreading primary nodes/features from spreading secondary ones.

However, one vowel lowering process in Eastern Arabic dialects (Herzallah 1990; also

cited in McCarthy 1994), involves gutturals, emphatics, and contextually emphaticised

[r]. Elaboration on this particular evidence is presented in Chapter 6 of this dissertation.

What is important here is that primary and secondary articulations can have the same

phonological impact.

The previous discussion highlights the phonetic and phonological inadequacies of

the present proposals for the theoretical representations of emphatics and gutturals in

Arabic. These representations reflect clear phonetic-phonological mismatches and they


78

fail to account for differences in the phonological behavior of the sound classes in ques-

tion.


79

CHAPTER3

Experiment One:

The Spectral Properties of Arabic Consonants

3.1 Overview

This experiment investigates the canonical spectral shapes of MSA emphatics,

gutturals, and related sounds. The goal of this chapter is to address two gaps in the acous-

tic literature on Arabic emphatics and gutturals. First, recall from the previous chapter

that previous attempts to distinguish emphatic consonants from non-emphatic ones based

on their spectral shapes have relied mostly on visual observation of spectrograms. A more

extensive objective research in this field is necessary to achieve a fuller understanding of

the acoustic correlates to emphaticness. Second, while there have been strong phoneti-

cally-based claims that Arabic pharyngeals are approximants rather than fricatives, no

similar claims are presented regarding the consonantal status of uvular continuants. Sig-

nificant theoretical proposals in one of the most notable works on the phonology of Ara-

bic gutturals (McCarthy 1994) are predicated on the claim that all Arabic gutturals are

approximants (see §2.5). It is necessary to verify this claim on phonetic grounds.

Consequently, there are two hypotheses being tested in this experiment. The first

hypothesis is that there are no differences between canonical spectral shapes of emphatic

consonants and their non-emphatic counterparts. This hypothesis is based mainly on the

prediction that the main articulations in Arabic emphatic would filter away most of the


80

acoustic impact from their secondary articulations. The second hypothesis is that the

shapes of the power spectra of Arabic uvulars suggest that these sounds are fricatives

rather than approximants. This hypothesis follows from the fact that the vocal tract con-

striction during the articulation of Arabic uvular continuants is noticeably narrower than

what one would expect in an approximant articulation.

Fairly consistent associations between the spectral shapes of consonants and their

articulatory properties have been reported in many studies (Hughes and Halle, 1956;

Halle et al., 1957; Strevens, 1960; Heinz and Stevens, 1961; Stevens and Blumstein,

1978; Blumstein and Stevens, 1979; Evers et al., 1998). Those studies, among many oth-

ers, show that the shapes of the power spectra of obstruents are decided by the vocal tract

configuration involved in their articulation. The location and degree of the articulatory

constriction for a certain sound determines its acoustic output. However, expressing those

associations objectively and economically, especially in the case of fricatives, remains a

methodological challenge as pointed out by Kent and Read (2002). The authors point to

the intra- and inter-observer variability linked to the most commonly used methods of

power spectra characterization. A table provided by the authors (Table 5-4, p. 168) dis-

plays the substantial ranges of variability in the results obtained by several examiners

who used metrics like relative intensity, effective spectrum length, and spectral peak lo-

cation to classify fricative spectra. Among those metrics, spectral peak location has been

relatively widely used. In what is possibly the most exhaustive use of this metric in a sin-

gle study, Nartey (1982) investigates the fricative consonants in several languages.

Nartey shows that prominent peaks in the power spectra of fricatives generally correlate

with their places of articulation. Nevertheless, an inspection of the many tables provided


81

in the study reveals that the peak locations (both high and low ones) depend heavily on

the language, the speaker, and the phonetic context. Another widely used metric is spec-

tral tilt. Stevens and Blumstein (1978) note that syllable-initial bilabial stops coincide

with a diffuse falling spectrum, alveolar stops coincide with a diffuse rising spectrum,

and velar stops have a compact spectrum with a mid-frequency peak. This method re-

mains, however, qualitative rather than quantitative limiting its objectivity. This is espe-

cially true when characterizing and comparing the power spectra of obstruents whose

places of articulations are not widely distributed in the vocal tract.

More recently, an analysis method that employs the statistical concepts of spectral

moments has received growing attention. In this alternative, which was introduced by

Forrest et al. (1988), the distribution of spectral energy in the main noise portion of the

obstruent (frication noise for fricatives and burst plus the ensuing frication for stops) is

treated as a statistically normal distribution for which the first four moments (mean, vari-

ance, skewness, and kurtosis) are computed. Those metrics inherently express the spectral

tilt of the obstruent as well as its peakedness and its center of gravity. We could roughly

consider this method to be, among other descriptions, a quantitative upgrade of the quali-

tative metric of spectral tilt. The spectral moments generated for a given speech sound are

typically computed from the FFT power spectrum of that sound. Theoretically speaking,

spectral moments can also be based on Linear Predictive Coding (LPC) power spectra.

However, a major technical drawback associated with the LPC analysis renders it unsuit-

able as the foundation for the spectral moments analysis. According to Kent and Reed

(2002), LPC models are generally founded on the assumption that speech articulations

produce only poles (resonances; formants) and not zeros (anti-resonances; antiformants).


82

This assumption does not hold for speech sounds where the vocal tract is highly con-

stricted, such as sibilants, or bifurcated, such as nasals. The FFT model, on the other

hand, captures both poles and zeros yielded by the different t:t;ansfer functions of the vo-

cal tract and is therefore considered to be a more appropriate source for the calculation of

spectral moments.

Though still under used, several studies have shown that spectral moments have

the ability to characterize obstruent place of articulation. Forrest et al. (1988) show that

power spectra obtained for the release portion of English [t] have consistently higher

mean and lower skewness than those of [p] and [k] (see their Figure 2, p. 119). Mean-

while, power spectra for [k] had consistently higher kurtosis values than the other two

stops. Nittrouer ( 1995) and Kardach et al. (2002) achieved similar results when compar-

ing the two English stops [t] and [k]. Forrest et al.'s (1988) discriminant function analysis

based on spectral moments yielded high classification rates for voiceless English stops.

The three voiceless stops [p, t, k] had correct classification rates that ranged from 85.3%

to 100% based on linear scale spectral moments derived for the first 20 ms from the stop

burst. By contrast, the classification of voiceless fricatives did not fare as well; even with

the inclusion of moments generated at the fricative-vowel transition (which, the study

finds, generally increases classification rates), classification of nonsibilants was as low as

61%. However, when considering sibilants alone, classification rates based on linear

scale spectral moments generated from the first 20 ms of the fricative noise were quite

high (70% to 100% ). Tomiak (1990) obtained comparatively higher classification rates

for the English fricatives [f, 9, s, J, h]. Her discriminant function analysis results based

on spectral moments showed high rates of correct identification that ranged between 75%


83

and 100%. However, when using spectral moments from the tokens produced by two of

her subjects for cross-validation, the overall identification rates for [f] and [9] dropped to

67% and 44%, respectively.

In their study of English voiced and voiceless fricatives, Jongman et al. (2000) re-

port that each of the four spectral moments generated at fricative onset, middle, offset,

and fricative-following vowel transition were able to distinguish at least three out of four

fricative places of articulation at each location. Subsequent discriminant function analysis

involving spectral moments (among other acoustic metrics) contributed an overall correct

identification rate of 77%. Here, again, nonsibilants were noticeably less accurately clas-

sified (64% to 68%) than sibilants (85% to. 91 %). It should be noted, though, that not

enough details were provided regarding the weight of the contribution of spectral mo-

ments as a single group of predictors to the results of Jongman et al. 's discriminant func-

tion analysis.

Most available studies of spectral moments rely on English data. However, an in-

vestigation of Polish fricatives by Jassem (1995) shows that cross-linguistic agreement in

the ranges of spectral moments' values, at least for [s], can be achieved. Among the five

voiceless Polish fricatives [s, f, J, x], [s] had the highest spectral mean and the lowest

skewness. The velar fricative [x] in Jassem's is the only fricative produced at a point in

the rear portion of the vocal tract (aside from [h] in Tomiak's (1990) study) for which

there is a spectral moments-based description. Since the present chapter covers Arabic

uvulars and pharyngeals, which are expected to reflect spectral properties close to those

of Polish [x], Jassem's study is a significant acoustic yardstick. Polish [x] had profoundly

higher skewness and kurtosis than the other four fricatives. These values are expected


84

since velar sounds generally have compact power spectra with well defined peaks and

comparatively little energy at the high frequency range.

Tomiak ( 1990) concludes that her results "portray the spectral moments metric as

a potential solution to the invariance problem in speech perception" (p. 187). Kent and

Read (2002) also suggest that spectral moments should be considered by anyone who

wants to study the spectral attributes of obstruents. The present experiment studies a set

of sounds with various local and global characteristic spectral attributes (compact, dif-

fuse, rising, falling, dispersed, dense). The bulk of acoustic analysis done in this chapter

involves descriptions and comparisons of MSA consonants using the spectral moments

method since, as pointed out by Jongman et al. (2000), it characterizes both local (center

of gravity) and global (tilt, and peakedness) qualities of power Heeding Kent

and Read's further suggestion that other methods of analysis may be used along with

spectral moments, a new quantitative spectral characterization method, cqlled the multi-

band spectral (MBS) analysis, is introduced for the first time in this 1 In this

method, the average RMS (root-mean-square) level for every 1000Hz frequency band of

the power spectrum is calculated by means of Fast Fourier Transform (FFT) and ex-

pressed in decibels as a single number. The graphical result is a stepped power spectrum

that averages out the many spectral minima and maxima found in a typical FFT spectrum;

revealing the bulk shape of the spectrum. The resulting loss of spectral details is not be-

lieved to have a significant impact on the unique acoustic identities of obstruents. Heinz

10 One could also add energy dispersion to the set of global spectral characteristics captured by spectral moments 11 I am very grateful to professors Raymond Kent and Paul Milenkovic for their efforts in the development of this analysis method. The former is duly credited for the concept of generating energy spectra averaged over bands of I kHz at the frication noise of the fricative and at the burst of the stop. The latter developed a computer implementation of the concept and included it in his very capable TF32 computer program.


- 10

-20

fg- 30 '-' <l) -g- 40

·'= 0.. s- 50 <C

-60

-70

-80

-90

85

-100 0 2 3 4 5 6 7 8 9 10 11

Frequency (kHz) Figure 3.1. A multi-band spectrum (stepped line) and an FFT spectrum for the Arabic voiceless fricative [s] in the sequence [asa] both generated from a 40-ms full Hamming window placed at the middle of the frication noise.

and Stevens (1961) found that listeners were able to distinguished synthetic voiceless

fricatives made from applying a single energy pole and a single zero to white noise yield-

ing rather oversimplified energy spectra. Figure 3.1 shows the shape of a multi-band

spectrum generated using a 40-ms Hamming window applied over the middle of frication

noise in [s] in the context [asa]. In this example, the MBS has 11 bands as a result of us-

ing a 22 kHz sampling rate to digitize the sound signal. A spectral length of eleven kHz is

regarded sufficient for capturing the characteristic spectral shapes of obstruents. In a

study of Australian English fricatives, Tabain (1998) concludes that spectral information

above 10 kHz is not consistent across speakers. This acoustic information appear to be

dependent on the speaker rather than on the fricative type. Additionally, studies on the

spectral properties of stop releases (Liberman et al. 1952, Stevens and Blumstein 1978,


86

Blumstein and Stevens 1979) suggest that the spectral "templates" that characterize the

different stop places are confined to the lower 5 kHz of the power spectrum.

3.2 Methods

3.2.1 Subjects

Five male subjects participated in this experiment. They were in their late twen-

ties and early thirties. All of the subjects were native speakers of central Saudi Arabic

dialects that are closely related and have identical phonological inventories. I have noted

that all five subjects spoke the Riyadh dialect even though they were not all natives of

Riyadh. This is not surprising given the regional status of the Riyadh dialect. Much like

the Cairo dialect in Egypt and the Damascus dialect in Syria (Holes 1994), the Riyadh

dialect is the predominant and most widespread dialect in central Saudi Arabia. All of the

subjects have lived for various periods in Riyadh and a sizable number of their acquaint-

ances are speakers of the Riyadh dialect.

Riyadh dialect retains all of the MSA consonants save for [d'•l which is always

replaced by [o'], and [q], which is used occasionally but is frequently replaced by [g].

The vocalic inventory of Riyadh dialect is substantially different from MSA. The high

vowels [i] and [u] are retained only when geminated. When single, these two vowels are

typically reduced to [g], The low vowel [a] is retained in both single and geminated

forms. Furthermore, Riyadh dialect has two mid vowels that appear only in geminated

forms: [££] and [oo]. The reason for the lack of any single versions of these vowels is


87

that they are used exclusively in lieu of the MSA vowel-glide combinations [aj] and

[aw], respectively.

The five subjects were all graduate students at the University of Wisconsin-

Madison. Their high level of education ensures that they have the desired proficiency in

MSA since, as pointed in § 1.3, MSA is the primary form of Arabic used in the educa-

tional system and intellectual circles. All five subjects showed a very high level of flu-

ency in MSA and expressed no complaints in regards to the speech materials they were

presented with. The subjects were quite capable of producing all MSA speech sounds

comfortably, including those that were absent from their native dialect. None of the sub-

jects displayed any speech- or hearing-related abnormalities.

3.2.2 Stimuli

The set of stimuli consists of real MSA words that contain the sequence VCV in

which the consonants belonged either to the class of emphatics ([f1], [d1], [01], [s1]), their

nonemphatic counterparts ([t], [d], [0], [s]), gutturals ([q], [XL [B'], [h], [<j']), or the velar

stop [k]. The vowels in those sequences were [i], [a], and [u]. Every VCV combination

was represented yielding a paradigm of 126 test words (3 x 14 x 3). A list of the test

words is provided in Appendix A.

The test paradigm is less than ideal since it is not possible to find a minimal (or

even near-minimal) set of 126 real words. This apparent drawback is outweighed by the

use of natural words rather than nonsense utterances. While the use of nonsense words

makes it possible to compile a carefully designed paradigm in which all sound sequences

of interest are represented in ;identical phonetic surroundings, such items remain, by defi-


88

nition, alien to the speakers. Certain degrees of unnaturalness and over-articulation have

to be involved when pronouncing them. In their cineflurographic articulatory comparison

between Arabic emphatics and their non-emphatic counterparts in nonsense words as

well as natural words, Ali and Daniloff (1972) found that the tongue shape during em-

phatics vary widely across subjects in nonsense words versus natural words. Furthermore,

one of Ali and Daniloff's figures (Figure 5, p. 91) shows that the articulatory displace-

ments of the different tongue parts in emphatics in nonsense words ranged from about

half to about twice those in natural words relative to non-emphatics. The authors con-

elude that "studies utilizing contrived nonsense utterances or sustained utterances, do not

elicit the same articulatory responses which occur during production of natural speech"

(p. 92).

In order to minimize the drawbacks of using a paradigm of natural words, the se-

lection of the test words followed some general guidelines. Nasals in test words were

avoided as much as possible since they spread nasalization to neighboring vowels creat-

ing zero resonances (anti-formants) that can cause misreads in automatic formant track-

Liquids were also avoided when possible as they can be contextually emphaticised.

Morpheme boundaries were avoided as well since that might interact with emphasis

spread. Additionally, preference was given to the use of single, as opposed to geminated,

vowels in the VCV sequences. Every effort was made to ensure that these guidelines are

upheld. To this end, three dictionaries were thoroughly consulted during the compilation

of the test paradigm: Ar-Razi's (fl. 1261 A.D.) Mukhtar Al-Sihah, Wehr's (1979) A Die-

tionary of Modern Written Arabic, and Baalabaki's (1995) Al-Mawrid Arabic-English

12 The importance of this point stems from the fact that the stimuli for this experiment are also used as a subset of the stimuli for Experiment Three. See Chapter 5.


89

Dictionary. However there are cases that go against some the stated preferences. This is

to be expected from a large experimental paradigm of natural words.

The words were presented in the carrier phrase "?alkalimtu hiya _" ("The

word is __ "). Since it is permissible in Standard Arabic speech to drop tense or case

inflectional suffixes in sentence-final words, the placement of the words in the carrier

phrase enables the subjects to pronounce the words without those suffixes which might

vary from subject to subject and cause inconsistent interference with the consonants and

vowels in question. Since such suffixes are mostly written as diacritics and are, in many

cases, optionally displayed in the written form, they were left out from the printed stimuli

to avoid any confusion.

3.2.3 Procedures

All the recordings were made in the Phonetics Lab at the University of Wiscon-

sin-Madison. Each subject was seated comfortably in a sound-attenuating booth and

asked to speak into a TOA J1 high-fidelity microphone, keeping a constant distant of ap-

proximately 15 em between his mouth and the microphone. The experimental phrases

were ordered randomly. Each experiment phrase was displayed in front of the subject as

an individual Microsoft PowerPoint (Microsoft Corp. 1987) slide. This necessitated the

presence of a computer in the recording booth. However, the computer that was used was

very quiet and, as a further precaution, was tucked below the table on which the micro-

phone, along with the computer system's monitor and keyboard, were placed. In general,

there was no detectable noise interference from the computer. Before the recording

started the subjects were instructed to read each phrase at a normal conversational rate


90

and effort then proceed to the next one by pressing a key on a quiet computer keyboard.

Each phrase was repeated three times by each subject, yielding a total of 15 instances of

every test phrase (3 repetitions x 5 subjects). The speech tokens were recorded on HHB

digital audio tapes using a TASCAM DA-30 digital audio recorder.

Using an Apple Macintosh desktop computer and the audio editing software Peak

LE (BIAS, Inc. 1996), the full recording session of each subject was digitized into a long

WAVE file at a 22 kHz sampling rate and 16 bit quantization. These master speech files

were transferred to recordable CD media for proper archiving. The individual test words

were then cut from the master files, normalized for amplitude, and saved as individual

WAVE files using the audio editing software GoldWave (GoldWave, Inc. 2002) on an

IBM -compatible desktop computer.

3.2.4 Acoustic Analysis

3.2.4.1 Spectral Moments

The speech analysis software Praat (Boersma and Weenink 1992) was used to

calculate the spectral moments of consonant spectra. Through the use of a scripting lan-

guage, Praat allows automation of the analysis procedures provided that the acoustic

landmarks being investigated are identified as time points or intervals. The acoustic

landmarks of importance for this experiment were the segment boundaries. To identify

these boundaries the waveform and spectrogram of each speech token were displayed in

the editor screen of Praat and the boundaries of the consonants being investigated were

identified and marked as time points. In deciding on the segment boundaries, waveforms

and wide-band spectrograms were visually consulted. Boundaries that were relatively


91

more difficult to distinguish were verified auditorily. The collection of time points corre-

sponding to the identified boundaries for the sound token was saved as a TextGrid

To calculate the spectral moments at the desired locations of the consonant, two

Praat script files were written; one for continuants and the other for stops. The scripts re-

ferred to the TextGrid files and then executed the following steps:

1. The analysis window locations (see further explanation below) were set rela-

tive to the landmarks previously identified in the TextGrid file. Those portions

of the sound file were then extracted as single windowed selections ( 40 ms

full Hamming).

2. Pre-emphasis was applied independently to each extracted selection. This was

done since, according to Praat' s co-author, Paul Boersma (personal communi-

cation), Praat calculates spectral moments directly without applying pre-

emphasis by default.

3. An FFT spectrum was generated for the selection.

4. The four spectral moments were computed from the FFT spectrum and their

values were recorded in an independent results text file.

The results text file was then converted to the file formats of the spreadsheet

software Microsoft Excel (Microsoft Corp. 1985) and the statistical analysis software

SPSS (SPSS, Inc. 1989) for statistical analysis.

In calculating the spectral moments for continuants, the procedures chosen by

Jongman et al. (2000) were followed. This involves computing the mean, standard devia-

tion, skewness, and kurtosis of FFT spectra generated using 40-ms full Hamming win-

13 A TextGrid file is basically a text file listing in numerical format the selected time points expressed in milliseconds.


3 5

2 4

2

Figure 3.2. Locations of the sampling windows at which the spectral moments for fricatives (above) and stops (below) were calculated.

92

dows. Jongman et al. calculate the moments over four regions (fricative onset, middle of

fricative, fricative offset, and centered over the fricative-offset/following vowel bound-

ary). In the present study, a region centered over the fricative-onset/preceding vowel

boundary was added bringing the number of test windows to five.

For stops, a 20-ms half Hamming window that starts at the beginning of the stop

release noise as well as a 40-ms full Hamming window centered over the onset of voicing

of the following vowel were used. The selection of a half Hamming window over the re-

lease is intended to make sure that the stop burst corresponds to the widest portion of the

window. This also avoids the inclusion of any possible pre-burst noises in the calculation.

And since release noises are in many cases rather short, the window length is set at 20

ms. Figure 3.2 illustrate the placement of the sampling windows over the frication noise

in fricatives and release noise in stops.


93

3.2.4.2 Multi-Band Spectra (MBS)

The speech analysis computer program TF32 (Milenkovic 2000) was used to cal-

culate the multi-band spectra for voiceless consonants only. For continuants, multi-band

spectra were calculated using 40-ms full Hamming windows at two locations (see theRe-

sults for elaboration): frication onset and middle of frication. These two window loca-

tions correspond to windows 2 and 3 used to calculate spectral moments (see Figure 3.2).

As for stops, 20-ms full Hamming windows were placed starting 5 ms ahead of the stop

burst. This particular arrangement was followed because the software used to generate

the MBS, TF32, does not offer a half Hamming window option which would have been

ideal for stop releases. And due to the short duration of many of the stop releases covered

in this study, 20-ms rather than 40-ms window length is a suitable choice.

3.2.5 Reliability

To assess the intra-judge reliability of the obtained spectral moments analysis re-

sults, a total of 108 sound files containing continuant sounds and 81 sound files contain-

ing stop sounds (approximately 10% of the total files in both cases) were randomly se-

lected by a random number generating software and re-analyzed following the same

procedures explained in §3.2.4.1. For continuants, the correlations between the spectral

moments values (averaged from the five window locations) of the two groups of tokens

(the original and the retested tokens) were at a full 1.00 for all four moments. Agreements

within 50 Hz for spectral mean and standard deviation, 0.05 for skewness, and 0.1 for

kurtosis were between 80.6% and 94.4%. For stops, the correlations between the two

groups of tokens for all moments at both window locations were also at exactly 1.00.


94

Agreements within 50 Hz for spectral mean and standard deviation, 0.05 for skewness,

and 0.1 for kurtosis were between 96.3% and 100%. The measurements were judged reli-

able.

In order to estimate of the intra-judge reliability of the multi-band spectra meas-

urements, a total of 54 sound files containing continuant sounds and 54 sound files con-

taining stop sounds (10% of the total files in both cases) were selected randomly by the

random number generating software and re-analyzed following the same procedures ex-

plained in §3.2.4.2. For continuants, the correlations between the original and the retested

groups of tokens in terms of the relative intensity values at each of the 11 spectral bands

were all above 0.99. Agreements within 1 dB at each band were between 89.8% and

99.1 %. For stops, the correlations between the two groups of tokens for each of the 11

spectral bands were all above 0.99. Agreements within 1 dB at_each band were between

97.2% and 100%. The measurements were judged reliable.

3.3 Results

3.3.1 Spectral Moments

3.3.1.1 Voiceless Continuants - Pooled Data

A set of four analysis of variance (ANOV A) tests was conducted for the four

voiceless continuants, [s], [s'], [X], and [h] across the five subjects and the nine vowel

contexts. In each test one of the four spectral moments (averaged from the five window

locations) served as the dependent variable. The averaged spectral moments values are

shown in Table 3.1.


95 Table 3.1. Mean values of spectral moments for voiceless continuants averaged across speakers, window locations, and vowel contexts.

Consonant

[s] [sl']

[X] [b)

F value (df = 3,536)

*** p < .001

Mean (Hz)

6,374 6,101 4,839 3,416

250.368***

Standard deviation (Hz)

2,167 2,215 2,536 2,198

20.488***

Skewness Kurtosis

0.297 0.77 0.337 0.49 0.832 1.61 1.918 5.37

143.418*** 82.103***

A main effect of consonant types on spectral mean is obtained [F(3,536) = 250.368, p < 0.001; R2 = 0.581]. The results, along with subsequent Scheffe post hoc

comparisons, show that the spectral mean values for the two voiceless alveolars [s] and

[s1] are significantly higher than the rest of the voiceless continuants (p < 0.001). The

spectral mean of [s] is 273 Hz higher than [s"], though this difference is not significant (p

> 0.16). Among the voiceless continuants, the lowest spectral mean value belongs to the

voiceless pharyngeal [h], distinguishing it from the other three continuants (p < 0.001).

The voiceless uvular [X] is also distinguished from the other voiceless continuants by its

mid-range spectral mean value (p < 0.001). So, while the averaged spectral mean value is

incapable of distinguishing the two alveolars from each other, it succeeds in distinguish-

ing the three points of articulation (alveolar vs. uvular vs. pharyngeal). There is also a

main effect of consonant types on spectral standard deviation [F(3,536) = 20.488, p <

0.001; R2 = 0.098]. However, only [X] is distinguished from all other voiceless contin-

uants by its relatively high standard deviation (p < 0.001). The remaining three voiceless

continuants; [s], [s"], and [h]; are not significantly different from each other (p > 0.85).

There is a main effect of consonant types on the spectral skewness of voiceless contin-

uants [F(3,536) = 143.418, p < 0.001; R2 = 0.442]. The two lowest skewness values dis-


96

tinguish the two alveolars [s] and [s1] from the rest of the rest of the continu(J.nts (p <

0.001), but not from each other (p > 0.97). The mid-range skewness value of [X] and the

high value of [h] succeed in distinguishing these two sounds (p < 0.001 for both). A main

effect of consonant types on the spectral kurtosis of voiceless continuants is obtained

[F(3,536) = 82.103, p < 0.001; R2 = 0.311]. The pharyngeal [h] has a notably high kurto-

sis value while [X] has a mid-range value and [s1] has the lowest value. Kurtosis distin-

guishes these three sounds from each other (p < 0.05). Meanwhile, the spectral kurtosis

value for [ s] falls in between those for [ s"] and [X] and fails to distinguish it from either

of those two sounds.

Figure 3.3 shows the spectral moments values at the five sampling windows for

the four voiceless continuants. It is clear from inspecting the spectral mean, spectral

skewness, and spectral kurtosis graphs that, for voiceless continuants, the locations where

the continuants are, from an acoustic point of view, most visibly distinguished from

neighboring vowels are at windows 2 and 3. It is also at these two window locations that

the widest acoustic dispersion across the various voiceless continuants is realized. This is

a clear indication that the canonical spectral shapes are achieved at the first half of the

voiceless continuant. For this reason, we will concern ourselves mostly with the spectral

moments values averaged from window locations 2 and 3 from this point forward in this

work. Four ANOV A tests similar to the ones discussed above were conducted for the

same sounds, only this time the values of the spectral moments averaged from windows 2

and 3 were used as the dependent variables. The spectral moments values are shown in

Table 3.2. Box plots of those values are shown in Figure 3.4.


97 8.00 3.00

7.00 ........... __ --·- -... ......... N' 2.50 ' ::r:: ' 6.00 / ' ..0.::

/ ' '--" 1::

N 0 2.00 g5.oo

·p \ rn ·;;

1:: \ \ v rn \ • Q v ::E 4.00 \ '"0 1.50 .... c; \ rn

'"0 .... 1:: E 3.oo rn ..... 0.. r./) 1.00 r./) c;

2.00 J:j (.) v 0..

1.00 r./) 0.50

0.00 0.00

2.5 8.0

7.0 2.0

6.0 • N' ifJ 1.5 I ::r:: 5.0 ifJ ..0.:: v I '--" I = I I v ifJ 4.0 I 0 ..0.:: ..... I Vl 1.0 I ....

I c; 3.0 J:j c; I (.) v J:j I /. 0.. (.) r./) 0.5 v 2.0 I / 0..

r./)

1.0 0.0

0.0

-0.5 -1.0 1 2 3 4 5 1 2 3 4 5

Window Location Window Location

- •- [s] Figure 3.3. Spectral moments values for voiceless continuants at the five sampling win-dow locations. The values are averaged across subjects and vowel contexts.

-o-[s'] , - •- [xJ --<>-[h]


98 Table 3.2. Mean values of spectral moments for voiceless continuants averaged from windows 2 and 3 and across speakers and vowel contexts.

Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [s] 7,540 2,002 -0.162 0.16 [sl·] 7,327 2,058 -0.008 -0.07

[xl 5,909 2,635 0.377 0.25 [h] 3,398 2,003 2.132 6.48

F value 385.540*** 59.934*** 154.587*** 100.345*** (df = 3,536)

*** p < .001

10.0 5.0

8.0 :E4.0 • • ,--... N '--'

6.0 ;:::: 3.0 0 '--'

;:::: i3 4.0 ·;; 2.0

:;8 <l) '0

2.0 1.0

0.0 0.0 8.0 50

• 6.0 • 40 •

VJ 4.0 • 30 • VJ <l) VJ • ;:::: 2 20 ::: 2.0 ..... • <l) ;::l

IZl 0.0 10 • • • I -2.0 • • 0

--4.0 [s] [ [X] [h]

-10 [s] [ sl'] [X] [h]

Figure 3.4. Box plots of the distributions of the spectral moments scores for the four voiceless continuants [s, sl", x. h]. The scores are averaged across subjects and vowel contexts.

Main effects of consonant type are obtained for spectral mean [F(3,536) = 385.540, p < 0.001; R2 = 0.682], spectral standard deviation [F(3,536) = 59.934, p <

0.001; R2 = 0.247], spectral skewness [F(3,536) = 154.587, p < 0.001; R2 = 0.461], and

spectral kurtosis [F(3,536) = 100.345, p < 0.001; R2 = 0.356]. The two sibilants are sig-


99

nificantly higher in spectral mean than the two nonsibilants (p < 0.001) but are not sig-

nificantly different from each other (p > 0.4 ). The uvular [X] had a spectral mean that is

significantly higher than that of [h] (p < 0.001). Standard deviation is not capable of dis-

tinguishing the two sibilants and [h] from each other (p > 0.8). However, the uvular [X]

has a significantly higher standard deviation than the other sounds (p < 0.001). The two

sibilants are, again, undistinguishable based on their skewness values (p > 0.6). All other

pair-wise comparisons show statistically significant differences (p < 0.05). Kurtosis can-

not distinguish the two sibilants from each other nor from [X] (p > 0.9). The pharyngeal

[h], on the other hand, has a markedly high kurtosis value that distinguishes it from the

other sounds in the group (p < 0.001).

3.3.1.2 Voiceless Continuants - Individual Subjects

To investigate the variability in the rankings of the four voiceless continuants in

terms of their spectral moments values across the individual subjects, each spectral mo-

ment (averaged from windows 2 and 3) was used as the dependent variable in an

ANOV A test conducted for the four voiceless continuants across vowel contexts for each

single subject. This yielded a total of 20 ANOVAs (4 tests x 5 subjects). Figure 3.5 con-

tains 20 box plots showing the distributions of each of the four spectral moments per sub-

ject.

The results show that the two sibilants [s] and [s"J are not statistically different

from each other in the majority of pair-wise comparisons regardless of the spectral mo-

ment being investigated. However, there are two exceptions that merit elaboration. Sub-

jects 2 and 5 show that [s] has a significantly higher spectral mean than that of [s"]. Fur-


100

-il- + -rn- . • • u []- e: V) --m- -[}- -ID- ·t :8 ..... () Q)

B -ill- ··-[I} ·-ill-· t ::l '2:. r:/)

-[]-- -{]- -ill -o -[IJ- -[[} -ill e:

"<t -ill- -ill· ·-ill- ·I :8 u Q)

B -[0- -OJ- -{]}-- ·-{} ::l '2:. r:/)

{]- -rn- -[}- -{} CfJ

·-{]- -ill-· ·t :;2 M ·-ill- -ill- -ID-· t u

Q)

B --ill ill- -ill-· ::l '2:. r:/)

-{]}- . --{]]--· --{]- ... n 00

-{!} --rn- . -{If- . -ill- e: C"l -rn- --[]}- -DJ- . ill :8 ..... () Q)

B ill +· -{]}- ·i ::l '2:. r:/)

-DJ- -ill- -ill-· .. 00

-OJ- --[[]- -Dt· -[]- :;2 ,...., {I}- -[]--- --[}- .. -ID- :8 ..... () Q)

B {]-- -{]]--- -OJ- ··1 ::l '2:. r:/) -rn- -[]- . -OJ--0 0 0 0 0 V) 0 V) 0 V) C< 0 0 0 C< 0 0 0 0 0 0 0 0 0 0 0 r--: q C"l .,., r- '<T <'i ci <'i 1 "<t M N ,...., oO \Ci ,.,f C'-i 0 M M C'-i ,...., 0 I I

Mean (kHz) St. deviation (kHz) Skewness Kurtosis

Figure 3.5. Box plots showing the distributions of the four voiceless continuants spectral moments scores for each of the five individual subjects.


101

ther scrutiny showed that this is attributed to window 2 which covers the beginning of the

continuant. For both subjects, spectral mean at window 2 is significantly higher for [s]

than for It is safe to assume that the difference displayed at window 2 by both sub-

jects is due to stronger vowel-consonant coarticulatory influence. By comparison, at win-

dow 3, which covers the middle of the continuant, both subjects show no statistical dif-

ference between the spectral mean values of the two sibilants. This window location is at

the portion of the continuant furthest away from any vocalic influence (in VCV contexts).

The pharyngeal [b] can be strongly distinguished from other voiceless continuants

by its spectral mean, skewness, and kurtosis values. For all five subjects, this sound has

the lowest spectral mean and the highest skewness and kurtosis. The uvular [X] can, in

most cases, be distinguished from the other sounds by its mid-range spectral mean value.

This sound also has either the highest or one of the highest standard deviation values.

Only subject 1 shows [X] to have a statistically higher kurtosis than [ s, while all other

subjects show no differences. As for spectral skewness, the difference between [X] on the

one hand and [s, s"] on the other depends largely on the subject.

3.3.1.3 Voiceless Continuants - Specific Vowel Contexts

To test the vowel context-based variability in the rankings of the four voiceless

continuants, a set of ANOV A tests were conducted that directly address each individual

vowel context across subjects. A total of36 ANOVAs were conducted (4 moments x 9

vowel contexts).

The first thing to note here is that, in all nine vowel environments, the two sibi-

lants [s, s"] are not statistically different from each other in any of their spectral moments


102

values. Like the pair-wise comparisons results of the pooled data, the two sibilants, gen-

erally speaking, enjoy the highest spectral means and [h] has the lowest. There are impor-

tant systematic differences, however, between some vowel contexts on the one hand and

the pooled data on the other as far as the spectral mean ranking of the uvular [X] is con-

cerned. Whenever the vowel [ u] occupies either or both vowel slots in the test word, the

spectral mean of [X] becomes no different (p > 0.1) or only marginally different (p < 0.1)

from the spectral mean values of the sibilant. It is possible that the tongue position during

the articulation of the back vowel [u], which takes place near the point of articulation of

[X], ampl!fies the tongue retraction during ['X:] bringing the back of the tongue dorsum

closer to the soft palate and further narrowing the oral tract at that specific point. This, in

turn, intensifies the air stream turbulence during the articulation of [X] giving this sound

substantial energy components at the high frequencies.

The previous explanation regarding the vowel-uvular coarticulation appears more

plausible when we look at the spectral skewness results. While [h] always has the highest

skewness, in most vowel environments [X] is not statistically different from the sibilants.

However, in the environments iCi and aCi, [X] has a significantly higher skewness than

[s]. Since the vowel [i] involves an advanced tongue position, we would expect it to

cause a coarticulatory effect on [X] that is the opposite of that of [u]. A higher skewness

. value means that the power spectrum curve is skewed upwards at the lower frequencies.

Since [X] is moved away from its back point of articulation creating a wider constriction,

this yields less intense frication dimming the energy components at the higher frequen-

cies. This explanation is not without its flaws, however. Notice that in the environment i-

a the skewness of [X] is not statistically higher than those of the sibilants. As for the envi-


103

ronments iCu and uCi, we could argue that the tongue-retraction coarticulatory effect of

[u] overrides the effect of [i]. How such contradicting effects are resolved is a compli-

cated matter that lies beyond the focus of this work.

In terms of spectral standard deviation, the two sibilants [s, s"] and the pharyngeal

[h] are always not statistically different from each other, which is the same result of the

pooled data. As for [X], in most cases this sound has the highest standard deviation. In the

presence of the vowel [u], this sound ranges in pair-wise comparisons from being no dif-

ferent than other sounds to being significantly higher. Meanwhile, pair-wise comparisons

of the spectral kurtosis results are always similar to those of the pooled data: [h] is sig-

nificantly the highest while the remaining sounds are no different from each other.

3.3.1.4 Voiceless Continuants - Discriminant Analysis

Discriminant analysis is used to classify the different voiceless continuants being

investigated based on the known values of their spectral moments averaged from sam-

pling windows 2 and 3. The results displayed in Table 3.3 show that the overall correct

classification rate is over 71%. While this general rate is somewhat high, the individual

rates of classification of the four continuants paint two contrasting pictures of the two

gutturals on the one hand and the two sibilants on the other. The respective correct classi-

fication rates for the uvular [X] and the pharyngeal [h] are more than 85% and 91%. The

two sibilants, on the other hand, are correctly classified in no more than 57.8% of their

actual incidents while being misclassified as each other in at least 37% of the cases. This

is not unexpected given the above discussed statistical proximities between the two sibi-

lants in their spectral. moment values. So, excluding differences based on secondary


Table 3.3. Results of the discriminant analysis for the voiceless continuants based on the four spectral moments' values combined together as predictors. The moments' values are averaged from sampling windows 2 and 3 and across speakers and vowel contexts. The numbers represent the totals and percentages of correctly classified sounds.

Predicted Group Membership Consonant [h] [s] [xl

Original Count [h] 123 0 0 12

[s] 0 68 58 9 [sl'] 0 50 78 7

[xl 13 2 5 115 % [h] 91.1 0.0 0.0 8.9

[s] 0.0 50.4 43.0 6.7 [sl'] 0.0 37.0 57.8 5.2

[xl 9.6 1.5 3.7 85.2

71.1% of original grouped cases correctly classified.

104

Total

135

135

135

135

100

100

100

100

places of articulation, the discriminant analysis of voiceless continuants based on the val-

ues of their spectral moments is quite capable of profiling their primary places of articula-

tion.

The standardized canonical discriminant function coefficients were analyzed to

estimate the contribution of each spectral moment value to the aforementioned classifica-

tion rates. Spectral mean values stand out as the most prominent predictors. On the con-

trary, spectral standard deviation does not contribute significantly to the classification of

the four voiceless continuants.

3.3.1.5 Voiced Continuants.,.... Pooled Data

A set of four ANOVA tests was conducted for the four voiced continuants, [o],

[ (f1], [ff ], and [<i']. across the five subjects and the nine vowel contexts. In each test one of


105

the four spectral moments (averaged from the five window locations) served as the de-

pendent variable. The averaged spectral moments values are shown in Tables 3.4.

Table 3.4. Mean values of spectral moments for voiced continuants averaged across speakers, window locations, and vowel contexts.

Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz)

[oJ 4,290 2,578 0.852 2.20 [oi·J 3,640 2,614 0.941 2.30 [K] 3,702 2,432 1.126 3.97 [1] 2,038 1,206 3.505 30.92

F value 84.784*** 146.594*** 177.321*** 142.564*** (df = 3,536)

*** p < .001

A mam effect of consonant types on spectral mean is obtained [F(3,536) = 84.784, p < 0.001; R2 = 0.318]. The highest spectral mean value belonging to the inter-

dental [o] (p < 0.01) and the lowest value belonging to the pharyngeal [1] (p < 0.001) dis-

tinguish them from the remaining voiced continuants. The emphatic interdental [o"] and

the voiced uvular [B'] are not distinct from each other (p > 0.98). A main effect of conso-

nant types on standard deviation is also observed for voiced continuants [F(3,536) = 146.594, p < 0.001; R2 = 0.448]. The voiced pharyngeal, [1], has the lowest spectral stan-

dard deviation which distinguishes it from the rest of the voiced continuants (p < 0.001).

Standard deviation fails to distinguish the other three continuants from each other (p >

0.14). A main effect for voiced continuants is also observed [F(3,536) = 177.321, p <

0.001; R2 = 0.495]. Again, only [1] is distinguished from the rest of the continuants by its

high spectral skewness (p < 0.001). The remaining three voiced continuants are not dis-

tinguishable from each other (p > 0.24). A main effect of consonant type on spectral kur-


106 6.00 3.50

5.00 ...... \ ,--_ 3.00

/ \ N / ::r: ./ \ A- - -Jir - - A',

,--_ I \ ';;' 2.50 / ',-..::

\ .s j/ 'iiJ '

'/ • A '-' ·;; .:: il) 2.00 "' 0 il) ::E 3.00 "0 ....

03 "' "0 1.50 t1 .:: () "' il) .....

--/:J t:. t:. [/) t:r-03 1.00 t1 () il) 0..

1.00 [/)

0.50

0.00 0.00

4.00 35.0

3.50 30.0

3.00 ,--_ 25.0 N

VJ ::r: VJ

";;;' 20.0 ·;;;

il) 0 ..... .... 15.0 ;::l

"' t1 03 ()

g, 1.50 t1 10.0 () [/) il)

/ 0..

''&- - -.....- / [/) ..A 1.00 5.0 /

/ Jf ---.---k'

0.50 '/ 0.0

0.00 -5.0 1 2 3 4 5 1 2 3 4 5

WindQw Location Window Location

- ·- [0] Figure 3.6. Spectral moments values for voiced continuants at the five sampling window locations. The values are averaged across subjects and vowel contexts.

--o--- [0'] --A- [ff] -----t:r-- [)]


107

tosis is obtained as well [F(3,536) = 142.564, p < 0.001; R2 = 0.441]. The exceptionally

high 30.9 kurtosis value for [)] distinguishes this sound from the rest (p < .001). Kurtosis

fails to make any other significant distinctions among voiced continuants.

Figure 3.6 shows the spectral moments values at the five sampling windows for

the four voiced continuants. It is clear from inspecting the graphs, particularly those of

spectral mean and spectral skewness, that voiced continuants distinguish themselves

acoustically from neighboring vowels and from each other at sampling windows 3 and 4,

which cover the middle and the end of the continuant, respectively. So, the canonical

spectral shapes of voiced continuants are achieved at the second half of the sound. Hence,

in the ensuing discussion, we will concern ourselves with the spectral moments values

averaged from window locations 3 and 4. Another set of four ANOV A tests were con-

ducted for the same sounds across subjects and vowel contexts. In each one of these tests,

one of the spectral moments averaged from windows 3 and 4 was the dependent variable.

The mean values of these spectral moments are shown in Table 3.5, and their distribu-

tions are shown in the box plots in Figure 3.7.

Table 3.5. Mean values of spectral moments for voiced continuants averaged from windows 3 and 4 and across speakers and vowel contexts.

Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [oJ 5,270 2,818 0.364 0.80 [oi·J 4,612 2,900 0.501 1.20 [B'] 4,131 2,592 0.982 2.79 [CJ] 2,001 1,221 3.601 31.99

F value 99.789*** 153.151*** 176.579*** 136.494*** (df= 3,536)

*** p < .001


10.0

8.0 N'

6.0 '-'

2.0

•

I

• • • •

0.0 L---..J...------1---.....J_-_ _L._ __

C/J C/J

10

8 6

g 4 2

VJ 0

-2

• •

$y I • • •

___ [oJ [B) [«i)

5.0

";;' 3.0 0

·;;: 2.0 <!)

"0

1.0

108

0.0 L,_ _ _.__ __ J.._ _ ___. __ __

150

120

90 C/J 0

60

30

0

• I

[oJ [B)

• • •

[)) Figure 3.7. Box plots of the distributions of the spectral moments scores for the four voiced continuants [ o, o\ B, 1). The scores are averaged across subjects and vowel contexts.

Main effects of consonant type are obtained for spectral mean [F(3,536) = 99.789,

p < 0.001; R2 = 0.355], spectral standard deviation [F(3,536) = 153.151, p < 0.001; R2 = 0.459], spectral skewness [F(3,536) = 176.579, p < 0.001; R2 = 0.494], and spectral kur-

tosis [F(3,536) = 136.494, p < 0.001; R2 = 0.430]. The interdental [o] is distinguished by

the highest spectral mean value (p < 0.05), while the pharyngeal [)] is distinguished by

the lowest value (p < 0.001). The emphatic interdental [o"] and the uvular [B"] are fall in

between and are not distinct from each other. In terms of spectral standard deviation, the

two interdentals [o, o"] have the highest values. The pharyngeal [<i'] is distinguished from

all sounds by its low standard deviation (p < 0.001). The uvular [B"] has a midrange value

that distinguish it from other sounds except [o] whose standard deviation is only margin-

ally higher (p < 0.1) than that of [ff]. The two interdentals have the lowest skewness and


109

are not statistically different in that metric. Meanwhile, the high skewness of [1] and the

mid-range skewness of [K] distinguish these two sounds in all pair-wise comparisons (p <

0.05). The [1] has a very high kurtosis which distinguishes it from all other sounds (p <

0.001 ). Kurtosis values of the other three sounds are not statistically different.

3.3.1.6 Voiced Continuants - Individual Subjects

Variations in the voiced continuants' spectral moments rankings between the five

subjects were examined by means of 20 ANOVA tests (4 moments x 5 subjects) con-

ducted for the four voiced continuants across vowel contexts. In each test the dependant

variable is one of the four spectral moments averaged from windows 3 and 4. Figure 3.8

includes 20 box plots showing the score distributions of each of the four spectral mo-

ments for each subject.

Only subject 3 shows [6] having a statistically higher spectral mean than [6'•l All

other subjects show no statistical difference between spectral means of the two sounds.

This is quite interesting since, based in the pooled results in the previous section, spectral

mean is the only metric that reflects any statistical difference between the two sounds. All

subjects show [1] as having the lowest spectral mean, although, for subject 4, [1] is not

statistically lower than [K]. The ranking of the spectral mean value of [K] shows no stable

pattern and depends on the subject. All five subjects show no difference between [6] and

[6'] in terms of spectral standard deviation. Also, with the exception of subject 5, the sub-

jects show no statistical difference between [6, 6'] and [K]. The pharyngeal [1], on the

other hand, is statistically shown to have the lowest standard deviation by all subjects.

The two interdentals are also not statistically distinct from each other in terms of their


110

.. -ill -ill --ill- -[]-tn --DJ-- -[[]--- -[[}- -ill u 0)

B -OJ-- --[]-- -[]]-;::l "'" Cl'l ':£ -[[]--- -[}-- .. ---DJ- ··I

-[]]- ·-[]}- -ill- ·-ill-7 ----[]]- -[0-- . -[}-u .. 0)

B -[[]---- --DJ- -[]- .,, "" ;::l Cl'l ':£ ---rn- --ID- ·+ ··I

-[]- ·---[]-- -{I}- ·-ill ('() --rn- •. -[]- t ..... (.) 0)

B ill-· -{]]- -1]}- ·t ;::l "'" Cl'l ':£ -rn- -DJ- t· ':£

-[]- -ill- ·-OJ-N ---DJ- -[]---- -ill- .. . ·I ..... (.) 0)

B I I --{]]--- ·-rn- . (,r< ;::l ':£ Cl'l

-[J}- -rn- -[}-- ..... 1 ':£

• -ill- .... -[]]- -rn-,...., -[]- -ITJ--- -{D- ·-1]} u 0)

B ----[0- -OJ-- -rn- (,r< ;::l ':£ Cl'l

--rn-·" --rn- -ill-0 0 0 0 0 0 0 0 0 0 "! c: "! 0 tn 0 0 0 0 0 c: 0 0 c: 0 c: c: 0 c: 0 t-- tn N 0 0i N 0\ \0 ('()

00 \C) ..,f N ci 7 ('() C'i ,...., ci I ,....,


Figure 3.8. Box plots showing the distributions of the four voiced continuants spectral moments scores for each of the five individual subjects.


111

spectral skewness as shown by all subjects. The five subjects also show that [1] always

has a significantly higher skewness than all other sounds. As for [B], as was the case for

spectral standard deviation, its skewness ranking is subject-dependant. All five subjects

show that the spectral kurtosis of [1] is significantly higher than any of the other three

sounds. The kurtosis values of [o], [01], and [B], on the other hand, do not show any sig-

nificant statistical differences by any of the subjects.

3.3.1.7 Voiced Continuants- Specific Vowel Contexts

The vowel context-conditioned variations in the rankings of the spectral moments

were examined by means of a set of 36 ANOVA tests conducted for the four voiced con-

tinuants across subjects. Each individual test compares the values of one of the four spec-

tral moments in one of the nine vowel contexts ( 4 moments x 9 vowel contexts).

In almost all vowel contexts, the two interdentals [ o, o1] are not statistically dif-

ferent from each other in terms of their spectral mean. The spectral mean of the uvular

[B] is not statistically different from at least one of the two interdentals in all contexts.

The pharyngeal [1], on the other hand, is statistically distinguished from the other three

sounds by having the lowest mean in almost all vowel contexts. The pharyngeal [1] is

also significantly distinguished from the other sounds by having the lowest spectral stan-

dard deviation, the highest spectral skewness, and the highest spectral kurtosis in all

vowel contexts. The other three sounds are mostly not significantly distinct from each

other in the values of those three metrics.


112

3.3.1.8 Voiced Continuants - Discriminant Analysis

To assess the power of spectral moments in classifying the four voiced contin-

uants a discriminant analysis test was conducted. The predictors used in the tests were the

four spectral moments values averaged from sampling windows 3 and 4. Table 3.6 shows

that the overall correct classification rate for voiced continuants is quite poor (51.7%).

Only [l] enjoys a high classification rate (82.2%) while [6], and [B] are frequently

confused with each other. Among the latter three continuants, cases of misclassification

as one another are near, or above, the 25% chance (as there are four categories in the

classification function). None of those three sounds is accurately classified more than

48.9% of its actual incidents. Meanwhile, misclassifications of [<t] as one of the three

sounds [6, 6\ B], and vice versa, are all below the 25% chance.

Table 3.6. Results of the discriminant analysis for the voiced continuants based on the four spectral moments' values combined together as predictors. The moments' values are averaged from sampling windows 3 and 4 and across speakers and vowel contexts. The numbers represent the totals and percentages of correctly classified sounds.

Predicted Group Membership Consonant [«!] [ol [oY] [.K]

Original Count [«!] 111 22

[ol 2 66 32 35 [oY] 5 42 49 39

[.K] 15 31 36 53 % [«!] 82.2 .7 .7 16.3

[ol 1.5 48.9 23.7 25.9 [oY] 3.7 31.1 36.3 28.9

[.K] 11.1 23.0 26.7 39.3


Total

135

135

135

135

100

100

100

100


113

Based on the standardized canonical discriminant function coefficients, none of

the four moments contributes significantly to the classification function. This is not sur-

prising given that the overall classification percentage is quite low to start with.

3.3.1.9 Voiceless Stops- Pooled Data

Another set of four ANOV A tests was conducted for the four voiceless stops [t],

[t'1], [k], and [q] across the five subjects and the nine vowel contexts. In each test one of

the four spectral moments (averaged from the two window locations) served as the de-

pendent variable. The averaged spectral moments values are shown in Tables 3.7.

Table 3.7. Mean values of spectral moments for voiceless stops averaged across speakers, window locations, and vowel contexts.

Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [ t] 4,986 2,433 0.937 2.28 [tl'] 3,870 2,116 1.159 3.60 [k] 4,244 2,515 1.282 3.93 [q] 3,821 2,300 0.982 3.36

F value 23.670*** 8.583*** 4.102** 3.629* (df = 3,536)

* p < .05; ** p < .01; *** p < .001

A main effect of consonant type on spectral mean is observed for voiceless stops

[F(3,536) = 23.670, p < 0.001; R2 = 0.112]. Only [t] is significantly distinguished from

all other stops by its high spectral mean (p < .001). A smaller distinction exists between

[ q] and [k] in which the spectral mean of the former is marginally significantly lower

than that of the latter (p < 0.1). Spectral mean is unable to distinguish [t1] from both [k],

and [q] (p > 0.12). There is a main effect of consonant type on the spectral standard de-


114

viation of voiceless stops [F(3,536) = 8.583, p < 0.001; R2 = 0.040]. The emphatic stop [t

'] is distinguished from [t] and [k] by the lowest standard deviation value (p < 0.01). The

latter two are not significantly different from each other (p > 0.8). The mid-range spectral

standard deviation value of [q] is not significantly different from both [t] and [t'] (p >

0.18) and is only marginally lower than that of [k] (p < 0.1). A main effect of consonant

type on the spectral skewness of voiceless stops [F(3,536) = 4.102, p < 0.01; R2 = 0.017]

is obtained. The only significant distinction made by skewness is between [t] and [kL in

which the former has a significantly lower spectral skewness than the latter (p < 0.05).

There is a less significant distinction whereby the spectral skewness of [q] is marginally

lower than that of [k] (p < 0.01). No other distinctions are made by the spectral skewness

of stops. A main effect of consonant types on spectral kurtosis is obtained for voiceless

stops [F(3,536) = 3.629, p < 0.05; R2 = 0.014]. Kurtosis makes only one distinction be-

tween [t] and [k] (p < 0.05). The remaining pair-wise comparisons show no significant

differences.

Figure 3.9 shows the four spectral moments values at both sampling windows lo-

cations for the four voiceless stops. When comparing the results of the spectral mean

generated at windows 1 and 2 in the case of voiceless stops, it is evident that window 1 is

the location where all stops distinguish themselves from the following vowel. It is clear

that the spectral means of all four stops are higher at the stop release than at the stop-

vowel junction. Unlike the case of continuants, however, the remaining spectral moments

do not reflect a similar trend. Nevertheless, because window 1 covers only the stop re-

lease, and because of the aforementioned fact regarding spectral mean, our attention in

the following discussion will concentrate on this window location alone as the location of


7.00

6.00

"""'5.00 N

§4.00

]3.00 u p.,

r:/)2.00

1.00

... ' ' ' ' ' ' ' ' .. .........

0.00 .__ ______ .....__ _____ ___j

1 2 Window Location

3.00

N' 2.50 :r: '-' :::::

2.00 ·;; Q "0 1.50 "0 §

1.00 J:j u p.,

r:/l 0.50

6.00

5.00

"""" N

4.00 '-'

VJ

3.00 ] u & 2.00

r:/l

1.00

115

0.00 ....__ ______ .....___ _____ _____.

1 2 Window Location

- ·- [t] Figure 3.9. Spectral moments values for voiceless stops at the two sampling window locations. The values are averaged across subjects and vowel contexts.

---o- [t']

- ·- [k] ---<>--- [q]


116

the canonical spectral shape of the stop. The spectral moments averaged from window 1

are listed in Table 3.8. Figure 3.10 shows the box plots that illustrate the distribution of

the spectral moments scores for the four voiceless stops as calculated from that window

location. For each spectral moment, an ANOV A test was conducted for the four voiceless

stops across subjects and vowel contexts with that moment (calculated at window 1) as

the dependent variable

Table 3.8. Mean values of spectral moments for voiceless stops calculated at window 1 and averaged across speakers and vowel contexts.

Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [t] 5,872 2,324 0.825 1.68

4,544 1,956 1.403 4.84 [k] 4,784 2,569 1.050 2.38 [q] 4,603 2,499 0.649 1.52

F value 24.504*** 19.627*** 10.212*** 12.809*** (df= 3,536)

*** p < .001

The tests show main effects of consonant type on spectral mean [F(3,536) = 24.504, p < 0.001; R2 = 0.116], spectral standard deviation [F(3,536) = 19.627, p < 0.001;

R2 = 0.094], spectral skewness [F(3,536) = 10.212, p < 0.001; R2 = 0.049], and spectral

kurtosis [F(3,536) = 12.809, p < 0.001; R2 = 0.062]. Spectral mean is only capable of dis-

tinguishing [t] from the rest of the stops. This plain stop had the highest spectral mean (p

< 0.001). The remaining stops were not significantly different from each other. Standard

deviation can only distinguish the emphatic [f1] from all other stops (p < 0.01). The re-

maining stops are not statistically different aside from the fact that [t] has a marginally

lower standard deviation than [k] (p < 0.1). The spectral skewness of [t'] is significantly


117 10.0 • I • 5.0

• • 8.0 : I 'N'4.0 • $ ,......_ • N •

$ 6.0 ';; 3.0 '-' 0

',::j 1:: ro i'3 4.0 ·;; 2.0

::E <l)

I "0

2.0 1.0

0.0 0.0 5.0 • • 40 •

+ • • 30 • 2.5 • • •

VJ • VJ VJ 20 • <l) ·;;; I • 1:: 0 0.0 ...... • ..... t ] ::l 10 • if] • -2.5 • • • • • 0

• • -5.0 -10 [t l Wl [k] [q] [t l [k] [q]

Figure 3.10. Box plots of the distributions of the spectral moments scores for the four voiceless stops [t, t\ k, q]. The scores are averaged across subjects and vowel contexts.

higher than those of [t] and [q] (p < 0.01). The latter two ate not distinct from each other.

The skewness of [k] had a mid-range value that is not statistically different from those of

[t] and [t'1] and only marginally higher than that of [q] (p < 0.1). The spectral kurtosis of

[t1] is significantly higher than the kurtosis of any other stop (p < 0.01). The other three

stops are not statistically different.

3.3.1.10 Voiceless Stops- Individual Subjects

Variations in the rankings of the spectral moments between the individual sub-

jects were examined by means of 20 ANOVA tests (4 moments x 5 subjects) conducted

across vowel contexts. In each test the dependant variable is one of the four spectral mo-


118

-ill- -ill -{[}--· -il-V) ·-ill ·-OJ-- -[]- ···-[}-..... u <!)

B --rn- -[]- -[]- -ill-;:l C/J --m- ill- -f]- .

---[[]- -ITJ-- -[]-· ·1 'i" -[[]-- -[[]- --{]]-. . . u <!)

B .. --DJ- -{[}- ·-ill- --(} ;:l • C/J

-{[}- -ill- -DJ--rn- -rn-·· -ill-- ·-ill

M ---rn- ·-rn- -ill-· -rn-..... u <!)

B -{!}- ·---[}- . --[]- ·--[]-;:l C/J

-ill-· -m- -[]- -{]-

-{]- --DJ- --[]- -ill C"'1 --[]}- --en- -{[}- ·-[} u <!)

B ·-ot- -DJ- -[]-- -ill ;:l C/J

-OJ-- --o- -m- . ·-ill --{]]- --[0- -G

..... -{]}- --[[]--- {]- -[} u <!)

B .. -ill -[0-- ·--{]- --o-;:l C/J -rn- -OJ- --[0- --(]}

0 0 0 0 o· 0 0 0 0 'r: 0 V) 0 0 0 0 0 0 0 0 0 0 0 N 0 C'i V) M C"'1 ...-<

00 .,0 .,t C'i .,t M C'i 0 I I I


Figure 3.11. Box plots showing the distributions of the four voiceless stops spectral moments scores for each of the five individual subjects.

2:

g ;:::::,

;:::::,

o: g ;:::::,

;:::::,

:§:

g ;:::::,

;:::::,

:§:

g '2. ;:::::,

:§:

g ;:::::,

;:::::,


119

ments (calculated at window 1). Figure 3.11 includes 20 box plots showing the score dis-

tributions of each of the four spectral moments for each subject.

Aside from subject 3, all subjects show that [t] has a significantly higher spectral

mean than its emphatic counterpart [t1} Subject 3 shows that the two are not statistically

different. It should be noted that the voice onset time (VOT) produced by subject 3 fol-

lowing [t] (71 ms, on average) was markedly longer than the other subjects (from 30 to

49 ms, on average). The VOT following [f'] for subject 3 (19 ms, on average), mean-

while, was in line with those of the other subjects (from 10 to 20 ms, on average). It is

possible that the extra duration between the stop release and the start of the vowel for

subject 3 has the effect of distancing and weakening the coarticulatory effects of the fol-

lowing vowel. These effects should be able to enhance the difference between the acous-

tic signatures of the plain [t] and the emphatic This way, the acoustic differences be-

tween these two stops seem to be neutralized for subject 3. There are no stable patterns

for the ranking of spectral standard deviation scores of the four stops across the five sub-

jects. Standard deviation rankings of voiceless stops seem to be highly subject dependant.

The same is somewhat true regarding the rankings of spectral skewness. However, the

skewness of the emphatic stop [t'i'] is either statistically the highest or among the highest.

Kurtosis results are quite similar to those of skewness: has either the highest or one of

the highest values, while the rankings of the other stops are subject-dependant.

For the most part, there is a great deal of inter-subject variability in the rankings

of voiceless stops spectral moments generated from the release portion. It is possible that

the variability in the duration of the release among the four stops results in substantial

variability in the frequency resolution captured by the analysis window. We should keep


120

in mind, though, that spectral moments are not the only method used in this experiment.

These four stops are tested again in §3.3.2.3 using the multi-band spectral analysis.

3.3.1.11 Voiceless Stops- Specific Vowel Contexts

Variations in spectral moments rankings based on vowel contexts were also in-

spected by means of a set of 36 ANOVA tests (4 moments x 9 vowel contexts) conducted

across subjects. Each individual test compares the values of one of the four spectral mo-

ments for the four voiceless stops in one of the nine vowel contexts.

The identity of the following vowel causes some consistent variation in the values

and rankings of the four spectral moments. This to be expected given the fact that the

stop release is occasionally a rather short acoustic landmark that resides close to the coar-

ticulatory influence of the ensuing vowel. The results show that the values and rankings

of the stop release spectral moments obtained before the two vowels [i, a] are different

from those obtained before [u]. When the following vowel is either [i] or [a], the spectral

mean of [t] is mostly significantly higher than the other stops which are not statistically

different from each other. When the following vowel is [u], all stops are not statistically

different from each other. The spectral standard deviations of the four voiceless stops are

mostly not significantly different from each other before the vowels [i, a]. Before [u], the

two back stops [k] and [q] are mostly significantly higher than the two alveolars [t] and

[t']. The velar stop [k] generally has the highest spectral skewness before the vowels [i,

a] but not always significantly. The remaining stops are mostly not different from each

other in those environments. Before [u], the two alveolar stops almost always have sig-

nificantly higher spectral skewness than the two back stops. The spectral kurtosis of [k] is


121

statistically the highest before [i, a] or all stops are no different. Before [u], on the other

hand, [t'1] always has the highest kurtosis while the remaining stops are not different from

each other.

It is clear, then, that both spectral standard deviation and spectral skewness do not

reflect any consistent acoustic differences between [t] and [1'1]. Spectral mean does reflect

a distinction between these two stops, but only before [i] and [a]. The only spectral mo-

ment that always reflects a distinction is kurtosis. The emphatic [t'1] has a consistently

higher kurtosis than its plain counterpart [t]. When taking into account the rather consis-

tent vowel effects along with the high inter-subject variability in the values and ranking

of the spectral moments values of the four voiceless subjects, it is possible that a subject

x vowel (i/a vs. u) x consonant analysis would produce better results. However, this pur-

suit lies outside the main purpose of this study and is better left for future consideration.

3.3.1.12 Voiceless Stops - Discriminant Analysis

Voiceless stops fair poorly as a group in discriminant analysis functions. As Table

3.9 shows, only [t] is correctly classified in a substantial number of cases than

74%) while the overall classification percentage is low (50.9% ). Spectral mean and spec-

tral skewness emerge as the only major contributors in the classification. The importance

of spectral mean is most likely a result of its rather stable relation with [t]. This stop is

consistently associated with high spectral mean. This is the likely reason for its compara-

tively high individual classification rate. The importance of spectral skewness, mean-

while, is most likely due to its slightly notable ability to distinguish alveolar stops from

back stops.


122 Table 3.9. Results of the discriminant analysis for the voiceless stops based on the four spectral moments' values combined together as predictors. The moments' values are calculated from sampling window l and across speakers and vowel contexts. The numbers represent the totals and percentages of correctly classified sounds.

Predicted Group Membership Consonant [k] [q] [t] [t\'] Total

Original Count [k] 37 35 21 42 135

[q] 37 75 5 18 135

[t] 10 0 101 24 135 [ 19 27 27 62 135

% [k] 27.4 25.9 15.6 31.1 100

[q] 27.4 55.6 3.7 13.3 100

[t] 7.4 .0 74.8 17.8 100

14.1 20.0 20.0 45.9 100


3.3.1.13 Voiced Stops- Pooled Data

As there are only two voiced stops in this study, [d] and [d>], ANOVA is not a vi-

able choice since post hoc comparisons require a minimum of 3 independent variables.

Therefore, four t -tests are used to compare the means of the values of each of the four

spectral moments for both stops across subjects and vowel contexts. The values are listed

in Table 3.10. There are significant differences between the two stops in spectral mean [t

= 6.464 (df = 268), p < 0.001], skewness [t = 5.137 (df = 268), p < 0.001], and kurtosis [t

= 3.168 (df = 268), p < 0.01]. The spectral standard deviation values of the two voiced

stops are not significantly different [t= 0.253 (df = 268), p > 0.8]. Figure 3.12 shows the

spectral moments values at the two window locations for the two voiced stops.


Table 3.10. Mean values of spectral moments for voiced stops averaged across speakers, window locations, and vowel contexts.

Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [d] 4,187 2,084 1.346 4.93 [di"J 3,477 2,064 0.834 2.98

t(df = 268) 6.464*** 0.253 5.137*** 3.168**

** p < .01; *** p < .001

123

The averaged spectral moments scores calculated from window 1 alone are listed

in Table 3.11. The box plots in Figure 3.13 represent the distribution of the spectral mo-

ments scores for the two voiced obtained from sampling window 1. As was the case for

the voiceless stops, this window is treated as the location of the canonical shape of shape

of the voiceless stops. This is supported by the fact that, for both stops, the spectral mean

value at this window (Figure 3.12) is higher than the one obtained at the stop-vowel june-

tion.

Each spectral moment was used as the dependant variable in at-test conducted for

the two stops across subjects and vowel contexts. The spectral moments values are calcu-

lated from window 1 alone. The tests show significant differences between the two stops

in the values of spectral mean [t = 4.850 (df = 268), p < 0.001] and spectral skewness [t =

4.688 (df = 268), p < 0.001]. The two stops are not statistically different in terms of their

spectral standard deviation [t = 0.668 (df = 268), p > 0.5] nor their spectral kurtosis [t =

1.352 ( df = 268), p < 0.177].


124 5.00 2.50 .. _ -- ---t:l ----4.00 :Il 2.00

'--'

---- :::: N .8 ::r:: ..... ro ·;; 1.50

ll) ro Q ll)

::;8 '"d ..... 'C<l ro

'"d t3 2.00 :::: 1.00 ro ll) Vl 0..

[/) 'C<l ..... ..... ()

1.00 ll)

0.50

0.00 0.00

1.50 6.00 .. _ --- ....... ...._...._. 5.00 ·-------·

----N

1.00 4.00 :::: '--'

D ll) D [/)

0 ..... [/) ..... 3.00 ::l 'C<l 1:l 'C<l () ll) 1:l

[ 2.00 [/)

1.00

0.00 0.00 1 2 1 2

Window Location Window Location

Figure 3.12. Spectral moments values for voiced stops at the two sampling window loca-tions. The values are averaged across subjects and vowel contexts.

- ·- [d]


125 Table 3.11. Mean values of spectral moments for voiced stops calculated at window 1 and averaged across speakers and vowel contexts.

Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [d] 4,448 2,068 1.407 4.84

3,859 2,012 0.845 3.69

t(df = 268) 4.850*** 0.668 4.688*** 1.353

*** p < .001

10.0 4.0 •

8.0 I • I .:.:: N 9 ::r:: 6.0 • '--'

>:: .:.:: .g 2.0 '--' >:: c<l c<l 4.0 ·;;: C)

::8 C) "0

2.0 ....; l.O IZl

0.0 0.0

6.0 60 • • • 50

4.0 • 40 • I

(/] (/] • C) • (/] 30 • >:: • 8 2.0 • • ..... • C) ;::l 20 • .:.::

IZl

l 0.0 10

• 0 • -2.0 -10

[d] [d'] [d] [d'J

Figure 3.13. Box plots of the distributions of the spectral moments scores for the two voiced stops [d, d1'].

The scores are averaged across subjects and vowel contexts.

3.3.1.14 Voiced Stops- Individual Subjects

A total of 20 t-tests were conducted on the spectral moment scores of the two

voiced stops for each subject individually to test whether all of the above stated rankings

are stable across subjects (4 moments x 5 subjects). Figure 3.14 represents box plots of


126

1- {]- . <r tr) 2 tl Q)

B t -ill- -ill- -i]} :::l [/)

:3

.. *. -ill- t· 1 <r

"'" 2 ..... (_) Q)

B ·-ill --[I}- -{]-:::l [/)

2

·i --rn- ... -ill- -o <r ('C) 2 tl Q)

B . -ill-· -ill- -t :::l [/)

:3

·---{]-- ill . -D-· <r N 2 tl Q)

B -{]- {]]- ---ill--:::l [/)

:3

t --ID- -{]}- ·t <r

2 -(_) Q)

B -t· -{TI- --{]- . . -ill :::l [/) :3

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 "'" <:'i c:i <:'i "'" N cx:i 0 ..,f <:'i ci ('C) C'i ci I


Figure 3.14. Box plots showing the distributions of the two voiced stops spectral moments scores for each of the five individual subjects.


127

those scores for each subject. All five subjects show that the spectral mean of [d] is sig-

nificantly higher than that of [ d"] and that the spectral standard deviations of both stops

are not statistically different. Relative spectral skewness values of the two stops is subject

dependent: subjects 1, 3, and 5 show that [d] has a significantly higher skewness while

the other two subjects show no statistical skewness difference between the two stops.

Spectral kurtosis does not distinguish the two stops from each other.

3.3.1.15 Voiced Stops- Specific Vowel Contexts

A set of 36 t -tests were also conducted to assess the variations in spectral mo-

ments rankings based on vowel contexts (4 moments x 9 vowel contexts). The results

show that in the vast majority of vowel contexts, there are no or just marginal statistical

differences between [d] and [d"] in terms of spectral mean, standard deviation, and kurto-

sis. The only exceptions are in the aCu environment where [d] has a significantly higher

spectral mean, and in the uCa environment where the spectral kurtosis of [d] is signifi-

cantly higher as well. As for spectral skewness, in the vowel contexts which include the

vowel [ u] at either or both vowel positions, with the exception of the aCu context, [ d] has

a higher skewness than [d"]. In all remaining contexts, the two stops are not statistically

distinct from each other.

3.3.1.16 Voiced Stops -Discriminant Analysis

As shown in Table 3 .12, the discriminant analysis shows that the two voiced stops

were rather well classified overall (more than 78% of the cases). Cases of misclassifica-

tion do not exceed 28.1 %. Analyzing the standardized canonical discriminant function


128

coefficients reveals that spectral mean and spectral skewness play the roles of the most

prominent predictors. This is to be expected given the statistical rankings of spectral

mean in subject-based data analysis (§3.3.1.14) and spectral skewness in vowel-based

data analysis (§3.3.1.15).

Table 3.12. Results of the discriminant analysis for the voiced stops based on the four spectral moments' values combined together as predictors. The moments' values are calculated from sampling window I and across speakers and vowel contexts. The numbers represent the totals and percentages of correctly classified sounds.

Predicted Group Membership Consonant [d] [dl'] Total

Original Count [d] 116 19 135 38 97 135

% [d) 85.9. 14.1 100

[dl'] 28.1 71.9 100


3.3.2 Multi-Band Spectra

3.3.2.1 Voiceless Continuants

Since the canonical shapes of voiceless continuants were judged to be at the be-

ginning and middle portions of the frication noise, multi-band spectra were generated us-

ing 40-ms full Hamming windows at locations that correspond exactly to windows 2 and

3 of the previous method. That is, the first window covers the first 40 ms of the con tin-

uant while the second window covers the middle 40 ms. For each continuant, the inten-

sity values at each frequency band were averaged across windows, subjects, and vowel

contexts. The averaged raw intensity values and the averaged normalized intensity values


129

for the eleven frequency bands in the four voiceless continuants covered by this method

are listed in Tables 3.13 and 3.14, respectively. The raw intensity value ofa given fre-

quency band is basically the relative intensity at that band as yielded directly by the

multi-band spectrum while the normalized value is the raw value minus the average of

the four raw intensity values of all four continuants at the same frequency band. Figure

3.15 shows four histograms that reproduce the overall shapes of the multi-band spectra of

the four voiceless continuants.

Table 3.13. Mean relative intensity values at the 11 frequency bands for the four voiceless continuants averaged from the two sampling windows across speakers and vowel contexts.

Frequency Band

2 3 4 5 6 7 8 9 10 11 [s] -64.5 -53.7 -44.5 -34.4 -27.2 -26.2 -31.4 -28.7 -29.3 -24.2 -24.8

-63.5 -54.2 -44.8 -33.6 -27.2 -26.5 -'-32.3 -29.3 -30.1 -25.4 -25.9 [x] -53.4 -45.9 -46.0 -37.5 -40.6 -42.7 -49.3 -47.0 -48.2 -40.8 -40.7 [n] -51.8 -42.5 -38.4 -42.2 -45.6 -50.1 -59.3 -54.7 -58.2 -51.8 -49.8

Table 3.14. Mean normalized relative intensity values at the 11 frequency bands for the four voiceless continuants averaged from the two sampling windows across speakers and vowel contexts. The normalized value at each band is the result of subtracting the averaged raw value of the band across all continuants from the raw value of the band for a given continuant.

Frequency Band

1 2 3 4 5 6 7 8 9 10 11 [s] -6.2 -4.6 -1.1 2.5 8.0 10.3 11.7 11.2 12.2 11.4 10.5 [sl'] -5.2 -5.1 -l.4 3.3 8.0 9.9 10.9 10.7 11.4 10.2 9.4 [x] 4.9 3.2 -2.6 -0.6 -5.4 -6.3 -6.2 -7.1 -6.7 -5.2 -5.4 [h] 6.5 6.7 5.0 -5.3 -10.4 -13.8 -16.2 -14.8 -16.8 -16.2 -14.5

The normalized intensity value for each band was used as the dependent variable

in a one-way ANOVA conducted for the four continuants across subjects and vowel con-

texts. This resulted in 11 individual ANOV A tests. Main effect of continuant type on the


0

-10 -20

-30 -40

-60 ;::l

[s]

E s <r: 0 <l)

-10 'i<l

-20 0::: -30 -40 -50 -60

[X]

2 3 4 5 6 7 8 9 10 11

130

[h]

2 3 4 5 6 7 8 9 10 11 Frequency Band (kHz)

Figure 3.15. Four histograms replicating the multi-band spectra of the four voiceless continuants. The values for the data bars are the relative intensity values of the eleven frequency bands in actual multi-band spectra (averaged from the two Hamming windows across subjects and vowel contexts). The error bars represent one standard deviation.

normalized intensity value at all 11 bands is observed [F(3,536) ranges from 43.877 to

610.271, p < 0.001]. Nine of the subsequent eleven Scheffe post hoc pair-wise compari-

sons show that individual normalized intensity values succeeded in distinguishing contin-

uant primary places of articulation (p < 0.01). The two alveolars [s] and are not sig-

nificantly different from each other in any of the eleven bands (p ranges from .437 to

1.000). The high similarities between the two sibilants are quite visible in Table 3.13 and

Figure 3.15. The normalized intensity values for this pair of sounds are lower in the low-

frequency bands and higher in the high-frequency bands than those of [X] and [h]. The

voiceless pharyngeal [h] follows a pattern opposite to that of the alveolars. Namely, it has

the highest normalized intensity values at the low-frequency bands and the lowest values


131

at the high-frequency bands. At band 1, the intensity value for [h] is not significantly dif-

ferent from that of [X] (p > .1 ). Meanwhile, the normalized intensity at band 3 fail to dis-

tinguish the uvular [X] from the two sibilants [s] and [s'] (p = 0.237 and 0.453, respec-

tively).

3.3.2.2 Voiceless Continuants - Discriminant Analysis

To weigh the ability of the gross spectral shapes of voiceless continuants, in the

form of multiple frequency bands, to classify these sounds, a discriminant analysis was

conducted. The normalized intensity values at each of the 11 frequency bands, averaged

from the two sampling window locations, were used as predictors.

Table 3.15. Results of the discriminant analysis for the voiceless continuants based on the normalized intensity values at each of the 11 frequency bands, averaged from the two sampling window locations, combined together as predictors. The numbers represent the totals and percentages of correctly classified sounds. The data is averaged across speakers and vowel contexts.

Predicted Group Membership Consonant [nJ [s] [s\·] [xl Total

Original Count [nJ 134 0 0 135

[s] 0 83 52 0 135

0 57 78 0 135

[xl 0 0 0 135 135

% [nJ 99.3 .0 .0 .7 100

[s] .0 61.5 38.5 .0 100 [s\·] .0 42.2 57.8 .0 100

[xl .0 .0 .0 100.0 100


Table 3.15 shows an overall correct classification rate of 79.6%. The two guttur-

als [X] and [h] are almost always correctly classified (100% and 99.3% correct classifica-


132

tion rates, respectively). The two sibilants, on the other hand, are highly misclassified as

each other. The plain [s] is correctly classified in only 61.5% of the cases, while its em-

phatic counterpart [s'] is classified correctly in only 57.8% of the cases. All cases of mis-

classification of these two sounds (38.5% and 42.2%, respectively) were as each other.

So while multi-band spectra were quite successful in classifying main places of articula-

tion, they fail to classify continuants based on the presence or absence of a secondary

place of articulation. Analyzing the standardized canonical discriminant function coeffi-

cients indicates that band 1 (0 to 1 kHz), band 3 (2 to 3 kHz), and band 7 (6 to 7 kHz)

weigh in almost equally as the most prominent contributors. It seems that these three

bands show the least amount of variability across the sample speech tokens.

3.3.2.3 Voiceless Stops

Tables 3.16 and 3.17 show the averaged intensity values and the averaged normal-

ized intensity values for the eleven frequency bands in the four voiceless stops [t], [t'],

[k], and [ q]. Figure 3.16 shows four histograms that reflect the overall shapes of the

multi-band spectra of the four voiceless stops.

Table 3.16. Mean relative intensity values at the II frequency bands for the four voiceless stops averaged across speakers and vowel contexts.

Frequency Band I 2 3 4 5 6 7 8 9 10 II

[t] -54.5 -48.1 -40.3 -36.2 -34.4 -36.7 -43.5 -41.6 -42.8 -36.0 -34.9 [ -50.3 -47.0 -42.6 -39.2 -38.9 -44.0 -52.9 -51.9 -52.8 -47.3 -46.2 [k] -49.9 -42.8 -39.0 -38.2 -37.7 -45.3 -49.6 -46.7 -47.7 -40.7 -39.0 [q] -47.1 -45.6 -45.3 -38.7 -44.4 -46.4 -53.8 -52.5 -53.8 -46.8 -46.5


133 Table 3.17. Mean normalized relative intensity values at the 11 frequency bands for the four voiceless stops averaged across speakers and vowel contexts. The normalized value at each band is the result of subtracting the averaged raw value of the band across all stops from the raw value of the band for a given stop.

Frequency Band 2 3 4 5 6 7 8 9 10 11

[ t] -4.1 -2.2 1.5 1.9 4.5 6.4 6.5 6.5 6.5 6.7 6.8 [t\') 0.1 -1.1 -0.8 -1.2 0.0 -0.9 -2.9 -3.7 -3.5 -4.6 -4.6 [k] 0.6 3.1 2.8 -0.1 1.1 -2.2 0.3 1.5 1.6 2.0 2.6 [q] 3.4 0.2 -3.5 -0.7 -5.6 -3.3 -3.9 -4.3 -4.5 -4.1 -4.9

The normalized intensity value for each band was used as the dependent variable

in a one-way ANOVA conducted for the four continuants across subjects and vowel con-

texts for a total of 11 individual ANOV As. Main effect of stop type on the normalized

intensity value at all 11 bands is observed [F(3,536) ranges from 4.346 to 76.013, p <

0

-10 -20

-30

-40 fg -50 '-'

-60 ::l

[t]

s <r: 0 I!)

.2': -10 c;; -20 -30

-40 -50 -60

[k]

2 3 4 5 6 7 8 9 10 11

[f']

[q]

2 3 4 5 6 7 8 9 10 11 Frequency Band (kHz)

Figure 3.16. Four histograms replicating the multi-band spectra of the four voiceless stops. The values for the data bars are the relative intensity values of the eleven frequency bands in actual multi-band spectra (averaged from the burst Hamming window across subjects and vowel contexts). The error bars represent one standard deviation.


134

0.01]. The alveolar stop [t] is distinguished by the lowest intensity value at band 1 (p <

.001) and the highest values at bands 5 through 11 (p < .01). On the contrary, [q] is dis-

tinguished by the highest band 1 value (p :::; .01). For the most part, bands 4 through 11

show a consistent pattern where [t] is distinguished by the highest values and [k] is dis-

tinguished by mid-range values, while [t1] and [q] have the lowest values that are not

significantly different except at band 5 where the value for [q] is significantly lower than

that of [t']. When comparing the plain/emphatic pair [t, t1], aside from bands 2 and 3, the

two sounds are different at all other bands (p < 0.05).

3.3.2.4 Voiceless Stops - Discriminant Analysis

As seen in Table 3.18, an overall correct classification rate of 68.3% is obtained

for voiceless stops. Each of the four stops is correctly classified in at least 63.7% of the

cases. The uvular stop [q] is misclassified as either [k] or [t1]. Meanwhile, [t] is misclas-

sified mainly as [t'] while [k] and [t1] exhibit no specific misclassification patterns.

When analyzing the standardized canonical discriminant function coefficients, bands 3, 9,

and 10 are the only insignificant contributors to the classification tasks. Among the eight

bands that show significant contributions, no subset stands out as a more prominently

contributing group.


135 Table 3.18. Results of the discriminant analysis for the voiceless stops based on the normalized intensity values at each of the 11 frequency bands combined together as predictors. The numbers represent the totals and percentages of correctly classified sounds. The data is averaged across speakers and vowel contexts.

Predicted Group Membership Consonant [k] [q] [t] [ t\•] Total

Original Count [k] 86 14 14 21 135 [q] 17 93 0 25 135 [t] 7 3 104 21 135

Wl 11 23 15 86 135 % [k] 63.7 10.4 10.4 15.6 100

[q] 12.6 68.9 0.0 18.5 100 [t] 5.2 2.2 77.0 15.6 100

Wl 8.1 17.0 11.1 63.7 100


3.3 Discussion and Conclusions

The results of this experiment indicate that there are no reliable and salient acous-

tic differences between the canonical spectral shapes of Arabic emphatic and non-

emphatic fricatives. As for stops, the results show strong spectral differences between

emphatics and non-emphatics. These results, however, are not conclusive due to unbal-

anced coarticulatory interference from adjacent vowels into the acoustic analysis as

pointed out below. The results further show that the spectral shapes of Arabic uvular con-

tinuants show strong fricative-like qualities.

The results strongly indicate that the spectral shapes of the two sibilants [s, are

very similar. This similarity is reflected in both the spectral moments and the multi-band

spectra analyses and is very stable across subjects and vowel contexts. As for the two in-

terdentals [0, while the pooled data show that these two sounds have statistically dis-


136

tinct spectral mean values, four of the five individual subjects show no difference be-

tween the spectral means of the two sounds. As for the remaining metrics, the two inter-

dentals are generally not distinct from each other. This, however, follows from the fact

that voiced non-sibilant fricatives, as a whole, do not reflect substantial spectral differ-

ences. In general, it seems that the primary articulation in emphatic fricatives masks any

potential impact of the secondary articulation on the acoustic identity of the sound signal.

Tables 3.19 and 3.20 show the classification rates of emphatic and non-emphatic

consonant pairs based on spectral moments and multi-band spectra, respectively. The

overall accurate classification rate of the pair [ s, s"] using spectral moments as predictors

is around chance (54.1% ). This basically means that [s] is almost totally indistinguishable

from [s"] based on their spectral moments values. Multi-band spectra fair better. Here, an

overall correct classification rate of 68.5% is achieved. It is clear that this analysis

method benefits from the larger number of predictors (intensity values in 11 bands) when

compared to spectral moments (4 moments). Nevertheless, the fact that almost one third

of the actual incidents of each member of the [ s, s"] pair is misclassified as the other de-

spite the use of 11 predictors in a test that involves only two sounds is an indication that

the frication portions of [s] and [s"] are acoustically similar. Recall also that the differ-

ences between the intensity values of these two sounds at all 11 bands are statistically

insignificant. It shows from the classification results of the pair [ 0, o"] that, much like the

classification of the two sibilants using multi-band spectra, about one third of the actual

incidents of each interdental are misclassified as the other. We conclude, therefore, that

the canonical spectral shapes of Arabic plain/emphatic continuant pairs do not include

reliable acoustic correlates to the phonetic difference between them.


137 Table 3.19. Results of the discriminant analyses for the plain/emphatic consonant pairs based on the spectral moments values as predictors. The numbers represent the percentages of correctly classified sounds. The data was averaged across speakers and vowel contexts.

Predicted Group Membership Cons. [s]

Original % [s] 48.9 51.1 [sY] 40.7 59.3

X= 54.1

[ol [oYJ 62.2 37.8 31.1 68.9

X= 65.6

[ t] [ t\']

[d] [d\']

Predicted Group Membership

[t] [tl']

81.5 18.5 27.4 72.6

X= 77.0

[d] [dl']

85.9 14.1 28.1 71.9

X= 78.9

Table 3.20. Results of the discriminant analyses for the plain/emphatic voiceless consonant pairs based on the normalized relative intensity values of the multi-band spectra as predictors. The numbers represent the percentages of correctly classified sounds. The data was averaged across speakers and vowel contexts.

Predicted Group Membership Predicted Group Membership Cons. [s] [sl'] [t] [ t\']

Original % [s] 68.9 31.1 [t] 86.7 13.3 31.9 68.1 13.3 86.7

X= 68.5 X= 86.7

By comparison, members of the plain/emphatic stop pair [t, t'] are well dis tin-

guished by both methods, as are [d, d'] for which only spectral moments were used. The

relatively high classification rates of plain/emphatic stop pairs should be viewed with

caution, however. One possible reason for these high rates is that stop releases for em-

phatic stops are followed by shorter VOTs (see §2.2.1), bringing them closer to coarticu-

latory influence from following vowels. In fact, there were incidents where the start of a

vowel following an emphatic was immediately attached to the stop burst. In such cases, a

substantial portion of the ensuing vowel was covered by the analysis window. Further-

more, variability in VOT between members of the stop pairs results in variability in the

frequency resolution available in the analysis windows. A long VOT can allow more af-


138

ter-burst frication to be admitted into the analysis window. Meanwhile, a short VOT

makes only a short, or even no, frication available to the window. This inconsistent influ-

ence makes any judgment based on the canonical spectral information of stops question-

able. The high classification rates of emphatic/nonemphatic stop pairs are, therefore, in-

conclusive.

Broadly speaking, the results reflect the general spectral shapes and articulatory

qualities of the consonants being investigated. Alveolar sibilant sounds typically have

diffuse rising spectra with intense high frequency energy and very little or no energy at

the low frequencies. The high frequency concentrations of energy in [s] shift the energy

center of gravity in its spectrum higher giving this sound the highest spectral mean. The

spectrum is also tilted higher at the high frequencies and lower at the lower frequencies

yielding negative skewness. A diffuse spectrum, when treated as a normal distribution

density curve, is expected to have relatively thicker tails than a compact spectrum: This

explains the low kurtosis value for [s]. Spectral moments results posted here for the sibi-

lant fricative [s] are in line with those reported in other accounts of spectral moments in

obstruents such as Jongman et al. (2000), Jassem (1995), and Tomiak (1990). The agree-

ment in sibilant spectral mean and standard deviation is especially notable.

Jongman et al. (2000) and Tomiak (1990) report high classification rates of [s] in

their discriminant analysis results. In contrast, Forrest et al. (1988) report relatively low

classification rates for fricatives in general, especially [s]. Notice that the low individual

classification rates reported here for [s] and [s"] reflect mostly the constant misclassifica-

tion of these two sounds as each other. Overall, however, classification of these two sibi-

lants as a single class of sounds is quite high. The two sibilants are rarely misclassified as


139

nonsibilants (6.7% for [s] and 5.2% for [s']) when using spectral moments. Misclassifica-

tion of sibilants as non-sibilants drops to 0% when using multi-band spectra.

The lower spectral mean of the glottal [h] reported in Tomiak (1990) and, to a

lesser extent, the voiceless velar fricative [x] in Jassem (1995), when compared to

less oral fricatives is a pattern that holds for Arabic gutturals as the present study shows.

Among the gutturals, the pharyngeal [<i'] has no airflow turbulence during its articulation

which is typical of a voiced approximant (Catford 1977). Compared to vowels, voiced

approximants have much weaker energy at the higher frequencies due to the narrower

vocal tract opening (Bickley and Stevens 1987, Stevens 1998). This causes almost all of

the acoustic energy in [<i'] to be concentrated in the lower 3 kHz frequency range. As a

result, treating the power spectrum of this sound as a normal distribution yields a very

sharply peaked curve over the low frequencies with a very narrow or no tail extending

into the high frequency range. This is reflected by the low spectral mean and standard

deviation as well as the high skewness and kurtosis. The latter is profoundly higher in

comparison with the other voiced continuants. Similarly, in the voiceless pharyngeal [h],

the bulk of spectral energy is located at lower frequencies than in fricatives. As a result,

[h] exhibits the lowest spectral mean and the highest skewness among the four voiceless

continuants. The noticeably high kurtosis is due to the smaller presence of energy as we

move higher in the frequency scale. The approximant qualities displayed by [h], how-

ever, are less drastic than those displayed by [<i']. The Arabic pharyngeal pair [h, )]

clearly reflects the typical acoustic properties of approximants as defined by Catford

(1977). Namely, only the voiceless member of the pair has some turbulence in the airflow

during its articulation while no turbulence is present during the articulation of the voiced


140

member. This stands in opposition to fricative voiced/voiceless pairs where turbulence is

always present. The present study provides acoustic support for the claims of Laufer and

Condax (1979, 1981), Catford (1977), and Ladefoged and Maddieson (1996) that Arabic

pharyngeals are approximants rather than fricatives.

The uvular [:X] is articulated near the velar region where sounds typically have

compact, well-formed spectra with mid-frequency peaks. Therefore, this sound exhibits a

mid frequency spectral mean and slightly high skewness and kurtosis. Likewise, the

voiced uvular [B] generally has a comparatively more compact spectrum with a mid to

low frequency peak. This explains the relatively high skewness and kurtosis values asso-

ciated with this sound when compared to the interdentals [o, The two uvular contin-

uants exhibit spectral qualities that are clearly more similar to those of fricatives than to

those of approximants. While this is not very evident in the case of [:X] (since voiceless

approximants also involve airstream turbulence), the spectral qualities of [B] clearly indi-

cate that uvular continuants are fricatives. Notice that [B] is not even close to [1] in the

values of any of the spectral moments metrics. The substantially higher standard devia-

tion and much lower kurtosis in uvular [B] indicates that the sound energy in the spec-

trum of this sound is widely dispersed. This points to the presence of airflow turbulence

as a result of a narrow fricative constriction. This finding does not agree with view that

Arabic uvulars are There are significant theoretical ramifications to this

finding. Recall from Chapter 2 that McCarthy's (1994) proposes that the identification of

14 One might argue that there is a fricative/approximant variation in the surface realization of Arabic uvular continuants and that these sounds are still underlyingly approximants surfacing as fricatives in the tokens tested. This is highly doubtful, though, given that in the test tokens the consonants are placed in VCV environments. This is the most likely environment for the supposed approximant variants to surface due to articulatory undershoot. There seems to be a strong tendency for Arabic uvular continuants to surface as fricatives and should be classifted as such.


141

the guttural class for the purposes of the OCP-based restrictions on Arabic morpheme

structure is achieved by taking the feature [pharyngeal] in conjunction with the feature

[+approximant]. Since, in McCarthy's view, all gutturals are approximants and emphatics

are not, members of the two classes are allowed to cooccur. The present finding nullifies

the claim that all gutturals are approximants and demands a new reasoning for the free

cooccurrence of gutturals with emphatics. This topic is addressed in detail in Chapter 6. 15

The relationships between continuant types and their spectral qualities are not as

sharply defined for voiced continuants as for voiceless continuants. While voiceless con-

tinuants as a group are generally very well classified by both the spectral moments and

the multi-band spectra methods, the voiced uvular [ff] and the two interdentals [o, o'] were frequently misclassified as one another. In general the three fricatives [o, o', ff]

show smaller differences in terms of their spectral moments values when compared to

voiceless fricatives. Non-sibilant spectra are known for lack of strong characteristic cues

(Harris 1958, La Riviere et al. 1975). Voiced fricatives typically share low frequency

voicing bands and formant-like structures reflecting energy modulation by the vibrating

vocal folds. Thus, the presence of voicing in these sounds seems to produce a leveling

effect in the distribution of acoustic energy. The voiced pharyngeal [1], meanwhile, had a

high correct classification rate of 82.2%. It seems that, while spectral moments are not

very capable of distinguishing voiced fricative spectra, they seem to be quite able of dis-

tinguishing voiced fricatives from voiced approximants.

15 I should note here that McCarthy (1994) bases his classification of all Arabic gutturals including uvular continuants on Clements' ( 1990) modification of Catford' s ( 1977) definition of approximants cited above. Clements requires all non-approximants to involve oral stricture. This stipulation is made by Clements solely to exclude nasals. As explained in Chapter 7, recent research on pharyngeal and laryngeal articulations reveal that there are several stricture possibilities in the pharynx ranging from stop to approximant.


142

The spectral moments scores for stops were either not different from each other or

highly dependant on the subject and the following vowel. It is not surprising, then, that,

contrary to the high classification rates of voiceless stops reported by Forrest et al.

(1988), the present study shows poor classification rates. It is possible that the types of

stops investigated in both studies play a major part in this discrepancy. Forrest et al.' s

focus was on the three stops [p, t, k] whose places of articulation are, more or less, evenly

and substantially spaced in the vocal tract. By contrast, the present study focuses on the

four stops [t, t', k, q] two of which ([t, t'1]) have the same primary place of articulation

while the other two ([k, q]) are produced at points of articulations that are very close to

each other. In general, though, the plain alveolar stop [t] was well distinguished from the

rest of the stops. In the discriminant analysis, this stop was correctly classified in 74.8%

of its actual incidents as the only highly classified voiceless stop. The velar [k] and the

uvular [q] are produced at points of articulation that are very close to each other. Mean-

while, the emphatic [t'1] seems to be affected by the coarticulatory effect of the following

vowel. Vowels following emphatics usually have a low F2. The coarticulatory effect of

this drop in F2 on the preceding [t'1] would most likely be in the form of a drop in the

center of gravity of the rising spectrum typical of an alveolar stop. These articulatory

configurations of the three stops [k, q, t'1] conspire to bring their acoustic qualities close

to each other.

When comparing the two spectral analysis methods used in this experiment,

multi-band spectra perform better in classifying obstruent places of articulation. The

overall correct classification rate of voiceless continuants when using multi-band spectra

was slightly higher than when using spectral moments (79.6% vs. 71.1 %). Multi-band


143

spectra achieved almost 100% correct classification rate for all sounds if we regard the

two sibilants as a single class since the two non-sibilants were almost never misclassified.

With spectral moments, on the other hand, the two non-sibilants were misclassified in

8.9% to 14.8% of their actual incidents. Multi-band spectra faired better than spectral

moments in the overall classification of voiceless stops (68.3% vs. 50.9% ). Both methods

were able to accurately classify [t] better than other stops. While the classification rates

for voiceless stops are generally low, the difference between the two methods is substan-

tial. Unlike spectral moments, which are essentially predicated on the translation of the

gross energy distribution into a smoothed out probability density curve, multi-band spec-

tra capture not only the gross distribution of energy in the power spectra, but also the

relative prominence of energy concentrations in the different stretches of spectral fre-

quency. In this sense, multi-band spectra preserve what is known about traditional power

spectra while expressing it as a small number of variables. Additionally, multi-band spec-

tra allow for the calculation of a prototypical power spectrum for a given voiceless ob-

struent by averaging the multi-band spectra from a number of sound samples from differ-

ent speakers. While still in need of further investigation involving more speakers, speech

sounds, and languages, multi-band spectra look like a promising new analysis method for

objective, quantitative characterization of voiceless obstruent power spectra.

3.4. Summary

This chapter investigates the canonical spectral qualities of MSA consonants. The

results show that Arabic emphatic/non-emphatic continuant pairs are only slightly and


144

inconsistently distinguished from each other based on their canonical spectra. As for em-

phatic/non-emphatic stop pairs, their canonical spectra do distinguish them substantially.

However, the fact that there are also dynamic differences between emphatic stops and

non-emphatic ones (in the form of shorter VOTs following the former compared to the

latter) undermines those distinctions. Overall, canonical spectral cues are not considered

a reliable source for acoustic distinctions between emphatics and non-emphatics. The

present experiment, therefore, offers a modern and objective support for the earlier stud-

ies cited in Chapter 2 which generally found little spectral shape distinctions between

emphatic and non-emphatic consonant pairs.

This chapter also provides experimental support for the claim that Arabic pharyn-

geals are approximants, not fricatives. However, the findings strongly suggest that the

two Arabic uvular continuants should be classified as fricatives. This particular finding

provides further challenges to the phonological views that crucially classify Arabic uvu-

lar continuants as approximants,

Having investigated the acoustic qualities of the consonants themselves, we move

in the next chapter to another possible source for acoustic distinction between emphatics

and non-emphatics: the coarticulatory effect of the consonants in question on neighboring

vowels.


145

CHAPTER4

Experiment Two:

Anticipatory and Carryover Consonant-Vowel Coarticulation

4.1 Overview

Experiment One did not yield reliably salient acoustic cues that could distinguish

emphatic sounds from their non-emphatic counterparts. This excludes the canonical spec-

tral qualities of consonants as a potential source for acoustic data that can be used to

achieve the main goals of this The present experiment investigates the an-

ticipatory and carryover coarticulatory effects of MSA emphatics, non-emphatics, and

gutturals on adjacent vowels. The main goal is to characterize these effects then compare

and contrast them. We need to find out if these effects represent consistent and reliably

salient cues for the secondary articulation in emphatics. We also need to know if the coar-

ticulatory effects of emphatics on adjacent vowels resemble the effects of gutturals. In

light of the previous acoustic accounts of Arabic emphatics cited in Chapter 2, we predict

that the coarticulatory effects of emphatics on adjacent vowels to be very different from

those of non-emphatics. We also predict emphatics and gutturals to show different coar-

ticulatory effects on neighboring vowels. Hence we formulate two test hypotheses. The

first hypothesis being tested is that the acoustic coarticulatory effects of emphatics on ad-

jacent vowels are different from the effects of non-emphatics. The second hypothesis be-

16 Of course, the finding in Experiment One that uvulars continuants are fricatives, not approximants, is quite crucial to our main goals. See Chapter 6.


146

ing tested is that emphatics and gutturals have different acoustic coarticulatory effects on

adjacent vowels.

The claim that cues for the articulation of consonants can be reflected in their

coarticulatory effects on adjacent vowels follows directly from the central concepts of the

acoustic theory (Pant 1960). The shape of the vocal tract when producing a specific

sound filters the noise energy and gives a specific acoustic signature to that sound. When

moving from one sound to another, the vocal organs cannot snap instantly from one con-

figuration to another. They shift instead from the specific configuration they assume dur-

ing the production of one sound to the configuration necessary for the production of a

following sound. The temporal morphing between the two configurations is preserved as

acoustic transitions from the first sound to the next one. These transitions are quite visi-

bly reflected in the formant frequencies of vowels.

Early perceptual studies on vowel formant transition patterns in synthetic CV syl-

lables have stressed the importance of those transitions as cues to consonant articulation.

Harris (1958) found that, while listeners could separate the sibilant pair [s, J] from the

non-sibilant pair [f, 9] based on frication noise, differentiating between the members of

the second pair depended on the transition in the adjacent vowel. So if [9] is followed by

a vowel spliced from the syllable [fV] it was perceived as [f] and vice versa. Meanwhile,

the two sibilants [s, J] were almost always accurately classified regardless of the ensuing

vowel. The highly salient noise portions of the sibilants can differentiate them from other

fricatives as well as from each other. Similar results were achieved for the voiced frica-

tives [v, o, z, 3].


147

Liberman et al. ( 1954) found that listeners were able to distinguish between the

three voiceless stops [b, d, g) and, separately, between the three voiced stops [p, t, k] in

synthetic CV syllables based on the size of formant transitions as well as the type of fol-

lowing vowel. Somewhat similar observations were obtained for the three nasals [m, n,

IJ] in synthetic VC syllables yielding relative consistency in the perception of consonantal

place of articulation based on similar transition sizes and directions regardless of the con-

sonant type. These findings also indicate that the patterns of place-cueing formant transi-

tions are similar in terms of pattern of movement and general size, in both CV and VC

contexts.

Delattre et al. (1955) propose a hypothetical fixed starting point for the second

formant frequency, or locus, that is fixed for every consonant (with certain exceptions)

across vowels but varies from one consonant to another. An F2 transition, therefore, is

basically the path that F2 must travel from that starting point to the steady state of the

vowel. Delattre et al. found that the loci for the stops [b, d) were approximately at 720 Hz

and 1800Hz, respectively. Further research by Stevens and House (1956) found that pos-

sible F2 loci for velar stops range from 600 Hz to 2500 Hz. Citing articulatory investiga-

tion of English stops by Dembowski (1998), Kent and Read (2002) attribute this variation

to the inconsistent location of the point of constriction for velar stops. Although the

works cited here investigate stops, Delattre et al. suggest that the same results should be

obtained for other consonants. It should be noted though that the value of transition loci

as invariant cues for consonant place of articulation has been questioned. An extensive

investigation by Kewley-Port (1982) of different vowel transition-based metrics includ-

ing formant frequency loci obtained from natural speech indicate that F2 loci, aside from


148

those obtained for alveolars, are not very dependable cues for place of articulation across

different vowel contexts. She points, however, that her data "support the well-known

claim that the direction and extent of the F2 formant transitions is an important place cue

for most but not all of the vowels examined" (p. 386).

The correlation between consonant place of articulation and formant transitions

discussed above can be extended to include secondary articulations as well. In fact, the

effects of secondary articulations on the transitions and steady states of neighboring vow-

els are sometimes more prominent than those of primary articulations. The works cited in

Chapter 2 indicate that F2 transitions next to emphatic sounds are always lower than

those next to non-emphatics. In a review of other types of secondary articulations by

Ladefoged and Maddieson (1996), acoustic cues for those articulations were visible on

the spectrograms of neighboring vowels. Contrastive labialization in Pohnpeian generally

causes a drop in F2 transition (and in some contexts F1 transition as well) of an adjacent

vowel. Velarization in Marshallese consonants is also accompanied by a sizable drop in

F2 transition of an adjacent vowel compared to plane consonants. Russian palatalized

consonants are usually associated with a high adjacent F2 transition. The acoustic cues

for secondary articulations are not always equally realized in the anticipatory (VC) and

carryover (CV) directions. In the languages they discussed, Ladefoged and Maddieson

note that while the effects of velarization on vowel transitions are almost equal in both

directions, labialization and palatalization are observed more in the carryover than in the

anticipatory contexts.

In all of the previous examples, the coarticulatory effects of consonantal secon-

dary articulations on adjacent vowels override the effects of primary articulations. It is


149

possibly a consequence of the tongue assuming the configuration for the secondary ar-

ticulation prior to the release of the consonant into the vowel in CV contexts (or starting

to take shape for the secondary articulation during the preceding vowel prior to the con-

sonantal closure in VC contexts). The articulators, therefore, do not start simply from the

primary articulation configuration and continue into the vowel, but rather start from a

complex articulatory configuration involving two places at once.

4.2 Methods

4.2.1 Subjects

The same five subjects who participated in Experiment One also participated in

this experiment. The experimental phrases for this experiment were intermixed with those

for Experiment One, and the subjects were not aware of the existence of separate experi-

ments.

4.2.2 Stimuli

The set of stimuli for this experiment consisted of real MSA words containing the

four emphatic coronals [t'l, d't, o'l, their non-emphatic counterparts [t, d, o, s], the

seven gutturals [q, x, ff, h, ), h, ?], and the velar stop [k]. The test words present these

sounds along with the three vowels [i, a, u] in #CV and VC# contexts. In compiling the

test paradigm, the same general guidelines for selecting the test words in Experiment One

(explained in §3.2.2) were followed. The only exception to this was the use of geminated

vowels rather than single vowels. The reason for this choice is that geminate vowels gen-


150

erally have more evident steady state portions than do single vowels. In Arabic, single

vowels are quite short and, in many cases, appear fully transitional, making the identifica- .

tion of a steady state highly subjective. Since this experiment addresses the coarticulatory

effect of consonants on both the transition and the steady state portions of adjacent vow-

els, the frequent lack of well-defined steady state portions in single vowels make them

less convenient than geminate vowels. The two interdentals [ o, o>] were excluded from

the #CV portion of this experiment since, for most CV combinations, no real words con-

taining these sounds initially followed by geminate vowels were found. The resulting test

paradigm consisted of 90 words ((14 x 3) + (3 x 16)). The set of stimuli is listed in Ap-

pendix B.

The VC# words were presented in the carrier phrase "?alkalimtu hiya _"

("The word is _"). As was the case in Experiment One, the subjects were instructed to

drop the optional tense and case inflectional suffixes. This was done to avoid the influ-

ence of the suffix sounds (across the consonant) on the vowel. As for the #CV words, the

words were presented in the modified carrier phrase "_ hiya 'lkalimah" ("_is the

word."). This modification is intended to isolate the CV pair at the left edge of the phrase

and avoid any coarticulatory influence from another sound across the consonant. Given

that the test word is topicalized in the test phrase (as if to introduce the word to someone),

it was possible to ask the subjects to drop the case and tense markings in this position as

well. Again, this was done to avoid any interference from other sounds that belong to the

suffixes.


151

4.2.3 Procedures

It was noted earlier that the material for this experiment was presented to the sub-

jects intermixed with the material for Experiment One. Hence, the procedures for this ex-

periment were the same as those for Experiment One.


The sound analysis software Praat (Boersma and Weenink 1992) was used to

automatically generate formant tracks using the Burg algorithm. The tracks were calcu-

lated using a succession of 25-ms Gaussian analysis windows. This window length was

found to be ideal, given that shorter windows result in misreads by the formant tracker,

while longer windows risk averaging away short formant transitions. The temporal dis-

tance between the centers of each two analysis windows was 5 ms. The experiment fo-

cuses on the first two formants only. As noted in §2.2.1, El-Dalee (1984) found that

changes in F3 are inconsistent while Giannini and Pettorino (1982) report that F3 loci

next to emphatics are not different from those next to non-emphatics. Formant values

were measured at the end of vowel transitions in VC# sequences and at the beginning of

vowel transitions in #CV sequences as well as at the steady state portion of the vowel in

both environments. These acoustic landmarks were identified based on visual inspection

of the waveform and the spectrogram of the sound token. Auditory verification was

added in cases where the vowel-consonant boundary was difficult to pinpoint. Figure 4.1

illustrates the locations of the acoustic landmarks of interest to this experiment. After

their identification, the landmarks were recorded as time points and annotated to a Praat

TextGrid file. A specially written Praat script referred to these files and automatically re-


152

corded the values of F1 and F2. For vowel transitions, formant readings were made 2.5

ms inside the vowel from the time point recorded in the TextGrid file rather than exactly

at the vowel-consonant boundary. This modification was done to avoid false formant val-

ues as a result of incorrect tracking of the formants or incorrect averaging of vowel-edge-

based and consonant-edge-based formant reading points. The script then stores the for-

mant readings as a text file. The text file was then converted to proper formats of the

spreadsheet software Microsoft Excel (Microsoft Corp. 1985) and the statistical analysis

software SPSS (SPSS, Inc. 1989).

a b c d

Figure 4.1. Cursor locations at vowel steady states in the CV (b) and VC (c) contexts as well as at the vowel transition edges in the two contexts (a and d, respectively).

4.2.5 Reliability

To estimate the intra-judge reliability of formant measurements, 63 CV tokens

and 72 VC tokens (10% of the total files in both cases) were randomly selected by a ran-

dom number generating software and re-analyzed following the same procedures ex-

plained in §4.2.4 above. In terms of F1 values at the transition and the steady state po-


153

tions, the correlations between the original and the retested tokens were above 0.98 for

both the CV and the VC environments. Agreements within 50 Hz were between 94.4%

and 100%. As for F2 values at the transition and the steady state potions, the correlations

between the original and the retested tokens were above 0.99 for both phonetic environ-

ments. Agreements within 50 Hz were between 81.8% and 100%. The measurements

were judged reliable.

4.3 Results

Tables 4.1, 4.2, and 4.3 list the averaged formant frequency values for the three

vowels at all measurement locations in the VC and CV contexts. For each of the three

vowels [i, u, a] in the VC sequence, a set of four one-way ANOV As were conducted for

all 16 consonants across all subjects. The dependant variable for the first ANOVA was

the frequency value at the steady state portion of F1 (F1vowe1), for tlie second it was the

frequency value at the steady state portion of F2 (F2vowe1), for the third it was the fre-

quency value at the offset of F1 (F1offset), and for the forth it was the frequency value at

the offset of F2 (F2offset). A similar set of ANOV As was conducted for the three vowels in

the CV sequences for only 14 consonants for the reasons noted in §4.2.2. The dependent

variables for these ANOV A tests were the frequency values at the onsets of F1 (F1onset)

and F2 (F2onset) as well as at the steady state portions of F1 and F2. Scheffe post hoc pair-

wise comparisons were conducted as well to establish objective comparisons between the

mean values of the formant frequency values next to all consonants.


154 Table 4.1. Average formant frequency values for the vowel [i] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants. The two interdentals [o] and [o\"] were not included in the CV contexts.

vc cv

F1vowel F1otT,ct F2vowcl F2otTsel F1vowcl F1onscl F2vowcl F2onscl

[t l 318 296 2294 2099 379 307 2386 2285

[tl"] 386 437 2354 1306 360 436 2310 1408

[d] 308 279 2341 2095 331 290 2293 2088

[d\'] 339 393 2251 1270 362 403 2200 1243

[oJ 331 325 2264 1807

354 354 2224 1179

[s] 334 326 2286 2102 321 292 2327 2105

[ sl·] 386 355 2364 1578 365 387 2294 1481

[k] 320 284 2267 2319 332 287 2309 2309

[q] 351 472 2261 1538 368 423 2293 1922

[X] 363 344 2282 1878 369 351 2263 2193

[ff] 347 427 2227 1478 350 390 2342 1736

[nJ 340 441 2261 2120 345 373 2330 2216

[!J] 322 547 2257 1834 346 453 2310 2013

[h] 330 317 2342 2358 324 302 2334 2328

[7] 309 324 2335 2260 320 320 2306 2293


155 Table 4.2. Average formant frequency values for the vowel [a] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing al116 consonants. The two interdentals [6] and [ol'] were not included in the CV contexts.

vc cv

Flvowcl Flo!Tset F2vowel F2olTsct F1vowel Flonscl F2vowcl F2onsel

[t] 730 495 1594 1730 730 510 1594 1761

[tl'] 722 544 1205 1084 757 615 1179 1074

[d] 712 419 1626 1724 688 423 1679 1800

[dl'] 725 492 1216 1045 704 506 1094 1029

[oJ 694 440 1586 1566

[ai·J 725 474 1187 1032

[s] 697 533 1631 1719 727 441 1610 1704

[ sl'] 744 680 1139 1185 751 577 1186 1052

[k] 715 566 1572 1860 737 499 1639 1876

[q] 747 639 1442 1196 759 406 1201 775

[X] 737 750 1511 1446 761 672 1288 1238

[K] 727 510 1497 1310 736 534 1253 1212

[h] 745 808 1601 1691 766 782 1577 1621

[)] 733 778 1633 1590 769 803 1594 1606

[h] 712 766 1599 1632 748 696 1559 1641

[?] 706 750 1619 1556 708 767 1608 1619


156 Table 4.3. Average formant frequency values for the vowel [u] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants. The two interdentals [o] and [o\'] were not included in the CV contexts.

vc cv Fl vowel Floffscl F2vowcl F2otlscl Fl vowel Flonscl F2vowcl F2onscl

[t l 376 348 823 ·1564 378 325 960 1310

[t\'] 413 404 885 963 426 431 812 921

[d] 360 307 935 1642 370 329 920 1713

[dl'] 413 385 821 863 405 406 848 979

[0] 365 320 997 1541

414 355 805 894

[s] 425 315 756 1430 403 338 962 1464

[ sl'] 383 389 765 1083 421 388 859 967

[k] 390 341 745 808 402 346 868 937

[q] 386 423 914 747 404 412 823 769

[X] 394 404 898 857 402 390 806 814

(K] 423 384 759 828 411 414 815 738

[nJ 405 448 783 1398 396 427 928 1248

['I] 398 504 841 1441 395 460 945 1295

[h] 382 376 750 891 385 340 814 911

[?] 377 398 899 875 382 382 795 793


157

2500 [Vt] [Vt\']

I Fl •: N ::c: I '-' 2000 ----· ---F2 u I = J:. -· .- I (L) 1500 ::l I a< T (L) ----.. L

1000 l ... r I = !::-. c<j 500 ···I s -- •. ---·-:£ :.c··- ](:'-... -.. ..

0 >I-. 0

2500 [Vd] [Vd\']

N' r I ::c: -'-' 2000 • u I I = (L) 1500 ::l a< l: I (L)

1000 [ ·![ ... : ... :-· .. J: = c<j 500 s ....::£ 1----... ·;--------':&:

0 >I-. 0

2500 [Vs] [Vs\']

N' .t e; 2000

u I = I (L) 1500 T ;::l 1 a< I (L) :t:: .. I 1000 ... I :+:- r ::::: :OK. __ ----I c<j --I s 500 ;,;; .. 1------:r --I ... -- ----c!: -I -----

0 >I-. 0

2500 [Vol

N I ::c: '-' 2000 I u ::::: ::r £ ll) 1500 ::l a< 1> (L) ..

1000 I l: ... :., . = :£ .... _ c<j s 500 ..... :11: :.::. "· ... :•...---------;-.:: "\'.,._______ __ :li: .. :lf: 0

>I-. 0 mid offset mid offset mid offset mid offset mid offset mid offset

[i] [a] [u] [i] [a] [u]

Figure 4.2. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the four Arabic plain coronals [t, d, o, s] and their emphatic counterparts [t\ d\ ol', Formant values are averages across speakers. The error bars represent± one standard deviation.


----- 2500 N ::c:

-;::: 2000 (.) = g 1500 c::r £ 1000

500

0

----- 2500 N ::c:

-;::: 2000 (.) = g 1500 g'

..t: 1000

-----N ::c:

500

0

2500

-;::: 2000 (.) = <!.) 1500 ;:::l c::r <!.) .... 1000

= <:<! 500 8 ....

0

0

-----2500

N ::c: '-' 2000 ;>.. (.) = <!.) 1500 ;:::l c::r <!.)

..t: 1000 ..... = <:<! § 500 0

0

[Vk]

I· I

[Vxl

I ·I

- -------c:+::

[Vh]

I-

[Vh] T t I

:c-- -:I:

mid offset [i]

,. ""--. .L

l--- I

I --"---

T------ I

mid offset [a]

--•-Fl

F2

I·· I

:!:· JE---c!:

I ... ·· I

mid offset [u]

[Vq]

I.

[VIS"]

I.

[V1]

I

[V?]

mid offset [i]

JC. ···· ...

I------I

mid offset [a]

158

.. I

.!:··· . I -- ..;:,:

I

mid offset [u]

Figure 4.3. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the velar [k], the three uvulars [q, x, IS"], the two pharyngeals [h, 1] and the two laryngeals [h,?]. Formant values are averages across speakers. The error bars represent± one standard deviation.


159

4.3.1 Anticipatory (VC) Coarticulation

Figures 4.2 and 4.3 represent simplified formant transition tracks of the vowels [i,

u, a] when preceding the 16 consonants. These tracks are basically interpolation lines

connecting the mean formant frequency values at the middle of the steady state portion

and at the offset of the vowel. Error bars (± one standard deviation) were added at the

data points.

4.3.1.1 Anticipatory Coarticulation in F1

The statistical analysis of variance shows that, for the vowel [i], main effects of

consonant type are obtained on both F1vowei [F(l5,224) = 16.812, p < 0.001, R2 = 0.498]

and F1ottset [F(l5,224) = 34.429, p < 0.001, R2 = 0.677]. The subsequent Scheffe pair-wise

post hoc comparisons are summarized in Tables 4.4 and 4.5. In terms of F1vowei values the

comparisons yield a small number of significant differences between different conso-

nants. Flvowei is generally higher (albeit only slightly in most cases) next to emphatics and

uvulars compared to the rest of sounds. F1vowei values before [t'i'] and [s"] are significantly

higher than those before their non-emphatic counterparts (p < 0.01) while [d"] and [o"]

are not significantly different from their non-emphatic counterparts in terms ofpreceding

F1vowei (p > 0.4). F1offset is generally higher next to emphatics and gutturals as opposed to

plain oral consonants. Most pair-wise comparisons of Floffset values do not reflect signifi-

cant differences. There seems to be an association between high Floftset values and the

four gutturals [q, B", h, )], especially the latter. Emphatics are also occasionally correlated

with high F1offset values. However, this correlation is not consistent across all emphatics.

The two laryngeals are not associated with high F1oftset values.


160 Table 4.4. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Flvowci values of the vowel [i] in the context [iC].

[o] [o\'] [t] [t\'] [d] [s] [k] [q] [xJ [K] [h] [1] [h] [7]

[oJ

[t] [t\'] *** [d]

[s] ***

[k] [q] [X] [K] [h] [1] [h] [7]

**

*

*** *** ** ***

*** ***

*

** *** *** ***

-------- -------

*** * *** ***

* *** *

* *** *** *** * ***

Table 4.5. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Florrsct values of the vowel [i] in the context [iC].

[o] [o\'] [t] [t\'] [d] [s] [k] [q] [X] [K] [h] [1] [h] [7]

[oJ [o\'J [t]

** *** [d] *** [d\'] * *** [s] ** [ s\']

[k] *** ** [q] *** *** *** *** *** *** *** [X] * *** [K] * *** *** * *** [h] *** *** *** *** *** * [1] *** *** *** ** *** *** *** *** *** *** *** ** [h] *** *** ** *** *** [7] ** *** ** *** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference


161

For [a], no main effect of consonant type on F1vowel is obtained [F(15,224) = 1.067, p > 0.3, R2 = 0.004]. As shown in Table 4.6, however, pair-wise comparisons indi-

cate that the values of F1vowel next to all 16 consonants are not significantly different from

each other (p > 0.9). By contrast, a main effect on F1offset is obtained [F(15,224) = 34.138,

p < 0.001, R2 = 0.675]. The subsequent Scheffe pair-wise post hoc comparisons are

summarized in Table 4.7. There is a strong correlation between F1offset values of [a] and

the low gutturals (pharyngeals and laryngeals), the uvular [X], and the emphatic [s'].

F1otfset values next to those sounds are significantly higher than most of the other sounds

(p < 0.01). The rest of the consonants are not significantly different from each other in

terms of preceding F1offset values.

For [u], main effects of consonant type are obtained on both F1vowel [F(15,224) =

7.047, p < 0.001, R2 = 0.275] and F1offset [F(15,224) = 15.268, p < 0.001, R2 = 0.472].

However, the subsequent Scheffe post hoc comparisons (summarized in Tables 4.8 and

4.9) reveal only few significant pair-wise differences among the F1vowel and F1offset values

next to the 16 consonants. Aside from the alveolar pair [s, s>], emphatics are generally

accompanied by slightly higher F1vowel values than their non-emphatics counterparts. The

differences are not significant for the pairs [t, t>] and [o, O'l] (p > 0.15) and only margin-

ally significant for the pair [d, d>] (p < 0.1). As for F1offset• the value before the voiceless

pharyngeal [h] is consistently higher compared to those before plain oral consonants,

while the value before [1] is significantly higher then those before most of the other con-

sonants (p < 0.05). Among the three uvulars, only [q] is distinguished from some of the

plain orals by a higher F1ottset value. Meanwhile, the plain orals [t], [d], [o], [s], and [k]

are accompanied by the lowest F1otfset values.


162 Table 4.6. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Fl vowet values of the vowel [a] in the context [aC].

[o] [<'F] [t] [d] [dl'] [s] [k] [q] [X] [K] [ti.] ['!] [h] [?] [o]

[t]

[d] [dl'] [s]

[k] [q] [X] [B'] [ti.] ['I] [h] [?]

Table 4.7. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Flortsct values of the vowel [a] in the context [aC].

[o] [t] [tl'] [d] [dl'] [s] [ [k] [q] [X] [B'] [ti.] [)] [h] [?]

[ol

[t] [tl'] [d]

[s] *** *** ** *** **

[k] [q] ** * *** [X] *** *** *** *** *** *** *** ** [B'] * *** [ti.] *** *** *** *** *** *** *** *** * *** ['I] *** *** *** *** *** *** *** *** *** [h] *** *** *** *** *** *** *** ** *** [?] *** *** *** *** *** *** *** ** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference


163 Table 4.8. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Fl vowel values of the vowel [u] in the context [uC].

[o]

[t]

[d]

[s]

[k] [q] [X]

[h] [1] [h] [7]

[o] [t] [t\'] [d] [d\'] [s] [k] [q] [X] [B"] [n] [1] [h] [7]

*

* **

* **

Table 4.9. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Fl otiset values of the vowel [ u] in the context [ uC].

[o] [t] [d] [s] [k] [q] [X] [B'] [n] [1] [h] [7]

[o]

[t]

[d] *

[s]

[k] [q] * *** ** [X] * [15] [nl *** * *** *** ** [1] *** *** *** * *** *** *** ** *** * *** [h] *** [7] ** * p < 0.05 ** p < 0.01 *** p < 0.001 -no significant difference


164

4.3.1.2 Anticipatory Coarticulation in F2

Main effects of consonant type on both F2vowei [F(l5,224) = 2.338, p < 0.01, R2 =

0.077] and F2offset [F(l5,224) = 162.479, p < 0.001, R2 = 0.910] are obtained for [i].

However, F2vowei values next to all 16 consonants are not significantly different from each

other as shown by the Scheffe post hoc comparisons in Table 4.1 0. The distinction be-

tween emphatics and non-emphatics is clearly reflected by F2offset values as shown in Ta-

ble 4.11. The four emphatics as well as the two uvulars [q] and [rr] cause a substantial

drops in F2 transitions of [i] resulting in significantly lower F2offset values than the rest of

the consonants (p < 0.05). The mid-range F2offset values preceding the two gutturals [X]

and ['1] as well as the interdental [o] distinguish them from most of the remaining conso-

nants, while the values before [h] and [?] are in the same range as those before plain

orals. The highest F2offset value is the one preceding [h], distinguishing this sound from

most of the other sounds.

For [a], significant main effects of consonant types are obtained on both F2vowei

[F(15,224) = 144.518, p < 0.001, R2 = 0.900] and F2oftset [F(l5,224) = 146.468, p < 0.001,

R2 = 0.901]. The subsequent Scheffe post hoc comparisons are summarized in Tables

4.12 and 4.13. The values of F2vowei are significantly lower before the four emphatics [t'>],

[d'>], [o"], and [s"] than before any other consonant (p < 0.001). The two uvulars [X] and

[rr] also cause significant lowering of F2vowei that is generally milder than the lowering

caused by emphatics. In several cases, these two uvulars are distinguished from non-

emphatics and low gutturals. The uvular stop [q] is accompanied by a lower mid-range

F2vowei that distinguishes it from all other sounds except from other uvulars. Plain orals,

pharyngeals, and laryngeals are mostly not distinctly different from each other in term of


165 Table 4.10. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowct values of the vowel [i] in the context [iC].

[o] [01'] [t] [t\'] [d] [s] [sl'] [k] [q] [X] [B"] [h] [CJ] [h] [?]

[oJ

[t l

[d]

[s]

[k] [q] [X]

[h] [CJ] [h] [?]

Table 4.11. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F201rsct values of the vowel [i] in the context [iC].

[oJ [01'] [t l [t\'] [d] [s] [ [k] [q] [X] [B' l [h] [CJ] [h] [?] [oJ [()I'] *** [t l *** *** [t\'] *** *** [d] *** *** ***

*** *** *** [s] *** *** *** *** [s\'] * *** *** *** *** *** *** [k] *** *** *** * *** *** [q] ** *** *** * *** ** *** *** [X] *** * *** *** * *** *** *** [B'] *** *** *** *** *** *** *** [h] *** *** *** *** *** *** * *** ['l] *** ** *** ** *** ** ** *** *** *** *** [h] *** *** ** *** ** *** ** *** *** *** *** * *** [?] *** *** *** *** *** *** *** *** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference


166 Table 4.12. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowci values of the vowel [a] in the context [aC].

[oJ [o\'J [t] [dJ [di'J [sJ [kl [qJ [xl [H'] [hJ ['i'J [hJ [?J

[oJ [oi'J *** [t l *** [tl'] *** *** [d] *** ***

-- -------------------------

[dl'] *** *** *** [s] *** *** *** [sl'] *** *** *** *** [k] *** *** *** *** [q] *** *** *** *** *** *** *** *** ** [X] *** *** * *** * *** [H'] *** *** ** *** *** *** [h] *** *** *** *** *** ['I] *** *** *** *** *** ** *** [h] *** *** *** *** *** [?] *** *** *** *** *** **

Table 4.13. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F20 rrsct values of the vowel [a] in the context [aC].

[o] WJ [t] [d] [s] [si·J [k] [q] [Xl [H"] [h] ['i'] [h] [?]

[oJ [oi'J *** [t] ***

*** *** [d] *** ***

*** *** *** [s] *** *** ***

*** *** *** *** [k] *** *** *** *** *** [q] *** * *** *** *** *** [X] *** *** *** *** *** *** *** *** *** [K] *** *** *** *** *** *** *** *** [h] *** *** *** *** * *** *** *** [l] *** *** *** *** *** *** *** [h] *** *** *** *** *** *** ** *** [?] *** * *** * *** *** *** *** *** * p < 0.05 **p<0.01 *** p < 0.001 - no significant difference


167

F2vowei value of preceding [a]. F2offset values accompanying the four emphatics as well as

the uvular [q] are significantly lower than those accompanying almost all of the remain-

ing consonants (p < 0.001). The value of F2offset before [X] and [ff] are not as low, but are

still significantly lower when compared to several other consonants. These mid-range

F2offset values distinguish the uvular fricatives from the majority of the remaining sounds.

The plain orals [t, d, o, s, k] and the lower gutturals [h, 1, h, ?] are accompanied by the

highest F2offset values. These two classes are mostly not distinct from each other.

Significant main effects of consonant types on both F2vowei [F(15,224) = 15.134, p

< 0.001, R2 = 0.470] and F2offset [F(15,224) = 146.124, p < 0.001, R2 = 0.901] are also ob-

tained for [u]. The post hoc comparisons, summarized in Tables 4.14 and 4.15, show that

the values of F2vowei establish only few significant differences between the consonants.

Before the emphatic [ o>] F2vowei is significantly lower compared to that before [ o] (p <

0.001), but this is mainly because [o] is preceded by the highest F2vowei among all 16 con-

sonants. All other emphatic/non-emphatic pairs are not significantly different from each

other even though the stop [d] is also preceded by a substantially high F2vowei· The picture

is different for F2offset values. Here, the values before all emphatics are significantly lower

than those before the non-emphatic counterparts (p < 0.001). Emphatics, uvulars, laryn-

geals, and the velar [k] are accompanied by lower F2offset values than the remaining

sounds. Among that group, the three uvulars and [k] are associated with the lowest F2onset

values. Plain coronals and pharyngeals are accompanied by the highest F2offset values and

are mostly not distinct from each other.


168 Table 4.14. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowei values of the vowel [ u] in the context [ uC].

[oJ [cFJ [tJ [dJ [sJ [kJ [qJ [Xl [K] [nJ ['i'J [hJ [?J [oJ

[t] ***

[d] ***

[s] *** *** *** **

[k] *** *** [q] ** * ** [X] * [K] *** *** * [nJ *** * ['i'] * [h] *** *** ** * [?] * * *

Table 4.15. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2oiTsct values of the vowel [ u] in the context [ uC].

[oJ Wl [tJ [dJ [di'J [s] [kJ [qJ [Xl [H] [nJ ['i'J [hJ [?J [oJ [61'] *** [t l *** [tl'] *** *** [d] *** ***

*** *** *** [s] *** *** * ***

*** *** *** ** *** [k] *** *** *** *** *** [q] *** *** ** *** *** *** [X] *** *** *** *** ** [K] *** *** *** *** *** [nJ *** *** *** *** *** *** *** *** *** ['I] *** *** * *** *** *** *** *** *** [h] *** *** *** *** * *** *** [7] *** *** *** *** * *** *** * p < 0.05 ** p < 0.01 *** p < 0.001 -no significant difference


169

In sum, Flvowel values make very few emphatic/non-emphatic distinctions. These

values do not distinguish gutturals as a class from non-gutturals. High Floftset values are

consistently associated with pharyngeals. High Floffset values are also occasionally associ-

ated with emphatics and uvulars. This association, however, is not as consistent as the

association with pharyngeals. Laryngeals are associated with high Floffset values in the

vowel [a] alone. F2vowel distinguishes emphatics from their non-emphatic counterparts in

the vowel [a] alone. F2offset• meanwhile, is very capable of distinguishing emphatics from

non-emphatics. Uvulars are also associated with low F2oftset values. These values are ei-

ther low or at the mid-range in [i] and [a]. In [u], F2offset values preceding uvulars are

lower than those preceding all other sounds, including emphatics. Pharyngeals are pre-

ceded by F2otfset values that are slightly lower than, or at the same range as, those preced-

ing plain coronals. F2offset values preceding laryngeals depend on the vowels since these

sounds have no particular coarticulatory effects on adjacent vowels.

4.3.1.3 Anticipatory Coarticulation - Discriminant Analysis

For the most part, the previous two sections indicate that the values of Fl and F2

at the vowel steady state portions reflect only minor distinctions among consonants. On

the contrary, Fl and F2 transitions reflect some systematic distinctions. To asses the ca-

pabilities of Floffset and F2offset as acoustic coarticulatory metrics to categorize the conso-

nant classes being investigated, a number of discriminant analysis tests were performed

in which these values were used as predictors. Rather than looking at the 16 consonants

individually, the categorization tasks were done for classes of sounds. The four emphatics

were grouped together and so were the four non-emphatic coronals, the three uvulars, and


170

the two pharyngeals. The velar [k] and the two laryngeals were excluded from these tests

since their Floffset and F2offset values are highly dependent on the vowel type.

Table 4.16. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values ofF1 transitions in VC contexts.


Vowel Original Emph. Phar. Plain. Uvu. Total [i] Emph. 41.7 11.7 21.7 25.0 100.0

Phar. 13.3 73.3 3.3 10.0 100.0 Plain. 21.7 0.0 78.3 0.0 100.0 Uvu. 28.9 28.9 15.6 26.7 100.0

X= 54.4% [a] Emph. 25.0 10.0 48.3 16.7 100.0

Phar. 0.0 83.3 0.0 16.7 100.0 Plain. 13.3 0.0 70.0 16.7 100.0 Uvu. 24.4 26.7 15.6 33.3 100.0

X =49.7% [u] Emph. 26.7 10.0 25.0 38.3 100.0

Phar. 6.7 66.7 6.7 20.0 100.0 Plain. 25.0 0.0 71.7 3.3 100.0 Uvu. 11.1 26.7 20.0 42.2 100.0

X= 50.3%

Results of the first three discriminant analysis tests are summarized in Table 4.16.

In each test, Floffset of one of the three individual vowels serve as the predictor. The over-

all correct classification rates predicted by Flotfset are not higher than 54.4%. The results

show that only pharyngeals and plain coronals are relatively well categorized in the three

vowel environments. Floffset values before these two classes of sounds are at the two ends

of the scale: pharyngeals are preceded by the highest values while plain coronals are pre-

ceded by the lowest. The high classification rates of plain coronals is especially notable

given that the analysis of variance tests reported earlier do not particularly distinguish


171

these sounds based on the F1oftset values preceding them. Emphatics and uvulars are fre-

quently misclassified as each other. This is due to the mid-range F1otfset values of these

sounds. It appears that the mid-range values preceding emphatics are not distinct enough

from those preceding their plain counterparts. For this reason, emphatics are also fre-

quently misclassified as plain coronals. Meanwhile, emphatics are not frequently misclas-

sified as pharyngeals. Based on F1ortset values, uvulars are misclassified as pharyngeals

above the 25% chance level in all three vowels. The most likely cause for this is the

rather consistently high F1offset values preceding the uvular stop [q].

Table 4.17. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F2 transitions in VC contexts.


Vowel Original Emph. Phar. Plain. Uvu. Total [i] Emph. 75.0 3.3 0.0 21.7 100.0

Phar. 0.0 33.3 46.7 20.0 100.0 Plain. 0.0 26.7 61.7 11.7 100.0 Uvu. 26.7 8.9 11.1 53.3 100.0

X=59.5% [a] Em ph. 83.3 0.0 0.0 16.7 100.0

Phar. 0.0 50.0 43.3 6.7 100.0 Plain. 0.0 43.3 55.0 1.7 100.0 Uvu. 20.0 11.1 0.0 68.9 100.0

X= 66.2% [u] Em ph. 60.0 6.7 0.0 33.3 100.0

Phar. 0.0 63.3 .36.7 0.0 100.0 Plain. 0.0 33.3 66.7 0.0 100.0 Uvu. 24.4 0.0 0.0 75.6 100.0

X= 66.2%

Table 4.17 shows the classification results of the three discriminant analysis tests

in which F2offset values of the three individual vowels serve as the predictors. The overall


172

classification rates, which range between 59.5% and 66.2%, are noticeably higher than

those obtained by Florrset values. After [i] and [a], the classification rates of emphatics

(75% and 83.3%, respectively) are substantially higher than the classification rates of

other sounds. This is a predictable outcome given that emphatics are consistently pre-

ceded by very low F2offset values in comparison with the other sounds. After [u], emphat-

ics are accurately classified in 60% of their actual incidents. It is very interesting to note

that, in all three vowel contexts, there were no cases of misclassification of emphatics as

plain coronals or vice versa. Uvulars are also relatively well classified after [a] and, more

evidently, after [u]. Recall from the previous section that F2offset values of [u] were lower

before uvulars than even those before emphatics. It seems that this association is strong

enough to profile uvulars as a class next to [u]. Overall, emphatics and uvulars are fre-

quently misclassified as each other. This follows from the fact that F2orrset values before

these two classes are lower than those before plain orals and pharyngeals. The latter two

classes are also frequently misclassified as each other.

The analysis of variance and discriminant function analysis tests show that Florfset

cannot classify emphatics as a class of sounds distinct from their non-emphatic counter-

parts (plain coronals). F2offset' meanwhile, is shown to be capable of accurately classifying

the two classes. To verify these findings, another discriminant analysis test was con-

ducted. In this test, which involved only the classes of emphatics and plain coronals,

Floffset and F2offset in all three vowels were combined and used together as predictors. The

test shows a very high overall accurate classification rate of 93.1 %. Inspecting the stan-

dardized canonical discriminant function coefficients reveals that this high rate is largely

contributed by F2offset· Notice that when we used F2offset values alone earlier, there were no


173

cases of misclassification between emphatics and plain coronals. Adding F1offset values as

a predictor actually drops the classification rate slightly.

The results show that F2offset is, by far, the most solid acoustic cue for the secon-

dary articulation in eiJ,lphatics in the VC context. Emphatics are consistently and reliably

associated with low F2offset values in preceding vowels. F1offset does not have any signifi-

cant role in distinguishing emphatic sounds from plain coronals. Overall, pharyngeals are

preceded by high Floffset values. For the most part, uvulars are associated with mid-range

values for both F1offset and F2offset in preceding vowels. There are two exceptions to his

trend. First, the uvulars stop [q] is associated with F1offset values that are as high as those

preceding pharyngeals. Second, after [u], all uvulars are associated with F2offset values that

are even lower than those preceding emphatics.

4.3.2 Carryover (CV) Coarticulation in Fl

Figures 4.4 and 4.5 represent simplified formant transition tracks of the vowels [i,

u, a] when following the 14 consonants covered in this portion of the study.


N' 2500 ::r: '-" 2000 ;;.., u c <l) 1500 ;::l c::r <l)

1000 i:: ro 500 s ..... 0

!:.I.. 0

2500 N ::r: '-" 2000 ;;.., u c

<l) 1500 ;::l c::r <l) ..... 1000 !:.I.. ..... c ro 500 s ..... 0

!:.I.. 0

2500 N ::r: '-" 2000 ;;.., u c <l) 1500 ;::l c::r <l)

1000 i:: ro 500 s ..... 0

!:.I.. 0

[tV] I 1

"--"-

[dV]

[sV]

:::.:::-----·-- ---;.]t:

onset mid [i]

onset mid [a]

--•-- Fl • F2

:£--------:.::

:.'------- -"""

onset mid [u]

T --------:.-:

:r------:Jr

onset mid [i]

I

onset mid [a]

174

:r ::t: --X

'l .. ..... ---- -- -w.:

:.::------------.;<·

onset mid [u]

Figure 4.4. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] following the three Ar&bic plain coronals [t, d, s] and their emphatic counterparts [t\ d\', Formant values are averages across speakers. The error bars represent ± one standard deviation.


""" 2500 N ::c: ';:. 2000 (.)

1500 a"

&: 1000 c "' 500 s ..... 0

0

""" 2500 N ::c: '-" 2000 ;>--, (.)

v 1500 ;::l a" v

1000 ......

"' 500 s ..... 0

0

""" 2500 N ::c: '-" 2000 ;>--, (.)

v 1500 ;::l a" v ..... 1000 ......

"' 500 s ..... 0

0

""" 2500 N ::c: '-" 2000 ;>--, (.)

v 1500 ;::l a" v

1000 ......

"' 500 s ..... 0

0

[kV]

I I

:::..;;------ -X

[XV]

I ;,.:

::J;: -- ---<•c

[nVJ

I

:1.7--- ----£

[hV]

1. I

,., - -·- --:a:

onset mid [i]

·;::------------ ;!

I-----J:

I-------:t:

onset mid [a]

-•-Fl ---F2

I :r

I . I

onset mid [u]

[qV]

[rN] I

[)V]

I I.--

[?V]

onset mid [i]

_I

I-- I

I-- I

----!:

onset mid [a]

175

I J[:' ---- -- ::..::

I- I

onset mid [u]

Figure 4.5. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the velar [k], the three uvulars [q, x, ff], the two pharyngeals [n, )] and the two laryngeals [h,?]. Formant values are averages across speakers. The error bars represent ± one standard deviation.


176

4.3.2.1 Carryover Coarticulation in Fl

A main effect of consonant type on F1onset [F(l3,196) = 25.734, p < 0.001, R2 = 0.606] is obtained for the vowel [i]. The subsequent Scheffe post hoc comparisons (sum-

marized in Table 4.18) indicate that, in most cases, F1onset of the vowel [i] are able to dis-

tinguish the three emphatics and the gutturals [q], [E], [h], and [<i'] from the remaining

sounds. The values of F1onset before all three emphatics are significantly higher in com-

parison with their non-emphatic counterparts (p < 0.01). The highest F1onset value is the

one following the pharyngeal [<i'], distinguishing it from all other sounds, including its

voiceless counterpart [h]. The uvular [X] has a mid-range F1onset value that distinguishes it

from only [t'] and [<i']. While the analysis of variance indicates that there is also a main

effect of consonant type on F1vowei [F(13,196) = 5.897, p < 0.001, R2 = 0.233] of the

vowel [i], the post hoc comparisons (Table 4.19) reveal that all F1vowei values are very

close to each other. The only significant pair-wise differences are that [t] was higher than

both [s] and[?] (p < 0.05).

For the vowel [a], main effects of consonant type on both F1anset [F(13,196) = 71.665, p < 0.001, R2 = 0.815] and F1vowei [F(13,196) = 2.669, p < 0.001, R2 = 0.094] are

obtained. The Scheffe post hoc comparisons (Table 4.20) show that F1onset values in [a]

are capable of distinguishing the low gutturals (pharyngeals and laryngeals) from the re-

maining sounds in most cases. F1onset values following the guttural sounds, except for [q]

and [E], are significantly higher compared to most of the consonants under investigation.

The highest values occur after [h], [<i'], and [?]. The three emphatics are generally accom-

panied by F1onset values that are higher than those accompanying their plain counterparts.

While [s"] is followed by F1anset value that is significantly higher than after [s], F1anset af-


177

Table 4.18. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms ofF !onset values of the vowel [i] in the context [Ci].

[t] [d] [s] [k] [q] [Xl [K] [n] ['i'] [h] [?]

[t l ***

[d] *** ** ***

[s] *** *** [ sl'] * ** ** [k] *** *** *** [q] *** *** *** *** [X] * [ff] * ** *** *** [ll] * * * ['I] *** *** *** *** *** * [h] *** *** ** *** ** *** [?] *** * *** ***

Table 4.19. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of Flvowci values of the vowel [i] in the context [Ci].

[t] [ti'J [d] [s] [k] [q] [Xl [K] [n] ['i'] [h] [?]

[t l

[d]

[s] *

[k]

[q]

[X]

[ff]

[nJ ['I] [h]

[?] * * p < 0.05 ** p < 0.01 *** p < 0.001 -no significant difference


178

Table 4.20. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of Fl onset values of the vowel [a] in the context [Ca].

[t] [tl'] [d] [s] [k] [q]. [X] [B'] [n] ['I] [h] [?]

[t] [tl']

[d] *** [dl'] * [s] *** [sl'] *** *** [k] * [q] *** *** [X] *** *** *** *** *** *** [ff] * ** *** [nJ *** *** *** *** *** *** *** *** * *** ['I] *** *** *** *** *** *** *** *** ** *** [h] *** *** *** *** * *** *** *** [?] *** *** *** *** *** *** *** *** ***

Table 4.21. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of Flvowct values of the vowel [a] in the context [Ca].

[t] [d] [s] [k] [q] [X] [ff] [n] ['I] [h] [?]

[t] [tl']

[d] [dl']

[s] [sl']

[k]

[q]

[X] [B']

[nJ ['I] [h]

[?] * p < 0.05 ** p < 0.01 *** p < 0.001 · -no significant difference


179

ter [t'] is only marginally higher than after [t] (p < 0.1) and the difference is not signifi-

cant for the pair [d, d>]. At the steady state portion of [a], no significant differences are

detected among the 14 consonants in terms ofF1vowel values (Table 4.21).

There are main effects of consonant type on both F1onset [F(l3,196) = 16.203, p <

0.001, R2 = 0.486] and F1vowel [F(13,196) = 5.304, p < 0.001, R2 = 0.211] for the vowel

[u]. The subsequent post hoc comparisons (Table 4.22) reveal relatively few significant

pair-wise differences achieved by F1onset· The comparisons indicate that F1onset is able to

distinguish the subset of the emphatic [t'] along with the gutturals [q], [rr], [h], and [1].

Those sounds are followed by higher F1onset values than some of the remaining sounds.

F1onset values after the two emphatics [t>] and [d>] are significantly higher when compared

to the values after [t] and [d] (p < 0.05) while [s>] is followed by F1onset value that is in-

significantly higher than the one following [s]. The highest F1onset value follows [1], while

the plain orals along with [h] are followed by the lowest values. As for F1vowel values,

most pair-wise post hoc comparisons reflect no significant differences (Table 4.23).


180

Table 4.22. Summary of the degree of statistical significance as expressed by the Schefte post hoc pair-wise comparisons of the 14 consonants in terms ofFlonset values of the vowel [u] in the context [Cu].

[t] [d] [s] [ [k] [q] [X] [ff] [nJ [1] [h] [?]

[t]

[tl'] *** [d] *** [dl'] * * [s] ***

[k] ** [q] ** ** * [X]

[ ff] ** ** * [nJ *** *** ** ** [1] *** *** *** *** [h] *** * ** *** [?] *

Table 4.23. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of Flvowcl values of the vowel [u] in the context [Cu].

[t] [t'l] [d] [s] [k] [q] [X] [ff] [nJ [1] [h] [?] [t]

[U] * [d] ** [dl']

[s] [ * [k]

[q]

[X]

[ff] [nJ [1] [h]

[?] * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference


181

4.3.2.2 Carryover Coarticulation in F2

For the vowel [i], main effects of consonant type on both F2onset [F(13,196) = 81.183, p < 0.001, R2 = 0.833] and F2vowel [F(l3,196) = 2.616, p < 0.01, R2 = 0.091] are

obtained. The Scheffe post hoc comparisons (summarized in Table 4.24) reveal that

F2onset is capable of distinguishing the three emphatics. F2onset following the three emphat-

ics are significantly lower than the ones accompanying the majority of the remaining

sounds. The uvular [ff] is followed by a mid-range F2onset value that distinguishes it from

all other sounds except [s"] and [q]. The same is true, but to a lesser extent, for [q] which

is accompanied by a mean F2onset value that falls between those accompanying plain cor-

onals and the one accompanying [B"]. For the most part, the remaining consonants are not

significantly different from each other. Almost all of F2vowei values are not significantly

distinct from each other (Table 4.25).

There are main effects of consonant type on both F2onset [F(l3,196) = 236.833, p <

0.001, R2 = 0.936] and F2vowel [F(l3,196) = 91.173, p < 0.001, R2 = 0.849] for the vowel

[a]. The differences between emphatics and the remaining sounds are reflected by the

Scheffe post hoc comparisons of F2onset values, summed up in Table 4.26. The three em-

phatics are followed by significantly lower F2onset values than those that follow the other

sounds (p < 0.001). The two uvulars [X, ff] are distinguished from almost all other sounds

by the mid-range F2onset values that follow them. The uvular stop [q] is followed by the

lowest F2onset value among all the sounds investigated here. On the other end of the scale,

the five plain orals [t], [d], [6], [s], and [k] are followed by the highest F2onset values. As

is the case at the onset of the vowel, the three emphatics and the three uvulars are associ-

ated with F2vowei values (Table 4.27) that are significantly lower than the remaining


182

Table 4.24. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [i] in the context [Ci].

[t] [tq [d] [s] [k] [q] Lxl [K] [h] ['I] [h] [?]

[t] [t\'] *** [d] *** [dl'] *** *** [s] *** *** [s\'] *** *** *** [k] *** *** *** [q] *** *** *** *** *** [X] *** *** *** [IS'] *** ** *** *** *** *** *** [h] *** *** *** * *** ['I] *** *** *** * * [h] *** *** *** *** *** ** [?] *** *** *** *** *** *

Table 4.25. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowcl values of the vowel [i] in the context [Ci].

[t] [tl'] [d] [s] [k] [q] [X] [K] [h] ['I] [h] [?]

[t]

[d]

* [s]

[k]

[q]

[X]

[IS']

[h]

['I] [h]

[7] * p < 0.05 **p<O.Ol *** p < 0.001 -no significant difference


183

Table 4.26. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onsct values of the vowel [a] in the context [Ca].

[t] [t'l] [d] [s] [s\·] [k] [q] [X] [B'] [h] [)] [h] [?] [t]

[t\'] *** [d] *** [d\'] *** *** [s] *** *** [s\'] *** *** *** [k] *** *** * *** [q] *** *** *** *** *** *** *** [X] *** * ***' *** *** ** *** *** [B'] *** *** ** *** * *** *** [h] *** ** *** *** *** *** *** *** [)] *** *** *** *** *** *** *** *** [h] *** * *** *** *** *** *** *** [?] *** ** *** *** *** *** *** ***

Table 4.27. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowci values of the vowel [a] in the context [Ca].

[t] [t"] [d] [s] [s\'] [k] [q] [X] [B"] [h] ['!] [h] [?]

[t]

[F] *** [d] *** [d\'] *** *** [s] *** *** [ s\'] *** *** *** [k] *** *** *** [q] *** *** *** *** [X] *** *** *** *** *** [K] *** *** * *** *** [h] *** *** *** *** *** *** [)] *** *** *** *** *** *** [h] *** *** *** *** *** *** [?] *** *** *** *** *** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference


184

sounds (p < 0.001). These two classes are mostly not distinct from each other. Plain orals

and lower gutturals are also not distinct from each other.

For [u], main effects of consonant type on both F2onset [F(13,196) = 130.082, p <

0.001, R2 = 0.889] and F2vowei [F(13,196) = 13.357, p < 0.001, R2 = 0.435] are obtained.

Here, too, F2onset is able to distinguish emphatics from their plain counterparts, as re-

flected by the post hoc comparisons (Table 4.28). The values of F2onset following the three

emphatics, the three uvulars, the two laryngeals, and the velar [k] are significantly lower

than those following the three plain coronals and the two pharyngeals (p < 0.001). The

lowest F2onset values are those that follow the two uvulars [q, B"] which are, in some cases,

even significantly lower than those values following emphatics. Few pair-wise distinc-

tions are achieved by F2vowei values (Table 4.29). The only emphatic/non-emphatic dis-

tinction is that [t>] was followed by a significantly lower F2vowei value than [t] (p < 0.01);

while the values after the remaining emphatics are only slightly lower than those after

their non-emphatic counterparts. Also, the values following the three uvulars and the two

laryngeals are lower than those following the two pharyngeals, but mostly insignificantly.

To sum up, Flonset values in vowels following emphatics are inconsistently higher

than those that follow plain coronals. Similarly, uvulars, when compared to plane cor-

onals, are preceded by inconsistently higher F1onset· The only class of sounds that are con-

sistently accompanied by significantly high F1onset values are pharyngeals. Emphatics are

consistently associated with very low F2onset values. The same is true for uvulars. How-

ever, before [i] and [a], F2onset following uvulars are mostly not as low as those following

emphatics. Before [u], uvulars are followed by the lowest F2onset values among all 14


185

Table 4.28. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [ u] in the context [Cu].

[t] [d) [s) [ [k] [q] [X) (B) [h) [1) [h) [7] [t]

*** [d) *** *** [d\') *** *** [s] *** *** *** [ s\') *** *** *** [k] *** *** *** [q] *** *** ** *** ** [X] *** *** *** (B) *** * *** *** *** *** * [h) *** *** *** *** *** *** *** *** [1] *** *** *** *** *** *** *** *** [h) *** *** *** *** *** [?] *** *** * *** *** ***

Table 4.29. Summary of the degree of st\ltistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowcl values of the vowel [ u] in the context [Cu].

[t] [d) [dl') [s] [ [k] [q] [X] (B) [h) [1] [h) [?] [t]

[tl') ** [d)

[s] ***

[k] [q] ** ** [X] *** *** (B) ** ** [h) * * [1] ** * ** ** [h) ** ** ** [?] *** * *** ** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference


186

sounds. In terms of F2onset values, pharyngeals and plain coronals are quite similar to each

other. Laryngeals do not have any specific coarticulatory effects on neighboring vowels.

4.3.2.3 Carryover Coarticulation - Discriminant Analysis

To weigh the capabilities of Flonset and F2onset in CV sequences to categorize the

consonant classes under investigation, a number of discriminant analysis tests were per-

formed in which these values were used as predictors. As with the previous discriminant

analysis. test in §4.3.1.3, the categorization tasks were done for four classes of sounds:

emphatics, plain coronals, uvulars, and pharyngeals. The velar [k] and the two laryngeals

were excluded from these tests since the Flonset and F2onset values following them are

highly dependent on the vowel type.

Results of the first three discriminant analysis are summarized in Table 4.30.

In each test, Flonset of one of the three individual vowels serve as the only predictor. The

overall correct classification rates predicted by Flonset are not higher than 56.4%. Plain

coronals are the only set of sounds that are relatively well classified in all three vowel

contexts. This is because these sounds are generally followed by the low Flonset values.

The high Flonset values that follow pharyngeals cause them to be accurately classified be-

fore [a] and, to a lesser extent, before [u]. Before [i], pharyngeals are confused as uvulars

in 30% of their actual occurrences. Emphatics and uvulars are never well classified as

sound classes. Their mid-range values do not make them stand out among the sound

classes investigated. In general, there are numerous incidents where emphatics and uvu-

lars are confused as pharyngeals, as plain coronals, or as each other.


187 Table 4.30. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of Fl transitions in CV contexts.


Vowel Original Emph. Phar. Plain. Uvu. Total [i] Em ph. 6.7 46.7 15.6 31.1 100.0

Phar. 16.7 46.7 6.7 30.0 100.0 Plain. 0.0 0.0 95.6 4.4 100.0 Uvu. 11.1 24.4 15.6 48.9 100.0

X =49.7% [a] Em ph. 51.1 6.7 20.0 22.2 100.0

Phar. 6.7 93.3 0.0 0.0 100.0 Plain. 6.7 0.0 73.3 20.0 100.0 Uvu. 28.9 15.6 35.6 20.0 100.0

X= 56.4% [u] Emph. 4.4 40.0 13.3 42.2 100.0

Phar. 6.7 70.0 6.7 16.7 100.0 Plain. 0.0 0.0 77.8 22.2 100.0 Uvu. 24.4 22.2 6.7 46.7 100.0

X=47.9%

Table 4.31. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F2 transitions in CV contexts.


Vowel Original Em ph. Phar. Plain. Uvu. Total [i] Emph. 91.1 0.0 0.0 8.9 100.0

Phar. 0.0 16.7 50.0 33.3 100.0 Plain. 0.0 35.6 46.7 17.8 100.0 Uvu. 22.2 20.0 24.4 33.3 100.0

X=49.7% [a] Em ph. 60.0 0.0 0.0 40.0 100.0

Phar. 0.0 83.3 16.7 0.0 100.0 Plain. 0.0 15.6 84.4 0.0 100.0 Uvu. 33.3 11.1 0.0 55.6 100.0

X= 69.7% [u] Em ph. 86.7 2.2 0.0 11.1 100.0

Phar. 20.0 56.7 23.3 0.0 100.0 Plain. 0.0 26.7 73.3 0.0 100.0 Uvu. 13.3 0.0 0.0 86.7 100.0

X= 77.6%


188

Table 4.31 shows the classification results of the three discriminant analysis tests

where F2onset of the three individual vowels is used as the predictor. The overall classifica-

tion rates range between 49.7% and 77.6%. Before [i], emphatics are very accurately

classified due to the very low F2onset that follow them. Meanwhile, the other three classes

are often confused with each other. The range of F2onset values following uvulars reaches

lower than those following plain coronals and pharyngeals which cause some cases of

uvulars to be misclassified as emphatics. Before [a], both pharyngeals and plain coronals

are well classified - the former class by mid-range values, and the latter class by high

values. Emphatics and uvulars are followed by rather equally low F2onset value ranges

causing them to be misclassified as each other quite frequently. Before [u], uvulars are

followed by the lowest F2onset values and are highly accurately classified. Low F2onset val-

ues also follow emphatics and enable them to be well classified. The high F2onset values

following plain coronals enable those sounds to be correctly classified in over 73% of

their actual incidents. Only pharyngeals are poorly classified due to the F2onset values that

follow them which range between the high values following plain coronals and the low

values following emphatics. For this reason, pharyngeals are frequently confused as

members of those two classes.

There are few occasions where emphatics are confused with their plain counter-

parts when using F1onset as the predictor, and no occasions of confusion when using F2onset·

As we did in §4.3.1.3, the capabilities of these two acoustic metrics to classify emphatics

and non-emphatic were examined using another discriminant analysis test. In this test,

F1onset and F2onset in all three vowels were combined and used together as predictors. A

high overall accurate classification rate of 89.3% was achieved. Based the standardized


189

canonical discriminant function coefficients, this high rate IS contributed mainly by

F2onset·

The previous ANOV A and discriminant analysis tests reveal that, as was the case

in the VC context, F2onset is a very solid acoustic cue for the secondary articulation in em-

phatics in the CV context. Emphatics are consistently followed by low F2onset values in

vowels. Flonset does not have as strong a role in distinguishing emphatics from non-

emphatics. However, high Flonset values correlate rather highly with pharyngeals. F2onset

values following pharyngeals and plain coronals are generally not substantially different

from each other. Following uvulars, Flonset values are at the mid-range. F2onset values,

meanwhile, vary depending on the vowel. In [i], they are close to those following plain

coronals. In [a], they are as low as those following emphatics. In [u] they are even lower

than those following emphatics.


The results of the present experiment show that the coarticulatory acoustic effects

of MSA emphatics on neighboring vowels distinguish these sounds very reliably from

their non-emphatic counterparts. The main coarticulatory correlate of emphatics is a size-

able drop in the F2 transition in adjacent vowels. While emphatics are also generally as-

sociated with higher Fl transitions than those accompanying non-emphatics, this associa-

tion is not nearly as salient nor as consistent as the association between emphatics and F2

drops. Like emphatics, uvulars are also associated with lower F2 and higher Fl transi-

tions in adjacent vowels. However, the magnitudes of these changes are not consistent.


190

The size of F2 drop next to uvulars depends on the vowel type. The only sounds associ-

ated fairly consistently with high F1 transitions are pharyngeals. Laryngeals, meanwhile,

are not associated with any specific transitions in adjacent vowels.

Pharyngeals are categorized rather accurately based on the high F1 values that ac-

company them. These sounds are produced with a narrow constriction at the lowest part

of the pharynx. Recall from Figure 1.1 that a constriction near that area corresponds to

the node of F1 explaining the high value of that formant. Next to the vowel [a], laryn-

geals show strong association with high F1 transitions as well. However, next to [i] and

[u], no such effects are detectable. This is not unexpected given that laryngeals typically

do not include any supraglottal configuration of their own. The high F1 in [a] next to la-

ryngeals is that of the vowel itself which has the highest F1 among the three Arabic vow-

els. This finding differs from that of Zawaydeh ( 1999) who reports that laryngeals, like

pharyngeals, are also associated with high F1 in neighboring vowels. As a matter of fact,

Zawaydeh reports that all Arabic gutturals are associated with high F1 values in the tran-

sitions and steady states of adjacent vowels. She uses this association as a phonetic basis

for the grouping of Arabic gutturals into a single natural class. It was pointed in §2.2.4

that Zawaydeh' s finding in regards to laryngeals is quite puzzling. Associating laryngeals

with high Fl in adjacent vowels goes against the tenets of the acoustic theory. These

sounds do not have any constriction above the glottis, so what causes a rise in F1? It was

hypothesized in §2.2.4 that Zawaydeh's subjects were possibly raising their larynges dur-

ing [h] and [?] or producing those sounds with wider mouth openings than normal. A

raised larynx shortens the vocal tract and raises essentially all formant frequencies. Wider

openings at the lips correspond to widening at the antinodes of all formants raising all of


191

them. I repeat here that Zawaydeh originally reported her results for the vowel [a] alone

initially using five subjects. In order to reject the notions that the high F1 next to laryn-

geals is that of the vowel itself, she tested F1 in [i] using two of her subjects. The results

reported in the present work are obtained from five subjects for all consonant-vowel

combinations. I therefore question Zawaydeh's findings and reject the proposal that high

F 1 values are a viable argument for the grouping of Arabic gutturals.

The most consistent association noted in this experiment is the one between em-

phatics and F2 transition of neighboring vowels. Emphatics are always accompanied by

significantly lower F2 transitions in adjacent vowels in comparisons to with their plain

counterparts. This cue extends to the steady state portion of [a] only. The resistance of [i]

to the spread of emphasis is well documented in the literature (see §2.2.1). It appears that

the antagonism between the articulatory demands on the tongue dorsum during the pro-

duction of [i] (fronting) and the secondary articulation in emphatics (retraction) blocks

any further extension of emphasis into [i]. As for [u], this vowel already involves a re-

traction of the tongue dorsum which is reflected by its characteristically low F2 causing

emphasis to spread vacuously to the steady state. However, results of the statistical analy-

ses of variance as well as the discriminant analyses reported in this study indicate that

emphasis spread to the transitional portion of the vowel is sizable and consistent enough

to be considered a very reliable acoustic indicator for the presence of an emphatic conso-

nant. It seems that the spread of emphasis to [u] is expressed acoustically in that it over-

rides the raised consonant-vowel transition associated with non-emphatic coronals. F1

transitions next to emphatics are, generally speaking, higher than those next to plain cor-

onals. However, the ANOVA and discriminant analysis tests show that these differences


192

are not always significant. Contrary to the assertion of Giannini and Pettorino (1982),

who relied on speech samples from a single subject and did not use any form of statistical

analysis, F1 is not a viable acoustic cue of the presence of an emphatic consonant. The F1

transitions neighboring emphatics are not as consistently nor as sizably high as those

neighboring pharyngeals. It looks like emphatics, unlike pharyngeals, do not involve ac-

tive tongue root retraction which would result in a substantial and consistent raising of

Fl. The mildly and inconsistently high F1 transitions neighboring emphatics point to sub-

tle retraction of the tongue root as a byproduct of the overall retraction of the tongue dor-

The previously stated findings in regards to the association between emphatics

and both F1 and F2 along with their articulatory interpretations strongly support El-

Dalee's (1984) similar views. The present experiment also extends parts of those claims

to include uvulars. Like emphatics, uvulars are generally associated with low F2 transi-

tions on neighboring vowels in both directions. However, the size of F2 drop next to uvu-

lars depends largely on the vowel. In [i] and [a], the values of F2 transitions next to uvu-

lars are either comparable to the values next to emphatics or range between the values

next to plain coronals and the values next to emphatics. In [u], F2 transitions next to uvu-

. lars are lower than those next to any other sound class, including emphatics. Recall from

Experiment One that we claim that uvulars, when adjacent to the vowel [u], are retracted

further back towards the uvular region. When adjacent to [u], the spectral mean of the

17 It might be argued that emphatics underlyingly involve active tongue root retraction that is lost in a phonetic reorganization of the articulatory qualities of these sounds (Kingston and Diehl 1994 ). This is quite doubtful, though, since such a modification goes against the role of phonetic reorganization which is to minimize articulatory effort and/or maximize acoustic contrast. An added tongue root retraction would significantly raise Fl and enhance the acoustic distinction between emphatics and their non-emphatic counterparts. It can also be said that tongue root retraction would facilitate that retraction of the whole tongue mass including the dorsum. See §6.3.2 and §6.4 below for cases where the two retractions co-exist.


193

uvular [X] is in the same range as the spectral mean of the sibilants [s, s"]. This was at-

tributed to greater constriction in uvular fricatives as a result of further backing of the

tongue. The very low F2 transitions in [ u] next to uvulars support this claim. Further

backing of the tongue dorsum in the upper pharynx results in more constriction near the

first antinode of F2. This results in a more pronounced drop in this formant. Fl transi-

tions next to uvulars are in the same range as those next to emphatics. These transitions

are not as high nor as consistent as the transitions next to pharyngeals. It seems that, uvu-

lars, like emphatics and unlike pharyngeals, do not involve independent tongue root re-

traction as part of their articulation.

Notice in Figures 4.2 and 4.4 that F2 in [i] and [a] next to emphatics have falling

transitions as we move from the vowel's steady state towards the consonant in VC se-

quences (and rising transitions as we move away from the consonant towards the vowel's

steady state in CV sequences). In [u], however, the patterns are reversed. Here, F2 is

slightly higher at the transition than at the steady state. However, F2 transitions in [u]

next to emphatics remain the lowest with the exception of those next to uvulars. It seems

that emphatics do not merely cause a drop in F2 of an adjacent vowel, but rather cause

that formant to start at a somewhat fixed point regardless of the vowel type. That fixed

point is lower than the prototypical F2 values of [i] and [a] but slightly higher than the

prototypical F2 value of [u]. This phenomenon is quite similar to the locus concept dis-

cussed in §4.1. In this sense, the emphatic coronals [t'l, 6'1, s"] share a second formant

locus different from the one shared by their non-emphatic coronal counterparts [t, d, 6,

s]. Obrecht (1961) investigated the perceptual relevance of emphatic loci in Lebanese

Arabic and found that the perceptual "zone of velarization" lies between 1000 Hz and


194

1400 Hz. The average F2 locus for emphatics reported by Obrecht is around 1200 Hz

compared to a locus around 1800 Hz for non-emphatics. Using speech samples from a

singlelraqi subject, Giannini and Pettorino (1982) report an emphatic F2 locus of 1000

Hz compared to 2000 Hz for non-emphatics.

To find out if a stable F2 hub (locus) exists for all emphatics, the F2 transition

edges in sequences were compared by means of an ANOV A test. A similar test was

done for the sequence. In both cases, the tests were conducted for the emphatic

sounds across vowel contexts with F2 transition serving as the dependent variable. Two

similar tests were performed for the non-emphatic coronals. Figure 4.6 displays the

means and standard deviations of F2 transitions averaged for all vowels in the in the two

2500 """' N :I: ';:: 2000 (.) >= <l) ;::l 0"' 1500 <l) .....

i:: ro 1000 E ..... 0

500 [Vt] [Vd] [Vs] [Vol [tV] [dV] [sV]

2500 N' :I: ';:: 2000 (.) >= <l) ;::l 0"' 1500 <l) .... i:: ro 1000 a ..... 0

500 [Vtl'] [Vdl'] [Vsl'] [Vo1·l

Figure 4.6. Mean F2 transitions next to the non-emphatic coronals and their emphatic counterparts. The formant transitions values are averaged across vowels and speakers. The error bars represent ± one standard deviation.


195

contexts VC and CV, separately, next to emphatics and non-emphatics. ANOVA results

for the emphatics in the VC context show a main effect of consonant type [F(3,176) = 14.997; p < 0.001]. Scheffe post hoc test, however, indicate that and are

not significantly distinct from each other in terms of mean F2 transition value of the pre-

ceding vowels. Mean F2 value preceding [s"] is significantly higher than those preceding

the other emphatics (p < 0.01). In the CV context, which does not include no main

effect of consonant type was detected [F(2,132) = 1.469; p = 0.234]. This was reflected

by the subsequent post hoc comparisons which revealed no pair-wise differences. The

test results for the non-emphatics in the VC context show a main effect of consonant type

[F(3,176) = 5.174; p < 0.01]. Scheffe post hoc test indicate that the only pair-wise differ-

ences involve F2 before [o] which is significantly lower than the values before [t] (p <

0.05) and [d] (p < 0.01). No main effect of consonant type was detected [F(2,132) = 1.542; p = 0.218] in the CV context, which does not include [o]. Consequently, no pair-

wise differences were indicated by the subsequent post hoc comparisons. The tests indi-

cate that both emphatics and non-emphatics show little variability in F2 transition values.

This is an indication that there are two separate F2 loci for Arabic coronals; one for em-

phatics (around 1100Hz) and another for non-emphatics (around 1700Hz).

Figures 4.7 through 4.10 represent plots of the loci estimates for consonants that

are believed to possess them. It is notable here that no loci were plotted for [k], a sound

that is believed to possess more than one. The reason is simply that the high velar locus

associated with non-back vowels cannot be estimated simply by averaging F2 transitions

of adjacent vowels. This locus is higher than F2 in both [i] and [a] as indicated by the di-

rections of their formant tracks. As for the low locus, Arabic has only one back vowel,


[Vt]

_,,] •

[Vd]

'- ] a • u_/

[Vs]

a • 7 u

[VoJ

a

u

196

N' 2000 e;

;>, u

1500 0'

.11000 500 § &: 0

a

u

N' 2000 e;

;>, u

1500 5 g.

500 § 4...... .j .1000

[Vdl'] 0

a

u

N' 2000 e;

;>, u

1500 5 • g.

1000 § l £ 500 § &:

[Vsl'] 0

a

,-... N

2000 e; ;>, u

1500

---.......... 0' "'- .1000 £ ----- .....

-

____________ _;__u_;_ __________ __,J 0500 J J

Figure 4.7. Stylized second formant tracks of the three Arabic vowels [i, a, u] preceding the four Arabic plain coronals [t, d, o, s] and their emphatic counterparts [t\ d\ O\ The figure illustrates also the location of the second formant locus at the left edge of each consonant. Each locus is calculated by averaging the values of F2 transition offsets of the three vowels.


197

a __ /, N

2000 e; ;>. u

1500 ::::: a ' C)

;::l cr C)

u 1000 u "-.. E

500 s ..... 0

ll.. [Vk] [Vq]

0

a

N' 2000 e;

;>. u

1500 ::::: a ........... C) ;::l cr C) .....

u 1000 ll.. E u

500 § 0

ll.. [VB]

0

N' 2000 e;

;>. a .1500

u :::::

u_/ C) ;::l cr C) .....

1000 ll.. E

500 s ..... 0

ll.. [V)]

0

N' 2000 e;

;>.

a u 1500 :::::

C) ;::l

a cr C) .....

1000 ll.. .... ::::: u u

500 8 ..... 0

ll.. [Vh] [V?]

0

Figure 4.8. Stylized second formant tracks of the three Arabic vowels [i, a, u] preceding the Arabic velar [k] as well as the seven gutturals [q, x, B, b, ), h,?].


2000 l 2 1500 c::r

....... 1000 § ] 500

0

g 2000 l ;>, () 2 1500 c::r

....... 1000 § § 500

0

g 2000 [ G' • 2 1500 c::r £ 1000 § ] 500

0

a

• t [tV]

• [dV] t

.... --· a

• [sV] t

r

r r

a

u

a u

a

u

198

Figure 4.9. Stylized second formant tracks of the three Arabic vowels [i, a, u] following the three Arabic plain coronals [t, d, s] and their emphatic counterparts [t\ d\ sll The figure illustrates also the location of the sec-ond formant locus at the left edge of each consonant. Each locus is calculated by averaging the values of F2 transition offsets of the three vowels.


199

N e; 2000 ;>-, (.) a ::::: <l) 1500 ;:I cr <l) / a

1000 ..... ::::: u u ro § 500 0

IJ...

0 [kV] [qV]

r N e; 2000 ;>-, (.) ::::: 1500 <l) ;:I cr a a <l) ....

IJ... 1000 u ro u

8 500 .... 0

IJ...

0 [XV] [IN]

g 200) r ;>-, • (.) • ::::: 1500 a a <l) ;:I cr <l) ....

IJ... 1000 ro 8 500 ..... 0

IJ...

0 [bV] [1V]

N e; 2000 ;>-, (.) a ::::: 1500 a <l) ;:I cr <l) .....

IJ... 1000 ..... ::::: u u ro s 500 ..... 0

IJ...

0 [hV] [?V]

Figure 4.10. Stylized second formant tracks of the three Arabic vowels [i, a, u] following the Arabic velar [k] as well as the seven gutturals [q, x, B, b, 1, h, ?].


200

[u], which also makes averaging of values inapplicable. Perceptual experiments like those

conducted by Delattre et al. (1955) and Stevens and House (1956) are probably required

to estimate these loci. Another notable issue is that, unlike emphatics, uvulars, which also

spread emphasis, do not appear to have any single F2 locus. As noted earlier, the edge of

F2 transitions next to uvulars, while generally low, depend on the vowel. It seems that

uvulars, like velars, adapt their specific point of articulation to the articulation of the ad-

jacent vowel. This is quite different from the highly stable F2 loci associated with em-

phatics. The association between F2 drop and emphatics is more rigid than the associa-

tion between F2 drop and uvulars. This indicates that the dorsal retraction implicated in

the articulation of emphatics is more articulatorily stable and less prone to coarticulatory

influence from adjacent vowels than the dorsal retraction in uvulars. Finally, it is quite

interesting that the two pharyngeals [h, 1] appear to have a common locus as the only

gutturals to do so. To pursue this possibility, two t-tests were conducted to compare the

means of F2 transitions in the two pharyngeals. One test was for the VC context and the

other for the CV context. For the VC context, the t-test show a significant difference be-

tween the means of F2 transitions next to the two pharyngeals [t = 2.205 (df = 88); p <

0.05], while for the CV context, no significant difference exists between the two pharyn-

geals [t = 0.718 (df = 88); p = 0.475]. It appears, then, that the proposed 'pharyngeal lo-

cus' (located around 1600 Hz) is realized more visibly on the right edge of the pharyn-

geal consonants. This locus is possibly a consequence of the [a)-like position (in the

front-back continuum) assumed by the tongue body during pharyngeals. This is supported

by the flat F2 transitions between pharyngeals and the vowel [a]. X-ray tracings in Delat-

tre (1971) and Ghazeli (1977) also support this claim. Apparently, this position is an ar-


201

ticulatory target that does not adapt very much to neighboring vowels. If it did, these

sounds would be as articulatorily transparent as [h] resulting in no locus.

Among the three articulatory descriptions of the secondary articulation in Arabic

emphatics discussed in the Introduction, pure pharyngealization is the one least favored

by the acoustic evidence reported here. The acoustic correlate of emphaticness is not con-

sistent with the articulatory proposal that emphatics and uvulars involve a constriction in

the pharynx by retracting the tongue root (Davis 1995, Rose 1996). Tongue root retrac-

tion is associated with high F1 values while emphatics and uvulars are not significantly

associated with high F1 in adjacent vowels. The correlation between low pharyngeal con-

striction and high F1 was predicted by Halle and Stevens (1969) and later proven true in

experimental investigations of several languages. We have already seen such effect from

Arabic pharyngeals. Similarly, in their reviews of Tsakhur, Udi, and !X66, three lan-

guages belonging to different families which possess pharyngealized vowels, Ladefoged .

and Maddieson (1996) reported correlation between pharyngealization and raising of F1,

but no specific correlation with F2. Also, .in languages that have vowel sets involving re-

traction of the tongue root as well as vowel sets involving the reverse form of this articu-

lation, i.e. advancing of the tongue root (ATR), the acoustic correlates of these added ar-

ticulations are strongly reflected by Fl. In a study of one such language, Kwawu, a

dialect of Akan, Hess (1992) found that the most consistent acoustic difference between

[ +ATR] and [ -ATR] vowel pairs is that members of the former class had lower F1 values

than members of the latter class. Similar acoustic effects were also reported by Fulop et

al. (1998) for Degema, an Edoid language, which also contrasts [+ATR] and [-ATR]

vowels. The same acoustic effect was also shown to be acquired by plain vowels when


202

neighboring primarily pharyngeal consonants. Ladefoged and Maddieson display spec-

trograms of pharyngeal-containing words from Burkikhan, and Agul dialect, that clearly

show that Fl at the transition of the vowel following the pharyngeal is quite high (around

1000 Hz in one case) before gradually falling as the vowel production progresses.

It was mentioned in the Introduction that Arabic emphatics have also been de-

scribed as velarized (Obrecht 1961, Catford 1977) and uvularized (McCarthy 1994,

Zawaydeh 1999). Velarization is compatible with the acoustic effects that characterize

emphatics. Contrastively velarized consonants in languages such as Marshallese (Choi

1995, Ladefoged and Maddieson 1996) and Russian (Bolla 1981 18) as well as non-

contrastively velarized consonants such as the 'dark' [l] in English (Sproat and Fujimura

1993) cause significant lowering ofF2 in neighboring vowels. Ladefoged and Maddieson

(1996) also display spectrograms and spectral slices of a Marshallese velarized [m11 ] and

its plain counterpart [m] showing a sizable drop in the second spectral peak of the velar-

ized nasal as opposed to the plain one. Uvularization as a secondary articulation implies

one of two possibilities. The first is that a uvularized sound has a secondary articulation

that is an exact copy of the articulation of [X] or [B], including the soft palate participa-

tion. The second, and more likely one, is that the sound involves a secondary articulation

in the form of retracting and raising the tongue dorsum towards the uvular region. Either

way, uvularization is a superset of velarization since primarily uvular sounds are consid-

ered complex pharyngeal-velar sounds. Uvularization, then, is expected to display the

same acoustic effects as velarization, which is exactly what Catford (1977) states.

18 Bolla actually refers to non-palatalized Russian sounds as pharyngealized. However, those sounds are widely known currently as velarized. See Chapter 6 of this dissertation for more on this topic.


203

Among the three articulatory descriptions of the emphatic secondary articulation

in Arabic, pharyngealization must be excluded. This characterization fails both descrip-

tively and explanatorily in the fields of phonological representation and phonetic articula-

tory-acoustic correspondence. Among the two remaining candidates, velarization and

uvularization, the latter has never been reported in any language other than Arabic. Fur-

thermore, according to the Handbook of the International Phonetic Association, there is

no IPA symbol for uvularization. As stated earlier, uvularization can be understood as the

combination of two concomitant secondary articulations: velarization and pharyngealiza-

tion. Other reported double secondary articulations include labiovelarization and

labiopalatalization. Ladefoged and Maddieson ( 1996) explain that the former is actually

what takes place in the majority of labialized sounds. This fact is quite understandable

when we consider plain labialization to be the addition of a [w]- or [u]-like articulation to

a primary articulation. The sounds [w] and [u] always involve a velar raising of the back

of the tongue. As for labiopalatalization, Ladefoged and Maddieson note that it is in fact

an allophonic variation of the labiovelarization occurring in the context of front vowels.

In these two types of articulation, velarization or palatalization coexists with labializa-

tion, the most widely encountered secondary articulation cross-linguistically. It is coun-

terintuitive to expect two relatively rare types of secondary articulations, velarization and

pharyngealization, to coexist as a double secondary articulation in the same sound. When

we add this point to the phonological arguments against pharyngealization (which is a

component of the uvularization) presented in Chapter 2, the case for uvularization in

Arabic emphatics weakens. This point is explored in more detail in the next two chapters.


204

4.5 Summary

The acoustic results in this experiment show that emphatics differ greatly from

their non-emphatic counterparts in terms of their coarticulatory impact on adjacent vow-

els. The most pronounced difference is that emphatics cause large F2 drops in the transi-

tion portions of adjacent vowels while non-emphatics do not. Generally, uvulars show

similar effects on adjacent vowels as do emphatics. These results are interpreted as an

acoustic reflection of the dorsal retraction in both emphatics and uvulars. However, the

size and stability of these effects are different between these two sound classes. While F2

drops next to emphatics are associated with a highly stable low F2 locus, they vary in size

and locus next to uvulars depending on the vowel. The lack of stability in this acoustic

correlate next to uvulars is interpreted as a less rigid association between uvulars, as op-

posed to emphatics, and dorsal retraction. While dorsal retraction in emphatics is highly

stable regardless of the adjacent vowel, it is more adaptable to the vowel environment in

uvulars. This indicates that the articulatory implementation of dorsal retraction in em-

phatics and uvulars might be different.

Pharyngeals are reliably associated with high Fl transitions in adjacent vowels.

The rises in Fl in vowels adjacent to emphatics and uvulars, which are frequently re-

ported in previous acoustic works, are not as sizeable nor as consistent as the Fl rise next

to pharyngeals. Meanwhile, laryngeals are not associated with any particular transitions

in adjacent vowels. These findings are interpreted as indications that the only Arabic

sounds that involve active tongue root retraction are pharyngeals. The milder and unsta-

ble Fl rises next to emphatics and uvulars are interpreted as indications of small and in-


205

consistent tongue root retractions that follow as by-products of the general retractions of

the tongue dorsum. These findings also do not support the claim that high Fl in neighbor-

ing vowels is a potential acoustic grouping factor for the class of Arabic gutturals.

Some of these results and claims are investigated further in the following chapter.

The most important point to elaborate on is the differences in implementing the dorsal

retractions in emphatics and uvulars. Such differences should be reflected in the ways

those sounds impact vowel-to-vowel coarticulation.


206

CHAPTERS

Experiment Three:

Vowel-to-Vowel Coarticulation

5.1 Overview

The previous two experiments examine the more direct acoustic correlates of the

articulatory properties of consonants. A less direct consonantal articulatory-acoustic cor-

relation that has been discussed in the literature is located in the patterns of influence of

an intervocalic consonant on the coarticulatory interaction between vowels flanking that

consonant in VCV sequences. In particular, consonants that place articulatory demands

on the tongue dorsum tend to minimize trans-consonantal vowel-to-vowel coarticulation

compared to consonants that do not. Such influences are often measurable both articula-

torily and acoustically. This interaction provides an experimental opportunity to investi-

gate and compare the articulatory traits of Arabic emphatics and gutturals. In particular,

testing this phenomenon could potentially enhance our understanding of the similarities

and differences between the dorsal retractions in both emphatics and uvulars. The present

experiment looks at how these two sets of sounds, as well as the other sound classes in-

vestigated in this dissertation, affect vowel-to-vowel coarticulation in order to determine

the presence and size of the coarticulatory effects they allow between flanking vowels in

VCV contexts. There are two hypotheses being tested here. The first hypothesis is that,

unlike their non-emphatic counterparts, emphatics should resist vowel-to-vowel coarticu-


207

lation. This hypothesis is based on the presence of dorsal retraction in Arabic emphatic

coronals. The second hypothesis is that there are acoustic differences between emphatics

and uvulars in terms of their effects on vowel-to-vowel coarticulation. This hypothesis

follows from the more constant and less flexible dorsal retraction in emphatics as op-

posed to uvulars and the absence of any dorsal retraction in other Arabic gutturals.

Vowel-to-vowel coarticulation across an intervening consonant is a subject that

has aroused some experimental and theoretical interest in recent years. The phenomenon

was first reported by Ohman (1966). In an acoustic study of vowel-stop-vowel sequences

in Swedish, Ohman noticed that the shapes of formant transitions in the first vowel de-

pend not only on the following consonant, but also on the type of the second vowel. The

transition patterns suggest that the articulatory configuration of the first vowel proceeds

to the configuration of the second vowel in a smooth diphthongal movement as if the

consonant did not exist. Ohman found a similar behavior in English, but not in Russian,

vowel-stop-vowel sequences.

A more extensive investigation of Russian VCV sequences by Purcell (1979) con-

firms Ohman's findings and adds that Russian palatalized and velarized consonants also

inhibit vowel-to-vowel coarticulation in the reverse direction. However, in a comparison

between vowel-to-vowel coarticulation in English, Russian, Hungarian, and Polish, Choi

and Keating ( 1991) found that palatalized consonants in all three Slavic languages permit

vowel-to-vowel coarticulation. It should be noted, nevertheless, that in one of Choi and

Keating's figures (Figure 7, p. 83), English permits more sizable amounts of coarticula-

tion in both directions than do any of the other three languages.


208

To explain these findings, Ohman proposed that vowels and consonants employ

different articulatory control mechanisms. He explains that a VCV sequence is not simply

a linear succession of three sounds. Rather, such a sequence is primarily a consonant su-

perimposed on a vowel layer in which one articulatory setting flows into the next. The

lack of V2 effects on in Russian, Ohman explains, is due to the fact that, unlike Swed-

ish and English stops, Russian stops involve either palatal or velar secondary articula-

tions. These secondary articulations are vocalic by nature and impose motor demands on

the articulatory control mechanism of vowels; thus interrupting the flow from to V2•

Keating (1985), on the other hand, adopts an explanatory model based on auto-

segmental phonological features. She proposes that the phonetic implementation of

speech sounds has access to the phonological specification of that segment. If a segment

is phonologically underspecified for a certain feature, this underpecification persists into

phonetics. Coarticulation is then treated as spreading of features. In English Vb V se-

quences, for example, the lingual features of V1 spread to V2 since English [b] is not speci-

fied for any vocalic lingual features to block the spreading. This spreading is imple-

mented in the final shape of the sound as an interpolation between the articulatory targets

of and V2• Sounds with a secondary place of articulation such as Russian palatalized

consonants, on the other hand, use the vowel feature tier to project their secondary articu-

lation features. These sounds, then, do not permit vowel-to-vowel coarticulation since

they block feature spreading from one vowel to the other. This model predicts that any

consonant with a secondary articulation should block vowel-to-vowel coarticulation.

Instead of a two-leveled view of coarticulation in which discreet, timeless phono-

logical units are reinterpreted as articulatory maneuvers with extrinsic timing, Fowler


209

(1980) and Fowler and Saltzman (1993) propose that the phonological constituents be

treated as dynamic phonetic gestures with predefined intrinsic timing structure. In this

alternate view, the intrinsic timing structures of the gestures involved in the coarticulating

segments overlap leading the segments to be coproduced. The size of coproduction be-

tween gestures employing the same articulator is less than if they employ independent

articulators. It also depends on how much the current gesture 'resists' the effects of the

coarticulating gesture. This model, then, does not exclude the possibility that conflicting

gestures could coarticulate.

The concept of 'coarticulation resistance' was first introduced by Bladon an Al-

Bamerni (1976). It was proposed to account for the different degrees of spatial coarticula-

tory effects allowed by speech segments. Bladon and Al-Bamerni propose that the phono-

logical specification for segments, as well as certain boundary types, be assigned a nu-

merically valued feature for coarticulation resistance. To capture cross-linguistic

differences in coarticulation resistance, they suggest that such feature can be language-,

or even dialect-, specific. Recasens (1985) argues that resistance to coarticulation should

be based on the constraints forced on the articulators by the speech segments rather than

on the linguistic status of those segments. Recasens et al. (1993) found that Catalan and

Italian alveopalatals and palatals resist vowel-induced coarticulation more so than alveo-

lars. They attribute this finding to the fact that the tongue dorsum is more involved in the

production of alveopalatals and palatals than,in the production of alveolars. The same ex-

planation is proposed by Recasens et al. ( 1996) to account for their finding that Catalan

velarized [1] resists vowel-dependent coarticulation more so than German non-velarized


210

[1]. Recasens suggests that the same explanations should apply for vowel-to-vowel coar-

ticulation resistance by intervening consonants.

5.2 Methods

5.2.1 Subjects

The same five subjects who participated in the previous two experiments also

took part in this experiment.

5.2.2 Stimuli

The set of stimuli that was used for Experiment One was also used in the present

experiment. However, the present experiment adds the two laryngeals [h] and [?] yield-

ing a paradigm of 144 test words (3 x 16 x 3). Refer to Appendix C for a list of the words

used in this experiment.

5.2.3 Procedures

The stimuli for all three experiments were intermixed and presented to the sub-

jects together in the same recording session. Therefore, the experimental procedures fol-

lowed here are identical to those followed in the previous two experiments.


As was the case in Experiment Two, the sound analysis software Praat (Boersma

and Weenink 1992) was used to automatically generate formant tracks using the Burg


211

algorithm. The tracks were calculated using a succession of 25-ms Gaussian analysis

windows. The temporal distance between the centers of each two analysis windows was 5

ms. For the V,C- portion of the V,CV2 sequence F2 was measured at the end of the vowel

transition and for the -CV2 portion of the sequence F2 was measured at the beginning of

the vowel transition. These acoustic landmarks were identified based on visual inspection

of the waveform and the spectrogram of the sound token. Auditory verification was

added in cases where the vowel-consonant boundary was difficult to pinpoint. The cursor

was placed at the identified landmark which was subsequently recorded as a time point.

The time points for all sound files were annotated to a single Praat TextGrid file. A spe-

cially written Praat script referred to the TextGrid file and automatically recorded the val-

ues of F2 transitions. Formant readings were made 2.5 ms inside the vowels from the

time point recorded in the TextGrid file rather than exactly at the vowel-consonant

boundary. As explained in Experiment Two, this modification was done to avoid false

formant values as a result of incorrect tracking of the formants at the consonant-vowel

junction or incorrect averaging of vowel-edge-based and consonant-edge-based formant

reading points. The script then stored the formant readings as a text file. This file was

then converted to proper formats of the spreadsheet software Microsoft Excel (Microsoft

Corp. 1985) and the statistical analysis software SPSS (SPSS, Inc. 1989).

5.2.5 Reliability

To estimate the intra-judge reliability of formant measurements, 216 sound files

(10% of the total files) were randomly selected by a random number generating software

and re-analyzed following the same procedures explained in section 5.2.4 above. The cor-


212

relations between the original and the retested F2 values at the transition portions were

above 0.99 for both the V1 and V2• Agreements within 50 Hz were 90.3% for V1 and

80.1% for V2• The measurements were judged reliable.

5.3 Results

To determine whether a certain consonant allows vowel-to-vowel coarticulation,

for each of the three target vowels [i, a, u], the mean F2 transition values when the source

trans-consonantal vowel is [i] are compared against the mean F2 transition values when

the source trans-consonantal vowel is [u]. These two source vowels are expected to have

opposite effects on the F2 transition of the target vowel since [i] has the highest F2 and

[u] has the lowest. The F2 transition values were compared using t-tests. Six t-tests were

conducted for every consonant, three in each direction (three anticipatory target vowels

and three carryover target vowels), for a total of 96 t-tests (3 tests x 2 directions x 16

consonants).

5.3.1 Anticipatory Vowel-to-Vowel Coarticulation

The mean F2 values of the transitions in the target vowels in the anticipatory di-

rection are listed in Tables 5.1 and 5.2. Figures 5.1 and 5.2 show the means and standard

deviations of F2 transitions in the three Arabic vowels [i, a, u] as they occur before the

sixteen consonants being investigated. The source trans-consonantal vowel in each case is

either [i] or [u].


213 Table 5.1. Second formant frequency means (and standard deviations) in Hz of the V1 transitions preceding the four MSA emphatics and their non-emphatic counterparts.

Target Source Intervocalic Consonant Vowel Vowel (VI) CV2) [t] [d] [dl'] [s] [sl'] [o]

[i] [i] 2135 1384 2189 1290 2112 1503 2052 1224 (59) (149) (106) (92) (43) (156) (74) (93)

[u] 1980 1344 1987 1248 1920 1532 1747 1177 (30) (142) (58) (59) (71) (156) (121) (85)

[a] [i] 1781 1076 1794 1050 1785 1079 1753 1045 (132) ( 101) (65) (48) (83) (102) (81) (69)

[u] 1706 1043 1696 998 1645 1114 1561 1018 (71) (92) (52) (90) (91) (53) (51) (71)

[u] [i] 1718 1070 1692 946 1677 978 1621 891 (157) (68) (96) (60) (120) . (118) (95) (43)

[u] 1536 972 1518 892 1409 1003 1356 930 (58) (56) (145) (44) (117) (72) (69) (71)

Table 5.2. Second formant frequency means (and standard deviations) in Hz of the V1 transitions preceding the MSA gutturals and the velar stop [k].

Target Source Intervocalic Consonant Vowel Vowel CV1) CV2) [k] [q] [X] [K] [n] [)] [h] [?] [i] [i] 2412 1585 1888 1740 2127 1989 2362 2312

(89) (46) (93) (138) (112) (189) (94) (Ill) [u] 2108 1261 1394 1066 1905 1544 1492 1701

(158) (64) (87) (76) (91) (90) (140) (252) [a] [i] 1990 1096 1487 1516 1826 1739 1937 1906

(146) (52) (57) (97) (102) (130) (117) (83) [u] 1423 1107 1262 1027 1661 1412 1213 1291

(76) (75) (119) (78) (90) (122) (197) (109) [u] [i] 1125 830 897 908 1468 1465 1312 1345

(117) (49) (46) (98) (203) (152) (142) (126) [u] 949 834 938 651 1484 1232 999 960

(89) (62) (69) (57) (1 03) (97) (80) (53)

The four non-emphatic coronals [t, d, o, s] allow anticipatory vowel-to-vowel

coarticulation in all three vowels. In 11 of the 12 comparisons involving these conso-

nants, the value of F2 transition in V1 was significantly higher when V2 is [i] than when it


214 [t] [tY)

2500 'N' iCV aCV uCV iCV aCV uCV e; 2100 :!: ;>. :•: (.) T T ;:::: • !: C) 1700 .L • ::I .l c:::r :!: C) T T ..... • [.l.., 1300 .l • ..... .l ;:::: ..... ..... !: Cl:l • • s 900

..L ...... :!: ..... 0

[.l.., p < 0.001 = 0.060 p < 0.001 p = 0.459 p = 0.365 p < 0.001 500 [d) [dl']

2500 'N' ..... iCY aCV uCV iCV aCV uCV e; 2100 • ..L ;>. :!: (.) :!: ;:::: ..... C) 1700 :!: • ::I ..L T c:::r • C) .L ..... ..... [.l.., 1300 • :!: ..... ..L ;:::: ..... Cl:l s • :!: 900 ..L ..... 0

[.l.., < 0.001 < 0.001 = 0.149 p = 0.061 p < 0.01 500 [s] [s\]

2500 ,.-., iCV aCV uCV iCV aCV uCV N

e; 2100 :e: ;>. !: (.) I ;:::: T C) 1700 ..... ::I • • T T ..L .L c:::r T • • C) .l ..... • .l [.l.., 1300 .L ..... T :!: ;:::: • T ! Cl:l s ..L • ..... 900 .L 0

[.l.., < 0.001 < 0.001 < 0.001 p = 0.621 p = 0.246 p = 0.487 500 [oJ [o\]

2500 'N' iCV aCV uCV iCV aCV uCV e; 2100 ! ;>. (.) T ;:::: I C) 1700 • ..... ::I .L :!: • c:::r ..L C)

!: ..... 1300 [.l.., ..... ..... • • ..L ;:::: .... !: ! Cl:l !: s 900 :e: .....

0 [.l.., < 0.001 < 0.001 <0.001 p=0.165 p = 0.308 p = 0.082 500

[i] [u] [i] [u] [i] [u] [i] [u] [i] [u] [i] [u] Second Vowel (V2)

Figure 5.1. Anticipatory V-to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the four plain coronals [t, d, o, s] and their emphatic counterparts [t\ d\, o\, s\). The effects are expressed as the mean value of F2 transition of the first vowel when the trans-consonantal vowel is either [i] or [ u]. Formant values are averages across speakers. The error bars represent± one standard deviation.


215 [k] [q]

2500 ..,... • iCV aCV uCV iCV aCV uCV ..... e; 2100 T • T ;>-, .l • u .l 1::::

1700 0'

!: ..... 1300 [.I.. :!: ..... T !: 1:::: • :!: o::l .L ..,...

E1 900 • ..... ..... :!: :!: 0 [.I.. < 0.001 < 0.001 < 0.001 = 0.642 = 0.866 500

[X] 2500

[K] iCV aCV uCV iCV aCV uCV N

e; 2100 ;>-, ..,... u • T 1:::: .....

1700 • 1. ..... 0' :!: • ..,... ..... ..... 1300 • T [.I.. ..... • .L !: o::l

!: !: ..... E1 900 :!: • ..... ..... 0 [.I.. < 0.001 <0.001 < 0.001 < 0.001 < 0.001:!: 500

[nJ [1] 2500

iCV aCV uCV iCV aCV uCV N

e; 2100 T T • .L ;>-, ..... • u • ..... 1 1:::: ..... • T

1700 ..... ..,... • • T ..,... .L ..... T 0' ..... • T • • ..... • ..... 1 ..... • .l [.I.. 1300 .L ..... • ..... o::l E1 900 ..... 0

[.I.. < 0.001 < 0.001 < 0.001 < 0.001 <0.001 500 [h] [?]

2500 e iCV aCV uCV T iCV aCV uCV N ..... • e; 2100 .L

T ;>-, • T I u .L 1:::: 1700 • T 1 0' • T T .... 1. T T • [.I.. 1300 • • .L • 1. .L

o::l 1 I :!: E1 900 ..... 0

[.I.. < 0.001 < 0.001 < 0.001 < 0.001 < 0.001 500

[i] [u] [i] [u] [i] [u] [i] [u] [i] [u] [i] [u] Second Vowel (V2)

Figure 5.2. Anticipatory V-to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the velar [k], the three uvulars [X, K, q], the two pharyngea1s [h, ?], and the two laryngeals [h, ?]. The effects are expressed as the mean value of F2 transition of the first vowel when the trans-consonantal vowel is either [i] or [u]. Formant values are averages across speakers. The error bars represent± one standard deviation.


216

is [u] [t ranges from 3.880 to 9.025, (df = 28); p < 0.01]. The only exception is that, be-

fore [t], the difference between F2 transition of [a] before [i] is only marginally higher

than before [u] [t ranges 1.962, (df = 28); p < 0.1]. The velar [k] also allows anticipatory

vowel-to-vowel coarticulation in all three vowels [t ranges from 4.619 to 13.368, (df = 28); p < 0.001]. The four emphatics [t'1, d", 6", s'l], on the other hand, tend to block vowel-

to-vowel anticipatory coarticulation. Of the 12 t-tests conducted for VCV sequences in-

volving emphatics, nine showed no significant differences between the mean F2 transi-

tion value in vl based on the identity of v2 [t ranges from -1.806 to 1.485, (df = 28); p 2:

0.149]. The emphatic stops [d'l] and [t'l] allow F2 transition in [u] to coarticulate with V2

[t = 2.810 and 4.288, respectively, (df = 28); p < 0.01]. The mean value of F2 transition

of [a] before [d'1] is marginally higher when V2 is [i] than when it is [u] [t = 1.950, (df = 28); p < 0.1].

Among the three uvulars, [B] allows F2 transitions in all three vowels to coarticu-

late with V2 [t ranges from 8.754 to 16.559, (df = 28); p < 0.001]. The voiceless fricative

[X] allows V2 to coarticulate with F2 transition in preceding [i] and [a] [t = 15.003 and

6.596, respectively, (df = 28); p < 0.001] but not in preceding [u] [t = -1.874, (df = 28); p

= 0.071]. The uvular stop [q] allows coarticulation from V2 into F2 transition of [i] [t = 15.977, (df = 28); p = 0.001] but blocks it from affecting [a] [t = -0.470, (df = 28); p = 0.642] and [u] [t = -0.170, (df = 28); p = 0.866]. Five of the six t-tests involving the two

pharyngeals [h] and [)] show that these two sounds allow anticipatory coarticulation

from V2 into F2 transition of V1 [t ranges from 4.711 to 8.231, (df = 28); p < 0.001]. The

one exception concerns the vowel [u] before [h] where no coarticulation is detected [t =-

0.267, (df = 28); p = 0.791]. Both laryngeals, [h] and [?] allow large degrees of anticipa-


217

tory vowel-to-vowel coarticulation in VCV sequences [t ranges from 7.454 to 19.974, (df

= 28); p < 0.001].

5.3.2 Carryover Vowel-to-Vowel Coarticulation

The mean F2 values of the transitions in the target vowels in the carryover direc-

tion are listed in Tables 5.3 and 5.4. Figures 5.3 and 5.4 show the means and standard de-

viations of F2 transitions of the three Arabic vowels [i, a, u] occurring after the sixteen

consonants. In each case, the trans-consonantal vowel is either [i] or [u].

All four plain coronals [t, d, o, s] allow coarticulation from V1 into F2 transition

of V2 in all VCV sequences [t ranges from 2.049 to 13.781, (df = 28); p :S 0.05]. The same

is true for the velar stop [k] [t ranges from 2.679 to 7.158, (df= 28); p < 0.05]. In eight of

the 12 t-tests conducted for VCV sequences in which the consonant was one of the four

emphatic coronals [t", d", o", s1], coarticulation from V1 was permitted into F2 transition

of V2. Both [t1] and [o1] allow carryover coarticulation in all VCV sequences [t ranges

from 2.493 to 3.914, (df = 28); p < 0.05]. The voiced emphatic stop [d1] permits signifi-

cant carryover coarticulatory effects when V2 is either [a] or [u] [t = 3.720 and 2.098, re-

spectively, (df = 28); p < 0.05] but only marginally significant effects when V2 is [i] [t =

1.890, (df = 28); p < 0.1]. The emphatic alveolar fricative [s1] exhibits the most resistance

to carryover coarticulation. This sound blocks coarticulation from affecting [i] or [a] [t =

1.435 and 1.306, respectively, (df = 28); p 0.162] and allows only marginally signifi-

cant effects on [u] [t = 2.017, (df= 28); p < 0.1].


218 Table 5.3. Second formant frequency means (and standard deviations) in Hz of the V2 transitions following the MSA emphatics and their non-emphatic counterparts.

, Target Source Intervocalic Consonant Vowel Vowel (Vz) (VI) [t] [d] [s] [oJ [oi·J [i] [i] 2251 1532 2183 1324 2164 1586 2116 1246

(83) (160) (64) (131) (79) (300) (70) (93) [u] 1968 1315 1894 1238 1843 1461 1841 1123

(83) (142) (89) (119) (52) (155) (33) (96) [a] [i] 1844 1062 1888 1042 1746 1079 1783 1019

(44) (66) (65) (76) (67) (80) (74) (88) [u] 1694 985 1727 947 1624 1034 1628 944

(76) (62) (62) (64) (76) (105) (59) (57) [u] [i] 1449 970 1720 975 1510 996 1560 951

(147) (62) (70) (82) (109) (81) (82) (82) [u] 1356 911 1544 912 1382 930 1374 881

(96) (68) (96) (84) (48) (97) (125) (63)

Table 5.4. Second formant frequency means (and standard deviations) in Hz of the V2 transitions following the MSA gutturals and the velar stop [k].

Target Source Intervocalic Consonant Vowel Vowel (Vz) (VI) [k] [q] [X] [H] [nJ ['l] [h] [?] [i] [i] 2321 1787 2273 1784 2219 2075 2391 2328

(116) (182) (114) (88) (103) (164) (98) (112) [u] 2062 1546 1711 1278 1897 1702 1867 1682

(79) (127) (146) (165) (120) (108) (165) (248) [a] [i] 1992 1107 1249 1300 1676 1688 1861 1848

(96) (74) (97) (81) (105) (75) (76) (98) [u] 1783 1110 1182 1031 1468 1404 1375 1124

(135) (71) (112) (101) (114) (99) (108) (127) [u] [i] 1021 877 931 880 1353 1311 1032 1162

(112) (47) (62) (58) (120) (68) (94) (215) [u] 905 849 768 679 1277 1208 947 847

(124) (36) (81) (54) (90) (122) (32) (88)

The voiced uvular [B"] permits substantial carryover coarticulation in all VCV se-

quences [t ranges from 8.026 to 10.486, (df = 28); p < 0.001]. The voiceless fricative [X:]

allows V1 to coarticulate with F2 transition in following [i] and [u] [t = 11.719 and 6.203,


219 [t] [tl']

2500 ,-.,

I VCi VCa VCu VCi VCa VCu N

tS 2100 ;;.., I u :::: C) 1700 :! T ;:I 0' T • C) • "T" .l T .... 1. • u.. 1300 ...... • .... 1. :::: !: o:l !: !: s 900 !: .... 0 u.. p < 0.001 p < 0.001 p = 0.050 p so·.ooi p < 0.01 p < 0.05 500

[d] [dl'] 2500

,-., VCi VCa VCu VCi VCa VCu N

tS 2100 !: ;;.., "T" !: u • :::: ......

!: !: C) 1700 ;:I "T" 0' • ...... T u.. 1300 • T

.L • .... ..I.. :::: :! o:l !: • s 900 ...... I .... 0 u.. p < 0.001 p < 0.001 p < 0.001 p = 0.069 p s 0.001 p < 0.05

500 [s]

2500 ,-., VCi VCa VCu VCi VCa VCu N

tS 2100 I ;;.., u :!: T :::: !: C) 1700 :! ;:I T • T 0' • 1 C) .L :!: • .... 1. u.. 1300 -;:: I T o:l • I "T" s 900 .L • .... ...... 0 u.. p < 0.001 p < 0.001 p < 0.001 p=0.162 p = 0.202 p = 0.053 500

[oJ 2500

,-., VCi VCa VCu VCi VCa VCu N

tS 2100 !: ;;.., u :e: :! :::: C) 1700 ;:I :!: I 0' T

1300 • "T" ..I.. • -;:: ...... "T" • • o:l ...... I s 900 ...... :!: I .... 0 u.. p < 0.001 p < 0.001 p < 0.001 p SO.OOI p < 0.01 p < 0.05

500 [i] [u] [i] [u] [i] [u] [i] [u] [i] [u] [i] [u]

First V owe! (VI)

Figure 5.3. Carryover V -to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the four plain coronals [t, d, o, s] and their emphatic counterparts [t\ d\ o\ The effects are expressed as the mean value of F2 transition of the second vowel when the trans-consonantal vowel is either [i] or [u]. Formant values are averages across speakers. The error bars represent ± one standard deviation.


[k] 2500 I VCi N' VCa

e; 2100 .L

i ..,.. ;;.., • u ..L T c • il) 1700 1. ;:l c::r il) ,_

ll.. 1300 c ro s 900 ,_ 0

ll.. <0.001 p < 0.001 500 [X]

2500 ,-.., T VCi VCa N • e; 2100 .L

;;.., {.) T :::: il) 1700 • ;:l 1. c::r il) ,_

1300 ..,.. ll.. • T ...... ..L • :::: .L ro s 900 ,_ 0 ll.. < 0.001 = 0.090

500 [nJ

2500 ,-..,

T VCi VCa N

e; 2100 • ..L

;;.., T u • :::: .L T il) 1700 • ;:l ..L c::r T il) • ,_

1300 .L ll.. ...... :::: ro s 900 ,_ 0 ll.. < 0.001 < 0.001 500

[h] 2500 ..,..

,-.., • VCi VCa N ..L

e; 2100 ;;.., T i u • :::: l. il) 1700 ;:l c::r il) T ,_

1300 • ll.. ..L

c ro s 900 ,_ 0 ll.. <0.001 < 0.001 500

[i] [u] [i] [u]

[q]

VCu VCi

T • 1 T • 1.

T • T .L • .L

< 0.001

(K] VCu VCi

.,... • .....

T • l. :!: ..

p<O.OOJ ..... <0.001 ['l]

VCu VCi T • l.

T • ..L

T • .,... .L • .....

< 0.001

[?] T VCu VCi • .L

T • 1 ..,.. • ..L :•:

< 0.001

[i] [u] [i] [u] First V owe! (V 1)

VCa

! !

= 0.915

VCa

..,.. • ..L

p < 0.001

VCa

..,.. • ..L

< 0.001

..,.. • ..L

VCa

T • .L

< 0.001

[i] [u]

220

VCu

p = 0.083

VCu

:!: < 0.001:!:

VCu

T • .L

< 0.01

VCu

T • 1 .,... • ..... < 0.001

[i] [u]

Figure 5.4. Carryover V -to-V coarticulatory effects on the three Arabic vowels [i, a, u] across [k], the three uvulars [X, JS, q], the two pharyngeals [n, ?], and the two laryngeals [h, 7]. The effects are expressed as the mean value of F2 transition of the second vowel when the trans-consonantal vowel is either [i] or [u]. Formant values are averages across speakers. The error bars represent± one standard deviation.


221

respectively, (df = 28); p < 0.001] but permits only marginally significant effects on F2

transition in following [a] [t = 1.757, (df = 28); p < 0.1]. The uvular stop [q] allows sig-

nificant carryover coarticulatory effects into [i] [t = 4.194, (df= 28); p < 0.001], marginal

effects into [u] [t = 1.798, (df = 28); p < 0.1], but no significant effects into [a] [t =-

0.108, (df = 28); p = 0.915]. Five of the six VCV sequences involving the pharyngeals

[h] and [)] reflect significant carryover vowel-to-vowel coarticulation [t ranges from

2.869 to 8.888, (df = 28); p < 0.01]. The only exception is in the sequence uhV where

only marginally significant coarticulatory effect takes place [t = 1.959, (df = 28); p < 0.1].

All six VCV sequences in which the consonant is one of the two laryngeals [h] or [?] re-

flect significant carryover coarticulation from V1 [t ranges from 3.281 to 17.439,

(df = 28); p < 0.01].

Figure 5.5 shows the averaged sizes of anticipatory and carryover coarticulation

allowed by each of the 16 sounds under investigation. Each data bar represents the sub-

traction of the averaged F2 value in all vowels when the trans-consonantal vowel is either

[u] from the value when the trans-consonantal vowel is [i]. So, for example, the size of

overall anticipatory effect allowed by [t] is calculated by subtracting the average of F2

transition values in all three vowels in the V1 position when the V2 is [ u] from the average

F2 transition in all three vowels in the V1 position when V2 is [i]. The larger the number

the more sizable the coarticulatory effect permitted by the intervening consonant. It is

clear from the figure that laryngeals allow the strongest coarticulatory effects. Strong ef-

fects are also allowed by the two pharyngeals and the five plain oral consonants. The

three uvulars show large variability in the size of permitted coarticulatory effects. The

coarticulatory effects allowed by [q] are relatively small compared to [ff] which allows

Reproduced w



ission.

700

!:1. 600 :;; (1)

@ ;::l

g 500 ;::;· ntv e; 0 :2 400 < ..... ;: > -· ;::l n ::::-. 300 < 8. 1"0

"' :::: .....

< ';'; 200 :::.::< :r::n N -· '-'I

< 100 n :::: '-' 0 ..., < 0 tv

-100

I I

• Anticipatory

D Carryover

[t] [d] [s] [oJ [k] [t'] [q] [X] [B"] [h] [l] [h]

Figure 5.5. Sizes of anticipatory and carryover vowel-to-vowel coarticulatory effects across the sixteen Arabic consonants under investigation. The sizes of the effects are measured by subtracting the value of F2 transition of the target vowel when the trans-consonantal vowel is [u] from the value when the trans-consonantal vowel is [i]. F2 transition values are averaged across speakers.

[?]

N N N


223

very sizable effects. The effects permitted by [X] are in between. The four emphatics

clearly resist coarticulatory effects more than any other class of sounds. This is especially

true in the anticipatory direction where the effect permitted by [s"] actually has a negative

value.

Excluding emphatics, the degree of anticipatory coarticulation permitted by seven

of the remaining 12 consonants is larger than that of carryover coarticulation. The reverse

is true for the other five consonants. Across all segment types, the sizes of carryover

coarticulatory effects are less variable and less extreme than those 'of anticipatory effects.

F2 differences in the carryover direction range from 78 to 562 Hz, while in the anticipa-

tory direction the range is between -30 and 636Hz. It is interesting to note that while the

size of carryover effects allowed by the four non-emphatic coronals is mostly constant,

the two fricatives [s] and [o] allow more anticipatory effects than the two stops [t] and

[d]. Meanwhile, all of the four emphatics allow more carryover than anticipatory coar-

ticulatory effects.


The results in this experiment indicate that MSA emphatic consonants strongly

resist vowel-to-vowel coarticulation in both the anticipatory and the carryover directions

while their non-emphatic counterparts, do not. Like non-emphatic coronals, pharyngeals,

laryngeals, and the velar [k] allow substantial vowel-to-vowel coarticulation. The impact

of uvulars on vowel-to-vowel coarticulation, meanwhile, depends on their degreed of


224

constriction. Larger degrees of constriction in uvulars correspond to higher vowel-to-

vowel coarticulation resistance.

The results show that the presence and size of vowel-to-vowel coarticulation per-

mitted by a consonant depends largely on how much tongue participation is implicated in

the production of the consonant. The two laryngeals [h] and [?] feature little or no tongue

involvement in their production. It is therefore not unexpected that these two sounds al-

low the most sizable vowel-to-vowel coarticulation in all sequences in both directions

among the consonants investigated. During the production of laryngeals, the tongue dor-

sum is free to move smoothly from the position it assumes in V, to that in V2• The two

pharyhgeals [h, )] and the four plain coronals [t, d, 6, s] are produced with active tongue

participation. However, this participation is limited to the tongue root in the case of the

pharyngeals and the tongue tip/blade in the case of the plain coronals. The tongue dor-

sum, meanwhile, is not actively involved in the production of these sounds. Since the

tongue dorsum is the main articulator of vowels, it is understandable that pharyngeals and

plain coronals allow vowel-to-vowel coarticulation due to their minimal interference with

the dorsal maneuvers. The size of vowel-to-vowel coarticulatory effects permitted by the

two pharyngeals [h, )] is somewhat lower than expected since their main articulation

takes place in the lower pharynx away from the tongue dorsum. However, it was noted in

Experiment Two that the tongue dorsum is apparently more active during the production

of these two sounds than one would think. In Experiment Two it was reported that there

was a 'pharyngeal locus' that represents an acoustic hub from which F2 of a following

vowel in a CV sequence starts. The claim presented in Experiment Two that the tongue

body assumes an [a]-like articulatory target located in the middle of the front/back axis


225

receives additional support here. If the tongue body was totally passive during the articu-

lation of pharyngeals, we would notice a more sizable vowel-to-vowel coarticulatory ef-

fects approaching in size those noticed for laryngeals. It seems that the [a]-like target of

the tongue body slightly limits the movement of the tongue dorsum. However, since this

target is located around the middle of the front/back dimension, it is almost always lo-

cated close to the middle of the interpolation line that extends from the articulatory target

of V1 to that of V2• The overall result of this situation is that pharyngeals almost always

allow significant vowel-to-vowel coarticulatory effects, but those effects are not as siz-

able as those allowed by laryngeals in which the tongue body is completely free.

The velar stop [k] is produced with active tongue dorsum participation, yet it al-

lows considerable vowel-to-vowel coarticulation in both directions; more so than plain

coronals, as a matter of fact. However it is widely acknowledged that plain velar stops

greatly adapt the position of their point of constriction to the vocalic context. This way, in

a VkV environment the point of velar constriction during the production of [k] is decided

through interpolation between the two dorsal settings for the flanking vowels. Therefore,

the adaptable placement of the tongue dorsum during [k] imposes little restrictions on the

tongue movement.

The four emphatic coronals [t', d', 6', s"] exhibit the most resistance to vowel-to-

vowel coarticulation. These sounds involve active tongue backing that constrains the

tongue dorsum greatly and prevents smooth transitioning from vl to v2. It is important to

note, however, that this resistance is not obtained for all VCV sequences containing em-

phatics consonants. While, blocking of anticipatory vowel-to-vowel coarticulation is

found in 10 of 12 VC"V sequences, only four of 12 sequences exhibit vowel-to-vowel


226

coarticulation blocking in the carryover direction. Moreover, while other groups of

sounds exhibit generally larger coarticulatory effects in the anticipatory than in the carry-

over direction, emphatics exhibit the opposite pattern. This is probably due to enhance-

ment by mechanico-inertial effects which, by nature, flow in the same direction as carry-

over coarticulation does.

The vowel-to-vowel coarticulatory effects recorded for uvulars vary greatly

among the members of this guttural subset. The voiced fricative [B] permits large coar-

ticulatory effects in both directions in all VBV sequences. The voiceless fricative [X]

permits vowel-to-vowel coarticulation in four of the six VxV sequences covering both

directions. Size-wise, the overall amount of vowel-to-vowel coarticulatory effects al-

lowed by [X] is quite substantial and compares well to the pharyngeal approximants. The

uvular stop [q], on the other hand, shows strong resistance to vowel-to-vowel coarticula-

tion. Among the six VqV sequences covering both directions, only two sequences show

significant vowel-to-vowel coarticulatory effects. It seems that vowel-to-vowel coarticu-

lation resistance depends not only on the involvement of the tongue dorsum in the articu-

lation of the intervening consonant but also on the degree of constriction executed by the

dorsum. The degree of constriction involved in the production of fricatives is less than

that involved in the production of stops. Within fricatives, voiceless sounds typically in-

volve more constriction than voiced sounds in order to produce audible turbulence in the

airflow. Apparently, uvular resistance to vowel-to-vowel coarticulation increases in that

order (stop > voiceless fricative > voiced fricative). This proposal is logical if we take

into account Recasens' ( 1985) argument that coarticulation resistance is based on the

physical constraints placed on the articulators by a segments articulation. The amount of


227

physical involvement of the articulators required by the production of a stop is more than

the amount required by the production of a homorganic fricative. The higher physical re-

quirements yield higher articulatory constraints on the articulators. The result is that [ q]

exhibits high resistance to vowel-to-vowel coarticulatory effects, [X] mostly allows such

effects, while [B"] is highly transparent to them.

The different patterns of influence on vowel-to-vowel coarticulation between uvu-

lars and emphatics indicate that the types of articulatory constraints each class imposes

on the tongue dorsum are different. Even though the secondary articulation in emphatics

is approximant by definition, it constrains the tongue more so than do the articulations of

the two uvulars [B", xJ. Ghazeli (1977) and Delattre (1971) report that the two uvular

fricatives involve some adjustment in the position of the velum in that it is spread flat

over the back tongue in the case of [X] and is curled downward towards the back of the

tongue in the case [B"]. Furthermore, the general direction of the tongue dorsum move-

ment during uvulars as reported by Ghazeli and others involves considerable raising

along with the retraction. These are indications that the articulatory muscle that is most

active physically during the production of uvulars is the palatoglossus. As explained in

Chapter 2, this muscle originates from the soft palate and inserts into the back of the

tongue and works to either lower the soft palate or raise the back of the tongue. Inspect-

ing Figure 2.1, which shows the extrinsic muscles of the tongue, clearly reveals that the

palatoglossus is located around the area towards which the tongue moves during the ar-

ticulation of uvulars. The styloglossus also seems to be variably active during the produc-

tion of uvulars. During the articulation of [X] and [B"], the styloglossus is probably re-

sponsible for cradling the tongue backwards to achieve the noticeable tongue retraction.


228

Additionally, the velar depressor muscles execute the downward movement of the velum

towards the inside of the oral cavity the articulation of [B]. For [X] and [q], the flattened

and seemingly tensed shape of the velum are most likely executed by the velar elevator

and tensor muscles.

Electromygraphic (EMG) studies indicate that three extrinsic muscles of the

tongue; the genioglossus, the styloglossus, and the hyoglossus; are the only muscles nec-

essary for achieving the plain articulations of the different vowels (Maeda and Honda

1994, Honda et al. 1992). The action of these muscles is sufficient to cover all articula-

tory movements that span the vowel space. If the articulation of uvulars [X] and [B] in-

volve mainly the palatoglossus and the velar muscles, it becomes understandable why

these sounds do not exhibit mu.ch resistance to vowel-to-vowel coarticulation. These

sounds interfere only slightly with the actions of the muscle group responsible for execut-

ing vowel articulations. This leaves these muscles relatively free to transition smoothly

from one vowel configuration to the other. The secondary articulation of emphatics,

though somewhat variable, generally shows more retraction and less raising than the ones

involved in [X] or [B]. Giannini and Pettorino (1984) suggest that the extrinsic muscles of

the tongue execute the secondary articulation in emphatics. Indeed, the pattern and axis

of tongue dorsum movements during emphatics are more in line with actions the sty-

loglossus and the hyoglossus. The fact that the styloglossus can raise the tongue (in addi-

tion to retracting it) and that the hyoglossus can lower the tongue (also in addition tore-

tracting it) might explain the variability in the elevation of the retracted tongue dorsum

during emphatics (compare, in particular, Ghazeli's 1977 x-ray tracings of and

Both these muscles are actively involved in the production of vowels. Since emphatics


229

place their own articulatory demands on them, the two muscles are physically constrained

during the consonant in the sequence which greatly limits their freedom in moving

from one vowel position to the other. Similarly, during the production of the uvular stop

[q], participation of the styloglossus is apparently more active than during [X] or [B] since

this sound involves more retraction than the two fricatives. Moreover, the occlusion dur-

ing [ q] is between the tongue dorsum and the soft palate in an arched stretch that spans

the horizontal as well as the vertical axes. A combined raising and backing of the tongue

is necessary for this occlusion to be effective. The more active involvement of the sty-

loglossus during [q] is a logical explanation for why this sound is the only uvular that is

highly resistant to vowel-to-vowel coarticulation.

The previous rationalization of the differences between the uvular fricatives on

the one hand and the emphatic consonants and the uvular stop on the other hand depends

on viewing main articulators from the angle of the individual muscles that control them.

With this view in mind we conclude that the secondary articulation in emphatics is sig-

nificantly different from the articulation of uvulars. During emphatics, the tongue dorsum

is controlled by the extrinsic muscles of the tongue. During uvulars, on the other hand,

the tongue dorsum is controlled primarily by the palatoglossus and secondarily by the

styloglossus. This view challenges existing proposals which suggest an articulatory

equivalence between the dorsal articulations in these two classes of sounds.

When inspecting the VCV coarticulatory aspects of emphatics and uvulars it ap-

pears that "coarticulatory aggression" is encountered between classes of sounds but not

necessarily within the members of the same class. The term "coarticulatory aggression"

was coined by Fowler and Saltzman (1993) to refer to the observation that sounds that


230

resist coarticulatory influence from other sounds the most are the same sounds that tend

to spread their own coarticulatory effects to other sounds. When comparing the findings

of the present experiment with those of Experiment Two, we notice that emphatics and

the uvular [ q] spread the strongest co articulatory effects to neighboring vowels and also

exhibit the most resistance for vowel-to-vowel coarticulation. However, when comparing

emphatic sounds to each other and uvulars to each other, this pattern is not necessarily

observable. Among the four emphatics, [s1] is generally accompanied by the least sizable

and most variable drop in F2 transition in neighboring vowels in CV and VC environ-

ment. This emphatic, however, exhibits the most resistance to vowel-to-vowel coarticula-

tion. Moreover, among the three uvulars, the coarticulatory effect of [B] on neighboring

vowels is clearly more sizable than that of [X]; yet the latter exhibits more vowel-to-

vowel coarticulation resistance than the former. This leads me to suggest that coarticula-

tory aggression should be considered as a property of sound classes rather than individual

sounds.

5.5 Summary

This experiment shows that Arabic non-emphatic coronals, pharyngeals, laryn-

geals, and the velar [k] allow significant amounts of anticipatory and carryover vowel-to-

vowel coarticulatory effects. Arabic emphatics, on the other hand, show strong and rela-

tively consistent resistance to such effects. The impacts of uvulars on vowel-to-vowel

coarticulatory effects follow from the degrees of constriction involved in their articula-

tions. The voiced uvular fricative [B] involves a relatively mild degree of constriction.


231

Consequently, this sound permits great degrees of vowel-to-vowel coarticulatory effects.

The voiceless fricative [X] involves a slightly higher degree of constriction and, as a re-

sult, allows similar, but not as substantial, effects. The uvular stop [q] involves the high-

est degree of constriction causing this sound to be mostly opaque to vowel-to-vowel coar-

ticulation.

These results are interpreted as indications that the dorsal involvement in the ar-

ticulations of emphatics and uvulars are not completely similar. When coupled with the

articulatory studies cited in Chapter 2, the results suggest that the tongue dorsum during

the articulation of Arabic emphatics is pulled back through the action of both the sty-

loglossus and the hyoglossus muscles. Both these muscles are employed by vowel articu-

lations. An intervening emphatic in VC'V sequences would therefore place its own ar-

ticulatory demands on part of the set of muscles executing the articulations of the

flanking vowels. Hence, the articulatory movement from one vowel to the other would

not be a free one. Uvulars, meanwhile, seem to involve only the styloglossus. The

hyoglossus cannot be implicated in their articulation since the outcome of its contraction

is antagonistic to the tongue raising necessary for their articulation. The degree of in-

volvement of the styloglossus in uvulars depends on their degree of constriction. Since

the stop [q] requires the most contraction by the styloglossus in order to achieve full oc-

clusion, this uvular interfere the most with vowel-to-vowel coarticulation. The tongue

raising implicated in the articulation of uvulars seems to be, at least partially, the respon-

sibility of the palatoglossus muscle. It should also be kept in mind that all uvulars involve

active participation by the soft palate that is absent in emphatics.


232

The articulatory details discussed in Experiments Two and Three should be re-

flected in the phonological representations of these sounds. The following chapter dis-

cusses this issue in detail and formalizes the phonological representations of these

sounds. These representations are then shown to be more adequate at addressing the chal-

lenges faced by the previous proposals discussed in §2.4. Also, an alternative reasoning

for the grouping of Arabic gutturals into a single natural class is presented.


233

CHAPTER6

Implications and Alternatives

The goal of this chapter is to motivate phonological representations for Arabic

emphatics and gutturals in the light of the phonetic findings of the previous three chap-

ters. To start with, the articulatory inferences from the acoustic data obtained in the three

experiments are reviewed and elaborated on. Those articulatory views are compared and

contrasted with the articulatory presumptions behind the existing formal proposals for the

representations of Arabic emphatics and gutturals. Along the way, an alternative basis for

the grouping of the different subsets of Arabic gutturals into one phonological natural

class is proposed. Formal phonological representations for Arabic emphatics and guttur-

als are then presented. These proposals are tested against the phonological constraints and

rules discussed in Chapter 2; namely, OCP-based morpheme structure constraints on Ara-

bic roots and guttural-conditioned vowel lowering. It was stated in Chapter 2 that these

rules and constraints pose severe challenges to existing representational proposals. It is

shown that the alternative representations provided here are more capable of handling

those challenges. Finally, the classifications of emphatic and guttural inventories in Tigre,

an Ethio-Semitic language, and Sta'at'imcest, an Interior Salish language, are touched on.

These languages differ from Arabic in terms of the groupings of emphatics and gutturals

into natural classes in terms of place of articulation. Such differences have to be reflected

in the phonological representations of their emphatic and guttural sounds. It is argued that


234

there are phonetic differences among these languages that underlie the representational

differences.

6.1 Emphatic and Guttural Articulations

Recall from Chapter 2 that emphatics and gutturals cooccur in Arabic consonantal

roots. According to previous representational proposals, this fact stands in violation of the

place OCP since both classes possess similar place or articulator terminal features or

class nodes. Such similarities are expected to trigger place OCP violations and, as a con-

sequence, restrict the cooccurrence of emphatics with gutturals. It was also mentioned in

Chapter 2 that McCarthy ( 1994) proposes that the domain of the place OCP applicability

should be limited by the conjunction of the feature [approximant] to the feature [pharyn-

geal]. In essence, Arabic gutturals are identified, for OCP purposes, as [pharyngeal,

+approximant]. Under McCarthy's proposal, since emphatics are not approximants, they

are excluded from this domain and their cooccurrence with gutturals is not restricted

since it results in no violation of the place OCP. Crucial to the success of this proposal is

that all Arabic guttural subclasses be classified as approximants. According to Catford

( 1977) approximants are consonants in which the vocal tract narrowing, when compared

to fricatives, "is somewhat larger, and flow through it is turbulent only when voiceless,

otherwise it is non-turbulent" (p. 122). Clements (1990) refines the definition of ap-

proximants to include "any sound produced with an oral tract stricture open enough so

that airflow through it is turbulent only if it is voiceless" (p. 293). This definition quali-

fies all gutturals to be approximants (since they do not involve any oral constrictions) and


235

supports McCarthy's argument. However, recent investigations of pharyngeal/laryngeal

articulations (Esling 1996, 1999; Edmondson et al. 2005) show that the articulatory pos-

sibilities in the pharyngeal cavity include approximants, fricatives, stops, and even trills.

We cannot, therefore, presume that a non-oral articulation is by default a non-fricative

one. The power spectra of pharyngeals, as shown in Experiment One (Chapter 3), support

the claims that these sounds are approximants. Laryngeals do not involve any supraglottal

constriction of any type and thus trivially qualify as approximants. Uvular continuants, on

the other hand, possess fricative-like acoustic attributes pointing to the involvement of

substantial airflow turbulence in their production. Arabic uvular continuants are clearly

fricatives. As non-approximants (i.e., [-approximant]), uvulars are expected to cooccur

with other gutturals as per McCarthy's proposal. Moreover, uvulars should not cooccur

with emphatics. Both of these expectations do not materialize as seen in Table 2.2. The

articulatory finding about uvulars poses serious challenges to McCarthy's arguments

since it reveals incompatibilities between his phonological designation of uvular contin-

uants and their phonetic reality.

The consistent and sizable correlation between higher F1 transitions and Arabic

pharyngeals, as shown in Experiment Two (Chapter 4 ), strongly suggests that pharyn-

geals are produced with an actively retracted tongue root (i.e., [ +RTR]). A retraction in

the tongue root produces a constriction in the pharynx near the node of F1 (see Figure

1.1) as reflected by the high value of that formant frequency. However, although uvulars

are also associated with relatively high F1 transitions in neighboring vowels, the raising

in the case of uvulars is not as substantial, nor as consistent, as that of pharyngeals. A

logical explanation for this is that the tongue root retraction in uvulars is not actively re-


236

tracted. The tongue root is backed only as a side-effect of the retraction of the tongue dor-

sum since these structures are closely attached together. In regards to laryngeals, the re-

sults show no particular association with high F1 transitions. This is in line with the pre-

dictions of acoustic-articulatory models since laryngeals do not involve any specific

supra-glottal narrowings of their own. Therefore, any claim pertaining to high F1 as an

acoustic correlate common among all gutturals (McCarthy 1994, Zawaydeh 1999) ex-

plaining their grouping into a natural class in phonology is challenged.

Experiment Two also shows that emphatics, like uvulars, are also associated with

high F1 transitions in neighboring vowels that are not as substantial nor as consistent as

the transitions next to pharyngeals. It seems that the tongue root retraction in emphatics is

also a byproduct of the general retraction of the tongue dorsum and not an independent

gesture. The most robust and most consistent acoustic correlate to Arabic emphatics is a

drop in F2 at the transition of the neighboring vowels. The raising of F1 adjacent to em-

phatics is not as crucial to the difference between emphatics and non-emphatics. While

these findings have been reported in numerous works, the present study provides stronger

evidence, in the form of linear discriminant analysis, of the capabilities of F1 raising and

F2 lowering to classify sounds on the basis of the presence or absence of emphaticness.

The inconsistency and smaller magnitude of change in the values of F 1 at the transitions

of vowels neighboring emphatics pose a serious challenge to the views that Arabic em-

phatics are [+RTR] sounds (Davis 1995, Rose 1996). If tongue root were actively re-

tracted during emphatics, we would expect them to consistently correspond to substan-

tially high F1 transitions on neighboring vowels. Furthermore, since the first node of F2

is located near the node of F1, we would also expect the [ +RTR] specification in emphat-


237

ics to trigger higher F2 transitions on neighboring vowels. Since both acoustic outcomes

are not what take place in reality, the claim that the tongue root produces the secondary

articulation in emphatics has to be dismissed. McCarthy (1994 ), Norlin (1987), and

Zawaydeh (1999) suggest that it is the tongue dorsum that produces the pharyngeal con-

striction in emphatics. McCarthy and Zawaydeh make similar suggestions for uvulars.

Both of these claims are supported by the results in Experiment Two. A constriction pro-

duced by a backed tongue dorsum takes place higher in the pharynx than where [ +RTR]

articulations are expected. Such constriction corresponds to the antinode of F2 which

would explain the low F2 in the transitions of vowels neighboring emphatics and uvulars.

This effect is stronger for emphatics since they involve a larger dorsal retraction than do

uvulars. I would also add here that the palatine dorsum lowering observed in emphatics

(but not uvulars) by Ali and Daniloff (1972) and Ghazeli (1977) is very important to the

acoustic correlate in question. This lowering widens the vocal tract near the second node

of F2. Since widening has the opposite effect of constriction, the result is an enhancement

of the acoustic product of the upper pharynx constriction.

The results of Experiment Two favor the dismissal of any precise pharyngeal con-

striction likeness between pharyngeals on the one hand and emphatics and uvulars on the

other. The tongue root, which is controlled by the pharyngeal constrictors, is the most

logical source of the pharyngeal constriction in the former, while the latter two classes

achieve their pharyngeal constrictions through the backing of the tongue dorsum, which

is controlled by the extrinsic muscles of the tongue. But does this, then, support the view

that emphatics are uvularized? The notion of a 'uvularized' sound means one of two pos-

sibilities. The first is that the sound has a secondary articulation involving the retraction


238

and raising of the tongue dorsum towards the uvula. The second is that the sound has a

secondary articulation that is a full-fledged copy of primarily uvular sounds in the sense

that there is a raising and retraction of the tongue dorsum with a concomitant action (curl-

ing or flattening) of the soft palate. The articulatory studies cited in Chapter 2 do not sup-

port either one of those possibilities. During emphatic articulation, the tongue dorsum is

retracted almost horizontally towards the posterior wall of the pharynx while the soft pal-

ate shows no peculiar action. The results of Experiment Three help us understand the ar-

ticulatory differences between emphatics and uvulars (and gutturals in general). Arabic

emphatics show strong resistance to vowel-to-vowel coarticulation. This finding suggests

that these sounds employ the same set of articulatory muscles as vowels. The direction of

the tongue back movement suggests that the muscles in action are the hyoglossus and the

styloglossus. Both of these two muscles pull back the tongue mass into the pharynx. As

far as the vertical placement of the tongue, these two muscles have effects that are in in-

direct opposition to each other. The styloglossus raises the back of the tongue while the

hyoglossus lowers it. This is the possible reason behind the variability in the height of the

tongue dorsum during emphatics. Compare the vertically balanced, or slightly lowered,

tongue dorsum during [f1] in Al-Ani's (1970) x-ray images as well as during both and

in Ghazeli's (1977) images to the lowered dorsum during in the study of Ali and

Daniloff's (1972) images as well as during in Ghazeli's images. The vector sum of

the simultaneous pull by the hyoglossus and the styloglossus is a general retraction on the

horizontal axis accompanied by slight variability in vertical positioning. All of the possi-

ble locations of the retracted tongue dorsum during emphatics occur near the first anti-

node of F2 which ensures a relatively stable acoustic effect.


239

We mentioned in discussing the results of Experiment Two that Arabic emphatics

show similar acoustic coarticulatory effects on neighboring vowels as those reported for

velarized sounds (Ladefoged and Maddieson 1996). This particular similarity merits fur-

ther elaboration. Russian is one language that has a set of velarized sounds. A rich source

of articulatory and acoustic evidence on Russian is Bolla (1981 ). Figure 6.1 shows

Bolla's x-ray tracings of Russian velarized [P'] and [r11 ] as well as their palatalized coun-

[P]

Figure 6.1. X-ray tracings of palatalized and velarizedlaterals and liquids in Russian (Bolla 1981, plates 76-79).


240•

terparts [Ii] and [ri] for the sake of comparison. We can clearly see that Russian velarized

sounds display tongue. dorsum retraction into the upper pharynx that is quite similar to the

retraction involved in Arabic emphatics. In fact, Bolla refers to these Russian variants as

'pharyngealized'. It seems that the articulatory configurations of Arabic emphatics are

not as rare or as unique as one would think when reviewing a large portion of the re-

search done on them. It seems also that none of the articulation-based terms used to refer

to Arabic emphatics mentioned in Chapter 1 are The term 'pharyngealized' is

better suited for sounds where the secondary articulation takes place lower in the pharynx

like Tsakhur vowels (Hess 1992). Such articulations are more likely executed by the pha-

ryngeal constrictor muscles, not the tongue muscles. I propose here calling Arabic em-

phatics (as well as Russian velarized sounds, dark [1] English, and similar sounds) 'oro-

pharyngealized' in reference to the general place where the retracted tongue dorsum

moves to. I, therefore, call for a new IPA diacritic to symbolize this articulation. The dia-

critic ['] used in the Handbook of the IPA (1999), as well as throughout this dissertation,

to symbolize secondary articulation in emphatics is more suited for true pharyngealized

sounds such as those in Tsakhur. The diacritic [¥] is derived from the corresponding sym-

bol [y] used for voiced velar fricatives. This symbol would be misleading since the

tongue dorsum movement during emphatics does not involve any substantial raising to-

wards the velar region.

19 'Velarized' seems to be relatively more appropriate than the terms pharyngealized and uvularized since it groups emphatics with phonetically similar sounds encountered in other languages. However, all three terms are not accurate articulatorily. Norlin (1987) rejects the term velarization based on the incompatibility between the real acoustic effects of emphaticness on neighboring vowels and the effects yielded by vocal tract models of velarization. It should be noted, however, that Norlin interprets the term 'velarization' literally (raised tongue dorsum towards the velar region) as shown in his vocal tract model implementations.


241

Further evidence against the claim that the secondary emphatic articulation is a

retraction of the tongue root is provided by Experiment Three (Chapter 5). In this ex-

periment, emphatics exhibit strong resistance to vowel-to-vowel coarticulation. This is to

be expected since their articulation largely employs the tongue dorsum which is the main

articulator of vowels. Had the tongue root been the implicated articulator, we would ex-

pect more substantial vowel-to-vowel coarticulation across emphatics since the articula-

tion of Arabic vowels does not implicate the tongue root. Uvulars range from strongly

opaque to vowel-to-vowel coarticulation to highly transparent to it. The ranking of resis-

tance vowel-to-vowel coarticulation among uvulars falls in a specific order: [q] > [X] >

[ff]. This is also the same ranking of the degree of constriction involved in those three

sounds. For the tongue back to execute a stronger degree of constriction in the uvular

area, it has to be raised higher and brought back further. The articulatory studies cited

earlier show exactly that. Raising and backing is what contracting the styloglossus does

to the tongue dorsum. So, this muscle has to be implicated in the articulation of all three

uvulars. However, since uvulars as a group allow vowel-to-vowel coarticulation, their

articulation cannot be executed primarily by the styloglossus as suggested in McCarthy

(1994). If the styloglossus were the main contributor in uvular articulation, we would ex-

pect much stronger resistance to vowel-to-vowel articulation since the styloglossus is a

very active muscle during vowel articulation. The degree of tongue body retraction in

uvulars is mostly less than in emphatics. On the other hand, the tongue dorsum is raised

considerably in uvulars as opposed to emphatics. There has to be a difference in the mus-

cles articulating both sounds. Raising the back of the tongue in the manner displayed by

uvulars is most likely due to the contraction of the palatoglossus. This is a very natural


242

assumption given the fact that this muscle is also affiliated with the soft palate which also

actively involved in uvular articulations. The exclusion of the hyoglossus from uvulars

makes further sense now. The hyoglossus pulls the tongue dorsum downwards away from

the uvular point of articulation in an action that is fully antagonistic to that of the pala-

toglossus (Seikel et al. 1997). So, emphatic articulation is most likely carried out by two

muscles, the styloglossus and the hyoglossus, that are also active in the production of

vowels while uvulars are produced mainly by the action of the palatoglossus along with

the velar depressors (for [B]) and velar tensors (for [X] and [q]). The styloglossus is ar-

gued here to be also involved in uvulars to maintain a constriction narrow enough for the

production of obstruents. The magnitude of styloglossus contribution increases in line

with the degree of constriction required by the uvular sound. Since the styloglossus is

also active in vowel production, uvulars resist vowel-to-vowel coarticulation in a manner

that reflects the degree to which this muscle is involved in their production.

In sum, the pharyngeal constrictions involved in emphatics and gutturals are the

products of different muscular mechanisms. Laryngeals are produced by the constriction

or spreading of the vocal folds which are controlled by the intrinsic muscles of the larynx.

Pharyngeals are clearly produced with a retracted tongue root which is controlled by the

pharyngeal constrictors. Uvulars are produced by complex velar and lingual maneuvers.

In [B], the velum is curled downwards and inside the mouth. In [X] and [q] the velum is

pulled upwards and flattened. The tongue dorsum in all three uvulars is pulled. up and

backwards. These maneuvers are executed by both lingual muscles (styloglossus and

palatoglossus) and velar muscles (tensors and depressors as well as the palatoglossus

again). The secondary articulation in emphatics is a retraction of the tongue dorsum as a


243

product of the contraction of both the hyoglossus and the styloglossus. The acoustic find-

ings also show that Arabic pharyngeals are approximants while uvular continuants are

fricatives. Following is a discussion of the phonological ramifications of these articula-

tory details.

6.2 Alternative Basis for the Guttural Natural Class

As explained in the Chapter 2, McCarthy ( 1994) excludes the possibility that

Arabic gutturals share a common active articulator. He offers instead the place of articu-

lation feature [pharyngeal] to represent these sounds phonologically. McCarthy justifies

this feature, which covers a wide articulatory area, on the basis of Perkell's (1980) pro-

posals that distinctive features are "orosensory patterns" that provide feedback informa-

tion specific to each articulatory movement and that are linked to consistent acoustic

characteristics. McCarthy rationalizes that the neuro-sensorily-impoverished pharynx (in-

cluding the larynx) should be treated as a single place of articulation. After explaining his

main proposal, McCarthy adds:

"The orosensory target model is not the only possible approach to the

problem posed by the feature [pharyngeal]. One obvious alternative is that

the pharynx has a uniform characterization in motoric terms. Clearly this

is not true at the lowest level: the uvular constriction is presumably made

primarily by a gesture of the styloglossus, while the true pharyngeals 1

and h are formed by the pharyngeal constrictors and the glottals are made


by the intrinsic muscles of the larynx. But it is certainly possible that these

consonants form a motoric unity at some much higher level." (p. 201)

244

McCarthy's assumption that uvulars are primarily controlled by the styloglossus

causes him to miss the "higher level" of "motoric unity" among the various gutturals. The

rationalization for the differences betwe.en the articulations of uvulars the emphatics pro-

posed earlier depends on dissecting the main articulators into the components of their

musculature. So the articulatory maneuvers and constraints are viewed as properties of

the individual muscles rather than· the complex speech organs that are controlled by sev-

eral muscles. Interestingly, this point of view has another important explanatory potential

as far as the unification of gutturals into a single natural class is concerned. When consid-

ering the innervation sources of the individual articulatory muscles, we notice that the

tongue muscles (including the styloglossus and the hyoglossus) receive their motor in-

nervation from the XII hypoglossal cranial nerve (Zemlin 1968, Perkins and Kent 1986,

Seikel et al. 1997). Meanwhile, motor innervation of the palatoglossus and most velar

muscles is supplied by the X vagus cranial nerve. An exception is the tensor veli palatini

which is innervated by the mandibular branch of the V trigeminal nerve. The X vagus

also supplies motor innervation for most of the pharyngeal muscles including the pharyn-

geal constrictor muscles that are the main active muscles during the articulation of

pharyngeals. One notable exception is the stylopharyngeus muscle which is innervated by

the IX glossopharyngeal cranial nerve. The intrinsic muscles of the larynx that control the

vocal folds and glottal spreading or constriction also receive their motor innervation from

the X vagus. So the main muscles in the pharyngeal region that are implicated in the ar-

ticulations of the three guttural subgroups share a main source of innervation: the X


245

vagus cranial nerve. This shared neuromotor conduit that is not involved in the produc-

tion of oral sounds should be seriously considered when asking why three sound sub-

classes that are produced at three different points of articulation are viewed as compo-

nents of a single class of sounds.

This neuromotor-based grouping of gutturals, along with the previous explana-

tions of vowel-to-vowel coarticulatory patterns involving Arabic uvulars and emphatics,

are largely in accord with Joos' (1948) "Overlapping Innervation Wave Theory" re-

advocated recently by Lindblom and Sussman (2002) and Lindblom et al. (2002). Joos'

concept is based on the assumption that speech segments are the result of individual neu-

ral waves sent simultaneously but individually from the speech control center to the in-

volved articulators. These waves increase and diminish in time and overlap with the

waves of neighboring segments. Resolving different neural instructions carried to the ar-

ticulator by neighboring waves is simply a matter of combination or subtraction. If the

two waves place non-contradicting demands on the articulator, the vector sum of the two

movements is executed. If the two articulatory demands are in contradiction with each

other, the weaker wave is subtracted from the stronger wave and the result gets executed

by the articulator. The innervation waves are static and invariant at the highest level of

organization. Joos refers to these abstract forms as 'neuremes'-his equivalent to the

phoneme. Aside from the theoretical details of this model, its basic concepts are not very

different from some of the more recent models that have been proposed to explain coar-

ticulation; mostly those of Fowler and Recasens (see §5.1).

The neural waves to the guttural articulators travel through separate pathways

from the neural waves to the lingual articulators. Hence, the neuromotor articulatory in-


246

structions for a pure guttural sound would not clash with those for vowels. This is why, in

Experiment Three, vowel articulations move relatively smoothly from one vowel to the

other through an intervening laryngeal or pharyngeal. In the secondary articulation of

emphatics, the implicated muscles and neural pathways are the same ones implicated in

the articulations of vowels. This way, the neural instructions to the articulators clash re-

suiting in an interruption of the articulatory transition from one vowel to the next through

an intervening emphatic. In uvulars, both neural pathways (lingual and guttural) are em-

ployed. However, the strength of the signal traveling through the lingual neural pathway

depends on the required degree of constriction. In [q], the degree of constriction is maxi-

mal. Hence, the lingual neural signal is relatively strong and is capable of interrupting the

exchange between the neural instructions of the flanking vowels. This translates in a

strong resistance for vowel-to-vowel coarticulation. In [ff], the degree of constriction is

minimal meaning the lingual neural wave is weak and is easily overridden by those of the

flanking vowels. This is why [ff] allows substantial vowel-to-vowel coarticulation. The

voiceless [X] falls in between both in lingual involvement and transparency to vowel-to-

vowel coarticulation.

Clearly, then, the main muscles argued here to be involved in the articulation of

uvulars are closely related, from a neuromotoric angle, to those involved in the articula-

tion of both pharyngeals and laryngeals. We can, therefore, attribute those muscles to a

single active articulator?0 This articulator extends from the anterior faucial pillars to the

2° Kenstowics (1994) briefly makes a similar suggestion noting that "[a] staunch supporter of the articulator model might speculate that the Glottal and Tongue Root articulators run along the same pathway and branch more superficially: the relevant sense of 'articulator' would be defined at this more abstract level" (p. 457). Zawaydeh (1999) also generally refers to the pharynx as an active articulator. She provides no convincing reasoning, however.


247

larynx, inclusively. This v1ew can be considered as a complement to Perkell's and

McCarthy's identification of distinctive features as orosensory patterns. Like all guttural

articulations, all lingual articulations share a single source of motoric innervation (the XII

hypoglossal). However, unlike the pharynx, the tongue is divided into two distinct active

articulators: the tongue blade and the tongue dorsum. This disparity can be understood

when considering the lower degree of tactile sharpness available to the general pharyn-

geal articulator as opposed to that available to the lingual articulators. The more detailed

sensory feedback in the oral cavity maximizes the efficiency of the XII hypoglossal nerve

as single conduit of motoric innervation to two active articulators.

The guttural components in the formal representations of Arabic gutturals should

be used in reference to this common articulator. So, based on the implicated muscles and

their motoric innervation sources, laryngeals, pharyngeals, and uvulars should be re-

garded as involving guttural components (corresponding to the intrinsic laryngeal mus-

cles, the pharyngeal constrictors, and the palatoglossus with the soft palate muscles along

with the X vagus nerve which innervates those muscles). Uvulars should also contain a

dorsal component (corresponding to the styloglossus muscle and the XII hypoglossal

nerve which innervates all of the tongue muscles). On the other hand, emphatics should

involve only a dorsal component (corresponding to the styloglossus and the hyoglossus as

well as the XII hypoglossal nerve). Since none of the muscles innervated by the X vagus

nerve are argued to be actively involved in the articulation of Arabic emphatics, no gut-

tural components should be present in their formal representations.


248

6.3 Formal Representations

Formal phonological representations of Arabic emphatics and gutturals are pre-

sented here in the light of the previous elaboration on the articulatory traits of those

sounds. The general representational framework adopted here is that of Halle et al.'s

(2000) Revised Articulator Theory (RAT) presented in ( 1) in the Introduction. There are

two main reasons for this choice. First, the anatomical organization of the vocal tract is

reflected accurately in this model. Most notably, the grouping of the Tongue Root with

the Larynx under the Guttural node reflects the close relationship between the muscles

controlling those organs as pointed by the authors. This particular point strongly agrees

with the explanations and arguments in §6.2 above regarding the articulatory affinities

among guttural sounds. Second, the model introduces a formal way of reflecting the dif-

ferences between primary and secondary articulations that avoids the shortcomings of

competing proposals. The latter point will be explained shortly.

I propose here some changes to Halle et al.'s (2000) model. First, the term 'Place'

should be replaced with 'Oral':' Using the term 'Place' conveys the idea that no portion

of the vocal tract is considered an encompassing place of articulation but the oral tract.

Clearly the pharynx and the larynx are other vocal tract portions where speech articula-

tions are made. Moreover, the term 'Oral' highlights the dichotomy of the vocal tract into

two main articulatory zones, the oral tract and the guttural tract. Alternatively, McCarthy

(1994) suggests retaining the Place node and bifurcating it into an Oral node and the fea-

21 The similar term "Oral Cavity" is used by Clements (1987) to refer to the class node that dominates the node Place (basically the equivalent of Place in the Halle et al. 2000 model) and the stricture feature [±continuant].


249

ture [pharyngeal] (see (16)). According to McCarthy, this solution is motivated in part by

some phonological rules of vowel-to-vowel assimilation that are blocked by Oral conso-

nants but not guttural ones. Note, however, that Halle et al.'s model does not need tore-

sort to such stipulation since there is an inherent separation between the Oral and the Gut-

tural nodes. Second, I propose changing the name of the articulator node 'Tongue Root'

to 'Pharynx'. While the term 'Tongue Root' is not formally problematic, it remains a

misnomer and a potential source of confusion. Tongue root-based articulations are clearly

not mainly executed by any of the tongue muscles. Rather such articulations are con-

trolled by the pharyngeal constrictors. Thus, using the term 'Pharynx' instead adds to the

clarity of the phonetic naturalness of the model. Lastly, I suggest adding the feature [ap-

proximant] to the class features [consonantal] and [sonorant]. According to Clements

( 1990) approximants act as a group in some phonological processes. Both McCarthy

(1994) and Padgett (1995) argue that this feature interacts with OCP-based restrictions on

morpheme well-formedness. The modified feature tree is shown in (20).

The Halle et al. (2000) model acknowledges six articulators and represents them

by the nodes Lips, Tongue Blade, Tongue Body, Soft Palate, Tongue Root, and Larynx.

The first three articulator nodes are grouped under the higher level node Place while the

last two are grouped under Guttural. One feature under each articulator node is unary

while the rest are binary. The unary features are the six articulator features [labial], [cor-

onal], [dorsal], [rhinal], [radical], and [glottal]. The presence of any of those features in

the representation of a phoneme indicates that its respective articulator is the designated

articulator for that phoneme. If two of those features are present, the phoneme is consid-

ered a complex sound with two primary articulations. The authors use the phoneme /kP/


(20) Modified version of Halle et al' s (2000) feature tree. [suction]------------------.. [continuant] -----------------.... [strident] -----------------.. [lateral]----------------.... [round] [labial] > Lips ---------..

250

[consonantal] [distributed] /Tongue [sonorant]

[approximant] [high]-------.. [low] -------..... [back] -------' [dorsal] ____ __,

Tongue Body

[nasal]---------....> Soft Palate __ __,

[ATR] [R TR] ) Pharynx ------..

[spread gl] --------.. [constricted gl] --------.... [stiff vfJ ---------7 Larynx ___ __, [slack vfJ ___ ____, [glottal] ___ ___.,

Guttural

as an example. For this sound, both [labial] and [dorsal] are specified under their respec-

tive articulator nodes indicating that the sound is labiodorsal. If one of the articulations in

a complex sound is primary and the other is secondary, only the articulator feature of the

primary articulation is present. The example used by the authors for this type of sounds is

the phoneme /kw/. In this case, only the feature [dorsal] is present indicating that this

sound is primarily dorsal. To indicate that the sound is a labialized dorsal, the feature

[+round] is specified under the Lips articulator without the presence of the feature [la-

bial].


251

In this sense, Arabic emphatics have the articulator feature [coronal] present un-

der the articulator node Tongue Blade since Arabic emphatics are primarily coronal. The

feature [+back] is specified under the Tongue Body articulator node signifying the pres-

ence of a secondary backing of the tongue dorsum in Arabic emphatics. It should be

noted that this representational proposal for Arabic emphatics is not new. In illustrating

the superiority of the RAT model over Herzallah's (1990) V-Place-based model in repre-

senting Arabic emphatics, Halle et al. (2000) offer a formal proposal containing the fea-

ture [+back] under the Tongue Body node. I adopt this representation here with the minor

replacement of Place with Oral in accordance with the general modification to the model

stated earlier. The modified representation is displayed in (21 ).

(21) Representation of Arabic emphatics (slightly modified from Halle et al. 2000:409) .

t" d" o" s"

I Oral

Blade Body

[cor] [+back] [-high] [-low]

The relevance of the features [-high] and [-low] is not of great importance for our spe-

cific purposes. It can be argued, however, that these two features are a phonological re-

flection of the articulatory vertical equilibrium of the simultaneous use of two muscles to

execute the dorsal retraction: the styloglossus, which is also a tongue back elevator, and

the hyoglossus, which is also a tongue back depressor.


252

In the same context where Halle et al. present their representation for emphatics,

they propose the representation shown in (22) for Arabic uvulars that treat them primarily

(and solely) as velars.

(22) Halle et al.' s representation of Arabic uvulars (2000:409)

XBq I

Place

Body

[dors] [+back][-high] [-low]

In this representation, the main articulation of uvulars is identical to the secondary articu-

lation of emphatics. It appears that Halle et al. (2000) agree with the view that Arabic

emphatics are 'uvularized' coronals which is a view that this dissertation rejects. The

acoustic evidence presented here favors the well established view that uvulars are com-

plex segments that have both a dorsal and a radical component (see, in particular, Elorri-

eta 1991). However, unlike the prominent claim that Arabic uvulars are double articu-

lated (McCarthy 1994; Davis 1995, Zawaydeh 1999), I argue here that Arabic uvulars,

including the stop [q] are primarily guttural and secondarily dorsal. Phonetic evidence for

this comes from the timing of tongue movements involved in the articulation of uvulars.

Ladefoged and Maddieson (1996) note that, in double articulated sounds, both articula-

tions are simultaneous. It was mentioned in Chapter 2 that Delattre (1971) notes in his x-

ray study that there is a two-staged movement by the tongue body during the articulation


253

of Arabic uvulars. The first is a horizontal sliding backwards and the second is a raising

of the retracted tongue body towards the soft palate. It is clear that the two articulations

involved in the production of Arabic uvulars are not simultaneous. Furthermore, accord-

ing to Ladefoged and Maddieson (1996), only stops and nasals may be doubly articu-

lated. For fricatives produced by two gestures, one of those gestures has to be considered

a secondary articulation. Secondary articulations usually start before and end after pri-

mary ones (emphasis spread comes to mind here). It is logical to assume that the backing

movement is the secondary articulation in uvulars while the raising is the primary articu-

lation. It has already been explained that tongue retraction is a result of constriction of the

styloglossus (keeping in mind that the hyoglossus cannot be contracted during uvulars). I

therefore consider Arabic uvulars to be primarily guttural and secondarily dorsal. The

designated articulator feature for their primary articulation is [radical] under the Pharynx

node. Hence, I give uvulars the phonological representation in (23).

(23) Alternative representation of Arabic uvulars

XHq r---, Oral Guttural

I I Body Pharynx

rT--1 (l [+back][-high] [-low] [radical] [-RTR]

Notice that I designate uvulars as [-RTR]. This goes against several previous claims

(Davis 1995, Halle 1995, Rose 1996, Shahin 1997, Zawaydeh 1999). Previous proposals

that Arabic uvulars are [+RTR] rely, for the most part, on Ghazeli's (1977) x-ray images.


254

These images are interpreted as depictions of a tongue root retraction in uvulars. How-

ever, a look at the tongue root and epiglottis locations in Ghazeli's images of plain oral

consonants reveals that these locations in Ghazeli's subject (himself) start at rest points

that are further back than usual. It is only logical that a backing of the tongue dorsum,

such as what we see in uvulars, results in further backing of the epiglottis and tongue root

to which they are attached. Given the already backed start locations, the added by-

product retraction might seem subjectively large. On the other hand, if we consider Delat-

tre's (1971) x-ray images of Arabic uvulars we see no substantial backing of the tongue

root and epiglottis. Additionally, the acoustic evidence reported in Experiment Two

shows that Fl transitions in vowels neighboring uvulars, while generally high, are not as

high as in vowels neighboring pharyngeals. The claim that uvulars are [+RTR] is unsub-

stantiated phonetically.

As for Arabic pharyngeals and laryngeals, I give them the representations in (24).

We have argued earlier that only pharyngeals involve .an active retraction of the root of

the tongue (through the action of the pharyngeal constrictors). Accordingly, the feature

[ +RTR] is present only in these sounds. The whole supra-laryngeal tract in laryngeals is

passive during laryngeals. Therefore, these sounds are considered [-RTR].

(24) Alternative representations of Arabic pharyngeals and laryngeals h) h?

I I Guttural Guttural

I Pharynx Pharynx

rl rl [radical] [+RTR] [radical] [-RTR]


255

Now that we have introduced the alternative proposals for the formal phonologi-

cal representations of Arabic emphatics and gutturals, let us review the capability of these

proposals to overcome the descriptive and analytical shortcomings of the previous pro-

posals. We will discuss how these alternatives handle the Arabic MSCs of root cooccur-

rence restrictions and guttural-conditioned vowel lowering.

6.3.1 Arabic Morpheme Structure Constraints Revisited

It was mentioned in §2.5 that existing representational proposals predict that Ara-

bic emphatics and gutturals should not cooccur in the same roots since the secondary ar-

ticulation in emphatics and the primary articulation in gutturals are similar at the level of

the terminal features ([pharyngeal]- McCarthy 1991, 1994), class nodes (Pharyngeal-

Rose 1996), or Lower Vocal Tract node (L VT - Zawaydeh 1999) triggering an OCP

violation. In reality, Arabic emphatics and gutturals cooccur quite freely as seen in Table

2.2. The alternative representations proposed here overcome this problem trivially since

the secondary articulation in emphatics is fundamentally different from the primary ar-

ticulations in gutturals. However, there are two problems in this context that we still need

to address. The first problem concerns the free cooccurrence of emphatics and uvulars

while both share dorsal components. It was noted in §2.5 that combinations of emphatics

and velars or uvulars and velars are avoided in Arabic roots. We can easily say that since

velars are dorsal sounds and both emphatics and uvulars are secondarily dorsal, the pres-

ence of an emphatic or a uvular in an Arabic root that also has a velar stands in violation

of the OCP. But why do emphatics and uvulars cooccur rather freely when both have

secondary dorsal components? The second problem concerns the free cooccurrence of the


256

uvular stop [q] with low gutturals (pharyngeals and laryngeals). We would expect that the

cooccurrence of [ q], as a guttural itself, with other gutturals to be significantly more re-

stricted than what Table 2.2 shows. We propose here some modifications to the domain

and mechanism for the application of place OCP to address these two problems in order.

Let us start with the issue of emphatic-uvular unrestricted cooccurrence. We men-

tioned in §2.5 that Pierrehumbert (1993) requires the place OCP effect to exclude secon-

dary articulations. This requirement, in its present shape, is too loose and would wrongly

predict that velars should cooccur freely with emphatics as well as with uvulars. I would

like to modify this requirement by stating that place OCP effect excludes secondary ar-

ticulations only if both affected articulations are secondary. What this means is that a

primary articulation cannot cooccur with a similar articulation whether it is primary or

secondary. On the other hand, a secondary articulation can cooccur with a similar secon-

dary articulation. It is necessary for the possibility of satisfying this requirement to mod-

ify of the level at which the place OCP operates. Place OCP applies to individual auto-

segmental place features (Mester 1986; McCarthy 1988, 1991, 1994; Yip 1989 - see

also Padgett 1995 for the role of stricture features in place OCP application). I propose

here that the place OCP functions at the level of the articulator class nodes in Arabic

roots. In the Halle et al. (2000) articulator-based model a terminal articulator feature is

basically a label indicating that its dominating articulator class node is the designated ar-

ticulator executing the structure features of the segment. So there is some equivalence

between the articulator features and their dominating articulator nodes. There is an addi-

tional benefit to this proposal. Place OCP affects only articulator features but not binary

features like [±back]. McCarthy (1986; cited by Padgett 1995) argues that articulator fea-


257

tures that are subject to place OCP effects must be privative. While this requirement is

inherited in the Halle et al. (2000) model used here, arguing for class nodes as the domain

over which place OCP operates discards the need for this requirement to begin with.

The presence of a terminal articulator feature in a sequence of sounds makes the

tier of that feature, along with the tier of its dominating articulator node, visible for the

place OCP. If a certain articulator feature is not present in any one of a sequence of pho-

nemes, the tier of its dominating node is invisible to the place OCP. So, if two segments

employ an articulator which is secondary in both cases, these segments are not avoided

since the relevant articulator nodes are not visible to the effect of the OCP. If, on the

other hand, the articulation in question is secondary in one sound but primary in the

other, the relevant tier on which the articulator node is projected would be visible and the

OCP would take effect. So, the place OCP can be formulated as seen in (25).

(25) Place OCP

*x x

Artie. Node Y Artie. Node Y

[artie.] ([artie.])

In (24) the placement of the terminal articulator feature in the second phoneme in paren-

thesis indicates that the presence of the terminal articulator feature in one or both pho-

nemes triggers an OCP violation at the node level since it takes only one terminal articu-


258

lator feature to expose the tier of its dominating node to OCP operation. All Arabic gut-

turals share the same articulator ([radical]), causing the node Pharynx to be visible to

place OCP effects. Thus, their cooccurrence in the same root is restricted since it violates

the place OCP as shown in (26; irrelevant nodes and features are omitted).

(26) *x

Pharynx Pharynx

[radical] [radical]

The cooccurrence of emphatics with the velar [k] (and similarly the cooccurrence of uvu-

lars with [k]) is restricted since the Tongue Body node is exposed to the effects of the

place OCP. This exposure is due to the existence of the dependant articulator feature

[dorsal] in [k]. An example is given in (27).

(27) *t" k

Tongue Body Tongue Body

[dorsal]


259

As for the cooccurrence of emphatics and uvulars, neither class has the feature [dorsal] in

its representation causing the tier of the articulator node Tongue Body to be invisible for

place OCP purposes as shown in (28). Therefore, emphatics and uvulars may cooccur

freely in Arabic roots.

(28) X

Tongue Body Tongue Body

Specifying the applicability domain of the place OCP over the tier of articulator

class nodes, as we do here, is, in some ways, procedurally similar to the treatment of root

cooccurrence restrictions in Padgett's (1995) 'articulator group' proposal. Padgett pro-

. poses that the elements addressed by the place OCP are the articulators. Upon identifying

articulator similarities, the OCP mechanism than checks for similarities in "OCP-

subsidiary features" which are stipulated in the language for a given articulator.

We turn now to the free cooccurrence of the uvular stop [q] with low gutturals.

We saw earlier that the cooccurrence of two coronals in Arabic roots is not avoided

unless both are sonorants, fricatives, or stops. McCarthy (1994, and references therein)

use the limiting statements in (29) to govern the domain within which the OCP may ap-

ply to restrict the root cooccurrence of Arabic coronals.


260

(29) Applicability domain of the OCP for Arabic coronals (from McCarthy 1994:206)

a. [coronal] I __ b. [coronal] I __

[ acontinuant] [asonorant]

The first statement denotes that OCP-based restrictions apply to two coronals only if they

share the same feature specification for [continuant]. The second statement indicates that

OCP-based restrictions apply to two coronals only if they share the same feature specifi-

cation for [sonorant]. Going back to the issue at hand, notice that Arabic low gutturals are

approximants while uvulars (including [q]) are not. Meanwhile, all Arabic gutturals, ex-

eluding [q], are continuants. The feature specifications for [approximant] and [contin-

uant] for Arabic gutturals are listed in (30).

(30) Stricture features specifications for Arabic gutturals

[approx] [cont]

q

+

lll + +

h? + +

Notice that, in terms of articulatory stricture, the uvular stop [q], on the one hand, and the

low gutturals, on the other hand, are maximally different. Both feature specifications for

these two sides are opposite each other. Uvular continuants stand in the middle, sharing

one feature specification with each side. It is possible to use those differences and rela-

tions in stricture features to refine our understanding of place OCP-based restrictions on

the cooccurrence of gutturals. Interaction between stricture and place OCP application is

well documented (see Padgett 1995 and references therein). According to Padgett, a Ian-


261

guage stipulates which features interact with place OCP applicability on a given articula-

tor. Whereas this stipulation is integrated in a process of 'checking' for OCP applicability

in Padgett's theory, I follow McCarthy in stating the stipulation as independent limiting

statements whose satisfaction is a prerequisite for the application of the place OCP.

Based on these two constriction features, I would like to present the statement in (31) to

limit the place OCP applicability domain for Arabic gutturals.

(31) Applicability domain of the OCP for Arabic gutturals

[radical] I __ or

[acont] [aapprox]

According to this statement, two guttural segments must share the same specification for

[approximant] or [continuant] to trigger a place OCP violation. The uvular stop [q] does

not cooccur with either [X] or [ff] since all three sounds are [-approximant]. All gutturals,

excluding [q], are [+continuant], explaining their rare cooccurrences. On the other hand,

[q] does not share any of the feature specifications for [approximant] and [continuant]

with pharyngeals and laryngeals. The cooccurrence of [q] with low gutturals, therefore,

lies outside the limitations of (31 ).

In sum, the proposed representations for Arabic emphatics and gutturals, in con-

junction with the formulations in (25) and (31) offer more adequate explanations for the

patterns of root cooccurrences involving emphatics and gutturals than previous proposals.

We turn next to the issue of guttural lowering in vowels.


262

6.3.2 Guttural Lowering Revisited

In this section, we take a look at the issue of vowel lowering in the neighborhood

of gutturals but not emphatics. Section 2.3.2 reviewed some of the phonological evidence

presented by a number of phonologists illustrating that, in several languages, vowels are

lowered to [a] or epenthetic vowels surface as [a] when adjacent to gutturals. McCarthy

(1994) treats this as a spreading of the feature [pharyngeal] from the guttural to the

vowel. Emphatics, which in McCarthy's proposals also have the feature [pharyngeal], do

not trigger such lowering effects. McCarthy vaguely appeals to the link between ap-

proximants and low vowels to justify this disparity. In the alternative proposals we pre-

sent here, vowel lowering can be treated as the spreading of the feature [radical] from

gutturals to target vowels. The spreading rule is formulated in (32). Since emphatics un-

derlyingly lack the articulator node Pharynx, they do not cause vowel lowering.

(32) Guttural Vowel Lowering

[radical]

To explain the relationship between the low vowel [a] and the articulator [radical], I use

Calabrese's (1993, cited in Halle et al. 2000) 'equivalency relations'. According to Halle

et al., "[t]his idea is implemented formally by positing that Universal Grammar includes a

special set of rules whose adoption ... is favored over the adoption of other rules". Halle

et al. posit a rule which equates [dorsal, -back] in consonants with [coronal] in vowels.


263

Following this reasoning, we can posit a similar rule which states that [radical] in conso-

nants is equaled by [dorsal, +low] in vowels.

There remains an important issue to be tackled here. While emphatics generally

do not trigger vowel lowering, there is a process discussed in Herzallah ( 1990) and

McCarthy (1994) where in some varieties of Arabic vowel lowering appears in the adja-

cency of gutturals and emphatics. This process, known as ?imiila 'raising and fronting',

is reported in Northern Palestinian Arabic and Syrian Arabic, among other Eastern Ara-

bic dialects. In these dialects, the feminine suffix surfaces as [-i] or [-e] unless the stem-

final consonant is a guttural, an emphatic, or a contextually emphaticised [r]. Evidence

from Northern Palestinian, as shown in Herzallah (1990:136-137) is presented in

(33) ?imiila in Northern Palestinian Arabic (Herzallah 1990:136-137)

Plain stems

hilm-i 'one dream' sall-i 'a basket' barz-i 'projection' samak-i 'a fish'

Emphatic stems Guttural stems

bat't'1-a 'a duck' falq-a 'a piece, or a half' buus1-a · 'a bamboo stick' salx-a 'one skinning' buuz"-a 'ice cream' mam-a 'loitering' mar'r'-a 'once' fallah-a 'a peasant'

mar3-1 'a small plain' fariid'-a 'an obligation' zarii f!-a 'plants, offspring' walh-a 'love, or sudden

awakening' farJ-i 'a mattress'

Herzallah analyses this process as the spreading of [pharyngeal] to the suffix

vowel. This process seemingly challenges my present argument that emphatics do not

have a pharyngeal component. Notice, however, that the fact that [r] causes vowel lower-

ing only when contextually emphaticised indicates that the representational constituent

22 Herzallah uses the symbol [a] for the emphaticised version of [a].


264

whose spreading triggers 7imala does not have to be part of that sound's underlying rep-

resentation.

To account for this process, I propose here that East Arabic languages have a spe-

cific rule that assigns [radical] to any sound that has the feature set [+back, -high, -low].

This rule is ordered before the vowel lowering rule in (32). In some sense, this proposal is

the reverse of McCarthy's (1994) proposal that the feature [dorsal] is redundantly as-

signed to emphatics. McCarthy's proposal is not a natural one since not all pharyngeal

constrictions are executed by the tongue body. Recall that in pharyngeals the tongue dor-

sum is not usually retracted, while in laryngeals the whole tongue is virtually passive. It

makes little sense, then, to assume that a redundancy rule would assign [dorsal] to a [pha-

ryngeal] sound only if it is an emphatic. The reverse rule proposed here is more natural

since it can be linked to the fact that retracting the tongue dorsum necessarily results in an

oropharyngeal narrowing since the tongue mass would have no other place to move to.

Furthermore, the link between sounds that involve a retracted tongue dorsum and guttural

articulations is not rare as the following section shows.

6.4 A Note on Ethio-Semitic and Interior Salish

Having discussed the alternative representations of Arabic emphatics and guttur-

als, we turn our attention briefly to Ethio-Semitic and Interior Salish. Phonological data

from both language families has already been reviewed in Chapter 2. The main issue we

discuss here is that both language families seem to treat their emphatics, in terms of place

of articulation, as members of guttural natural class to the exclusion of laryngeals. Laryn-


265

geals in these languages are considered placeless. This is different from the case of Ara-

bic where emphatics are not usually grouped with gutturals as a natural class while laryn-

geals are (as reflected by the free cooccurrence of the members of both classes in conso-

nantal roots as well as most cases of vowel lowering). Ethio-Semitic evidence for this

difference comes from Tigre (Lowenstamm and Prunet 1985, 1987; cited in Rose 1996)

where pharyngeals and emphatics (ejectives) trigger vowel lowering, but not laryngeals.

Interior Salish evidence is cited in §2.3.1 where, in Moses-Columbian, a pharyngeal

sound cannot be followed by another pharyngeal, a uvular, or an emphatic (retracted al-

veolar) in the same root. No such restriction on laryngeals exists. The differences in

sound groupings among the three languages (Arabic, Tigre, and Moses-Columbian) merit

some consideration. I point out here that there are phonetic differences among the three

languages as far as emphatics and gutturals are concerned. These differences call for a

reconsideration of the formal representations of emphatics and gutturals in the three lan-

guages.

We start with Tigre. The emphatic sounds in Tigre are known to be ejectives. This

means that the airstream pressure source in the articulation of Tigre emphatics is glottalic

rather than pulmonic. According to Catford (1977), in the glottalic initiation of airstream

pressure "the glottis is tightly closed, the larynx is jerked upwards by action of the extrin-

sic laryngeal muscles, which attach the larynx to the hyoid bone and other structures

above it". Catford also adds that, "[t]here may, in addition, be some secondary sphinc-

teric compression of the pharynx" (p. 68). This maneuver pushes the air volume above

the glottis outwards. This laryngeal action employs both the intrinsic and the extrinsic

muscles of the larynx. The intrinsic muscles execute the glottal closure while the extrinsic


266

muscles raise the while larynx t9 push the air volume. The sphincteric narrowing of the

pharynx is caused by the pharyngeal constrictors. According to Seikel et al. (1997), the

laryngeal elevators include the extrinsic muscles of the larynx as well as the thyra-

pharyngeus muscle which is part of the large inferior pharyngeal constrictor muscle. Both

the intrinsic muscles of the larynx and the pharyngeal constrictors are guttural muscles

according to the understanding we established here. It is safe to assume, then, that Tigre

emphatics involve a guttural articulation not present in Arabic emphatics. Proposed rep-

resentations for Tigre pharyngeals and emphatics are shown in (34). Following Rose

(1996), Tigre ejectives are represented as [+RTR]. This feature receives phonetic backing

from Catford's (1977) description cited earlier. The presence of a guttural component in

Tigre gutturals and emphatics should explain their treatment as a natural class in phonal-

ogy.

(34) Representation of Tigre emphatics ( ejectives) and pharyngeals. a. Ejectives b. Pharyngeals

I Oral Guttural Guttural

I I I T. Blade Pharynx Pharynx

I I [coronal] [+RTR] [radical] [ +RTR]

In regards to Interior Salish, the acoustic work by Bessell and Czaykowska-

Higgins (1992) shows that there are strong phonetic similarities between emphatics (or

retracted alveolars, as they are often called), on the one hand, and the uvular and pharyn-


267

geal gutturals, on the other. They provide acoustic vowel space charts showing the effects

of those sounds on vowels. The charts clearly show that emphatics, uvulars, and pharyn-

geals in Salish correspond to substantially high Fl transitions and substantially low F2

transitions. In Arabic, by comparison, we saw in Chapter 4 (Experiment Two) that em-

phatics correspond to substantially low F2 transitions and mildly high Fl transitions.

Arabic pharyngeals, meanwhile, correspond with substantially high Fl transitions and

slightly lowered F2 transitions. These results suggest that there is a difference in the ar-

ticulatory nature of emphatics and gutturals in Arabic and Salish. A recent ultrasonic

study of Sta'at'imcest (Lillooet Salish) by Namdaran (forthcoming) strongly backs up

this suggestion. The relatively large group of guttural sounds in Sta'at'imcest is generally

similar to that in other Interior Salish dialects. Namdaran reports that emphatics in

Sta' at' imcest are articulated with a retracted tongue root. Retraction of the tongue dorsum

in these sounds seems to be dialect dependant. This is different from Arabic where the

tongue root is only retracted as a byproduct of the strong tongue dorsum retraction. Fur-

thermore, Namdaran notes that the tongue root retraction in Sta'at'imcest [5'] is not severe

(cf. Arabic pharyngeals). This sound also involves a substantially retracted tongue dor-

sum somewhat similar to the uvular [q]. In Sta'at'imcest pharyngeals, the pharyngeal

constriction takes place in the upper pharynx. As a matter of fact, Namdaran describes

the Sta' at' imcest sounds [5' ,lw ,l' ,l'w] as pharyngealized uvulars. Arabic [5'] is articulated

with a largely retracted tongue root while the tongue dorsum is held in a mid oral position

(Ghazeli 1977, Delattre 1971). There clearly are a fundamental differences between Ara-

bic and Sta'at'imcest in the articulations of emphatics and gutturals. Both classes of

sounds in Sta'at'imcest involve active retraction of the tongue root and, depending on the


268

dialect, active retraction of the tongue dorsum. Representations of Sta'at'imcest emphat-

ics and pharyngeals are shown in (35). The representation of Sta'at'imcest emphatics is

essentially a version of the representation given by Namdaran (forthcoming) modified

slightly to fit the Halle et al. (2000) model.

(35) Representations of Sta'at'imcest retracted alveolars and pharyngealized uvulars. a. Retracted Alveolars b. Pharyngealized Uvulars

Oral Guttural Oral Guttural

I I I I T. Blade Pharynx T. Body Pharynx

I I [coronal] [+RTR] [+back] [radical] [+RTR]

The reasoning behind the grouping of Tigre and Sta'at'imcest gutturals into a sin-

gle natural class is less abstract than the one provided for Arabic in §6.2. The natural

class of gutturals in Tigre and Sta'at'imcest includes all sounds that are produced with a

retracted tongue root (i.e., [+RTR]). Laryngeals are excluded from these classes since

they do not involve any supraglottal constriction of their own.

6.5 Summary

This chapter puts together the results of the previous three acoustic experiments

and arrives at more detailed understanding regarding the articulatory traits of Arabic em-

phatic and guttural sounds. These details are used to motivate alternative formal represen-


269

tations of these sounds. Emphatics are argued to be primarily coronal and secondarily

dorsal sounds. Uvulars are argued to be primarily radical and secondarily dorsal. Pharyn-

geals and laryngeals are argued to be radical sounds. Only pharyngeals are argued to be

[+RTR] sounds in Arabic. The three guttural subclasses are argued to be produced by a

single active articulator, the pharynx, based on high-level motoric unity among the differ-

ent guttural muscles producing those subclasses. The formal representations presented

here are shown to be better at handling the theoretical challenges facing existing propos-

als. Emphatics are allowed to cooccur freely with gutturals since emphatics do not share a

pharyngeal component with gutturals. While both emphatics and uvulars have dorsal

components, it is argued that since these components are representations of secondary

articulations in both classes of sounds, they do not trigger any OCP violations. Hence, the

cooccurrence of emphatics and uvulars is not restricted. The free cooccurrence of the

uvulars stop [q], which is considered here to be a guttural, with low gutturals is argued to

be a consequence of maximal distinction between this stop and the low gutturals in terms

of stricture features.

Vowel lowering in guttural contexts is argued to be the result of spreading the ar-

ticulator feature [radical]. Emphatics do not generally cause vowel lowering since the

lack the feature [radical]. However, to account for cases where both emphatics and gut-

turals cause vowel lowering in some Eastern Arabic dialects, it is argued here that a re-

dundancy rule in such dialects assigns [radical] to any sound with the feature specifica-

tion [+back, -high, -low]. In these dialects, the assigned feature [radical] in emphatics

and the contextually emphaticised [r], along with the underlying feature [radical] in gut-

turals are spread one to the target vowel by a single vowel lowering rule.


270

Two other languages that have emphatic and guttural sounds are briefly dis-

cussed: Tigre and Sta'at'imcest. What is interesting about these two languages is that

they group emphatics with the gutturals and exclude laryngeals. The phonological identi-

ties of emphatics and gutturals in these two languages have to be different from those in

Arabic. It is argued here that there are phonetic differences that underlie these phonologi-

cal differences. Tigre emphatics are ejectives meaning they involve a pharyngeal action

of larynx raising. Furthermore, these sounds are argued elsewhere to be [ +RTR], an ar-

ticulatory property not uncommon in ejective sounds. Recent articulatory investigations

on Sta'at'imcest show that their emphatics (or retracted alveolars) always involve are-

tracted tongue root and occasionally involve a retracted dorsum. The pharyngeals in this

language involve a sizeable dorsal retraction along with the tongue root retraction causing

their narrowest constriction to take place in the upper pharynx. Emphatics in both lan-

guages are argued to include radical components representing their active secondary pha-

ryngeal articulations. The natural class of gutturals in Tigre and Sta'at'imcest are there-

fore defined as the sounds that have the feature [ +RTR]. This admits emphatics and

pharyngeals (and uvulars in the case of Sta'at'imcest- Tigre has no uvulars) and ex-

cludes laryngeals.


271

CHAPTER 7

Conclusion and Future Directions

This dissertation is motivated primarily by the descriptive and analytical inade-

quacies facing existing formal representations of Arabic emphatic and guttural sounds. It

is believed that these inadequacies are a direct consequence of a lack of un.derstanding of

the phonetic similarities and differences among those sounds. This dissertation aims to

further our understanding of the phonetic qualities of those sounds in order to arrive at

alternative formal representations that are more capable of handling the relevant phono-

logical phenomena such as root cooccurrence restrictions and vowel lowering. The dis-

sertation also seeks to propose a more elaborate reasoning for the grouping of the three

Arabic guttural sub-classes (uvulars, pharyngeals, and laryngeals), which are produced at

distinct points of articulation, into a single natural class. To achieve these goals, the dis-

sertation relies on a set of three acoustic experiments using speech samples from Modern

Standard Arabic. The dependence on acoustic data to further our knowledge about articu-

lation follows from the well-established theories and models of acoustic-articulatory rela-

tions.

The first acoustic experiment focuses on two gaps in the acoustic literature on

Arabic emphatics and gutturals. This first gap is the lack of any extensive and objective

acoustic research on the spectral differences between Arabic emphatics consonants and

their non-emphatic counterparts. The spectral shapes of consonants are among the most

important correlates to articulation. The possibility that consonantal spectral correlates to


272

emphaticness could be detected is in need of extensive, modern research. The second gap

is the lack of any reliable phonetic characterization of the consonantal status of Arabic

uvular continuants. These sounds are mostly characterized as fricatives. However, in

some phonological views, these sounds are classified as approximants. A principled pho-

netic judgment is needed in this matter. This experiment addresses these two issues by

describing and comparing the spectral shapes of the consonants in question using two

spectral analysis methods: spectral moments and multi-band spectra. The latter is a novel

tool being introduced for the first time in this dissertation. The results indicate that no

highly reliable acoustic correlates to emphaticness can be located in the spectral shapes of

the consonants. The canonical spectra of consonants are, therefore, not considered reli-

able sources for emphatic/non-emphatic acoustic differences. More importantly, the re-

sults show that Arabic uvular continuants have the spectral qualities of fricatives. This

finding has major impacts on any phonological views that are predicated on classification

of all Arabic gutturals as approximants.

The second acoustic experiment focuses on the coarticulatory impact of emphat-

ics, non-emphatics, and gutturals on the formant frequencies of adjacent vowels. The dif-

ferent coarticulatory impacts of these sounds on the Fl and F2 of adjacent vowels are in-

terpreted as indications of the articulatory qualities of those sounds. The results show that

the most reliable coarticulatory correlate to emphaticness is a substantially low and stable

F2 locus in the vowel adjacent to the emphatic sound. Uvulars are also associated with

low F2 transitions. These transitions, however, do not point to identifiable F2 loci in ad-

jacent vowels. The size of F2 drop in uvulars depends on the identity of the identity of the

vowel. Pharyngeals, meanwhile, are associated with consistently high Fl transitions.


273

While emphatics and uvulars are also generally accompanied by high Fl transitions,

these transitions are not as high nor as consistent as those accompanying pharyngeals.

These findings are interpreted as indications that the only sounds in Arabic where the

tongue root is actively retracted are pharyngeals. Emphatics and uvulars, meanwhile, are

produced with a retracted tongue dorsum. Any tongue root retraction in emphatics and

uvulars is considered a by-product of the dorsal retraction. These findings pose chal-

lenges to the phonological views that represent Arabic emphatics and uvulars as [+RTR]

sounds. Furthermore, the more stable association between low F2 transitions and emphat-

ics, as opposed to uvulars, is interpreted as an indication that the dorsal retractions in em-

phatics and uvulars are not quite similar.

The third acoustic experiment investigates the coarticulatory effects of two flank-

ing vowels on each other across an intervening consonant which was a plain oral, an em-

phatic, or a guttural. Such coarticulatory effects are usually influenced by the articulatory

restriction placed on the tongue dorsum by the intervening consonant. The results show

that plain orals and low gutturals allow significant amounts of anticipatory and carryover

vowel-to-vowel coarticulatory effects. Arabic emphatics, on the other hand, strongly and

consistently resist these effects. The impacts of uvular gutturals on vowel-to-vowel coar-

ticulation depend on the degrees of constriction involved in their articulations. The

voiced uvular fricative [B], which involves mild constriction, permits large degrees of

vowel-to-vowel coarticulatory effects. The voiceless fricative [X], which involves a

higher degree of constriction, allows coarticulatory effects that are not as substantial. The

uvular stop [q], which involves the highest degree of constriction, strongly resists vowel-

to-vowel coarticulation. These results are interpreted as indications that the involvement


274

of the tongue dorsum in the articulations of emphatics and uvulars are not fully similar.

The tongue dorsum during the articulation of Arabic emphatics seems to be pulled back

through the action of both the styloglossus and the hyoglossus muscles. Both these mus-

cles are employed by vowel articulations explaining why these consonants strongly resist

vowel-to-vowel coarticulation. By comparison, the dorsal retraction in uvulars seems to

involve only the styloglossus since contraction of the hyoglossus is directly antagonistic

to the tongue raising necessary for their articulation. The degree of involvement of the

styloglossus in uvulars depends on their degree of constriction. Since the stop [q] requires

the most contraction by the styloglossus, this uvular interfere the most with vowel-to-

vowel coarticulation. The tongue raising during the articulation of uvulars seems to be, at

least partially, the responsibility of the palatoglossus muscle. All three uvulars also in-

volve active participation by the soft palate.

The articulatory perspectives gained from the three acoustic experiments are over-

viewed in Chapter 6 and used to revise the phonological representations of Arabic em-

phatics and gutturals. Emphatics are argued to include a dorsal component in their repre-

sentation, but no pharyngeal component. Uvulars are argued to include both dorsal and

pharyngeal components. Pharyngeals and laryngeals should include pharyngeal compo-

nents only. Formal representations reflecting these views are presented and shown to be

· more adequate at handling root cooccurrence restrictions and vowel lowering in Arabic

than existing proposals. The chapter also introduces a new reasoning for the grouping of

Arabic gutturals into a single natural class. The three guttural subclasses are argued to be

produced by a single active articulator, the pharynx, based on high-level motoric unity

among the different guttural muscles producing those subclasses. The chapter also briefly


275

discusses two other languages that possess emphatic and guttural sounds, Tigre and

Sta'at'imcest. These two languages group emphatics with the gutturals into a single natu-

ral class and exclude laryngeals. It is argued that there are phonetic differences that un-

derlie these phonological differences. Unlike Arabic emphatics, Tigre and Sta'at'imcest

emphatics seem to involve active pharyngeal articulations. The natural class of gutturals

in Tigre and Sta'at'imcest are defined as the sounds that have the feature [+RTR]. This

admits emphatics and excludes laryngeals.

Much of the mischaracterization of the articulatory properties of the sounds inves-

tigated in this dissertation stems from the physical proximity and dependency between

the precise articulators executing those sounds. In future research, these sounds can be

subjected to extensive and direct articulatory studies. A very telling method would be

electromyography (EMG). This method can measure and diagnose individual muscle ac-

tivities with precision. This method is used to study speech sound articulation. However,

there· are not many published works in this field as far as pharyngeal articulations are

concerned. A possible reason is that EMG applications are quite invasive since they re-

quire the insertion of electrodes in the forms of small pins or plates into the musculature.

One could imagine how difficult this would be if the musculature being studied is in a

more internal organ like the pharynx or the larynx. Thus, finding and applying an alterna-

tive to EMG that is less invasive would add great depth to our understanding of the nu-

ances of speech production.

Studies of muscle fiber types in the different articulatory muscles could also be

very informative as far as the distinctions among articulators are concerned. Muscle fi-

bers are generally classified into slow-twitch Type I fibers and fast-twitch Type II fibers.


276

Dr. Raymond Kent brought to my attention the possibility of grouping and classifying

articulators based on the distribution of different types of fibers in their controlling mus-

cles (see Kent 2004). The oral-guttural dichotomy could potentially be based on the ratio

of slow-twitch fibers to fast-twitch fibers present in the musculatures of those cavities. So

far, no studies dedicated to that goal have been conducted.

Staying in the field of articulatory studies, one can see that most articulatory

works reviewed in this dissertation employ x-rays. This method, while still considered

very helpful for studying speech production, poses some health risks to subjects. Pro-

longed exposure to radiation is not a liability experimenters and participants are easily

willing to take. The risks are aggravated if one intends to study speech production in mo-

tion. One downside to this is that all x-ray studies cited here had very few subjects, or

even just one. At times it would be only the experimenters themselves who are willing to

take the risk. Computerized tomography (CT) scans also use x-rays and, therefore, inherit

the same health risks. Recently, use of ultrasound imaging and magnetic resonance imag-

ing (MRI) as methodological alternatives has generated increasing interest. These meth-

ods carry much less health risk than x-rays. The advent of 3-D ultrasound and motion

"open MRI" carries with it exciting new methodological potential for speech sciences.

Employing these new methods can enrich our understanding of the details of pharyngeal

articulations and speech production as a whole.

This dissertation introduces the multi-band spectral (MBS) method as a refine-

ment of the venerable FFT method for the purpose of characterizing obstruent power

spectra. This method shows real potential as an economic, quantitative alternative to FFT

spectra. Given that the application of this method in the present dissertation is limited to


277

eight sounds, five subjects, and one language, extending its use to include more sounds

from more subjects and from various languages is of the essence. This should test the

method's cross-subject, cross-examiner, and cross-linguist reliability.

Finally, a cross-linguistic articulatory and phonological comparison of guttural

sounds is needed in order to provide more solid grounding for their phonological repre-

sentations. As we briefly see in Chapter 6 of this dissertation, some guttural sounds in

unrelated, or partially related, languages that are phonologically presumed to be equiva-

lents are substantially different articulatorily. This can result in theoretical descriptive and

analytical problems. Given that phonological representations of speech sounds are, in

general, phonetically grounded, starting out with a solid, thorough, and methodologically

advanced phonetic (articulatory and acoustic) understanding of the similarities and differ-

ences among the guttural sounds in different languages should be the first step before

characterizing them phonologically. I believe that this notion is true for all speech sounds

in general. In this manner, observable and testable facts can play an even larger role in

specifying abstract, higher-level phonological units.


278

REFERENCES

Al-Ani, S. (1970). Arabic phonology; an acoustical and physiological investigation. The Hague: Mouton.

Ali, L. & Daniloff, R. (1972). A contrastive cinefluorographic investigation of the articulation of emphatic-non emphatic cognate consonants. Studia Linguistica, 26, 81-105.

Ali, L. & Daniloff, R. (1974). The perception of coarticulated emphaticness. Phonetica, 29, 225-231.

Alwan, A. (1989). Perceptual cues for place of articulation for the voiced pharyngeal and uvular consonants. Journal of the Acoustical Society of America, 86, 549-556.

Anderson, S. (1985). Phonology in the twentieth century: theories of rules and theories of representations. Chicago: University of Chicago Press.

Ar-Razi, Muhammad Ibn Abi Bakr. (1976). Mukhtar Al-Sihah. (M. Khatir, Ed.). Cairo: al-Hay'ah al-Misriyah al-'Ammah lil-Kitab. (Original work undated).

Avery, P. & Rice, K. (1989). Segment structure and coronal underspecification. Phonology, 6, 179-200.

Baalbaki, R. (1995). Al-Mawrid: A modern Arabic-English dictionary. Beirut: Dar El-Ilm Lilmalayin.

Bateson, M. (1967). Arabic language handbook. Washington: Center for Applied Linguistics.

Bessell, N. (1992). Towards a phonetic and phonological typology ofpostvelar articulations, Ph.D. Dissertation, University of British Columbia.

Bessell, N. & Czaykowska-Higgins, E. (1992). Interior Salish evidence for placeless laryngeals. Proceedings of North Eastern Linguistics Society. 22, 35-49.

BIAS, Inc. (1996). Peak LE. Computer software.

Bickley, C. & Stevens, K. (1987). Effects of a vocal-tract constriction on the glottal


279

source: data from voiced consonants. InT. Baer, C. Sasaki and K. Harris (eds.), Laryngeal function in phonation and respiration. Boston: College-Hill Press.

Bladon, R. & Al-Bamerni, A. (1976). Coarticulation resistance in English /1/. Journal of Phonetics, 4, 137-150.

Blevins, J. (2004). Evolutionary phonology: The emergence of sound patterns. Cambridge: University Press.

Blumstein, S. & Stevens, K. (1979). Acoustic invariance in speech production evidence from measurements of the spectral characteristics of stop consonants. Journal of the Acoustical Society of America, 66, 1001-1017.

Boersma, P. & Weenink, D. (1992). PRAAT. Computer program.

Boff Dkhissi, M-C. (1983). Contribution a !'etude experimentale des consonnes d'arriere de l'arabe classique (locuteurs marocains). Travaux de l'Institut de Phonetique de Strasbourg, 15, 1-363.

Bolla, K. (1981). A conspectus of Russian speech sounds. Koln: Bohlau.

Brame, M. (1972). On the abstractness of phonology: Maltese? In M. Brame (ed.), Contributions to Generative Phonology, 22-61. Austin: University of Texas Press.

Broselow, E. (1979). Cairene Arabic syllable structure. Linguistic Analysis, 5, 542-582.

Butcher, A. & Ahmad, K. (1987). Aerodynamic characteristics of pharyngeal consonants in Iraqi Arabic. Phonetica, 44, 156-172.

Card, E. (1983). A phonetic and phonological study of Arabic emphasis, Ph.D. Dissertation, Cornell University.

Catford, J. (1977). Fundamental problems in phonetics. Bloomington: Indiana University Press.

Chiba, T. & Kajiyama, M. (1958). The vowel, its nature and structure. Tokyo: Phonetic Society of Japan. (Originally published in 1941).

Choi, J. (1995). An acoustic-phonetic underspecification account of Marshallese vowel allophony. Journal of Phonetics, 23, 323-347.


280

Choi, J. & Keating, P. (1991). Vowel-to-vowel coarticulation in three Slavic languages. University of California Working Papers in Phonetics, 78, 78-86.

Chomsky, N. & Halle, M. (1968). The sound pattern of English: Studies in language. New York: Harper & Row.

Clark, J. & Yallop, C. (1995). An introduction to phonetics and phonology. Cambridge, Massachusetts: Blackwell.

Clements, G. N. (1985). The geometry of phonological features. Phonology Yearbook, 1985, 225-252.

Clements, G. N. (1987). Phonological feature representation and the description of intrusive stops. CLS 23: Parasession on Autosegmental and Metrical Phonology. Chicago. 29-50.

Clements, G. N. (1990). The role of the sonority cycle in core syllabification. In J. Kingston and M. Beckman (eds.), Papers in Laboratory Phonology I, 283-333. Cambridge: University Press.

Davis, S. (1995). Emphasis spread in Arabic and grounded phonology. Linguistic Analysis, 26, 465-498.

Delattre, P. (1971). Pharyngeal features in the consonants of Arabic, German, Spanish, French, and American English. Phonetica, 23, 129-155.

Delattre, P., Liberman, A., & Cooper, F. (1955). Acoustic loci and transitional cues for consonants. Journal of the Acoustical Society of America, 27,769-773.

Dembowski, J. (1998). Articulator point variability in the production of oral stop consonants, Ph.D. Dissertation, University of Wisconsin--Madison.

Edmondson, J., Esling, J., Harris, J., & Huang, T. (2005). A laryngoscopic study of glottal and epiglottal/pharyngeal stop and continuant articulations in Amis--an Austronesian language of Taiwan. Language and Linguistics, 6, 381-396.

El-Dalee, M. (1984). The feature of retraction in Arabic, Ph.D. Dissertation, Indiana University.

El-Halees, Y. (1985). The role ofF1 in the place-of-articulation distinction in Arabic.


281

Journal of Phonetics, 13, 287-298.

Elorrieta, J. (1991). The feature specification of uvulars. Proceedings of the West Coast Conference on Formal Linguistics. 10, 139-149.

Esling, J. (1996). Pharyngeal consonants and the aryepiglottic sphincter. Journal of the International Phonetic Association, 26, 65-88.

Esling, J. (1999). The IPA categories "pharyngeal" and "epiglottal": Laryngoscopic observations of pharyngeal articulations and larynx height. Language and Speech, 42, 349-372.

Evers, V., Reetz, H., & Lahiri, A. (1998). Crosslinguistic acoustic categorization of sibilants independent of phonological status. Journal of Phonetics, 26, 345-370.

Fant, G. (1960). Acoustic theory of speech production. The Hague: Mouton & Co.

Forrest, K., Weismer, G., Milenkovic, P., & Dougall, R. (1988). Statistical analysis of word-initial voiceless obstruents preliminary data. Journal of the Acoustical Society of America, 84, 115-123.

Fowler, C. (1980). Coarticulation and theories of extrinsic timing. Journal of Phonetics, 8, 113-133.

Fowler, C. & Saltzman, E. (1993). Coordination and coarticulation in speech production. Language and Speech, 36, 171-195.

Fre Woldu, K. (1981). Facts regarding Arabic emphatic consonant production. Reports from Uppsala University Department of Linguistics, 7.

Fulop, S., Kari, E., & Ladefoged, P. (1998). An acoustic study of the tongue root contrast in Degema vowels. Phonetica, 55, 80-98.

Garnes, S. (1975). An acoustic analysis of double articulations in Ibibio. Ohio State University Working Papers in Linguistics, 20, 44-55.

Ghazeli, S. (1977). Back consonants and backing coarticulation in Arabic, Ph.D. Dissertation, University of Texas at Austin.

Giannini, A. & Pettorino, M. (1982). The emphatic consonants in Arabic: Speech


laboratory report IV. Naples: Istituto Universitario Orientale.

Goldsmith, J. (1976). Autosegmental phonology, Ph.D. Dissertation, Massachusetts Institute of Technology.

282

Grossman, R. (1964). Sensory innervation of the oral mucosae. Journal ofthe Southern California State Dental Association, 32, 128-133.

GoldWave, Inc. (2002). GoldWave. Computer software.

Halle, M. (1983). On distinctive features and their articulatory implementation. Natural Language & Linguistic Theory, 1, 91-105.

Halle, M. (1995). Feature geometry and feature spreading. Linguistic Inquiry, 26, 1-46.

Halle, M., Hughes, G., & Radley, J. (1957). Acousticproperties of stop consonants. Journal of the Acoustical Society of America, 29, 107-116.

Halle, M. & Stevens, K. (1969). On the feature [Advanced Tongue Root]. Quarterly Progress Report of the MIT Research Laboratory in Electronics, 94, 209-215.

Halle, M., Vaux, B., & Wolfe, A. (2000). On feature spreading and the representation of place of articulation. Linguistic Inquiry, 31, 387-444.

Harris, K. (1958). Cues for the discrimination of American English fricatives in spoken syllables. Language and Speech, 1, 1-7.

Hayward, K. & Hayward, R. (1989). 'Gutteral': arguments for a new distinctive feature. Transactions of the Philological Society, 87, 179-193.

Heath, J. (1987). Ablaut and ambiguity: phonology of a Moroccan Arabic dialect. Albany: State University of New York Press.

Heinz, J. & Stevens, K. (1961 ). On the properties of voiceless fricative consonants. Journal of the Acoustical Society of America, 33, 589-596.

Herzallah, R. (1990). Aspects of Palestinian Arabic phonology: a nonlinear approach, Ph.D. Dissertation, Cornell University.

Hess, S. (1992). Assimilatory effects in a vowel harmony system: An acoustic analysis of


advanced tongue root in Akan. Journal of Phonetics, 20, 475-492.

Holes, C. (1994). Arabic. In R. Asher (ed.), The Encyclopedia of Language and Linguistics, 191-194. Oxford: Pergamon Press.

283

Honda, K., Kusakawa, N., & Kakita, Y. (1992). An EMG analysis of sequential control cycles of articulatory activity during utterances. Journal of Phonetics, 20, 53-63.

Hughes, G. & Halle, M. (1956). Spectral properties of fricative consonants. Journal of the Acoustical Society of America, 28, 303-310.

Ibn al-Jazari, Muhammad. (1986). Al-Tamhid Fi 'IlmAl-Tajwid. (Gh. Hamad, Ed.). Beirut: Maktabat al-Ma'arif. (Original work undated).

Ibn Jinni, Abu al-Fath 'Uthman. (1954). Sirr sina'at al-i'rab. (M. As-Saqqa, M. Az-Zafzaaf, I. Mustafa, & A. Amin, Eds.). Cairo: Mustafa Lubabi Al-Halabi & Sons.

IPA. (1999). Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet. Cambridge: Cambridge University Press.

Jakobson, R., Fant, G., & Halle, M. (1952). Preliminaries to speech analysis: The distinctive features and their correlates. Cambridge, MA: Acoustics Laborataory Massachusetts Institute of Technology.

Jakobson, R. & Halle, M. (1956). Fundamentals of language. The Hague: Mouton.

Jassem, W. (1995).The acoustic parameters of Polish voiceless fricatives: An analysis of variance. Phonetica, 52, 251-158.

Jongman, A., Wayland, R., & Wong, S. (2000). Acoustic properties of English fricatives. Journal of the Acoustical Society of America, 108, 1252-1263.

Joos, M. (1948). Acoustic Phonetics. Language: Journal of the Linguistic Society of America, Language Monographs 24, 1-136.

Kardach, J., Wincowski, R., Metz, D., Schiavetti, N., Whitehead, R., & Hillenbrand, J. (2002). Preservation of place and manner cues during simultaneous communication: a spectral moments perspective. Journal of Communication Disorders, 35, 533-542.


I

284

Kaye, A. (1990). Arabic. In Comrie, B. (Ed.), The World's Major Languages, 664-685. New York: Oxford University Press.

Keating, P. (1985). CV phonology, experimental phonetics, and coarticulation. UCLA Working Papers in Phonetics, 62, 1-13.

Kenstowicz, M. (1994 ). Phonology in generative grammar. Cambridge, Massachusetts: Blackwell.

Kent, R. (2004). The uniqueness of speech among motor systems. Clinical Linguistics & Phonetics, 18, 495-505.

Kent, R. & Read, C. (2002). Acoustic analysis of speech. Albany, NY: Singular.

Kewley-Port, D. (1982). Measurement of formant transitions in naturally produced stop consonant-vowel syllables. Journal of the Acoustical Society of America, 72, 379-389.

Keyser, S. & Stevens, K. (1994). Feature geometry and the vocal tract. Phonology, 11, 207-236.

Kingston, J. & Diehl, R. (1994). Phonetic knowledge. Language, 70, 419-454.

Kuriyagawa, F. (1984). The Features of /k/ and /q/ in Cairo Standard Arabic. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, 18, 65-73.

Ladefoged, P. & Maddieson, I. (1996). The sounds of the world's languages. Oxford: Blackwell.

LaRiviere, C., Winitz, H., & Herriman, E. (1975). The distribution of perceptual cues in English prevocalic fricatives. Journal of Speech and Hearing Research, 18, 613-622.

Laufer, A. & Baer, T. (1988). The emphatic and pharyngeal sounds in Hebrew and in Arabic. Language and Speech, 31, 181-205.

Laufer, A. & Condax, I. ( 1979). The epiglottis as an articulator. Journal of the International Phonetic Association, 9, 50-56.

Laufer, A. & Condax, I. (1981 ). The function of the epiglottis in speech. Language and


Speech, 24, 39-62.

Leben, W. (1973). Suprasegmental phonology, Ph.D. Dissertation, Massachusetts Institute of Technology.

285

Lehn, W. (1963). Emphasis in Cairo Arabic. Language: Journal of the Linguistic Society of America, 34, 29-39.

Liberman, A., Delattre, P., & Cooper, F. (1954). The role of consonant-vowel transitions in the perception of the stop and nasal consonants. Psychological Monographs, 68, 1-13.

Lieberman, P. & Blumstein, S. (1988). Speech physiology, speech perception, and acoustic phonetics. Cambridge: Cambridge University Press.

Lindblom, B. & Sussman, H. (2002). Principal components analysis of tongue shapes in symmetrical VCV utterances. Fonetik 2002., (TMH-QPSR). Stockholm. 44, 1-4.

Lindblom, B., Sussman, H., Modarresi, G., & Burlingame, B. (2002). The trough effect: Implications for speech motor programming. Phonetica, 59, 245-262.

Lowenstamm, J. & Prunet, J-F. (1985). Tigre vowel harmonies. Paper presented at the 16th Annual Conference on African Linguistics. Yale University.

Lowenstamm, J. & Prunet, J-F. (1987). Vertical harmonies in Tigre. Paper presented at the 18th Annual Conference on African Linguistics. UQAM.

Maeda, S. & Honda, K. (1994). From EMG to formant patterns of vowels: The implication of vowel spaces. Phonetica, 51, 17-29.

McCarthy, J. (1979). Formal problems in Semitic phonology and morphology, Ph.D. Dissertation, Massachusetts Institute of Technology.

McCarthy, J. (1986). OCP effects: Gemination and antigemination. Linguistic Inquiry, 17(2), 207-263.

McCarthy, J. (1988). Feature geometry and dependency: A review. Phonetica, 45, 84-108.

McCarthy, J. (1991). Semitic gutturals and distinctive feature theory. In B. Comrie and


286

M. Eid (eds.), Perspectives on Arabic Linguistics, III., 63-91. Amsterdam: Benjamins.

McCarthy, J. (1994). The phonetics and phonology of Semitic pharyngeals. In P. Keating (ed.), Phonological Structure and Phonetic Form: Papers in Laboratory Phonology Ill. Cambridge: Cambridge University Press.

McCawley, J. (1967). Le role d'un systeme de traits phonologique dans une theorie du langage. Languages, 8, 112-123.

Mester, R. (1986). Studies in tier structure, Ph.D. Dissertation, University of Massachusetts at Amherst.

Microsoft Corp. (1985). Excel. Computer software.

Microsoft Corp. (1987). PowerPoint. Computer software.

Milenkovic, P. (2000). TF32. Computer program.

Namdaran, N. (forthcoming). Retraction in St'at'imcets (Lillooet Salish): An ultrasonic investigation, Masters Thesis, University of British Columbia.

Nittrouer, S. (1995). Children learn separate aspects of speech production at different rates: Evidence from spectral moments. Journal of the Acoustical Society of America, 97, 520-530.

Norlin, K. (1987). A phonetic study of emphasis and vowels in Egyptian Arabic: Lund University, the Department of Linguistics Working Papers 30.

Obrecht, D. ( 1961 ). Effects of the second format in the perception of verlar1zation in Lebanese Arabic, Ph.D. Dissertation, University of Pennsylvania.

Odden, D. (1986). On the role of the Obligatory Contour Principle in phonological theory. Language, 62, 353-383.

Odden, D. (1988). Anti anti-gemination and the OCP. Linguistic Inquiry, 19,451-475.

Ohman, S. (1966). Coarticulation in VCV utterances: Spectrographic measurements. Journal of the Acoustical Society of America, 39, 151-168.


287

Padgett, J. (1995). Stricture in feature geometry. Stanford: CSLI Publications.

Palmer, J. (1993). Anatomy for speech and hearing. Baltimore: Williams & Wilkins.

Penfield, W. & Rasmussen, T. (1950). The cerebral cortex of man. New York: Macmillan.

Perkell, J. (1971 ). Physiology of speech production: A preliminary study of two suggested revisions of the features specifying vowels. MIT Research Laboratory of Electronics Quarterly Progress Report, 102, 123-139.

Perkins, W. & Kent, R. ( 1986). Functional anatomy of speech, language and hearing: A primer. San Diego: College-Hill Press.

Pickett, J. (1999). The acoustics of speech communication: Fundamentals, speech perception theory, and technology. Needham Heights, MA: Allyn & Bacon.

Pierrehumbert, J. (1993). Dissimilarity in the Arabic verbal roots. Proceedings of North Eastern Linguistics Society. 23, 367-381.

Purcell, E. (1979). Formant frequency patterns in Russian VCV utterances. Journal of the Acoustical Society of America, 66, 1691-1702.

Recasens, D. (1985). Coarticulatory patterns and degrees of coarticulatory resistance in Catalan CV sequences. Language and Speech, 28, 97-114.

Recasens, D., Farnetani, E., Fontdevila, J., & Pallares, M. (1993). An electropalatographic study of alveolar and palatal consonants in Catalan and Italian. Language and Speech, 36, 213-234.

Recasens, D., Fontdevila, J., & Pallares, M. (1996). Linguopalatal coarticulation and alveolar-palatal correlations for velarized and non-velarized 11/. Journal of Phonetics, 24, 165-185.

Ringel, R. (1970). Oral region two-point discrimination in normal and myopathic subjects. Second Symposium on Oral Sensation and Perception. Springfield, Ill: Charles C. Thomas. 309-321.

Rose, S. (1996). Variable laryngeals and vowel lowering. Phonology, 13,73-117.


Saussure, F. (1966). Course in generallinguistics. (W. Baskin, Trans) New York: McGraw-Hill. (Original work published 1915).

Seikel, J., King, D., & Drumright, D. (1997). Anatomy and physiology for speech and language. San Diego: Singular Publishing Group.

288

Semaan, K. (1963). Arabic phonetics; Ibn Sina's Risalah on the points of articulation of the speech-sounds. Arthur Jeffery memorial monographs, no. 2. Lahore: Sh. Muhammad Ashraf.

Shahin, K. ( 1997). Postvelar harmony: An examination of its bases and cross linguistic variation, Ph.D. Dissertation, University of British Columbia.

Shahin, Kimary N. (1996). Accessing pharyngeal place in Palestinian Arabic.

Sibawayh. (1898). Kitab Sibawayh. Baghdad: Al-Muthanna Library. (Original work undated).

Sproat, R. & Fujimura, 0. (1993). Allophonic variation in English /1/ and its implications for phonetic implementation. Journal of Phonetics, 21, 291-311.

SPSS, Inc. (1989). SPSS. Computer program.

Stevens, K. (1989). On the quantal nature of speech. Journal of Phonetics, 17, 3-45.

Stevens, K. (1993). Models for the production and acoustics of stop consonants. Speech Communication, 13, 367-375.

Stevens, K. (1999). Articulatory-acoustic-auditory relationships. In W. Hardcastle and J. Laver (eds.), The handbook of phonetic sciences. Oxford: Blackwell.

Stevens, K. (1998). Acoustic Phonetics. Cambridge, Massachusetts: The MIT Press.

Stevens, K. & Blumstein, S. (1978). Invariant cues for place of articulation in stop consonants. Journal of the Acoustical Society of America, 64, 1358-1368.

Stevens, K. & House, A. (1955). Development of a quantitative description of vowel articulation. Journal of the Acoustical Society of America, 27, 484-493.

Stevens, K. & House, A. (1956). Studies of formant transitions using a vocal tract analog.


289

Journal of the Acoustical Society of America, 28, 578-585.

Stevens, K. & House, A. (1961 ). An acoustical theory of vowel production and some of its implications. Journal of Speech and Hearing Research, 4, 303-320.

Strevens, P. (1960). Spectra of fricative noise in human speech. Language and Speech, 3, 32-49.

Tabain, M. (1998). Non-sibilant fricatives in English: Spectral information above 10kHz. Phonetica: International Journal of Speech Science, 55, 107-130.

Trubetskoi, N. (1969). Principles of phonology. Berkeley: University of California Press.

Vaux, B. (1993). Is ATR a laryngeal feature? Ms. Harvard University.

Wehr, H & Cowan, J. (1979). A dictionary of modern written Arabic: (Arabic-English). Wiesbaden: Harrassowitz.

Yip, M. (1989). Feature geometry and cooccurrence restrictions. Phonology, 6, 349-374.

Younes, M. (1982). Problems in the segmental phonology of Palestinian Arabic, Ph.D. Dissertation, University of Texas at Austin.

Younes, M. (1993). Emphasis spread in two Arabic dialects. In M. Eid and C. Holes (eds.), Perspectives on Arabic Linguistics, V, 119-145. Amsterdam: Benjamins.

Zawaydeh, B. (1997). An acoustic analysis ofuvularization spread in Ammani-Jordanian Arabic. Studies in the Linguistic Sciences, 27, 185-200.

Zawaydeh, B. (1999). The phonetics and phonology of gutturals in Arabic, Ph.D. Dissertation, Indiana University.

Zemlin, W. (1968). Speech and hearing science; anatomy and physiology. Englewood Cliffs, N.J.: Prentice-Hall.


290

APPENDIX A

Stimuli - Experiments One and Three

Carrier phrase:

"?alkalimtu hiya _"

"The word is "

IPA Arabic Script Gloss

kitaab y\.:f 'book'

9iqah ' .. 'trust' '"-"-' '

?ixaa? l.:>-1 >- < 'fraternity'

?iBaa9ah it$-1 'aide/rescue'

zihaaf :.__:,\.;... . J 'picking up speed'

Yl..+ 'mountain passes'

'\11 J < 'frame'

Jidaad ;\...L.::. 'harsh ones'

'lighting'

kisaf 'pieces'

'quorum'

'bones'

sikak 'streets' J

kutib 'was written'

kutub J

'books' ' J

kutal 'masses'


9uqif

suquf

buqa<i

?ux;io

dux;uul

dux;aan

lmmub

eusaa?

bubie

suhub

tuhaf

bu'li9

fu'lab

wu'luud

budi?

sudus

cJ.3udad

kusib

?usus

' 0 ... :.

0 •

0 :,

' ! ..\.:.:.\

' , .

' ' }

0 • / !

' }

::Jy)

+ jf

:.j,' . )

' } 0...\.J , .

' ' ' u"..L..o

' ' ' , ' ' '

' } ! <..J""""\

0 J !'

291

'became understood'

'ceilings'

'spots'

'was taken'

'entrance'

'smoke'

'fatigue'

'bleating'

'was researched'

'clouds'

'works of art'

'was sent'

'branches'

'promises (n.)'

'was crossed out'

'frames'

'fresh dates'

'was started'

'one sixth'

'new ones'

'was digested'

'heated grilling stones'

'curiosity'

'was won'

'bases'

'was phlebotomized'

'statues'


ku6ib

nu6ur

nukat

sukuut

fatak

batuul

waqib

eaquf

faqad

saxun

?axa6

?axiir

faHif

faHab

faHUuf

fahub

d3ahad

taflib

fafluul 0 ,,

jk!

'was lied'

' (warning) signs'

'shreds'

'shady'

'systems'

'shaded'

'was spilled'

'jokes'

'silence'

'exterminate'

'virgin'

'rude'

'understood'

'lost'

'became hot'

'took'

'last'

'in love (with)'

'riot'

'deeply in love (with)'

'became pale'

'denied'

'became tired'

'sent'

'does a lot'

'became slower'

'crossed out'

292


nadim

?adab

waduud

nad\:id:3

fad\:ul

nad\:ab

?as if

kasul

kasab

fas\:ad

ka6ib

ka6ab

ka6uub

Ja6\:if

na6\:uf

Ja6\:af

fakih

sakab

?akuul

mabiituh

mabiitii

fariiquh

fariiqii

jastasii:rruh

'

y¥ yJ$--

Yj£ :....av•

}

:__;,V:.

j_,s-1 'J /

}

'}

293

'regretted'

'literature'

'affectionate'

'ripened'

'became virtuous'

'diminished'

'felt sorry'

'became lazy'

'won'

'phlebotomized'

'able to see'

'lying'

'lied'

'frequent liar'

'hard'

'became clean'

'hardship'

'humorous'

'spilled'

'one who eats a lot'

'his spending of the night'

'my spending ofthe night'

'his team'

'my team'

'his cooking'

'my cooking'

'(he) finds (it) enjoyable'


tastasii:sii

juziihuh

tuziihii

takiiduh

takiidii

d;3aliisuh

d;3aliisii

diikuh

diikii

buu:sit

kuusah

faatik

baahi8

0 J

'L '-? -·

294

'(you- f. sg.) find enjoyable'

'(he) removes it'

'(you- f. sg.) remove it'

'he sells it'

'(you -f. sg.) sell it'

'(he) unveils'

'(you- f. sg.) unveil'

'deceive (him)'

'(you- f. sg.) deceive'

'his patient'

'my patient'

'his companion'

'my companion'

'his shirt'

'my shirt'

'his wine'

'my wine'

'harming'

'his keeper'

'my keeper'

'his rooster'

'my rooster'

'was taken by surprise'

'zucchini'

'one inch'

'exterminator'

'researcher'


faat\:i?

faas\:uuljaa

'beach'

'kidney beans'

295


APPENDIXB

Stimuli - Experiment Two

1. VC# context stimuli.

Carrier phrase:

IPA

jabiit

siiq

jafiix

jaziiH

jasiih

jabii\'

jatiih

sii1

jumiit'i

biid

mariid'i

kiis

qamiis'i

,, .. !,....... f "

"?alkalimtu hiya _"

"The word is ,

Arabic Script

/

'

'

<\....:.> .__.

' J

'

'

' J

'

296

Gloss

'spend the night'

'was led'

'grow old'

'go astray'

'tour'

'sell'

'get lost'

'was annoyed'

'(he) unveils'

'deserts'

'patient'

'bag'

'shirt'

'seek refuge for someone'

'keeper'


297

abiik 'your father (object of prep.)'

taabuut Q.y.\j 'coffin' ' J 'market'-suuq J_y' ..

jaduux cJ--4 'faint' ' jarUUH tJJ.. 'evade'

jabuuh C.r.t 'disclose'

kuu'i t_y5' 'elbow' J

fuuh ' . 'his mouth'

suu1 'evil' ' jasuut'i .k /

_y--1. 'whip'

suud ' J 'blacks' ,:,.Y'

furuud'i 'obligations' ' 'razor' muus vr

buus'i 0 'reed' <..J""Y. 0

'seek refuge' 0

.1_# 'well kept' .

abuuk y.i 'your father (subjective)'

baat 'spent the night'

saaq 0L- 'leg'

daax cb 'fainted' ' zaaH t_lj 'went astray'

saah OL. c 'toured'

oaa'i 'spread'

taah 'got lost' 0

saa1 ,_L,. 'became worse' 0

sijaat'i 'whips'

saad 'prevailed'


298

rijaadr :;,4) 'gardens'

daas 'stepped on'

baasr 0 L r..T' . 'bus'

'drizzle' 0

J:,u.:,... 'well keeping'

abaak 'your father (object of verb)'

2. #CV context stimuli.

Carrier phrase:

"_ hiya 'alkalimah"

" is the word."

IPA Arabic Script Gloss

tiin 0 'figs' lN

tuut 0 'berries' uy

taab y\; 'repented'

qiis 0 o, 'was measured' J

quut :.:..,; 'nourishment'

qaas 0 \j <...!" 'measured'

x;iirah 0/ 'elite' 0 .frf

x;uuoah 0. J 'helmet' o.:>y

x;aab y\.>. 'failed'

B'iid 'delicate women' 0

B'UUl 'ogre'


299

Kaab yli- 'failed to appear'

hiik 'was knit'

b.uut 0 J 'whale' ...::..>y-

haak 'knitted'

£iid 'holiday'

£uud 0 'twig'

£aad 'returned'

hiib 'wCJ.s feared'

huud 0 'Hud' (name of a prophet)

haab ylA 'feared' 0

1iioaa? 'harming (n.)' 0 l

1uutii I)JI 'was given'

1aat 'coming'

'perfume'

y_,k 'bricks'

ytb 'became pleasant'

diik ./ 'rooster'

duud 0 'worms' 0

daa? 'disease' :J } ))4' 'scarcity'

0 'narrowness' 0

'became lit'

sii? 'was annoyed'

suu? 'evil' 0

saa? 'became evil' 0 'reputation'

0y 'wool'


300

s'iaad 'hunted'

kiis 'bag'

kuub y§' 'cup'

kaad 'nearly (did something)'


301

APPENDIXC

Additional Stimuli - Experiment Three

The following words are added to the stimuli in Appendix A to form the set of stimuli for

Experiment Three.

Carrier phrase:

IPA

d.3ihaad

mi1aat

Juhid

Juhub

suhaad

su1il

Ju1uun

su1aal

Jahid oahab

mahuul

sa1im

"?alkalimtu hiya _"

"The word is "

Arabic Script

0 J

0 J J

0 ' •

Gloss

'struggle'

'hundreds'

'was attended'

'shooting stars'

'insomnia'

'was asked'

'matters'

'question'

'attended'

'gold'

'frightful'

'got tired or


302

da?ab 'persevered'

da?uub 'persevering (n.)'

fabiihuh 0' 'his look-alike'

fabiihii 'my look-alike' 0 ' )

jusii?uh 'does (it) badly'

tusii?ii 0 ' '(you- f. sg.) do badly'