Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
PHONETIC AND PHONOLOGICAL ASPECTS
OF ARABIC EMPHA TICS AND GUTTURALS
by
Musaed S. Bin-Muqbil
A dissertation submitted in partial fulfillment of
the requirements for the degree of
Doctor of Philosophy
(Linguistics)
at the
UNIVERSITY OF WISCONSIN-MADISON
2006
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMI Number: 3222872
Copyright 2006 by Bin-Muqbil, Musaed S.
All rights reserved.
INFORMATION TO USERS
The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction.
In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.
®
UMI UMI Microform 3222872
Copyright 2006 by ProQuest Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest Information and Learning Company 300 North Zeeb Road
P.O. Box 1346 Ann Arbor, Ml48106-1346
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
©Copyright by Musaed S. Bin-Muqbil2006 All Rights Reserved
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
.::!.l
.!!!
A dissertation entitled
Phonetic and Phonological Aspects of Arabic Emphatics and Gutturals
submitted to the.Graduate School of the University of Wisconsin-Madison
in partial fulfillment of the requirements for the degree of Doctor of Philosophy
by
Musaed S. Bin-Muqbil
Date of Final Oral Examination: April 5, 2006
Month & Year Degree to be awarded: December May 2006 August
**************************************************************************************************
"7 //}Approval Signatures of Dissertation Committee
(LA
Signature, Dean of Graduate School
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
To my family.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
11
ABSTRACT
Existing formal representations of Arabic emphatic and guttural sounds are ill-
motivated articulatorily and suffer from descriptive and analytic inadequacies. This dis-
sertation aims to clarify our understanding of the articulatory attributes of these sounds as
reflected in their acoustic characteristics. The present experimental finding that the sec-
ondary articulation of emphatics is distinct from the primary articulation of gutturals re-
quires a grounded representational distinction.
Three acoustic experiments, using Modern Standard Arabic speech samples from
five male subjects, tested the acoustic characteristics of emphatics and gutturals. The first
experiment, comparing spectral qualities of consonants, found no reliable differences be-
tween spectral shapes of emphatics and non-emphatics. Acoustic attributes of uvular con-
tinuants argue for a fricative, not approximant, articulation. The second experiment ex-
amined the coarticulatory impact of consonants on formant frequencies of adjacent
vowels. Results indicate pharyngeals are more strongly associated with high Fl transi-
tions than emphatics and uvulars, which are associated with low F2 transitions. The F2
effect was stronger in emphatics than in uvulars. Emphatics and uvulars are thus under-
stood to be articulated with a retracted tongue dorsum while pharyngeals are articulated
with a retracted tongue root. Dorsal retractions in emphatics and uvulars are argued to be
qualitatively different. The third experiment investigates vowel-to-vowel coarticulation
across intervening consonants. Results show emphatics blocking or weakening coarticu-
lation. Coarticulatory effects of the three uvulars depend on their degree of constriction: a
stronger constriction corresponds to a stronger resistance to vowel-to-vowel coarticula-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
iii
tion. Remaining sounds allow vowel-to-vowel coarticulation. These results are attributed
to articulatory differentiation: emphatics employ the styloglossus and hyoglossus for their
dorsal articulation; uvulars primarily use the palatoglossus and secondarily the styloglos-
sus.
Taken together, experimental results lead to important implications for phonetic
grounding of Arabic emphatics and gutturals: emphatics and uvulars share a secondary
dorsal component; uvulars, pharyngeals, and laryngeals share a primary radical compo-
nent. The pharynx, then, is best viewed within phonology as a single active articulator
grouping guttural subclasses into one natural class. Formal representations based on these
views are more capable of handling the patterning of these sounds in Arabic phonology.
Implications for phonological analyses of Tigre and Sta'at'imcest (Lillooet Salish) are
discussed.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
iv
ACKNOWLEDGEMENTS
Anyone who has undertaken a doctoral dissertation would, more likely than not,
remember all sorts of challenge, anxieties, and sleepless nights. They would also remem-
ber faces, names, and exchanges that soothed those pains away. I should know. There
were occasions when certain difficulties I faced bordered on being insurmountable. Luck-
ily, I was surrounded by people who were eager to lend a capable helping hand. At the
forefront is my academic advisor, Dr. Thomas Purnell. It is quite difficult to extend
enough gratitude to a man who, for several years, patiently guided my steps through this
winding road till I reached my goal. I have worked with Dr. Purnell for years and never
once do I recall him being less than gracious and supportive. Above all, he instilled in me
a great deal of confidence without which I doubt this work would have ever seen the
· light. Thanks, Dr. Purnell.
I all honesty, I have been blessed with a committee of highly regarded professors
who combine mastery of their respective fields of study with welcoming attitudes. I am
greatly indebted to Dr. Raymond Kent for his direction and insights in the field of ex-
perimental phonetics. I can never forget how cheerful and respectful that man is. I have
never left him after a meeting with him without feeling much more informed than before.
I'm very proud to say that I have learned from one of the few masters in the field. I thank
Dr. Paul Milenkovic for all the help I received from him in regards to the experimental
methods used in this dissertation. He dedicated a great deal of his valuable time to the
development of a capable computer algorithm to calculate Multi-Band Spectra specifi-
cally for my dissertation. I am very grateful and I am sure many future researchers will
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
v
be, too. I also send my gratitude to Dr. Joseph Salmons for all the generous help I have
received from him. As he would typically do, Dr. Salmons provided me with enriching
feedback that elevated the quality of my work. I thank him for it. I would also like to
thank Dr. Rand Valentine for all the advice and support I have received from him. His
cheerful attitude and consummate workmanship are example that I hope I would be able
to follow. No matter what academic degree is conferred upon me now or in the future, I
will always consider myself a student of those gentlemen.
I am grateful to King Saud University for their generosity in granting me the op-
portunity to pursue my higher studies. I am particularly indebted to the faculty of the
English Language Department for placing their faith in me and providing me with the
chance to realize my dreams.
There is no possible way that I could show my gratitude to my family. I am sure
my late father would have been very proud of me. I ask the Almighty Allah to bestow His
mercy on him. My dear mother had to endure my years-long absence in silence. Her suf-
fering and her prayers dwarf any thanks I can direct to her. I ask Allah to enable me to
honor her the way she should be honored. My gratitude to my brothers and sisters knows
no bounds. I also thank my dear relatives and my wonderful in-laws. Their prayers and
well wishes will never be forgotten. In their absence, my wonderful son Faaris has been
the source of my cheers and happiness. His laughter and playfulness never failed to ener-
gize me whenever I felt down. Many times he would wipe the worries of the outside
world off my mind with a simple 'hala baba!' ('Welcome, papa!') as I walk into the
house. My ever-flowing love and gratitude go to my dear wife Abeer who has been with
me through thick and thin. She endured more than five years away from her family just to
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Vl
share it all with me. Words can never do her justice. Thank you, Abeer. You are truly a
blessing.
My first, last, and continuous thanks go to the Almighty Allah who blessed me
with everything that I have and everything that I am. I pray to Him to enable me to use
whatever I have learned for the good of mankind.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Vll
TABLE OF CONTENTS
ABSTRACT ..................................................................................................................... ii
ACKNOWLEDGEMENTS ............................................................................................... iv
TABLE OF CONTENTS .................................................................................................. vii
LIST OF FIGURES ............................................................................................................. X
LIST OF TABLES ........................................................................................................... xiv
CHAPTER 1 Introduction ................................................................................................. 1 1.1 Aims ............................................................................................................ 1 1.2 Rationale ..................................................................................................... 4
1.2.1 Experimental phonetics and phonological representations ............. 4 1.2.2 Acoustic-articulatory relations ........................................................ 8
1.3 Modern Standard Arabic (MSA) .............................................................. 14 1.4 Overview of the dissertation ..................................................................... 17
CHAPTER2 2.1
2.2
2.3
2.4
2.5
Background and Literature Review .......................................................... 22 Basic Vocal Tract Anatomy ...................................................................... 23 2.1.1 The Tongue ................................................................................... 23 2.1.2 The Pharynx ........................................................................ , ......... 26 2.1.3 The Soft Palate .............................................................................. 27 2.1.4 The Larynx .............................................................. : ..................... 29 Phonetic Properties of Arabic Emphatics and Gutturals ........................... 31 2.2.1 Emphatics ...................................................................................... 31 2.2.2 Uvulars .......................................................................................... 40 2.2.3 Pharyngeals ................................................................................... 45 2.2.4 Laryngeals ......... : ........................................................................... 50 Gutturals as a Nat ural Class ...................................................................... 54 2.3.1 Morpheme Structure Constraints .................................................. 55 2.3.2 Guttural Lowering ......................................................................... 60 Representations of Emphatics and Gutturals ............................................ 62 2.4.1 McCarthy (1994) ........................................................................... 65 2.4.2 Rose (1996) ................................................................................... 68 2.4.3 Zawaydeh (1999) .......................................................................... 70 Representational Problems ........................................................................ 73
CHAPTER 3 Experiment One: The Spectral Shapes of Consonants ............................. 79 3.1 Overview ................................................................................................... 79 3.2 Methods ..................................................................................................... 86
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Vlll
3.2.1 Subjects ......................................................................................... 86 3.2.2 Stimuli ........................................................................................... 87 3.2.3 Procedures ..................................................................................... 89 3.2.4 Acoustic Analysis ......................................................................... 90
3.2.4.1 Spectral Moments .......................................................... 90 3.2.4.2 Multi-Band Spectra (MBS) ............................................ 93
3.2.5 Reliability ...................................................................................... 93 3.3 Results ....................................................................................................... 94
3.3.1 Spectral Moments ......................................................................... 94 3.3.1.1 Voiceless Continuants- Pooled Data ............................ 94 3.3.1.2 Voiceless Continuants- Individual Subjects ................. 99 3.3.1.3 Voiceless Continuants- Specific Vowel Contexts ...... 101 3.3.1.4 Voiceless Continuants- Discriminant Analysis .......... 103 3.3.1.5 Voiced Continuants- Pooled Data .............................. 104 3.3.1.6 Voiced Continuants- Individual Subjects ................... 109 3.3.1.7 Voiced Continuants- Specific Vowel Contexts .......... 111 3.3.1.8 Voiced Continuants- Discriminant Analysis .............. 112 3.3.1.9 Voiceless Stops- Pooled Data .................................... 113 3.3.1.10 Voiceless Stops- Individual Subjects ......................... 117 3.3.1.11 Voiceless Stops- Specific Vowel Contexts ................ 120 3.3.1.12 Voiceless Stops- Discriminant Analysis .................... 121 3.3.1.13 Voiced Stops- Pooled Data ........................................ 122 3.3.1.14 Voiced Stops- Individual Subjects ............................. 125 3.3.1.15 Voiced Stops- Specific Vowel Contexts .................... 127 3.3.1.16 Voiced Stops- Discriminant Analysis ........................ 127
3.3.2 Multi-Band Spectra ..................................................................... 128 3.3.2.1 Voiceless Continuants .................................................. 128 3.3.2.2 Voiceless Continuants- Discriminant Analysis .......... 131 3.3.2.3 Voiceless Stops ............................................................ 132 3.3.2.4 Voiceless Stops- Discriminant Analysis ....................... 134
3.4 Discussion and Conclusions ................................................................... 135 3.5. Summary ................................................................................................. 143
CHAPTER 4 Experiment Two: Anticipatory and Carryover Consonant-Vowel Coarticulation .......................................................................................... 145
4.1 Overview ............... .- ................................................................................. 145 4.2 Methods ................................................................................................... 149
4.2.1 Subjects ....................................................................................... 149 4.2.2 Stimuli. ........................................................................................ 149 4.2.3 Procedures ................................................................................... 151 4.2.4 Acoustic Analysis ....................................................................... 151 4.2.5 Reliability .................................................................................... 152
4.3' Results ..................................................................................................... 153 4.3.1 Anticipatory (VC) Coarticulation ............................................... 159
4.3.1.1 Anticipatory Coarticulation in Fl ................................ 159
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.4 4.5
CHAPTERS 5.1 5.2
5.3
5.4 5.5
CHAPTER6 6.1 6.2 6.3
6.4 6.5
IX
4.3.1.2 Anticipatory Coarticulation in F2 ................................ 164 4.3.1.3 Anticipatory Coarticulation- Discriminant Analysis . 169
4.3.2 Carryover (CV) Coarticulation ................................................... 173 4.3.2.2 Carryover Coarticulation in F1 .................................... 176 4.3.2.2 Carryover Coarticulation in F2 .................................... 181 4.3.2.3 Carryover Coarticulation- Discriminant Analysis ...... 186
Discussion and Conclusions ................................................................... 189 Summary ................................................................................................. 204
Experiment Three: Vowel-to-Vowel Coarticulation .............................. 206 Overview ................................................................................................. 206 Methods ................................................................................................... 210 5.2.1 Subjects ....................................................................................... 210 5.2.2 Stimuli ......................................................................................... 210 5.2.3 Procedures ................................................................................... 210 5.2.4 Acoustic Analysis ....................................................................... 210 5.2.5 Reliability .................................................................................... 211 Results ..................................................................................................... 212 5.3.1 Anticipatory Vowel-to-Vowel Coarticulation ............................ 212 5.3.2 Carryover Vowel-to-Vowel Coarticulation ................................ 217 Discussion and Conclusions ................................................................... 223 Summary ................................................................................................. 230
Implications and Alternatives ................................................................. 233 Emphatic and Guttural Articulations ...................................................... 234 Alternative Basis for the Guttural Natural Class .................................... 243 Formal Representations ........................... .-.............................................. 248 6.3.1 Arabic Morpheme Structure Constraints Revisited .................... 255 6.3.2 Guttural Lowering Revisited ....................................................... 262 A Note on Ethio-Semitic and Interior Salish .......................................... 264 Summary ................................................................................................. 268
CHAPTER 7 Conclusion and Future Directions ........................................................... 271
REFERENCES ............................................................................................................... 278
APPENDIX A ................................................................................................................ 290
APPENDIX B ................................................................................................................ 296
APPENDIX C ................................................................................................................ 301
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
X
LIST OF FIGURES
Figure 1.1. Points of minimum velocity (nodes) and maximum velocity (an tin odes) for the first two formant frequencies of vowels .................................. 11
Figure 1.2. Illustration of how the articulation of the three Arabic vowels [i, u, a] is related to their acoustic shapes in the light of the source-filter theory .............. 13
Figure 2.1. The extrinsic muscles of the tongue along with some other vocal tract organs ............................ .-........................................................................................ 25
Figure 2.2. The pharyngeal constrictors and related structures ......................................... 25
Figure 2.3. Muscles of the soft palate along with related structures ................................. 28
Figure 2.4. Structure of the larynx ..................................................................................... 28
Figure 2.5. A schematic illustration of the vocal tract configuration during the articulation of an Arabic emphatic coronal and its non-emphatic counterpart. ............................................................................................................ 3 2
Figure 2.6. Schematic illustrations of the vocal tract configurations during the articulation of an Arabic uvulars ........................................................................... .41
Figure 2. 7. A schematic illustration of the vocal tract configuration during the articulation of an Arabic pharyngeal consonant. .................................................. .46
Figure 3.1. A multi-band spectrum (stepped line) and an FFT spectrum for the Arabic voiceless fricative [s] in the sequence [asa] both generated from a 40-ms full Hamming window placed at the middle of the frication noise ............. 85
Figure 3.2. Locations of the sampling windows at which the spectral moments for fricatives (above) and stops (below) were calculated ............................................ 92
Figure 3.3. Spectral moments values for voiceless continuants at the five sampling window locations ................................................................................................... 97.
Figure 3.4. Box plots of the distributions of the spectral moments scores for the four voiceless continuants [s, s", x, h) ................................................................... 98
Figure 3.5. Box plots showing the distributions of the four voiceless continuants spectral moments scores for each of the five individual subjects ........................ 1 00
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Xl
Figure 3.6. Spectral moments values for voiced continuants at the five sampling window locations ................................................................................................. 1 06
Figure 3. 7. Box plots of the distributions of the spectral moments scores for the four voiced continuants [0, o", B", )] ......•.•.•.•.••..•.•.•... ; ......•.•.•.•.•.•••........•.••••.•.•.•.• 108
Figure 3.8. Box plots showing the distributions of the four voiced continuants spectral moments scores for each of the five individual subjects ........................ 110
Figure 3.9. Spectral moments values for voiceless stops at the two sampling window locations ................................................................................................. 115
Figure 3.10. Box plots of the distributions of the spectral moments scores for the four voiceless stops [t, t\ k, q] ............................................................................ 117
Figure 3 .11. Box plots showing the distributions of the four voiceless stops spectral moments scores for each of the five individual subjects ........................ 118
Figure 3.12. Spectral moments values for voiced stops at the two sampling window locations ................................................................................................. 124
Figure 3.13. Box plots of the distributions of the spectral moments scores for the two voiced stops [d, d'] ....................................................................................... 125
Figure 3.14. Box plots showing the distributions of the two voiced stops spectral moments scores for each of the five individual subjects ..................................... 126
Figure 3.15. Four histograms replicating the multi-band spectra of the four voiceless continuants ........................................................................................... 130
Figure 3.16. Four histograms replicating the multi-band spectra of the four voiceless stops ...................................................................................................... 133
Figure 4.1. Cursor locations at vowel steady states in the CV (b) and VC (c) contexts as well as at the vowel transition edges in the two contexts (a and d, respectively) ................................................................... ; ................................. 152
Figure 4.2. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the four Arabic plain coronals [t, d, o, s] and their emphatic counterparts [tl', dl', ol', ................................................................... 157
Figure 4.3. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the velar [k], the three uvulars [q, x, B], the two pharyngeals [h, )] and the two laryngeals [h, ?] ................................................. 158
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 4.4. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] following the three Arabic plain coronals [t, d, s] and their
Xll
emphatic counterparts dl:, ........................................................................ 174
Figure 4.5. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the velar [k], the three uvulars [q, x, B"], the two pharyngeals [h, )] and the two laryngeals [h, ?]. ................................................ 175
Figure 4.6. Mean F2 transitions next to the non-emphatic coronals and their emphatic counterparts .......................................................................................... 194
Figure 4. 7. Stylized second formant tracks of the three Arabic vowels [i, a, u] preceding the four Arabic plain coronals [t, d, o, s] and their emphatic counterparts [tl:, o'l, s"] ................................................................................... 196
Figure 4.8. Stylized second formant tracks of the three Arabic vowels [i, a, u] preceding the Arabic velar [k] as well as the seven gutturals [q, x, ff, h, ), h, ?] ...................................................................................................................... 197
Figure 4.9. Stylized second formant tracks of the three Arabic vowels [i, a, u] following the three Arabic plain coronals [t, d, s] and their emphatic counterparts ..................................... -.................................................... 198
Figure 4.10. Stylized second formant tracks of the three Arabic vowels [i, a, u] following the Arabic velar [k] as well as the seven gutturals [q, x, ff, h, ), h, ?] ...................................................................................................................... 199
Figure 5 .I. Anticipatory V V coarticulatory effects on the three Arabic vowels [i, a, u] across the four plain coronals [t, d, o, s] and their emphatic counterparts [t", s'>] ......................................................................... _ .......... 214
Figure 5 .2. Anticipatory V -to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the velar [k], the three uvulars [X, ff, q], the two pharyngeals [h, ?], and the two laryngeals [h, ?] ................................................ 215
Figure 5.3. Carryover V-to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the four plain coronals [t, d, o, s] and their emphatic counterparts [t\ d'I, o'>, s'>] ................................................................................... 219
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 5.4. Carryover V-to-V coarticulatory effects on the three Arabic vowels [i, a, u] across [k], the three uvulars [X, B", q], the two pharyngeals [h, ?],
xiii
and the two laryngeals [h, ?]. .............................................................................. 220
Figure 5.5. Sizes of anticipatory and carryover vowel-to-vowel coarticulatory effects across the sixteen Arabic consonants under investigation ....................... 222
Figure 6.1. X-ray tracings of palatalized and velarized laterals and liquids in Russian (Bolla 1981, plates 76-79) ..................................................................... 239
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
XIV
LIST OF TABLES
Table 2.1 A Summary of the phonetic attributes of Arabic emphatic, uvular, pharyngeal, and laryngeal sounds .................................................................. , ....... 53
Table 2.2. Frequencies of consonant cboccurrences in Arabic roots (from McCarthy 1994:204) ............................................................ , ................................. 57
Table 3.1. Mean values of spectral moments for voiceless continuants averaged across speakers, window locations, and vowel contexts ........................................ 95
Table 3.2. Mean values of spectral moments for voiceless continuants averaged from windows 2 and 3 and across speakers and vowel contexts ........................... 98
Table 3.3. Results of the discriminant analysis for the voiceless continuants based on the four spectral moments' values combined together as predictors .............. 1 04
Table 3.4. Mean values of spectral moments for voiced continuants averaged across speakers, window locations, and vowel contexts ...................................... 105
Table 3.5. Mean values of spectral moments for voiced continuants averaged from windows 3 and 4 and across speakers and vowel contexts .................................. 107
Table 3.6. Results of the discriminant analysis for the voiced continuants based on the four spectral moments' values combined together as predictors ................... 112
Table 3.7. Mean values of spectral moments for voiceless stops averaged across speakers, window locations, and vowel contexts ................................................. 113
Table 3.8. Mean values of spectral moments for voiceless stops calculated at window 1 and averaged across speakers and vowel contexts .............................. 116
Table 3.9. Results of the discriminant analysis for the voiceless stops based on the four spectral moments' values combined together as predictors ......................... 122
Table 3.10. Mean values of spectral moments for voiced stops averaged across speakers, window locations, and vowel contexts ................................................. 123
Table 3.11. Mean values of spectral moments for voiced stops calculated at window 1 and averaged across speakers and vowel contexts .............................. 125
Table 3.12. Results of the discriminant analysis for the voiced stops based on the four spectral moments' values combined together as predictors ......................... 128
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 3.13. Mean relative intensity values at the 11 frequency bands for the four voiceless continuants averaged from the two sampling windows across
XV
speakers and vowel contexts ................................................................................ 129
Table 3.14. Mean normalized relative intensity values at the 11 frequency bands for the four voiceless continuants averaged from the two sampling windows across speakers and vowel contexts ..................................................... 129
Table 3.15. Results of the discriminant analysis for the voiceless continuants based on the normalized intensity values at each of the 11 frequency bands, averaged from the two sampling window locations, combined together as predictors ........................................................................................... 131
Table 3.16. Mean relative intensity values at the 11 frequency bands for the four voiceless stops averaged across speakers and vowel contexts ............................. 132
Table 3.17. Mean normalized relative intensity values at the 11 frequency bands for the four voiceless stops averaged across speakers and vowel contexts ......... 133
Table 3.18. Results of the discriminant analysis for the voiceless stops based on the normalized intensity values at each of the 11 frequency bands combined together as predictors .......................................................................... 135
Table 3.19. Results of the discriminant analyses for the plain/emphatic consonant pairs based on the spectral moments values as predictors ................................... 137
Table 3.20. Results of the discriminant analyses for the plain/emphatic voiceless consonant pairs based on the normalized relative intensity values of the multi-band spectra as predictors .......................................................................... 137
Table 4.1. Average formant frequency values for the vowel [i] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants ................................................................................ 154
Table 4.2. Average formant frequency values for the vowel [a] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants ................................................................................ 155
Table 4.3. Average formant frequency values for the vowel [u] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants ................................................................................ 156
Table 4.4. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F1vowei values of the vowel [i] in the context [iC] ................................................ 160
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 4.5. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of
XVl
F1offset values of the vowel [i] in the context [iC] ................................................. 160
Table 4.6. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F1vowei values of the vowel [a] in the context [aC] ............................................... 162
Table 4.7. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F 1 offset values of the vowel [a] in the context [ aC] ............................................... 162
Table 4.8. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F 1 vowel values of the vowel [ u] in the context [ uC] ............................................... 163
Table 4.9. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F1offset values of the vowel [u] in the context [uC]. .............................................. 163
Table 4.10. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowei values of the vowel [i] in the context [iC]. ............................................... 165
Table 4.11. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2offset values of the vowel [i] in the context [iC] ................................................. 165
Table 4.12. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowei values of the vowel [a] in the context [ aC] ............................................... 166
Table 4.13. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2onset values of the vowel [a] in the context [ aC] ............................................... 166
Table 4.14. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowei values of the vowel [u] in the context [uC] ............................................... 168
Table 4.15. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2offset values of the vowel [ u] in the context [ uC] ............................................... 168
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 4.16. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of
XVll
F 1 transitions in VC contexts ............................................................................... 170
Table 4.17. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F2 transitions in VC contexts ............................................................................... 171
Table 4.18. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1onset values of the vowel [i] in the context [Ci]. ................................................ 177
Table 4.19. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F 1 vowel values of the vowel [i] in the context [Ci] ................................................ 177
Table 4.20. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1onset values of the vowel [a] in the context [Ca]. ............................................... 178
Table 4.21. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1vowel values of the vowel [a] in the context [Ca] ............................................... 178
Table 4.22. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1onset values of the vowel [u] in the context [Cu] ............................................... 180
Table 4.23. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F1vowel values of the vowel [u] in the context [Cu] ............................................... 180
Table 4.24. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [i] in the context [Ci] ................................................. 182
Table 4.25. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowel values of the vowel [i] in the context [Ci] ................................................ 182
Table 4.26. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [a] in the context [Ca] ................................................ 183
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 4.27. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of
xviii
F2vowel values of the vowel [a] in the context [C'a] ............................................... 183
Table 4.28. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [u] in the context [Cu] ............................................... 185
Table 4.29. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowel values of the vowel [ u] in the context [Cu] ............................................... 185
Table 4.30. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F1 transitions in CV contexts ............................................................................... 187
Table 4.31. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F2 transitions in CV contexts ............................................................................... 187
Table 5.1. Second formant frequency means (and standard deviations) in Hz of the V1 transitions preceding the four MSA emphatics and their non-emphatic counterparts .......................................................................................... 213
Table 5.2. Second formant frequency means (and standard deviations) in Hz of the V1 transitions preceding the MSA gutturals and the velar stop [k] ................ 213
Table 5.3. Second formant frequency means (and standard deviations) in Hz of the V2 transitions following the MSA emphatics and their non-emphatic counterparts .......................................................................................................... 218
Table 5.4. Second formant frequency means (and standard deviations) in Hz of the V2 transitions following the MSA gutturals and the velar stop [k]. ............... 218
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
1
CHAPTER 1
Introduction
1.1 Aims
Much of the phonetic and phonological research on Arabic discusses the sound
classes of emphatics and gutturals. Arabic emphatics ([t', d', o', s"]) are a set of complex
phonemes that are produced with a primary coronal articulation and a secondary articula-
tion involving the retraction of the tongue body into the oropharynx. This secondary ar-
ticulation is what distinguishes the four emphatics from their non-emphatic counterparts
([t, d, o, s]). Arabic gutturals are a class of consonants produced primarily in the lar-
ynx/pharynx region. Arabic has seven guttural consonants. The two laryngeals, [h] and
[?], are produced at the larynx with a fully open or fully constricted glottis, respectively.
The two pharyngeals, [h, 1], are produced by a retraction of the tongue root, the anterior
wall of the pharynx, and the epiglottis towards the posterior wall of the pharynx. The
uvulars, [X, E, q], are produced with a retracted and raised tongue body accompanied, in
the case of [E], by a lowered soft palate forming a constriction in the uppermost orophar-
ynx, or, in the cases of [X] and [q], by a raised and flattened soft palate. While gutturals
are clearly produced at different points of articulation, significant phonological evidence
has been presented which suggest that these three subsets are members of a single phono-
logical natural class in terms of place of articulation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2
The term 'emphatics' is one of several terms that have been used to refer to the set
of complex coronals in Arabic. According to Lehn (1963), these sounds have also been
termed pharyngealized, velarized, uvularized, retracted, strongly articulated, and heavy:
While some of these terms (including 'emphatics') are rather impressionistic, other terms
reflect disagreements between linguists in regards to the articulatory nature of the secon-
dary articulation involved in the production of these sounds. The most prominent pro-
posal is that these sounds are pharyngealized. This is basically a place of articulation term
that reflects the fact that the pharynx is generally narrowed during the articulation of
these sounds. It is possible that the prevalence of this designation emanates from the at-
tractive notion of equating the secondary articulation of emphatics to the primary articu-
lation of pharyngeals since both sets of sounds exist in the same language and in both the
pharynx is constricted. Hence, as detailed below, several phonologists propose formal
representations that involve some sort of a pharyngeal component that is present as a
primary place/articulator feature in gutturals and as a secondary feature in emphatics. As
a result, these proposals face some formidable phonetic and phonological challenges
ranging from phonetic-phonological disparity to theoretical descriptive and analytical in-
adequacies.
The main goal of this study is to highlight the inadequacies of the existing formal
proposals for representing Arabic emphatics and gutturals and to propose alternative rep-
resentations that overcome those weaknesses. Although the central pursuits here revolve
around the nature emphatic articulation, the linguistic nature of these sounds cannot be
fully understood without including gutturals in the same investigation. While the present
1 Lehn also lists another somewhat curious term, u-resonance.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3
study concludes that the secondary articulation of emphatics and the primary articula-
tion(s) of gutturals have to be fundamentally different, there are some compelling reasons
to cover gutturals extensively in the same study. First, in order to refute the notion that
emphatics and gutturals employ a similar articulator, we have to compare and contrast the
two classes based on similar parameters. Second, the guttural subclass of uvulars holds
some interesting phonetic affinities to emphatics. We have already mentioned that some
phonologists term emphatics as 'uvularized' in reference to a uvular articulation accom-
panying the main coronal articulation in emphatics. Third, the articulatory similarities
and differences between emphatics and gutturals may not be the same in all languages.
Brief, but important, considerations of other languages are stated in various locations of
this dissertation that highlight this particular issue. Fourth, existing groupings of the three
subclasses of gutturals (uvulars, pharyngeals, and laryngeals) into one natural class is in
need of clarification, specifically with regard to the unifying foundation of this sound
class.
The present study proposes articulator-based alternatives to the formal representa-
tions of emphatics and gutturals on the basis of acoustic data. The idea is to relate the sa-
lient acoustic correlates of these sounds to their possible articulatory traits. It is possible,
then, to accept or refute the different claims regarding those traits on the basis of acous-
tic/articulatory compatibility. The acoustic data reported here suggest that the secondary
articulation in Arabic emphatics is fundamentally different from the articulations of all
three guttural The secondary articulation in emphatics is argued to be exe-
cuted through the retraction of the tongue dorsum with no active involvement of any pha-
2 To foreshadow, however, the data suggest that there are some common traits between the secondary articulation in Arabic emphatics and the articulation of uvulars.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4
ryngeal component. By comparison, all three guttural subclasses are produced with active
participation of the pharynx which is understood to be a linguistic reference to the area
extending from the anterior faucial pillars to the larynx, inclusive. Accordingly, emphat-
ics are represented with a primary coronal articulation and secondary dorsal articulation.
Uvulars are represented with a primary radical articulation and a secondary dorsal articu-
lation. Pharyngeals and laryngeals are represented as purely radical sounds. These repre-
sentations are shown to be more adequate at describing and explaining the most promi-
nent phonological phenomena associated with these sounds. Furthermore, it is argued that
the pharyngeal region can be considered an active articulator, not merely a place of ar-
ticulation, for the class of guttural sounds. Unlike oral articulators, however, the pharyn-
geal articulator is defined at an abstract neuromotoric level.
The following section of this chapter explains the basic rationale behind this
study. Section 1.3 acquaints the reader with Modern Standard Arabic (MSA) from which
the experimental data is collected. Section 1.4 overviews the dissertation.
1.2 Rationale
1.2.1 Experimental phonetics and phonological representations
The position taken in this dissertation is that experimental phonetic methods play
an important part in motivating, verifying, and refuting formal phonological representa-
tions. This is in spite of the murky pool of arguments and counterarguments that charac-
terize the phonetics-phonology relationship. This relationship ranges in the literature
from a near-total separation to a full integration of the two fields. The recognition of the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5
two fields started rather vaguely with Ferdinand de Saussure's (in Course in General
Linguistics; 1915; reprinted in translation in 1966) distinction between langue (a higher
cognitive system of idiosyncratically related signifiers- signs- and signifieds- ide-
alized concepts) and parole (the physical instantiation of speech). But it was Trubetzkoy
(1969) who drew a principled distinction between phonetics and phonology3• In his view,
phonetics is "the study of sound pertaining to the act of speech, which is concerned with
concrete physical phenomena." This field "would have to use the methods of the natural
sciences". Meanwhile, phonology is "the study of sound pertaining to the system of Ian-
guage". This field "would use only the methods of linguistics, or the humanities, or the
social sciences" (pp. 3- 4). However, in spite of these proposed methodological delimita-
tions, Trubetzkoy utilizes phonetic terminology based on articulatory and acoustic speech
properties to describe the distinctive oppositions among speech sounds stating that "no
other discipline except phonetics can teach us about individual sound properties".
While Trubetzkoy believed that the minimal components of sound structure are
phonemes, his close colleague and fellow Prague School member Roman Jakobson main-
tained that distinctive features, the building units of phonemes, are the minimal compo-
nents. Jakobson, Fant, and Halle (1952) and Jakobson and Halle (1956), represent the
earliest extensive experimental accounts aimed at characterizing distinctive features by
reference to their acoustic, auditory, as well as articulatory correlates. In this early system
distinctive features are encoded into cover terms that implied a number of phonetic di-
mensions; articulatory, acoustic, and perceptual. However, the adequacy of this system is
3 According to Trubetzkoy, however, "Baudouin de Courtenay ... was the first to arrive at the idea th;:tt there should be two distinct types of descriptive sound study, depending on whether concrete sounds were to be investigated as physical phenomena or as phonic signals used by a speech community for purposes of communication." (pp. 4- 5)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
6
challenged by languages whose phonemic inventories require separation of those dimen-
sions in order to express phonological contrasts. An example directly related to the topic
of this dissertation is provided by McCawley (1967; cited in Anderson 1985). The Jakob-
sonian feature [+flat] refers to sounds that involve a labial or back narrowing of the vocal
tract causing an acoustic lowering of the higher frequency components. In Arabic, the
feature [+flat] describes the back rounded vowel [u] as well as the emphatic consonants
since they involve a secondary constriction in the back of the vocal tract. As things stand
so far, there is no formal problem since, in Arabic, rounding is contrastive only in vowels
while 'pharyngealization' is contrastive only in consonants. But Arabic vowels become
pharyngealized when adjacent to pharyngealized consonants. How can this be expressed
as an acquisition of [+flat] by the vowel [u] which is already specified for that feature?
These challenges notwithstanding, however, it has been generally acknowledged that
phonological units are phonetically grounded since the publication of Jakobson's works.
In the early works of generative phonology, as explained in Chomsky and Halle's
(1968) Sound Pattern of English (SPE), the underlying forms of morphemes are made up
of strings of abstract, but not arbitrary, "phonetic features". These features are universal
since they "represent the phonetic capabilities of man" (p. 299). Further developments in
the theory saw the articulatory basis of features, as well as the phonetic role in phonol-
ogy, receiving considerable attention. Many of the subsequent generative frameworks
used feature systems based on Halle's (1983) proposal that phonological features repre-
sent neural instructions to the articulators. This "Articulatory Model" marks a departure
from the view that links features to passive cavities and places of articulation in the vocal
tract (or to inconsistent descriptions like the location of the highest point in the tongue).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
7
Instead, the articulatory correlates of features are described as the actions of the active
movable articulators. Another influential development in the consideration of the nature
of features is the introduction of Feature Geometry (Clements 1985). Phonological fea-
tures came to be recognized as autosegmental entities-rather than matrix entries-that
are hierarchically grouped, reflecting the independent action of certain sets. of features in
phonological processes. In some of the later geometrical models (e.g. Keyser and Stevens
1994, Halle 1995, Halle et al. 2000) the anatomical architecture of the human vocal tract
plays a central role in the construction of feature trees. A more recent example
of such models is given in (1). Features are grouped under common articulator nodes that
denote the speech organs that physically execute these features. These nodes are further
grouped under articulator group nodes that reflect the anatomical or neural affinities
among the articulators. Finally, the articulator group nodes, along with the articulator-free
stricture features, are grouped under the root node.
The development of formal phonological representations within the general
framework of generative phonology brought alqng with it an increasingly tighter integra-
tion of phonology and phonetics. An important consequence of this progression is that
phonological representations, which are highly grounded in phonetics, lend themselves to
experimental methods. In recent years, ambitious attempts to subject phonological hy-
potheses to empirical validation (experimental phonology) have been gaining momentum.
The LabPhon forum (1990-present) is a noticeable example of such aspirations to shift
the field of phonological research into the domain of the mature sciences.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
8 (1) Halle et al.' s (2000) feature tree.
[suction] [continuant] [strident] [lateral] [round]
>Lips [labial] [anterior]
/>Tongue Blade [consonantal] [distributed] Place [coronal] [sonorant]
[high] [low]
Tongue Body [back] [dorsal] [nasal] > Soft Palate [rhinal] [ATR]
) Tongue Root [RTR] [radical] [spread gl] Guttural · [constricted gl] [stiff vi] Larynx [slack vf] [glottal]
1.2.2 Acoustic-articulatory relations
This dissertation is built on the belief that the acoustic attributes of speech sounds
are a reflection of their articulatory qualities. It has been shown in various seminal works
that the different configurations assumed by the vocal tract correspond to systematic
acoustic The present dissertation depends on this relationship to further our un-
derstanding of the articulation of the speech sounds in question on the basis of their vari-
ous acoustic correlates. In this section we go through the basic tenets of acoustic-
articulatory relation in vowel production. The choice of vowel production as an example
4 As such, this dissertation follows the notion of articulatory and acoustic stability of Stevens (1989, 1999).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
9
follows from the need to limit this discussion to a manageable length. This choice is also
based on the fact that acoustic-articulatory models of vowel production are less complex
than those of obstruents. Vowel articulations involve a single wide open resonator (the
vocal tract) and an energy source at its end (the vibrating vocal folds). Obstruents, on the
other hand, involve more complex resonators due to the higher degree of constriction in
their articulations. Furthermore, the energy source during obstruent articulations resides
within the resonator. Nevertheless, many of the basic aspects of the following discussion
apply to the cases of obstruents as well.
Classic works on vowels (Fant 1960, Steven and House 1961) have modeled the
human vocal tract (during the articulation of a simple mid-central vowel like English [g])
as a uniform pipe open at one end (corresponding to the lips) and closed at the other end
(corresponding to the glottis). The different resonating frequencies for this type of reso-
nator are calculated by the formula in (2).
(2) Fn = (2n
Where Fn is the n1h resonating frequency, cis the velocity ofsound, and lis the length of
the tube. What this formula means is that when the pipe resonator is excited by the acous-
tic energy generated by the vibration of the vocal folds it resonates at frequencies corre-
sponding to the odd multiples of the quarter-wavelength of a sine wave. This is because
those multiples coincide with maximum air volume velocity and minimum air volume
pressure at the open end of the tube, and to the opposites at the closed end (Chiba and
Kajiyama 1958; originally published in 1941). These resonating frequencies are known as
'formants'. Based on this formula, a pipe 17.7 em in length, which approximates the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
10
length of an average male speaker (Stevens 1998), would produce formants at the fre-
quencies 500Hz, 1500Hz, 2500Hz, etc.
Those frequency values are based on the assumption that the vocal tract has a
more or less uniform diameter along its length as to produce the English mid-central
vowel variations to the diameter of the tube at different locations (corre-
sponding to the various narrowings in the vocal tract when producing other vowels) have
been shown to correspond to systematic and rather predictable variations in the values of
the formant frequencies (Stevens and House 1955). The perturbation theory of Chiba and
Kajiyama (1958) relates the patterns of changes for a given formant to the constrictions
made at the points of maximum volume velocity (points of minimum volume pressure;
antinodes) or at points of minimum volume velocity (points of maximum volume pres-
sure; nodes) of that formant. If the articulation of a given vowel results in a constriction
at or near the an tin ode of a certain formant, the formant is lowered. Conversely, if the
constriction is at or near the formant node, the formant is raised. Widening, rather than
constricting, the vocal tract at those points has an opposite effect. Lip rounding has one or
both of two possible articulatory products. It can result in a constriction at the antinodes
of all formants (since, as explained earlier, the odd multiples of the quarter-wavelength of
the sine wave always have their last antinode at the lips) or in a lengthening of the vocal
tract. Constriction at the lips lowers all frequencies since it is a constriction at the anti-
nodes. Vocal tract lengthening also lowers all frequencies since the base value in the
formula in (2) increases. The approximate locations of the nodes and antinodes for the
formant frequencies F1 and F2 are schematized in Figure 1.1. So, the overall geometric
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
11
Figure 1.1. Points of minimum velocity (nodes) and maximum velocity (antinodes) for the first two for-mant frequencies of vowels. An indicates the antinode of formant n, while Nn indicates the node of that for-mant.
shape of the vocal tract above the glottis filters the sound energy for the vibrating vocal
folds to give the distinctive acoustic shapes of vowels.
As a related example, let us consider the articulation of the three Arabic vowels [i,
u, a]. The high front vowel [i] is articulated with the tongue body raised and fronted to-
wards the alveolar region. As a result of this forward thrust of the tongue mass, the lower
pharynx is usually widened during [i]. The alveolar constriction takes place close to the
node of F2 which yields a high value for that formant. Meanwhile, widening the pharynx
takes place nearthe node of Fl which is why this frequency is usually low for [i]. For [u],
the tongue body is raised and backed towards the velar region. Also, the lips are rounded
(constricted) and, occasionally, protruded. Both the velar and the labial constrictions take
place near the two antinodes of F2. This is why this formant is usually very low for [u].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
12
Furthermore, the labial constriction takes place at the antinode of Fl, which, like F2, is
usually very low for [u]. If the lips are also protruded, this would elongate the vocal tract
and lower both formants as well. Arabic [a] is a mid low vowel that is accompanied by a
mildly narrowed lower pharynx. The oral tract is somewhat wider when taking the Eng-
lish mid-central vowel [ g] as a reference. The mild constriction in the lower pharynx
takes place close to the nodes of both Fl and F2, which should yield higher values for
both F1 and F2. However, the pharyngeal narrowing effect on F2 seems to be counterbal-
anced by the oral widening at the other node of F2. This is why, compared to English [g],
F1 for Arabic [a] is higher while F2 is about the same.
Figure 1.2 illustrates the relation between the articulatory configurations of the
three Arabic vowels and their typical power spectra. The glottal line spectrum refers to
the frequency components (harmonics) of the energy source which drop in amplitude at a
rate of 12 dB per octave (Pickett 1999). The radiation characteristics refer to the ten-
dency of the higher frequency components to gain in amplitude at a rate of 6 dB per oc-
tave as a result of the radiation of the sound signal out of the lips and into the open air. So
the net source spectrum drops in amplitude at a rate of 6 dB per octave. The transfer
function is basically the filtering effect of the specific shape of the vocal tract. The output
spectra reflect the filtering effects of the vocal tract during the production of each vowel.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
dB 80 70 60 50 40 30 20 10
Glottal Line Spectrum
Vowel
Vocal tract shape
Vocal tract trans-fer function
Output vowel spectrum
0 2 3kHz
[i]
13
Radiation Characteristics
+ ---------
[u] [a]
Figure 1.2. Illustration of how the articulation of the three Arabic vowels [i, u, a] is related to their acoustic shapes in the light of the source-filter theory.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
14
1.3 Modern Standard Arabic (MSA)
The experimental portions of this dissertation rely exclusively on data from Mod-
ern Standard Arabic (henceforth MSA). So, it is helpful to be acquainted with this variety
of Arabic and review its phonemic inventory.
Arabic is the main language in the Arab countries which occupy most of the Mid-
dle East and North Africa. Close to 200 million people in that region speak one variety of
Arabic or another as their first language. Furthermore, Classical Arabic (henceforth CA)
is used as a liturgical language by more than 1 billion Muslims around the world. Mus-
lims believe that Islam's holy book, the Holy Qur'an, which is worded in a form of CA
highly admired by Arabs (Kaye 1990), is the direct words of Allah (God). CA is often
referred to asjus'naa (clearest). As time passed, different Arabic-speaking peoples devel-
oped, naturally, numerous regional vernaculars that are mostly spoken, but rarely written.
MSA emerged as a direct descendent of CA that fills the need for a standardized form of
Arabic that can also be expressed in writing. Many Arab intellectuals hail MSA as a more
'proper' form of Arabic than the regional vernaculars which they view as signs of the
corruption that befell the revered CA. MSA is currently the language of the media, the
public education systems, practically all written and technical forms of Arabic, as well as
intellectual circles. MSA can also be thought of as a pan-Arab lingua franca used when-
ever dialectal differences veer into unintelligibility. According to Holes (1994), the wide-
spreading of education and mass-media exposure has a "leveling influence" which brings
the divergent Arabic dialects gradually closer to MSA.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
15
As noted earlier, MSA is a descendant of CA and retains the basic syntactic, mor-
phological, and phonological systems. But MSA brings added 'standardization'. Bateson
( 1967) lists the following main differences between MSA and CA:
1. MSA is a simplified form of CA. This simplification is mostly realized as the
placement of limitations on the choices of syntactic structures and vocabulary
items used. MSA only uses a subset of the possible syntactic structures avail-
able in CA as well as a substantially reduced lexicon.
2. Included in the MSA lexicon are newly derived, coined, and borrowed vo-
cabulary items that are intended to address the need for technical and other
modern-use terminology.
3. There are idiomatic, stylistic, and even syntactic innovations introduced into
MSA mainly due to the influence of European languages. Such influences are
brought about mostly by direct translations of European texts into Arabic.
The MSA phonemes are listed in (3) and (4). These phonemes are essentially di-
rectly inherited from CA. Overall, there are 28 consonant and three vowel phonemes.
Like other Semitic languages, Arabic is known for its root-and-pattern morpho-
logical system which differs from concatinative systems in that the morphemes are, more
or less, interwoven rather than linearly ordered. Most Arabic stems are based on roots of
two or three consonants between which vowels are inserted. Generally speaking, the con-
sonantal root carries the semantic meaning of the word while the vocalism and the vowel-
consonant ordering reflect the word's inflection and its part of speech. The example
words in (5) are all based on the tri-consonantal root ktb 'write'. Inflectional prefixes and
suffixes can also be attached to the stems. Compare the examples in (6).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
16 (3) Arabic consonant phonemes.
Bila- Labio- Dental Alveo- Palato- Pal a- Velar Uvular Pharyn Glot-bial dental Jar Alveo- tal geal tal
lar Stop b d k q. 7
tl' dl'. Fricative f e 0 s z J X B h
<)I' \' s Affricate d3
Nasal m n
Trill r
Approximant w w h l
Lateral
(4) Arabic vowel phonemes.
a
(5) a. katab 'wrote' b. kutib 'was written' c. kaatib 'writer' d. kitaab 'book'
(6) a. katab-a 'wrote' 3rct m. sg. katab-at 'wrote' 3rct f. sg. b. ja-ktub 'write' 3rct m. sg. na-ktub 'write' 1st m./f. pl.
Following the theoretical proposals of Goldsmith (1976), McCarthy (1979) han-
dles the theoretical challenges this morphological system poses to traditional linear theo-
ries by proposing the separation of the consonantal root, the vocalism, and the CV skele-
ton of the word into separate autosegmental tiers. The consonants and vowels are mapped
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
17
into the CV slots of the skeleton by means of association lines as shown in (7). As such,
the consonants that appear separated by vowels in the surface structure of the word are
underlyingly adjacent.
(7) Consonantal Tier k t b I I I
CV-Template cvcvc v Vocalic Melody a
1.4 Overview of the dissertation
Besides the current chapter, this dissertation is comprised of six chapters. Chapter
2 (Background and Literature Review) lays out the phonetic and phonological back-
ground and reviews the literature pertaining to Arabic emphatic and guttural sounds. The
chapter discusses the most prominent formal representations of Arabic emphatics and
gutturals and highlights the descriptive and explanatory inadequacies facing them. The
chapter also goes over some of the relevant vocal tract anatomical details.
Chapters 3, 4, and 5 are the core chapters of the dissertation. As noted earlier, this
dissertation investigates the acoustic correlates of Arabic emphatics and gutturals and re-
lates those correlates to the articulatory traits of those sounds. It is essential for the sue-
cess of this approach to locate salient and reliable acoustic correlates to emphatic and gut-
tural articulations. Each one of the three core chapters focuses on one possible source for
acoustic correlates to articulation. Three sources are focused on here since they have been
widely studied and have been shown to be rich in acoustic cues for articulation: the spec-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
18
tral shapes of the consonants themselves, formant transitions in the vowels adjacent to the
consonants in question, and consonants' effects on vowel-to-vowel coarticulation.
Chapter 3 (Experiment One) focuses on the spectral shapes of Arabic emphatics
and gutturals along with other related consonants. The goal of this chapter is to address
two gaps in the acoustic literature on Arabic emphatics and gutturals. This first gap con-
cerns the consonantal spectral correlates to emphaticness. As explained in the next chap-
ter, the majority of the previous attempts to distinguish emphatic consonants from non-
emphatic ones based on their consonantal spectral shapes have been either sketchy or
subjective or both. A comprehensive acoustic comparison between emphatics and their
non-emphatic counterparts is presented in Chapter 3 using more recent objective methods
of characterizing consonant spectra. The chapter concludes that no highly reliable acous-
tic correlates to emphaticness can be located in the spectral shapes of the consonants
themselves. This excludes the canonical spectra of consonants as the potential acoustic
source to pursue when addressing the main goals of this dissertation. The second gap ad-
dressed in this chapter concerns the consonantal status of Arabic uvular continuants.
These sounds are sometimes described as approximants, a classification that has crucial
theoretical repercussions as explained in the next chapter. The chapter concludes that
Arabic uvular continuants posses strong fricative spectral qualities. This finding demands
major reconsiderations of the phonological claims that are based on the treatment of all
Arabic gutturals as approximants.
Chapter 4 (Experiment Two) examines the coarticulatory impact of the sounds in
question on the formant frequencies of adjacent vowels. While this issue has been treated
thoroughly in the literature, this experiment aims to provide more objective evaluations
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
19
of the precise coarticulatory correlates to emphatic and guttural articulations. The subtle
similarities and differences among emphatics, uvulars, and pharyngeals are highlighted
and interpreted as solid indications of the characteristic articulatory properties of those
sounds. The main and only reliable correlate to emphaticness is shown to
be a substantially low and stable F2 locus in the adjacent vowel. Uvulars are also associ-
ated with low F2 transitions. However, unlike emphatics, uvulars are not associated with
identifiable F2 loci in adjacent vowels. The magnitude of F2 drop in uvulars depends on
the identity of the vowel. Pharyngeals are associated with consistently high Fl transi-
tions. While emphatics and uvulars are also generally associated with high Fl transitions,
this association is not as strong nor as stable as in the case of pharyngeals. These findings
are interpreted as indications that only pharyngeals achieve their pharyngeal constriction
through active tongue root retraction in Arabic. Emphatics and uvulars, on the other hand
involve mainly tongue dorsum retractions. Any tongue root retraction in these sounds is a
by-product of the general retraction of the tongue mass. This challenges the phonological
views that represent Arabic emphatics and uvulars as [+RTR] sounds. Furthermore, the
more stable association between low F2 transitions and emphatics, as opposed to uvulars,
is interpreted as an indication that the dorsal retractions in emphatics and uvulars are not
fully similar. This particular issue is addressed further in Chapter 5.
Chapter 5 (Experiment Three) examines the vowel-to-vowel coarticulatory effects
across the sounds in question. It is widely acknowledged that the tongue dorsum is the
main articulator in vowels. Since the sound classes investigated in this dissertation in-
volve different degrees and types of tongue participation in their production, the influ-
ence of those sounds on vowel-to-vowel coarticulation provides an experimental oppor-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
20
tunity to compare and contrast their articulations. This is particularly true in the cases of
emphatics and uvulars, both of which involve active participation of the tongue dorsum.
The results show that plain orals, pharyngeals, and laryngeals permit substantial degrees
of vowel-to-vowel coarticulation. Emphatics, on the other hand, strongly resist such ef-
fects. Uvulars' impact on vowel-to-vowel coarticulation depends on their degrees of con-
strictions. The uvular stop [q] shows emphatic-like resistance to vowel-to-vowel coarticu-
lation. The voiceless fricative [X] allows significant vowel-to-vowel coarticulation. The
voiced fricative [B"] is highly transparent to vowel-to-vowel coarticulation. These results
are interpreted in the light of the possible musculature involved in the production of em-
phatics and uvulars. Dorsal retraction in emphatics is most likely produced through con-
striction of the lingual muscles the styloglossus and the hyoglossus. Both of which are
implicated in the production of vowels. Dorsal retraction in uvulars is attributed to the
styloglossus only. The magnitude of this participation depends on the degree of constric-
tion involved in the sound. Tongue raising in uvulars is attributed to the contraction of the
palatoglossus which is not implicated in the production of vowels.
Chapter 6 (Implications and Alternatives) casts the acoustically-motivated articu-
latory claims in the three experiments into formal phonological representations of Arabic
emphatics and gutturals. Emphatics are considered to be secondarily dorsal sounds. Uvu-
lars are considered to be secondarily dorsal and primarily radical. Pharyngeals and laryn-
geals are considered to be radical sounds. The representations based on these considera-
tions are shown to be more capable of handling the challenges that face previous
proposals. Chapter 6 also proposes an abstract neuro-motoric foundation for the grouping
of the three guttural subclasses into one natural class. The chapter then concludes with a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
21
brief look at emphatics and gutturals in Tigre and Sta'at'imcest who differ from Arabic in
that their guttural natural classes include emphatics but exclude laryngeals. It is suggested
that these phonological differences are explainable on the basis of phonetic differences.
Chapter 7 (Conclusion and Recommendations) concludes the dissertation and
suggests possible topics for future research.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
22
CHAPTER2
Background and Literature Review
This chapter lays out the phonetic and phonological background for the disserta-
tion. Since there are numerous anatomical references in this chapter and throughout the
dissertation, the first section in this chapter goes over the most relevant speech organs in
some detail. Section 2.2 reviews the phonetic literature on Arabic emphatics, uvulars,
pharyngeals, and laryngeals. The section concentrates on the most prominent articulatory
and acoustic reports on these sounds. Occasional reviews of perceptual works are also
provided. Section 2.3 goes over the phonological evidence supporting the grouping of the
three Arabic guttural subclasses into a single natural class in terms of place of articula-
tion. The section concentrates on two types of phonological evidence (morpheme struc-
ture constraints and vowel lowering in guttural contexts) as reported in Arabic as well as
other related and unrelated languages. Section 2.4 reviews the most prominent formal
phonological representations of Arabic emphatic and guttural sounds. The section focuses
on the representations in McCarthy (1994), Rose (1996), and Zawaydeh (1999). These
three examples cover the general formal representational trends as far as Arabic emphat-
ics and gutturals are concerned. Section 2.5 highlights the descriptive and explanatory
inadequacies facing those proposals and links them primarily to the lack of a clear under-
standing of the articulatory traits of these sounds.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
23
2.1 Basic Vocal Tract Anatomy
At several locations in this dissertation, extensive references to articulatory organs
and their musculature are made. In this section, we take a look at the active articulatory
organs and describe the main muscles that underlie their actions. Only the articulatory
organs that are directly implicated in the articulation of the sounds of interest are dis-
cussed here. This is why, for example, the lips are ignored in the following review. The
details presented in this section are based on the descriptions and illustrations of Zemlin
(1968), Perkins and Kent (1986), Lieberman and Blumstein (1988), Palmer (1993), and
Seikel et al. (1997).
2.1.1 The Tongue
This flexible mass of muscle fiber is arguably the most notable articulatory organ.
From a linguistic perspective, the tongue is divided into four parts, mostly on the basis of
their relation to the fixed structure of the vocal tract: the tip, the blade, the dorsum, and
the root. The rear and radical portions of the tongue are fixed to the velum, the pharynx,
the epiglottis, and the hyoid bone. The lingual movements are executed by two sets of
muscles: the intrinsic muscles of the tongue, which originate from inside the tongue it-
self, and the extrinsic muscles of the tongue, which arise from neighboring structures and
terminate at various points in the tongue.
The intrinsic muscles of the tongue are the superior longitudinal muscle, the infe-
rior longitudinal muscle, the transverse muscle, and the vertical muscle. The superior
longitudinal muscle is a sheet of muscle tissue that extends throughout the length of the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
24
tongue just below its upper surface. When contracted, it shortens the tongue or lift the tip
and neighboring sides upwards giving the tongue a concave shape. Contracting one side
of this muscle alone causes the tongue to turn to that side. The inferior longitudinal mus-
cle is a paired muscle that arises from the root of the tongue and extends all the way to its
tip. It courses along the lower surface of the tongue following two side paths separated
along the middle by the genioglossus muscle (discussed below). Contracting this muscle
shortens the tongue or lowers its tip. Like the superior longitudinal muscle, contraction of
one side of this muscle causes the tongue to turn to that side. The transverse muscle is a
paired muscle whose fibers radiate from the median fiber wall of the tongue and stretch
laterally to terminate at the side edges of the tongue. Contraction of this muscle narrows
the tongue and lengthens it. The vertical muscle is also a paired muscle whose fibers ex-
tend vertically from just below the upper surface of the tongue flowing downward to-
wards the base of the tongue. Along the way, these fibers intertwine with those of the
transverse muscle. This muscle flattens the tongue when contracted.
The extrinsic muscles of the tongue are the genioglossus muscle, the hyoglossus
muscle, the styloglossus muscle, and the palatoglossus muscle. These muscles are sche-
matized in Figure 2.1. The genioglossus is the largest of the tongue muscles. Its fibers
arise from the inside surface of the mandible and fan upward and backward to insert into
the tongue from its tip all the way to its root. It occupies a medial location along the
width of the tongue. Contracting the anterior portion of the genioglossus draws the
tongue back while contraction of the posterior portion slides the tongue forward. When
both portions are contracted, the tongue assumes a concave shape along the middle. The
hyoglossus is a paired muscle that arises from the hyoid bone and inserts into the lower
Reproduced w
ith permission of the copyright ow
ner. Further reproduction prohibited without perm
ission.
1 I II ---.........._-----.., ' I I // .......
v:: 1\ 5
\ 6 \ 5
!'-- -.... \ 7 2 ( '.\.l;.\ 2
3
4 8
Figure 2.1. The extrinsic muscles of the tongue along with some other vocal tract organs.
1. Maxilla 5. Palatoglossus Muscle 2. Genioglossus Muscle 6. Styloglossus Muscle 3. Mandible 7. Hyoglossus Muscle 4. Geniohyoid Muscle 8. Hyoid Bone
6
3 I \ I '-
r 7
4 \ -:=;:jl l \ I \ .I I ....---.__) 8
Figure 2.2. The pharyngeal constrictors and related structures.
1. Stylohyoid ligament 5. Superior Constrictor Muscle 2. Pterygomandibular ligament 6. Middle Constrictor Muscle 3. Larynx 7. Inferior Constrictor Muscle 4. Trachea 8. Esophagus
N VI
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
26
sides of the tongue. When contracted, the hyoglossus lowers and retracts the tongue mass.
The styloglossus is a paired muscle that emerges from the styloid process and inserts into
the lower sides of the tongue. Contraction of this muscle draws the tongue back and up-
wards. The palatoglossus is a paired muscle considered to be a part of the anterior faucial
pillars. This muscle is classified anatomically as one of the extrinsic muscles of the
tongue as well as one of the muscles of the soft palate (see §2.1.3 below). It originates
from the anterior portion of the soft palate and inserts into the sides of the back of the
tongue. Contracting this muscle either lowers the soft palate or, if the soft palate is fixed,
raises the back of the tongue.
2.1.2 The Pharynx
The pharynx is, roughly speaking, a tube-like structure that extends from the pos-
terior region of the nasal cavity to the larynx. The upper region of the pharynx above the
velum is called the nasopharynx. The region extending from the velum down to the hyoid
bone is the known as the oropharynx. The region from the hyoid bone down is called
the laryngopharynx. The most notable of the pharyngeal muscles are its three constrictor
muscles shown in Figure 2.2. The superior constrictor muscle originates from the ptery-
gomandibular ligament and courses backwards to insert into the midline tendinous raphe.
The middle constrictor muscle originates from the hyoid bone and the stylohyoid liga-
ment and courses backward to insert into the midline tendinous raphe. The inferior con-
strictor muscle is a rather large sheet of muscle fibers. It starts from the cricoid cartilage
and the thyroid lamina and fans back and around to insert into the midline tendinous ra-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
27
phe. Contracting any of the three pharyngeal constrictors narrows the diameter of the
pharynx at its particular location.
2.1.3 The Soft Palate
The soft palate, or velum, is a flexible flap of muscle fibers and other tissues that
forms the posterior part of the roof of the mouth. It is attached anteriorily to the rear edge
of the hard palate, by means of the palatal aponeurosis, and laterally to the superior con-
strictors of the pharynx. In speech, the soft palate plays a major role in the production of
nasal sounds. When fully lowered it allows the pulmonic airstream to pass through the
nasal cavity producing the acoustic and auditory effect of nasalization.
As shown in Figure 2.3 the soft palate has two main elevator muscles: the levator
veli palatini muscle and the uvular muscle; and two main depressor muscles: the pala-
toglossus muscle and the palatopharyngeus muscle. The levator veli palatini is a paired
elevator muscle. It arises from the temporal bone and the Eustachian tube and descends to
insert into the aponeurosis of the velum. When contracted, this muscle lifts the velum up
and back. The other elevator muscle, the uvular muscle, originates from the posterior
palatal bones and from the palatine aponeurosis and extents backwards until it inserts into
the uvula. When contracted, this muscle shortens and raises the velum. The palatoglossus
has already been described as one of the extrinsic muscles of the tongue. As explained
above, contraction of this muscle lowers the soft palate or, if the soft palate is fixed, it
raises the back of the tongue. The palatopharyngeus is another velar depressor. Its fibers
arise from the soft palate and stretch laterally and downwards and attach to the thyroid
cartilage as well as the pharyngeal walls. It is part of the posterior facial pillars. Contrac-
Reproduced w
ith permission of the copyright ow
ner. Further reproduction prohibited without perm
ission.
__.-'fl'///k '' 5 2
3 ......... ,_.,,,,,_,_""""'-- //,',e/1
4 I I 6
Figure 2.3. Muscles of the soft palate along with related structures.
1. Palatal Bone 4. Tongue 2. Uvular Muscle 5. Levator Veli Palatini Muscle 3. Palatoglossus Muscle 6. Palatopharyngeus Muscle
2
3 _,_,__ ___
4 5
Figure 2.4. Structure of the larynx.
1. Hyoid Bone 2. Hyothyroid Membrane 3. Thyroid Cartilage 4. Cricothyroid Muscle
(Pars Recta)
: : 6
I 7
: 8
5. Cricothyroid Muscle (Pars Oblique)
6. Arytenoid Cartilage 7. Cricoid Cartilage 8. Trachea
N 00
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
29
tion of this muscle brings that faucial pillars closer and lowers the velum in a sphincteric
move that narrows the diameter of the pharynx. It can also raise the larynx.
2.1.4 The Larynx
The larynx (Figure 2.4) is a complex structure made up of muscle tissues and car-
tilages. The main cartilage structures that make up the larynx are the cricoid cartilage, the
thyroid cartilage, the arytenoid cartilages, and the epiglottis. The cricoid rests atop the
upper end of the trachea. It is shaped like a ring with its posterior part reaching higher
(i.e., is thicker) than the anterior part. The thyroid is the largest laryngeal structure. It is
situated above the cricoid and the two are joined by means of a pair of facets, one on each
side. The arytenoid cartilages are small pyramid-shaped structures that rest on facets lo-
cated on the two sides of the upper posterior surface of the cricoid cartilage. Attached to
the front and sides of these cartilages are the vocal folds which are multi-layered tissues
whose other edge is attached to the inner surface of the front of the thyroid. The epiglottis
is a shoe-hom-shaped cartilage that is attached to the lower inner surface of the front of
the thyroid. It rises upwards extending above the level of the hyoid bone. It is attached to
the arytenoid cartilages by means of the aryepiglottic folds. While not structurally a part
of the larynx, the hyoid bone is closely related to laryngeal and oral structures. It is an
arch-shaped bone sitting above the larynx and is linked to the thyroid by means of the
lateral hyothyroid ligaments which extend from the posterior tips of hyoid downwards to
the horns of the thyroid. Also, the two structures are linked by means of the hyothyroid
membrane which drapes down from the hyoid and attaches to the upper rim of the thy-
roid.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
30
The larynx has the following intrinsic muscles: the lateral cricoarytenoid muscle,
the transverse arytenoid muscle, the oblique arytenoid muscles, and the cricothyroid
muscle. The lateral cricoarytenoid muscle is a vocal fold adductor muscle. It extends
from the upper rim of the cricoid cartilage to the muscular process of the arytenoid carti-
lages. Contraction of this musele rotates the arytenoids bringing the vocal folds closer to
each other. The transverse arytenoid muscle stretches from the side and back of one
arytenoid to the other. When contracted, it brings the arytenoids, and subsequently the
vocal folds, closer to each other. The oblique arytenoid muscles are paired muscles. Each
one arises from the bottom of the posterior part of one of the arytenoids and inserts into
the top of the other. From there it continues upward to form the aryepiglottic muscles
which insert into the sides of the epiglottis. Contraction of the oblique arytenoid muscles
has a rather similar effect to that of the transverse arytenoid muscles. Contraction of the
aryepiglottic muscles, with the aid of the oblique arytenoid muscles, pulls the epiglottis
down to cover the larynx. The cricothyroid muscle arises from the sides and front of the
cricoid cartilage then divides into two parts: the pars recta and the pars oblique. The for-
mer extends upwards and the latter extends up and backwards till they both attach to the
lower edge of the thyroid. Contraction of the pars recta tilts the front of the thyroid down
increasing the distance between the thyroid and the arytenoids causing a tension in the
vocal folds. Contraction of the pars oblique slides the thyroid forward also increasing the
distance between the thyroid and the arytenoids and tensing the vocal folds.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
31
2.2 Phonetic Properties of Arabic Emphatics and Gutturals
2.2.1 Emphatics
The famed Arab grammarian Sibawayh5 (d. circa 796 A.D.) notes that the four
emphatics [t', d'i, cF, s'i] are articulatorily similar to their non-emphatic counterparts [t, d,
o, s]. The exception being that in the emphatics, "your tongue would cover (the area ex-
tending) from their main place of articulation to portion of the palate opposite the tongue
(which) you raise towards the palate" (Kitab Sibawayh, vol. 2, p: 406). For this reason,
Arab grammarians termed these four sounds mut'1baqah ('covered'). Ibn Sina (d. 1037
A.D. - known in western historical and philosophical circles as Avicenna) adds that
emphatics are articulated with a depressed tongue surface behind the main articulation.
This point, as discussed below, is verified by modern research techniques.
Modern studies show that, beside their primary coronal articulation, all Arabic
emphatics have a secondary articulation involving the back of the tongue. Descriptions of
the latter involvement differ from one study to the other. It is generally accepted that the
secondary emphatic articulation involves mainly a retraction of the tongue body. The
schematic in Figure 2.5 illustrates this articulatory configuration. Among the earliest x-
ray examinations Of emphatics is the one done by Al-Ani (1970). His x-ray tracings
clearly show that the tongue body is pulled backwards into the upper oropharynx during
the articulation of [t'i]. Based on this articulatory evidence, the author favors pharyngeali-
zation over velarization as the proper description for the secondary emphatic articulation.
5 For some reason, Sibawayh's name is misspelled as 'Sibawayhi' in the majority of the modern works published in western countries. The source of the added i at the end of the name is not known to me. The spelling I use here is a transliteration of the Arabic spelling,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 2.5. A schematic illustration of the vocal tract configuration during the articulation of an Arabic emphatic coronal and its non-emphatic counterpart. This schematic is based on descriptions and illustrations in Al-Ani (1970), Ali and Daniloff (1972), and Ghazeli (1977).
32
The cineflurographic investigation by Ali & Daniloff (1972) using Iraqi speakers
arrived at similar findings. The difference reported by the authors between emphatics and
non-emphatics is that the former class of sounds involves a retraction of the pharyngeal
tongue dorsum causing a narrowing in the upper pharynx. The authors found that the pos-
terior wall of the pharynx and the velum were not significantly implicated in the articula-
tory difference. The only significant involvement of the velum occurs during the produc-
tion of [k"] ([q]) - which they consider as an emphatic version of [k] -during which
the velum is moved toward the tongue. Additionally, the authors are careful to point to an
active participation by the palatine tongue dorsum in the articulatory difference between
emphatics and non-emphatics. The palatine dorsum is depressed during emphatics caus-
ing a widening of the oral cavity- an adjustment also shown in Al-Ani's tracings.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
33
A more extensive articulatory investigation is offered by Ghazeli (1977) who
points that the accompanying depression of the palatine dorsum is either the cause or the
result of the rearward movement of the tongue back. The author reports that the retraction
of the tongue back into the upper pharynx takes place "at the level of the second cervical
vertebra" (p. 72). The precise location of the constriction, however, does not seem to be
an area of agreement among articulatory studies. Based on x-rays of a speaker of Bagh-
dad Arabic, Giannini & Pettorino (1982) report that the extremum of the pharyngeal con-
striction takes place closer to the level of the third and fourth vertebrae.
The x-ray-based investigations of Ali & Daniloff (1972) and Ghazeli (1977) point
to a retraction in the upper pharynx during emphatics achieved by a retraction of the
tongue body towards the posterior pharyngeal wall while little or no adjustments take
place in the lower pharynx (Al-Ani's (1970) x-ray tracings do not show the lower phar-
ynx). Ghazeli notes that there is an accompanying backward movement of the epiglottis
but no significant adjustments in the laryngopharynx. This suggests that the epiglottal
constriction is a byproduct of the general retraction of the tongue. However, Laufer &
Baer (1988) argue that the epiglottal constriction is actually what defines the secondary
emphatic articulation on the bases of a fiberscopic study of nine subjects speaking Ara-
bic, Hebrew, or both languages. Their images show noticeable backing of the epiglottis in
emphatics as opposed to non-emphatics. The authors conclude that the secondary articu-
lation in emphatics and the primary articulation in pharyngeals are qualitatively similar: a
constriction in the lower pharynx achieved by a backward movement of the epiglottis to
form a constriction with the pharyngeal walls. According to the study, however, the pha-
ryngeal constriction is less extreme and less constant in emphatics than in pharyngeals.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
34
The fiberscopic study by Zawaydeh ( 1999) also concludes that there is an articulatory
similarity between emphatics, uvular, and pharyngeals, all of which involve pharyngeal
narrowing. Zawaydeh's study, however, does not discuss the precise locations of the con-
strictions.
Fiberscopic images are a very valuable method for investigating the lateral and
annular movements in the pharynx which cannot be captured by lateral x-ray images.
However, compared to x-rays, fiberscopes are at a disadvantage when we consider the
area of coverage and the coordination between articulators. Rhinal fiberscopes are typi-
cally inserted into the subject's nostril, extended backwards through the nasal cavity, and
then dangled downwards into the upper oropharynx below the level of the uvula. They
provide a top-to-bottom look at the mid and lower pharynx. Thus, fiberscope images can-
not capture the whole tongue dorsum nor can they reliably judge the vertical placement of
the larynx. Lateral x-rays images are wide enough to cover the whole vocal tract and
show how the movement of all articulators are timed and coordinated. The x-ray tracings
in the studies discussed above show a rearward movement by the epiglottis accompany-
ing the general backing of the tongue dorsum. Furthermore, Giannini and Pettorino
(1982) state that the aryepiglottic muscle which, when contracted, depresses the epiglottis
backwards is not involved in this articulation. Accordingly, Laufer & Baer's (1988) con-
clusions regarding the common active articulator for emphatics and pharyngeals have to
be questioned. A rearward movement of the epiglottis can be the result of a general re-
traction of the tongue back or the tongue root to which the epiglottis is attached. It was
mentioned earlier that Ghazeli (1977) and Giannini & Pettorino (1982) noted such dis-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
35
placement in x-ray images, but it was not the most significant articulatory difference be-
tween emphatics and non-emphatics.
So far, most attempts to distinguish emphatic consonants from their non-emphatic
counterparts on the basis of their acoustic shapes have depended on visual inspection of
spectrograms and generally met little success. Using synthesized sound tokens, Obrecht
(1961) found that [s] and [s'] cannot be perceptually distinguished from each other based
on lower cutoff edge of their fricative noise portions. The lower frequency cutoffs for [s]
and [s'] reported by Al-Ani (1970) are at about 3000 Hz and 2750 Hz, respectively.
Ghazeli (1977), however, found that both [s] and [s'] have energy concentrations that
start at 3000 Hz. Giannini and Pettorino (1982) report that the spectrograms of both [s]
and [s'] exhibit similar "irregular striations of equal intensity above 3000-4000 cps".
Card (1983) also found it impossible to link the difference between the two sibilants to
the lower edge of spectral frequency. Norlin (1987) could not find differences between
the Egyptian Arabic emphatic/plain fricative pairs [s, s'] and [z, z'] using mingograms
and spectrograms. However, using critical band spectra he concluded that the spectral
center of gravity in emphatics was generally lower than in non-emphatics. Although he
also found that emphatics, on average, had higher energy dispersion and lower mean in-
tensity than non-emphatics, this was not true for all of his subjects.
For the most part, Al-Ani (1970) reports no difference in duration between em-
phatics and non-emphatics. Meanwhile, Giannini and Pettorino (1982) found that the dif-
ferences in duration between emphatics and non-emphatics demonstrate some variation:
before [a], emphatics are longer while before [i] non-emphatics are longer.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
36
.The voiced pair [o, o1], on the other hand, show detectable acoustic differences
in their spectrograms due to the absence of intense noise that would otherwise mask those
differences. Ghazeli (1977) found that both fricatives have visible formant-like structures.
For [o] F2 is at 1600, 1600, 1400Hz after [i], [a], and [u], respectively. For [o'], the val-
ues are 1100, 1000, and 800 Hz.
Regarding emphatic/non-emphatic stops, Al-Ani (1970) reports that the energy
concentration in the stop burst of [t'] is lower than in the burst of [t]. No such difference
was reported by Giannini and Pettorino (1982) nor by Ghazeli (1977) who found that the
most visible concentration of energy in the bursts of both stops is at 4000Hz. However,
Ghazeli found that the VOT is longer for [t] than for [t'] (30 msec vs. 10 or 15 msec).
Giannini and Pettorino also report that the [t], as opposed to [t1], is sometimes followed
by aspiration. This comes in spite of Fre Woldu's (1981) finding that there is no differ-
ence in peak intraoral pressure (expressed in mm in H20) between emphatic and non-
emphatic consonants.
The coarticulatory effect of emphatics on neighboring vowels, or emphasis spread
(ES), is a well known acoustic attribute of these sounds. The most reported effects are a
lowered F2 and a raised F1 (either at the transition only or throughout the vowel). The
rise in F1 is not reported in all studies. Al-Ani (1970) reported large F2 onset drops in
vowels following emphatics consonants as opposed to non-emphatic ones. The vowel [i]
exhibited rising transitions from emphatics while [u] had falling transitions. Meanwhile,
no major differences were noticed between the frequency values at the transition and
steady state of [a]. The absence of coarticulatory effects on [a] is somewhat surprising
since most of the other studies indicate that [a] is the vowel most susceptible to ES.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
37
Ghazeli (1977) found that the drop in F2 extends throughout [a] while in [i] only the on-
set of F2 in is low followed by a rising transition in [i] and there is no transition in [u]. He
also found F1 is raised in all vowels. Younes (1982) found similar patterns in Northern
Palestinian. Meanwhile, F3 in vowels does not seem to reflect any coarticulatory influ-
ence by adjacent emphatics. Giannini & Pettorino (1982) found no change in F3 locus
next to emphatics while El-Dalee (1984) reports that the changes in F3 were inconsistent.
The spread of emphasis was found to be a strong perceptual cue for the presence
of a secondary articulation in the consonant. Ali and Daniloff (1974) prepared truncated
Baghdad Arabic minimal pairs of natural words. In each word, the sound that was spliced
away was an emphatic or its non-emphatic counterpart in either a word-initial or word-
final position with a vowel adjacent to it. When they presented the tokens in carrier
phrases to their ten subjects, the authors found that, in a statistically significant number of
cases, speakers could tell whether a word contains an emphatic or a non-emphatic. The
authors did not attribute the perceptibility of emphasis to any single vowel formant.
However, Obrecht (1961), who used synthetic tokens, found that the low locus ofF2 next
to emphatics was successful in cuing the perception of emphaticness. He found the per-
ceptually-effective F2 locus next to emphatics, or the "zone of velarization" (he claims
that the secondary articulation in emphatics is velarization), to be between 1000 and 1400
Hz.
Some segments have been shown to be opaque toES. This means that seg-
ments would resist the articulatory and acoustic effects of ES and would block those ef-
fects from reaching beyond them to other segments. The most reported opaque sound is
the high front vowel [i] (e.g., Ghazeli 1977, Card 1983, Heath 1987, Younes 1993, and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
38
Davis 1995). The semivowel [j] and the voiceless fricative [f] have also been frequently
reported to be opaque toES (e.g., Card 1983, Heath 1987, Younes 1993, Davis 1995,
Shahin 1997a, b). In Arabic dialects that possess the voiced fricative [3], this sound is
also cited as an opaque segment (e.g., Heath 1987). Additionally, Shahin (1997a, b)
found that, in Abu Shusha Palestinian Arabic, the two affricates [1f, d.3] are opaque toES.
A common articulatory trait between those opaque segments is that they involve raising
and fronting of the tongue dorsum. This maneuver is antagonistic to the tongue dorsum
retraction involved in the secondary articulation in emphatics which is what gets spread
to neighboring segments. Equally important, but generally ignored, is the fact that the ar-
ticulation of those opaque segments negates the lowering of the palatine dorsum surface
which is witnessed during emphatic articulation.
Emphasis spread generally travels both leftward and rightward relative to the em-
phatic consonant. Most reports in the literature indicate that leftward ES is more sizable
and more constant than rightward ES. Ghazeli (1977) found that R-to-L ES is less re-
stricted than L-to-R ES. The former can be weakened but not blocked by [i] or [j], while
the latter is strongly weakened or blocked by [i]. Younes ( 1993), however, finds that,
while the same is true for Palestinian Arabic, Cairene Arabic exhibits the opposite trend:
ES is less restricted in the L-to-R direction than in the R-to-L direction. Zawaydeh (1999)
provides acoustic evidence suggesting that, in Ammani-Jordanian Arabic, L-to-R ES is
gradient while R-to-L is categorical. This means that, in the L-to-R direction, the acoustic
effect of ES wears off as we move further away from the emphatic consonant. The acous-
tic effect of R-to-L ES, on the other hand, remains strong and relatively constant. El-
Dalee's results agree with Zawaydeh's, but he points that while L-to-R ES is gradient, it
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
39
still lies within "the acoustic range which is assumed to induce the perception of (em-
phaticness )" (p. 141 ).
Word boundary delimits ES as reported by Ghazeli (1977) and Younes (1993).
There is, however, disagreement on the effect of morpheme boundary onES. Inboth Pal-
estinian Arabic and Cairo Arabic, Younes (1993) found that morpheme boundary option-
ally blocks R-to-L ES but has no significant effect on L-to-R ES. Zawaydeh (1997), on
the other hand, reports that emphasis spreads obligatorily into prefixes and optionally into
suffixes. If the word ends in an emphatic, emphasis spread into suffixes becomes obliga-
tory as well. It seems that distance also plays a role in the degree of ES. Younes (1993)
found ES to be stronger on closer segments than on further ones. But this was true only in
Palestinian Arabic. In the other dialect he studied, Cairene Arabic, distance has no bear-
ing onES. Zawaydeh (1997) also found that, in Ammani-Jordanian Arabic, The further
away the trigger of ES, the weaker its effect becomes. In general, the studies cited earlier
which state that L-to-R ES is gradient also provide support to the effect of distance from
the emphatic trigger.
Lehn (1968) and Broselow (1979), both of whom worked on Egyptian varieties of
Arabic, argue that the domain for ES is the syllable. According to Broselow, if a conso-
nant in a syllable is assigned [+RTR] (i.e., emphatic- she clearly agrees with the view
that emphatics involve an actively retracted tongue root), the node dominating the sylla-
ble is assigned [ +RTR]. Thus, all segments dominated by that node will be assigned the
same feature. However, neither Lehn nor Broselow provide acoustic verification for their
claims.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
40
In sum, Arabic emphatics, [t", d', o", s'], are a set of coronal obstruents that in-
volve a secondary articulation in the form of a retracted tongue dorsum resulting in a nar-
rowing in the upper portion of the pharynx. This retraction is accompanied by small re-
traction by the lower part of the anterior wall of the pharynx and the epiglottis. Emphatics
are generally associated with a lowered F2 and raised Fl in adjacent vowels in compari-
son with their non-emphatic counterparts. This acoustic effect, known as 'emphasis
spread' (ES) reaches far in both directions and is usually blocked or weakened by high
front sounds like [i], [j] and [J]. The F2 drop has been shown to be a reliable cue for the
perception of emphatics.
2.2.2 Uvulars
Early Arab grammarians noticed that the two uvulars [B", :X] hold an articulatory
affinity to the other guttural sounds in terms of place of articulation. They have observed
that the articulation of those two uvulars was a pharyngeal rather than oral one, but,
unlike other gutturals, uvulars are articulated at a point in the pharynx very close to the
mouth. Sibawayh described these two uvulars as "the sounds whose point of articulation
is (at the part of the throat) bordering the mouth" (Kitab Sibawayh, vol. 2, p. 405). The
possibility of an oral participation in the articulation of uvulars was suggested by Ibn
Jinni (d. 1002 A.D.) who noted that uvulars are pronounced at the upper end of the throat,
along with the edge of the mouth. This is an important observation supported later by
modern phonetic and phonological research. As for the uvular stop [q], it was considered
by Arab grammarians as an oral stop whose point of articulation, according to Sibawayh,
is "at the portion of the tongue furthest back and the part of the palate just above it". An-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
41
other important observation by Arab grammarians that received modern support is the
articulatory (and auditory) similarities between uvulars and emphatics. Sibawayh, and
later Ibn Jinni and Ibn Al-Jazari (d. 1429 A.D.), grouped these two sets of sounds into the
class of mustraliyah sounds. This term is derived from the Arabic word 7isti)laa7, which
is described by Sibawayh as the elevation of the tongue towards the palate.
Modem studies offer somewhat similar articulatory accounts of these sounds. The
schematics in Figure 2.6 show the articulatory configurations of the three Arabic uvulars
[X, ff, q]. Catford ( 1977) describes the articulation of uvulars, including the Arabic set of
[X, ff, q], as moving the rear-most portion of the tongue surface towards the posterior soft
palate and the uvula. For this reason, he terms their articulation as dorsa-uvular. This
general description is supported by the published x-ray investigations of these sounds,
although, at least in the case of Arabic, the articulation of [ff, x, q] is more complicated
than Catford's brief account. Based on successive x-ray frames of a single Lebanese
speaker, Delattre (1971) describes a rather dynamic articulation for the three uvulars [ff],
[X], and [q]. In his account, the tongue slides horizontally backwards then moves up-
Figure 2.6. Schematic illustrations of the vocal tract configurations during the articulation of an Arabic uvulars. These schematics are based on descriptions and illustrations in Delattre (1971) and Ghazeli (1977).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
42
wards to create a constriction in the upper pharynx. For this reason, he provided two il-
lustrative tracings for the two uvulars [ff] and [X], one for each movement (only one trac-
ing of [q] is provided). This curved path followed by the back of the tongue is common
among the three sounds. The articulation of [ff], as shown in Delattre's tracings, also in-
volves a downward curling of the uvula towards the raised back of the tongue causing a
slight trill which he notes to be "hardly noticeable on spectrogram, and even does notal-
ways take place" (p. 135). The articulation of [X] is generally similar except that it is nar-
rower than that of [ff] and does not involve similar participation from the uvula, which,
according to the author's descriptions and x-ray tracings, is held flat over the back of the
tongue. When comparing the uvula position during the articulation of [X] to those ac-
companying the other fricatives reported by Delattre, it appears clearly that the flattened
shape of the uvula is somewhat unique. The author explains that this configuration is in-
tended "to prolong the stricture and contribute to the production of friction turbulence"
(p. 137). It is quite visible from comparing the tracings for [ff] and [X] that the narrower
constriction of the latter is achieved by a higher and more bayked position of the tongue.
This position seems to be exaggerated further during the articulation of [q] to achieve a
full occlusion.
In general, the area and manner of uvular constrictions reported in the x-ray inves-
tigation of Ghazeli (1977), who served as his own subject during the x-ray part of his
study, were similar to those of Delattre (1971). However, the tongue back positions dur-
ing the articulations of [ff] and [X] as reported in the two studies are different. Ghazeli
noted that the tongue dorsum is retracted more during [ff] while the place of constriction
for during [X] falls between those for [q] and [k]. Furthermore, Ghazeli's descriptions
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
43
and x-rays indicate that the anterior wall of the pharynx as well as the epiglottis is pulled
backwards towards the posterior wall of the pharynx. Delattre reported no such adjust-
ments. Ghazeli also reports slight raising of the larynx during [X] and [q], but not [ff].
While not expressed by Ghazeli, his tracings show that the tongue is backed the most dur-
ing [q]. Accordingly, the pharyngeal volume above the epiglottis is smaller during [q]
than during [ ff] or [X]. A possible reason for this is that the occlusive nature of [ q] de-
mands a full articulatory seal at the vertical as well as the horizontal surfaces of the uvula
causing more raising and backing of the tongue.
Acoustically, the voiced uvular [ff] features somewhat vowel-like formants
throughout its duration accompanied by some weak noise pointing to a mildly fricativ{{
manner of articulation (Al-Ani 1970). The formant-like spectral structures are subject to
coarticulatory conditioning by neighboring vowels. Al-Ani actually refers to these for-
mant-like structures as a "continuation of Fl, F2, and F3" of neighboring vowels. Ghazeli
(1977) reports that Fl of [ff] ranges from 500 to 600Hz next to the low vowel [a] while
F2 ranges from 1200 to 1300Hz. Next to [i] and [u] Fl is lower while F2 is raised next to
[i] and lowered next to [u] (though no precise numbers were given). Both Al-Ani and
Ghazeli describe the spectrograms of the voiceless uvular [X] as aperiodic noise. The
lower limit of spectrographic energy reported by Ghazeli ranges from 600 to 1500 Hz,
depending on the subject. Al-Ani, on the other hand, explains that the lower limit of the
spectral energy depends on the vowel context: around 1500 Hz, 1000 Hz, and 800 Hz
next to [i], [a], and [u], respectively.
Acoustic investigations also show that, like emphatics, uvulars spread emphasis
into neighboring vowels. There are differences in the reports of the size and domain of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
44
ES from uvulars. Al-Ani (1970) reports that, next to [B"] and [X], F2 onset value in [i] is
1 owe red to 1800-1900 Hz while F2 onset in [ u] is raised to 13 50 Hz. As for F2 onset in
[a], there was a stronger coarticulatory effect from [B"] (1250-1300 Hz) than from [X]
(1350-1500 Hz). The coarticulatory effect exhibited by [q] is stronger still: F2 onset val-
ues were 1600Hz, 1150-1200 Hz, and 900Hz next to [i], [a] and [u], respectively. Un-
fortunately, Al-Ani did not report any F1 onset values which, one would expect to be
somewhat raised next to uvulars. Giannini & Pettorino (1982) found that F1 locus next to
[B"] and [X] is at 500Hz, while that of F2 is at 1500Hz. F1 and F2 loci next to [q] are at
500 Hz and 1400 Hz, respectively. Interpreting these values in the light of nodes and
antinodes of F1 and F2, Giannini & Pettorino conclude that [B"] and [X] are articulated at
the same place as [ q] and that all three sounds are uvular.
While Heath (1987), who studied Moroccan Arabic, states that the coarticulatory
effect of uvulars is as strong as that of emphatics, the general agreement is that ES from
uvulars is somewhat milder and does not reach as far. Ghazeli (1977) notes that ES from
uvulars does not affect adjacent high vowels, adjacent consonants, or non-adjacent seg-
ments. Kuriyagawa (1984), who studied only the uvular stop [q] along with emphatics in
Standard Arabic spoken by an Egyptian subject, also found that, while the coarticulatory
effect of that uvular on vowels is qualitatively similar to that of emphatics, it does not
reach into the following syllable. Similarly, El-Dalee (1984) found that, in Egyptian Ara-
bic, ES from uvulars affects the adjacent vowel only.
To sum up, Arabic has two uvular continuants, the voiceless [X] and the voiced
[B"], and one uvular stop [q]. These sounds are produced with a general raising and retrac-
tion of the tongue dorsum towards then soft palate. This maneuver is comprised of two
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
45
movements: the dorsum is first pulled back, and then it is raised towards the uvular re-
gion. In [X] and [q], the uvula is flattened and held up. In [B"], the uvula is curled down-
wards towards the tongue. Acoustically, [B"] shows mild frication with formant-like struc-
tures while [X] shows aperiodic noise. All three uvulars spread emphasis onto
neighboring vowels. However, emphasis spread from uvulars is generally not as sizeable
nor as far-reaching as emphasis spread from emphatics.
2.2.3 Pharyngeals
According to Sibawayh's and Ibn Jinni's descriptions, the two pharyngeals [1]
and [h] are articulated "at the middle of the throat". While rather vague, this description
does capture the fact that the main point of articulation for pharyngeals lies in between
those of laryngeals and uvulars. This has been largely confirmed in modern articulatory
studies. The schematic in Figure 2. 7 illustrate the general articulatory configuration of
Arabic pharyngeal consonants. Unfortunately, the x-ray tracings provided by Al-Ani
( 1970) for the two pharyngeals [h, 1] do not cover the mid and low regions of the phar-
ynx. Interestingly, Al-Ani claims that the most common allophone of [1] is a voiceless
stop while in intervocalic positions it is realized as either a stop or a glide. While it is true
that [1] surfaces as a stop in certain dialects of Arabic (Al-Ani bases his conclusion on
acoustic data obtained from Iraqi subjects), most of the published phonetic literature
clearly indicate that it is always realized as a voiced fricative or approximant. Such wide
variation in the degree of constriction for [1] was also exhibited by Hebrew subjects
(Laufer and Condax 1981, 1979).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 2. 7. A schematic illustration of the vocal tract configuration during the articulation of an Arabic pharyngeal consonant. This schematic is based on descriptions and illustrations in Delattre ( 1971) and Ghazeli ( 1977).
46
The tracings in Delattre ( 1971) and Ghazeli ( 1977) cover the whole vocal tract.
Both studies report that Arabic pharyngeals are articulated mainly by retracting the
tongue root towards the posterior pharynx wall with the narrowest constriction taking
place at the level of the epiglottis. Delattre notes that the constriction for [h] is lower than
that one for [)]. He also notes, as does Ghazeli, that the constriction is narrower for [h]
than for [)] since, as a voiceless fricative, [h] requires a narrow constriction to produce
adequate turbulence in the air stream. Other important articulatory movements reported ·
by Ghazeli are raising the larynx and a forward movement of the lowest part in the poste-
rior wall of the pharynx. While x-ray tracings reveal a backward displacement in the
tongue root and the epiglottis, the anatomical makeup of the musculature involved sug-
gests a rather two-dimensional annular gesture that cannot be reflected in lateral x-rays.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
47
Catford (1977) describes the articulation of [h] and [<i'] as "largely a sphincteric semi-
closure of the oro-pharynx" 163).
Ladefoged and Maddieson (1996) maintain that Semitic [h] and [<i'] are "neither
pharyngeals nor fricatives" (p. 168) arguing instead that these sounds are epiglottal ap-
proximants. Most available accounts of these sounds (see acoustic descriptions below) do
support the view that Arabic [h] and [<i'] are approximants. In Butcher and Ahmad's
(1987) words, [h] and [<i'] are "formed in a region of the vocal tract where true fricatives
are very difficult to produce" (p. 156). However, Ladefoged and Maddieson's claim that
these sounds are epiglottals and not pharyngeal is not without problems. In the same
book, Ladefoged and Maddieson describe the connection between the epiglottis and the
tongue root as follows:
"The relation between the root of the tongue and the epiglottis is similar to
that between the tip and the blade of the tongue. They can be moved sepa-
rately, but because of their proximity only one or the other can be the
principal articulator in any given sound." (p. 11)
Hence, Ladefoged and Maddieson's claim that [h] and [)] are epiglottals rather than
pharyngeals means that the epiglottis is the organ which moves backwards to make the
constriction. This claim is expressed by Laufer and Condax ( 1981) based on fiberscopic
data. Laufer and Condax assert that the tongue does not participate in the articulation of
these sounds. These positions are at odds with the cited x-ray accounts that clearly show
that the root of the tongue is pulled backwards causing an unavoidable retraction of the
epiglottis along with it. If the epiglottis was moved independently (through the action of
the aryepiglottic muscles), we would not expect a consistently concomitant retraction of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
48
the root of the tongue. An x-ray investigation by Boff Dkhissi (1983) (cited in Ladefoged
and Maddieson ( 1996)) concludes instead that the movement of the tongue root and the
epiglottis are not independent of each other and that the constriction is made by the two
organs jointly. The stronger argument seems to be that [h] and [l] are made, primarily, by
a retracted tongue root. This retraction pulls the lower surface or the posterior wall of the
pharynx as well as the epiglottis backwards along with the tongue root.
Meanwhile, the tongue body assumes a mid position inside the mouth in an [a)-
like fashion. These positions are clearly seen in Delattre's and Ghazeli's x-ray tracings.
In reference to this oral configuration, Ghazeli actually describes a second narrowing tak-
ing place in the oral tract during pharyngeals approximately 6 em behind the lips.
Acoustically, Ghazeli (1977) explains that [l] has vowel-like formant structures.
F1 and F2 of [l] fall between 650-900 Hz and 1300-1700 Hz, respectively, depending on
the vowel context. The spectrogram of voiceless [h], on the other hand, has aperiodic
noise together with formant structures. The value ranges for F1 and F2 are 550-1100 Hz
and 1100 -1800 Hz, respectively. The noticeably high F1 values, according to Ghazeli
are credited to the very low place of constriction in pharyngeals as well as the relatively
wide oral cavity. It is worth noting that, while Al-Ani describes [l] as a voiceless stop,
his acoustic account of [h] is generally similar to that of Ghazeli. It seems that, for the
particular Iraqi dialect studied by Al-Ani, the contrast between pharyngeals is not in voic-
ing, but rather in degree of constriction.
In regards to the coarticulatory impact of pharyngeals on neighboring vowels,
Ghazeli (1977) reports only small effects: raising of F1 throughout the vowel and long
transitions in F2. Al-Ani (1970), on the other hand, notes that F1 in vowels neighboring
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
49
[5'] is much higher than their usual values ( 400 Hz or higher up from the prototypical
275-300 Hz in [i]-no reports on Fl in [a] or [u]). Al-Ani also reports that F2 of [i] starts
at 1500Hz or lower while F2 in [u] rises to 950Hz and in [a] it drops to 1250-1350 Hz.
The F2 drop in [a] extends throughout the vowel, not just at the onset. The F2 starting
values following [h] were not as low: 1750-1900 Hz, 900Hz, and 1300-1450 Hz for [i],
[u], and [a], respectively. It looks from these values and patterns that the Iraqi Arabic ver-
sion of [)] spreads emphaticness in a manner similar to that exhibited by uvulars. Recall,
however, that Al-Ani describes [5'] as a stop. It is possible that, in order to attain an oc-
clusion in the pharyngeal area, the whole tongue mass needs to be along with
the tongue root to facilitate the full contact with the posterior wall of the pharynx. In this
way, the articulation of Iraqi Arabic [)] resembles to a certain degree the retracted articu-
lations involved in emphatics and uvulars. Unfortunately, it is not possible to verify this
claim since, as mentioned earlier, Al-Ani's x-rays of[)] and [h] do not cover the middle
and lower pharynx.
The most observable effect of pharyngeals on neighboring vowels is a rise in Fl.
An extensive acoustic account of this effect is presented in Butcher and Ahmed (1987).
The authors found that both pharyngeals are accompanied by a raised F 1 at the steady
state of neighboring vowel. Still, there is also a rising Fl transition from the vowel to the
pharyngeal consonant. Alwan (1989) studied this effect as a perceptual cue for pharyn-
geals using synthesized speech samples. When Fl values were high, the guttural sound
was perceived as [)], while lower values cued the perception of the uvular [ff]. El-Halees
( 1985) arrives at similar conclusions for the same uvular/pharyngeal pair as well as their
voiceless counterparts [X, h]. He concludes that Fl is a strong perceptual cue for distin-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
50
guishing sounds made in the posterior portion of the vocal tract-the sounds produced
further back correspond to higher Fl transitions. While other researchers stress the im-
portance of F2 onset for the perception of place of articulation in consonants, Alwan
(1989) notes, it is Fl that plays a significant role in distinguishing uvular from pharyn-
geal. She explains that this is because in other consonants, F 1 starts at a somewhat similar
low hub while after uvulars and pharyngeals it starts at a higher point, creating a distinc-
tion between orals and gutturals. Among the gutturals, Fl further helps in distinguishing
between pharyngeals and uvulars: Fl is usually higher next to pharyngeals than next to
uvulars.
In short, Arabic has two pharyngeal sounds: the voiceless [h] and the voiced [1].
Both sounds involve a low pharyngeal constriction due to the retraction of the tongue root
and the epiglottis. Articulation of pharyngeals is also reported to involve raising the lar-
ynx and advancing the lower part of the posterior wall of the pharynx. Meanwhile, the
tongue body is held in a medial position in the oral cavity. While there are different re-
ports regarding the degree of constriction of Arabic pharyngeals, they are more convinc-
ingly described as approximants. Acoustically, both sounds show vowel-like formant
structures throughout their articulation. The voiceless [h] has also aperiodic noise. Arabic
pharyngeals are generally associated with high Fl in neighboring vowels.
2.2.4 Laryngeals
Early Arab linguists noted that the two laryngeals are articulated lower and further
back than any other speech sounds. Sibawayh describes the two laryngeals [h] and [?]
(along with the vowel [a]) as "The sounds whose point of articulation is the furthest
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
51
(down the throat)''. Ibn Sina offers a more detailed articulatory description. In his ac-
count, the glottal stop is formed by a laryngeal obstruction of the pulmonic air pressure
"which is then, expelled, being forced out both by (the activity of) the muscles (which
cause the larynx to be) opened and by the air pressure" (Semaan 1963:35). He notes that
the articulatory mechanism for [h] is quite similar, except that "the obstruction is not
complete, but is modified by the edges of the exit (in the larynx)" causing a perturbation
in the exiting air stream.
The only modern instrumental account of the articulation of laryngeals in Arabic
is Zawaydeh ( 1999). She reports in her fiberscopic study that the pharyngeal area during
the articulation of the two Arabic laryngeals is as wide as it is during the articulation of
plain oral sounds. By comparison, the pharynx is significantly narrower during the articu-
lation of emphatics, uvulars, and pharyngeals. Generally speaking, it is unlikely that Ara-
bic laryngeals differ articulatorily from the most cross-linguistically common forms of
[h] and['?]. Catford (1977) terms these two sounds as glottals since they form a subset of
more possible laryngeals (the remaining members of which are made primarily by the
action of the ventricular bands). Using the term laryngeals to refer to Arabic [h] and [?]
is acceptable since they do not contrast with any other laryngeals in the language. Curi-
ously, the laryngeal fricative [h] is described by Al-Ani (1970) as an oral voiceless frica-
tive. It is possible that the presence of oral configurations coarticulated from neighboring
vowels led him to believe that these configurations were required for the articulation of
[h]. By definition, however, these configurations would vary depending on vowel con-
texts and cannot be ascribed to unique articulatory demands of [h].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
52
Al-Ani (1970) describes [h] acoustically as noise whose starting frequency de-
pends on the adjacent vowel: 2000-2700 Hz next to [i], 1500-2000 Hz next to [a], and
1200Hz next to [u]. The spectral shape of the glottal stop [?] varies: when single, [?] ap-
pears as a series of glottal pulses that look somewhat like formants and are more widely
spaced than the glottal pulses one sees in vowels. When geminated, it appears as a long
silent gap. While Al-Ani reports only slight or no coarticulatory impact of laryngeals on
neighboring vowels, Zawaydeh (1999), based on an acoustic investigation of Ammani-
Jordanian Arabic, reports that, on following [a] she found that gutturals, including laryn-
geals, and emphatics are followed by statistically significant higher F1 values than other
sounds. Since it was possible that the high F1 of the low vowel might not be due to a rais-
ing effect from the guttural, but rather a non-lowering effect (since next to oral obstru-
ents, F1 usually start from a very low hub), Zawaydeh conducted a follow up test of the
coarticulatory effect on the high vowel [i] which has a low Fl to begin with. Her results
indicate that, following emphatics, uvulars, pharyngeals, and laryngeals, Fl in [i] was
significantly higher than when following plain orals. This result is somewhat surprising
since laryngeals usually have no vocal tract constrictions of their own above the glottis.
As noted earlier, in the articulatory part of her study, Zawaydeh maintains that there is no
narrowing in the pharynx during laryngeal articulations in Ammani-Jordanian Arabic. It
is possible that some degree of larynx raising is involved in the production of Ammani-
Jordanian Arabic laryngeals or that Zawaydeh's subjects produced these sounds with a
much wider mouth and lip openings than usual.
Reproduced w
ith permission of the copyright ow
ner. Further reproduction prohibited without perm
ission.
Table 2.1 A Summary of the phonetic attributes of Arabic emphatic, uvular, pharyngeal, and laryngeal sounds.
Emphatics
Uvulars
Pharyngeals
Laryngeals
Articulation
• Main coronal articulation. • Tongue dorsum retracted into upper
pharynx. • Lowered palatine dorsum. • Mildly retracted tongue root and epi-
glottis.
• Tongue dorsum raised and retracted into upper pharynx.
• [q] and [X]: Uvula flattened and raised. • [E]: Uvula curled downwards. • Mildly retracted tongue root and epi-
glottis.
• Tongue root and epiglottis retracted into lower pharynx.
• Raised larynx and forward movement of posterior wall of pharynx also re-ported.
• Tongue body held in mid oral location.
• [h] is articulated with an open glottis. • [?] is articulated with a constricted
glottis. • No pharyngeal or oral constriction is
involved.
Acoustic Shape
• No consistent reports of spectral dif-ferences between emphatic and non-emphatic consonants.
• Emphatic stops are followed by shorter VOT's than non-emphatic ones.
• This area is in need of more extensive investigation.
• [K] has formant-like structures. • [X] appears as aperiodic noise. • [q] is a stop.
• While sometimes referred to as frica-tives, Arabic pharyngeals are more convincingly described as approxi-mants.
• [1] has strong formant-like structures. • [n] has aperiodic noise alongside the
formant-like structures.
• [h] appears as aperiodic noise. • [?] appears as glottal pulses when sin-
gle and as silent gap when geminated.
Coarticulatory Effect
• Spread emphasis (ES) onto other sounds (rise in Fl + fall in F2 in vow-els).
• ES travels in both directions and is a far-reaching effect.
• ES is often blocked by [i], [j], and [f]. Other blockers are also reported.
• Domain of ES varies depending on dialect.
• Spread emphasis (ES) onto adjacent other sounds.
• ES not as strong nor as far-reaching as ES from emphatics.
• Associated with high Fl in adjacent vowels.
• Not associated with any particular coarticulatory effects on other sounds.
VI VJ
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
54
In sum, Arabic has two laryngeals: [h] and [?]. The only articulatory study on
these sounds indicated that they do not involve any pharyngeal narrowing. Like most of
their equals in other languages, Arabic laryngeals do not involve any supraglottal vocal
tract configurations of their own. Acoustically, [h] appears as noise whose starting fre-
quency depends largely on the vowel context. [?] appears as widely-spaced glottal pulses
when singles and as a silent gap when geminated. Most studies report no coarticulatory
effects of Arabic laryngeals on neighboring vowels. However, Zawaydeh (1999) notes
that laryngeals, like pharyngeals, are associated with high F1 in neighboring vowels.
A summary of the phonetic qualities of Arabic emphatics, uvulars, pharyngeals,
and laryngeals is provided in Table 2.1.
2.3 Gutturals as a Natural Class
This section reviews some of the phonological evidence presented in support of
the treatment of guttural sounds as a single natural class in terms of place of articulation.
Most of the pieces of evidence discussed in the literature come from Semitic languages.
However, Cushitic and Interior Salish evidence for this classification has also been pre-
sented. It should be noted, however, that membership to this class differs in different lan-
guages. In Arabic, for example, uvulars, pharyngeals, and laryngeals form the guttural
class. On the other hand, the Interior Salish guttural class includes uvulars, pharyngeals,
and retracted alveolars (emphatics- see §2.3.1 below), but not laryngeals.
The present discussion focuses on two general types of evidence that are consid-
ered to be the most compelling as well as the most reported cross-linguistically. The first
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
55
type of evidence is the guttural-conditioned morpheme structure constraints. The selected
examples of this type come from Arabic, Qafar, and Moses-Columbian. The second type
of evidence is the guttural-conditioned vowel lowering processes. Selected examples of
this type come from Arabic, Hebrew, Maltese, and Tigrinya.
2.3.1 Morpheme Structure Constraints
In the canonical roots of Arabic, as well as other Semitic languages, adjacent
identical consonants are strictly prohibited. This explains the complete absence of roots
like * kkb or * t 77. This constraint is explained in McCarthy (1986) as a function of the
Obligatory Contour Principle (OCP), originally proposed by Leben (1973) to account for
the prohibition of adjacent identical tones in lexical The function of this
principle was extended further to apply to Place features, and not just full segments
(Mester 1986; McCarthy 1988, 1991, 1994; Yip 1989). Thus, the cooccurrences of horn-
organic consonants in Arabic roots are avoided because they would project their identical
Place feature on the same tier, making them adjacent on that particular tier. So, a root like
*fbk which has two adjacent labials is either rare or totally absent from the Arabic lexi-
con. This avoidance applies also, but less forcefully, to non-adjacent consonants. It should
be noted that the avoidance of homorganic segments is not as absolute as the ban on iden-
tical segments. Sequences of identical segments are totally prohibited while the avoidance
of homorganic segments, though somewhat strict, is not absolute. For the prohibition to
6 The position taken in this dissertation is that the OCP is a well-supported grammatical principle. Some linguists question the status of the OCP in phonology arguing that it is violable and exception-ridden (see Odden 1986, 1988; Blevins 2004). However, the phonological evidence in support of the status of the OCP comes from numerous languages and affects different phonological units (tones, features, segments). Such evidence is quite compelling and hard to ignore. Furthermore, many cases of apparent OCP violations have been accounted for quite systematically (see McCarthy 1994 and references therein, Padgett 1995; see also §6.3.1 of this dissertation.) ·
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
56
be invoked by the place feature [coronal], this particular feature has to be taken in con-
junction with the major class features [sonorant] and [continuant], though not without
exceptions. In general, the cooccurrence of any two members of one of the sound classes
in (8) is avoided.
(8) Arabic sound classes subject to OCP-based MSCs (from McCarthy 1994:204).
a. Labials bfm b. Coronal sonorants 1 r n c. Coronal stops t d t" d" d. Coronal fricatives eo s z s" z" I e. Velars gkq f. Gutturals XBh)h?
McCarthy (1994) provides statistical support for the Place-related cooccurrence
restriction. Table 2.2, which is a reproduction of McCarthy's Figure 12.1 ( 1994:204 ), lists
the frequencies of adjacent consonant combinations in Arabic roots? Consonants listed in
the row headers are linearly ordered before the ones listed in the column headers in the
consonantal root. The statistical assessment of the frequencies was obtained through X2
tests on 1 df (for explanation of the test parameters see Padgett 1995). Combinations of
identical consonants were excluded from these tests since there is an absolute ban on
them. What is important is that the frequencies of cooccurrences of uvulars, pharyngeals,
and laryngeals with each other are significantly lowers than expected. The fact that the
members of the three subclasses of gutturals do not freely cooccur is taken as an indica-
7 As one can see in Table 2.2, there are phonetic symbols that are not present in the inventory of MSA shown in (3) in the previous chapter. In his original figure, McCarthy (personal communication) uses the symbol [Z] to represent the emphatic interdental [ (F]. Also, McCarthy lists the velar stop [g] among the Arabic consonants. This symbol represents the Arabic jim ([d3]) which and patterns as (and originates from) [g]. See Clark and Yallop (1995:372) for further explanation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
57
tion that these subclasses should be combined to form the larger natural class of gutturals,
much like the bilabial [b], the labiodental [f], and the bilabial nasal [m] form the natural
class of labials.
Table 2.2. Frequencies of consonant cooccurrences in Arabic roots (from McCarthy 1994:204).
tl' dl' eo J gk q XK <Jh ?h I r n wj
bfm 43 43 31 79 44 180 40 91
td 10 32 21 69 20 51
tl' dl' 9 25 11 59 14 38
eo 4 5 9 44 3 24
s z 19 40 24 75 21 65
sl' zl' 4 16 7 38 5 24
J 10 33 8 37
gk 75 24 90 29 47
q 51 11 15 6 45 12 31
XK 70 18 31 13 23 13 >o 63 13 42
<Jh 91 42 29 17 35 27 2 83 28 60
?h 67 32 10 10 29 4 8 25 6 2 65 16 54
I r 149 51 36 15 58 20 20 66 48 29 74 42 0 91
n 55 23 19 7 26 12 14 31 26 16 28 21 2 X 51
wj 83 44 31 14 44 14 18 34 33 20 41 29 89 26 ·zjm
p < 0.05 p < 0.005
Combinations of uvulars and velars are also avoided. As discussed later, all major
views concerning the representation of these sounds indicate that uvulars are complex
sounds with pharyngeal and dorsal components (see Elorrieta 1991 for extensive discus-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
58
sion supporting this view). It is, then, the dorsal place in the representations of both uvu-
lars and velars that motivates the avoidance of their combinations. The uvular stop [ q]
rarely cooccurs with other uvulars or velars. Again, [q] is argued to have pharyngeal and
dorsal components. However, as Table 2.2 shows, this sound cooccurs freely with lower
gutturals. Note also that emphatics cooccur freely with all gutturals even though they are
argued to contain a pharyngeal component representing their secondary articulation.
Meanwhile, all emphatic and velar combinations are noticeably infrequent. These issues
will be discussed later in different points of this dissertation.
The restrictions on consonant cooccurrences are found in non-Semitic languages
as well. One such language is Qafar, an east Cushitic Language discussed in Hayward
and Hayward (1989). However, the root restrictions in Qafar work somewhat differently.
Roots may contain either identical or non-homorganic consonants. The classes of homor-
ganic consonants in Qafar are listed in (9). Again, the classification of the two pharyn-
geals ['l, h] with the laryngeal [h] is taken as an indication that they constitute a natural
class of homorganic consonants.
(9) Qafar homorganic consonants (from Hayward and Hayward 1989: 183).
a. Labials bf b. Coronal sonorants 1 r c. Coronal stops t d ct d. Velars gk e. Gutturals 'lhh
Another non-Semitic language that exhibits similar restrictions on root consonants
is Moses-Columbian, an Interior Salish language. According to Bessell and Czaykowska-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
59
Higgins (1992), a morpheme structure constraint prevents pharyngeals, uvulars, and re-
tracted alveolars from occurring in the second consonant location if the same root has a
pharyngeal in its first consonant location. The avoided combinations are listed in (10).
There are some interesting points to note here. First, unlike Semitic root cooccurrence
restriction which affects all sound classes, the Moses-Columbian constraint pertains only
to gutturals. Second, while Moses-Columbian has laryngeal consonants, they are not af-
fected by the constraint. In fact, the whole purpose of the Bessell and Czaykowska-
Higgins article was to show that laryngeals are placeless and are not part of the guttural
class in Interior Salish languages (see Rose 1996, however, for counter arguments). The
third point of interest is that, according to Bessell and Czaykowska-Higgins (1992), the
retracted alveolars oflnterior-Salish "resemble the emphatic consonants of Arabic in both
phonological and phonetic properties" (p. 37). As a matter of fact, in citing the same Bes-
sell and Czaykowska-Higgins article, both Rose (1996) and Zawaydeh (1999) term the
Interior Salish retracted alveolars as 'emphatics'. It is the case, then, that unlike in Arabic,
the Moses-Columbian MSC addresses gutturals (excluding laryngeals) and emphatics as
one natural class in terms of place of articulation.
(10) Consonant combinations that are disallowed in Moses-Columbian (from Bessell and Czaykowska-Higgins 1992:42).
a. *Pharyngeal (V) Pharyngeal b. *Pharyngeal (V) Uvular c. *Pharyngeal (V) Retracted alveolar
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
60
2.3.2 Guttural Lowering
In several languages, a strong link between guttural consonants and low vowels
has been noticed. Numerous cross-linguistic phonological processes involve vowels being
lowered or epenthetic vowels surfacing as low vowels in guttural contexts. A prominent
example reported by McCarthy (1991, 1994) involves the type of the second vowel in
Arabic imperfect verbs. This vowel almost always surfaces as [a] if a guttural precedes or
follows it ( 411 of 436 incidents). Examples are given in (11 ).
( 11) Perfect/imperfect vowel alternations in Arabic verbs (from McCarthy 1991 :69; 1994:207).
Plane roots Guttural roots Perf. Imperf. Perf. Imperf. katab jaktub 'write' fa1al jaf1al 'do' d'arab jad'rib 'beat' rada1 jarda1 'prevent' farib jafrab 'drink' balud jablud 'be stupid'
McCarthy also provides a Hebrew example of guttural-conditioned vowel lower-
ing. In Hebrew, CVCVC words with stress on the penult vowel are considered to be un-
derlyingly of the canonical form /CVCC/. One phonological rule assigns stress followed
by an epenthesis rule that inserts the second vowel. As the examples in (12) show, the ep-
enthetic vowel surfaces as [a] when the consonant immediately preceding it is a guttural.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
61
(12) Hebrew epenthetic vowel lowering (from McCarthy 1994:210).
Plain medial consonant Guttural medial consonant /malk/ [melek] 'king/my king' /ba)l/ [ba)al] 'master' /sipr/ [se:per] 'book' /kanJI [kahaJ] 'lying' !qudf/ [qmdeJ] 'holiness' /lahb/ [lahab] 'flame'
/tu?r/ [tu?ar] 'form/his form'
Brame (1972), cited in Hayward and Hayward (1989), notices a similar phenome-
non in Maltese. In this language, as can be seen in (13), the vowel in the 1st sg. imperfect
prefix is [i] if the stem starts with a non-guttural consonant and [a] if the stem has a gut-
tural in the initial position.
(13) Maltese 1st sg. imperfect verbs (from Hayward and Hayward 1989:185, citing Brame 1972).
Plain roots Guttural roots 1lli+kteb 'I write' 1lla+7bez 'I jump' 1lli+nzel 'I descend' 1na+?leb 'I overturn' 1ni+dneb 'I sin' 1na+hdem 'I work' 1lli+freJ 'I spread' 1na+hleb 'I milk'
Hayward and Hayward ( 1989) also discuss a case in Tigrinya in which vowels are
lowered when adjacent to a guttural. In Tigrinya the low vowel [a] appears in syllables
that contain a guttural while the mid central vowel Ui] appears in non-guttural syllables.
Examples are shown in (14).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
62 ( 14) Tigrinya vowel lowering (from Hayward and Hayward 1989: 179).
Plain syllables sabar-a 'he broke (something)' fanaw-a 'it decayed' mazaz-a 'he drew (a sword from its sheath)' sabar-ka 'you have broken (something)' k'arab-ka 'you have approached'
Guttural syllables
?axal-a 'it was enough' hadar-a 'he spent the night' )arag-a 'he ascended' bala)-ka 'you have eaten' sarah-ka 'you have worked'
2.4 Representations of Emphatics and Gutturals
As cited in the previous chapter, Jakobson et al. (1952) give emphatics the feature
[+flat] distinguishing them phonologically from non-emphatics. This feature applies pri-
marily to rounded vowels since they involve narrowing and protruding the lips. This la-
bial setting has the acoustic effect of flattening, or a general lowering of some or all for-
mants. The authors explain that a similar acoustic effect can also be achieved by
constricting the pharynx. Here is their account:
"Instead of the front orifice of the mouth cavity, the pharyngeal tract, in its
turn, may be contracted with a similar effect of flattening. This independ-
ent pharyngeal contraction, called pharyngealization, affects the acute
consonants and attenuates their acuteness .... The fact that peoples who
have no pharyngealized consonants in their mother tongue, as for instance,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
63
the Bantus and the Uzbeks, substitute labialized articulations for the corre-
sponding pharyngealized consonants of Arabic words, illustrates the per-
ceptual similarity of pharyngealization and lip-rounding. These processes
do not occur within one language. Hence they are to be treated as two
variants of a single opposition - flat vs. plain." (p. 31)
It should be noted here, though, that there is an important acoustic difference be-
tween pharyngeal constriction and lip rounding. While both cause a lowering of F2, pha-
ryngeal constriction has the effect of raising F1 while lip rounding lowers all formants.
This can be considered as a challenge to the above stated view which equates the two
gestures on acoustic/auditory grounds.
In SPE, emphatics are distinguished from non-emphatics in that the former are
[+low, +back]. These tongue body specifications follow from the fact that the tongue
body is actively involved in the production of the secondary articulation of emphatics.
The SPE feature matrices for the consonants in question, among others, are given in (15).
(15) SPE feature specifications of (plain and emphatic) alveolars, velars, uvulars, pharyngeals, and laryngeals (according to Chomsky and Halle 1968:307).
Plain alveolars Emphatic alveolars Velars Uvular gutturals Pharyngeal gutturals Laryngeal gutturals
Anterior Coronal High Low Back + +
+ +
+ +
+ +
+ + + +
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
64
Note that the SPE model assumes that emphatics are truly pharyngealized since
their tongue body specifications are identical to those of pharyngeal sounds. McCarthy
( 1994) notes that, on the basis of these feature specifications, gutturals are distinguished
from the rest of the sounds in that they are [-anterior, -high] while the specifications for
[low, back] distinguish the three guttural classes from each other. McCarthy also points
to some problematic aspects regarding the feature specifications for gutturals. He states
that the feature [-high] for uvulars is inconsistent with the actual articulation of these
sounds which involves a raised tongue8• Moreover, he argues, convincingly, that the SPE
featural specifications for pharyngeals and laryngeals inaccurately involve the tongue
body features [low] and [back]. Neither subclass involves the tongue body as its articula-
tor. It should be noted here that McCarthy's challenge of the feature specification for
pharyngeals cannot be easily extended to include emphatics - the so-called 'pharyngeal-
ized' sounds. Emphatics clearly involve a retraction of the tongue body that cannot be
found in pharyngeals. The feature [+back], then, is a potentially acceptable consideration
as far as emphatics are concerned.
Among the more recent feature-geometry-based representational proposals, only
three that are either influential or embody somewhat novel representational suggestions
are discussed here. These include, of course, McCarthy's (1994) influential work. This
article has stirred significant debate among linguists. It also served as the launch pad for
almost all of the ensuing alternative accounts (including the present dissertation). Rose's
(1996) work relies on an extensive review of cross-linguistic data and attempts to recon-
cile the seemingly variable grouping of laryngeals in the natural classes in several Ian-
8 See, however, Chapters 5 and 6 of this dissertation for more on the articulation of uvulars. It is argued that the raising maneuver in uvulars is actually due to a radical articulation, not a lingual one.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
65
guages. Zawaydeh' s (1999) is a more recent work that relies on articulatory and acoustic
data to investigate the articulatory properties of emphatics and gutturals and introduces a
new distinctive feature called [Retracted Tongue Back] to characterize the lingual in-
volvement in emphatics and uvulars. There are two other prominent proposals that are not
discussed here: Herzallah (1990) and Davis (1995). Herzallah adopts V-Place-based rep-
resentations. Such representations suffer from independent theory-internal problems (see
Halle et al. 2000 for criticism). Davis's proposals contain representational elements found
also in Rose's and Zawaydeh's, both of which are discussed below. For these reasons, I
find it unnecessary to go into Herzallah's and Davis' particular proposals in detail.
2.4.1 McCarthy (1994)
McCarthy ( 1991, 1994) argues that it is not possible to represent Arabic gutturals
in a fashion that is consistent with basic concepts of the articulator theory. This follows
from the fact that the three Arabic guttural subclasses are articulated at three distinct
points within the pharyngeal region. He therefore identifies those three subclasses with
the feature [pharyngeal] denoting their common place of articulation. To handle the
asymmetry created by this proposal (arising from the fact that other sounds are identified
with the [labial], [coronal], and [dorsal] active articulators), McCarthy embraces a differ-
ent understanding of distinctive features. He follows Perkell's (1980) characterization of
distinctive features as "orosensory patterns corresponding to sound producing states"
(Perkell1980:338, cited in McCarthy 1991:84 and 1994:199). In McCarthy's elaboration
of this proposal, he defines distinctive features "as particular patterns of feedback from
the vocal tract (which have consistent acoustic consequences)" (1994: 199; parenthesis in
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
66
the original). McCarthy cites some histological and neurosensory works by Grossman
(1964), Ringel (1970), and Penfield and Rasmussen (1950) which argue that the pharynx
lacks the neural density and the tactile sharpness that can be found in the oral regions.
Accordingly, the pharynx, though a comparatively large area, can be considered equal,
from a sensory point of view, to the [labial], [coronal], or [dorsal] areas of the oral tract.
McCarthy proposes [pharyngeal] as the feature which identifies the class of gutturals
whose articulation is in the broad region that spans "the area from the larynx to the oro-
pharynx inclusive" ( 1994: 198-199). The "consistent acoustic consequences" for this type
of articulation is a high F1 which McCarthy argues to be witnessed in all gutturals. The
feature [pharyngeal] is also used to refer to the secondary articulation in Arabic emphatic
coronals since they too possess pharyngeal constrictions. The general feature tree given
by McCarthy which reflects his arguments is given in (16) while the individual represen-
tations of the subclasses of gutturals as well as emphatics are given in (17).
(16) McCarthy's feature geometry (1994:223) .
•
• Laryngeal node • Place node
[voice] [constr] • Oral
[lab] [cor] [ dors] [pharyngeal]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(17) McCarthy's proposed representations for gutttuals and emphatics (1994:221)
a. Pharyngeals and Laryngeals b ) h ? •Place
[pharyngeal]
c. Emphatics t' d' o' s' •Place
[coronal] [pharyngeal] ( [dorsal])
b. Uvulars X B"
•Place
[pharyngeal] [dorsal]
d. Uvular stop q
•Place
[pharyngeal] [dorsal]
67
Given these representations, adjacent guttural consonants would be avoided in
Arabic roots since they would project their [pharyngeal] features on the same tier causing
a violation of the OCP. As for vowel lowering, McCarthy reasons that it involves the
spreading of the feature [pharyngeal] from the guttural to the target vowel.
Emphatics include the feature [pharyngeal] in their representation since, like the
primary articulation of gutturals, the sec,ondary articulation of emphatics is a constriction
in the pharynx. As both emphatics and uvulars achieve their pharyngeal constrictions
through tongue dorsum retraction, both classes posses the articulator feature [dorsal] as
well. Notice that in emphatics the feature [dorsal] in parenthesis. What this means is that
this feature is redundant for this class of sounds and is not part of their underlying speci-
fication. The perceived articulatory similarity between emphatics and uvulars leads
McCarthy to suggest that emphatic sounds should be described as 'uvularized' rather than
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
68
'pharyngealized'. In regards to the uvular stop [q], McCarthy considers this sound as the
emphatic version of [k]. This claim means that [q] is the only non-coronal emphatic.
2.4.2 Rose (1996)
For Rose, the feature [RTR] is present in all sounds that involve a constriction in
the pharynx. This includes emphatics, uvulars, and pharyngeals. Laryngeals are excluded
from the set of [RTR] sounds. The main focus of Rose (1996) article is the representa-
tional status of the laryngeals. Rose presents an extensive set of data in support of her
claim that the status of laryngeals is decided by the phonemic inventory of the language
in question. In languages like Arabic where laryngeals contrast with other gutturals, la-
ryngeals are phonologically classified as Pharyngeal sounds. If the language does not
have other gutturals, laryngeals are considered placeless. Rose bases her claim on A very
and Rice's (1989) 'Modified Contrastive Specification' of speech segments. According to
this claim, a class node is underlyingly specified for a certain segment only if this seg-
ment contrasts with other segments in features that depend on the relevant node.
It has been argued by Bessell (1992) and Bessell and Czaykowska-Higgins (1992)
that laryngeals in Interior Salish languages should be viewed as placeless and are not as a
subset of the guttural class. One piece of evidence is the absence of laryngeals from the
sets of sounds that trigger the MSC on guttural consonants cooccurrences in Moses-
Columbian as explained in §2.3.1. Another piece of evidence is that, unlike uvulars,
pharyngeals, and emphatics, Interior Salish laryngeals do not trigger vowel retraction.
Furthermore, the realization of the epenthetic vowel in Moses-Columbian varies based on
the identity of the adjacent consonant. Of importance is that the vowel appears as [<;!-] next
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
69
to uvulars and as [if] next to pharyngeals? Next to laryngeals the vowel surfaces as [a]
which, although still low, has a higher quality than the other two realizations. Bessell and
Czaykowska-Higgins argue that [a] is the "uncoarticulated value of the default vowel"
which is what one would expect to surface next to a placeless consonant.
To account for these issues, Rose proposes that cases in which laryngeals act dif-
ferently from other gutturals are in fact due to the lack of a tongue root retraction [RTR]
in laryngeals. So Rose explains vowel retraction as spreading [RTR] from the source con-
sonant onto the vowel. As for the different surface quality of the epenthetic vowel, Rose
actually uses it as further evidence that laryngeals are specified as Pharyngeal sounds in
Interior Salish. She points to Bessell and Czaykowska-Higgins' (1992) data which shows
that, regardless of the precise surface quality of the epenthetic vowel, it always surfaces
as low next to uvulars, pharyngeals, and laryngeals. This is akin to the cases of vowel
lowering next to guttural sounds in other languages. Rose, however, does not present any
compelling explanation for why laryngeals are not included in the sounds that trigger the
morpheme structure constraints in Moses-Columbian. Rose's representations of laryn-
geals, pharyngeals, uvulars, and emphatics are given in (18).
9 Bessell and Czaykowska-Higgins (1992) use the symbols [<,1] and [:f] to represent lower versions of [a] in Moses-Columbian. [:f] is apparently the lowest version of [a] while [<,1] falls in between the two.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
70
(18) Rose's proposed representations for gutturals and emphatics (1996:80)
a. Laryngeals h ? Place or ROOT
Pharyngeal
d. Emphatics t1 d1 61 s1
Place
Oral Pharyngeal
b. Pharyngeals h ) Place
I Pharyngeal
I [RTR]
Coronal Dorsal [RTR]
c. Uvulars X B R
Place
I
0 Dorsal
e. Uvular stop q Place
[RTR]
Oral Pharyngeal
I I Dorsal [RTR]
Like McCarthy ( 1994 ), Rose draws a distinction between the uvular continuants
[X, B] and the uvular stop [q]. According to Rose, in the continuants, the pharyngeal
component is primary, while in [q] it is secondary, making the uvular stop essentially an
emphatic version of [k].
2.4.3 Zawaydeh (1999)
The class of Arabic gutturals in Zawaydeh's view includes uvulars, pharyngeals,
laryngeals, and emphatics - essentially all sounds that include a constriction in the back
of the vocal tract. Based on her interpretation of her fiberscopic evidence that emphatics,
uvulars, and pharyngeal involve a constriction in the pharynx, she asserts. that the phar-
ynx is an active articulator in these sounds. However, she was careful to exclude the la-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
71
ryngeals from this claim since, as shown by her data, there is no pharyngeal constriction
involved in their articulation. However, she bases the membership of laryngeals in the
guttural class on acoustic grounds. Her acoustic experiment shows that emphatics, uvu-
lars, pharyngeals, and laryngeals are associated with high F1 in adjacent vowels. Her
acoustic findings in regards to laryngeals have already been questioned in §2.2.4. Such
findings merit further scrutiny since they go against the predictions of the source-filter
theory. Zawaydeh's claim regarding the basis on which gutturals are grouped is therefore
questionable. Zawaydeh maintains that emphatics and uvulars both involve a movement
by the back of the tongue towards the uvular region, which was initially declared by
McCarthy's (1994). For this reason, Zawaydeh introduces a new feature [Retracted
Tongue Back] which she uses only in the representations of emphatics and uvulars.
Following representational proposals by Vaux (1993) and Davis (1995), Zaway-
deh's feature trees involve splitting place node into two branches: a lower vocal tract
node (L VT) and an upper vocal tract node (UVT). The former dominates pharyngeal and
laryngeal articulators and features while the latter dominates oral features and articula-
tors. Additionally, Zawaydeh follows Davis (1995) in labeling main places of articulation
as (1 place) and labeling secondary places of articulation as (2 place). The resulting rep-
resentations for emphatics, uvulars, pharyngeals, and laryngeals as provided by Zaway-
deh are listed in (19).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
72 (19) Zawaydeh's proposed representations for emphatics, uvulars, pharyngeals, and
laryngeals (1999:82)
a. Emphatics ROOT
1 place 2 place
I I UVT LVT
I I [Coronal] Pharyngeal Constriction
I Retracted Tongue Back
c. Pharyngeals ROOT
I 1 place
I LVT
I Pharyngeal Constriction
I [Retracted Tongue Root]
b. Uvulars ROOT
I 1 place
UVT LVT
I I [Dorsal] Pharyngeal Constriction
I Retracted Tongue Back
d. Laryngeals ROOT
I 1 place
I LVT
I [Laryngeal]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
73
2.5 Representational Problems
The different representational proposals for emphatics and gutturals discussed
above share some common traits. The most important is that they all include, in one form
or another, a pharyngeal component to represent the primary pharyngeal articulations in
uvulars, pharyngeals, and laryngeals as well as the secondary articulations in emphatics.
The precise implementation of this common view depends mainly on each researcher's
interpretation of the articulatory characteristics of emphatics. Each proposal directly re-
flects the assumption that emphatics are either pharyngealized or uvularized. All of those
proposals, however, ignore some crucial differences in regards to the particular vocal ap-
paratus that implements the aforementioned constrictions. While the representations of
laryngeal articulations are relatively straightforward, those of uvulars, pharyngeals, and
emphatics are somewhat vague or inaccurate. Additionally, these proposals fail to explain
some important phonological processes and patterns. The end results are theoretical rep-
resentations that are ill-motivated both phonetically and phonologically. This section
highlights these issues.
When considering the previously reviewed experimental literature on the articula-
tory maneuvers involved in the production of pharyngeals, one can clearly see that these
sounds are implemented by a retraction of the tongue root in the lower pharynx. Accord-
ing to Ladefoged and Maddieson (1996), "[t]he root of the tongue and the epiglottis can
be moved independently of the body of the tongue" (p. 11). Those studies also show that
the tongue body is not retracted during the articulation of pharyngeals. The active articu-
lator for these sounds is the tongue root which is pulled back to create the lower pharyn-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
74
geal constriction. In uvulars, no independent retraction of the tongue root is noted. In-
stead, the retracted structures include the tongue root, the tongue dorsum, and the epiglot-
tis-essentially all the movable structures behind the tongue blade that are parts of or are
directly attached to the tongue. So the apparent tongue root retraction in uvulars is possi-
bly a by-product of the retraction of the tongue body as a whole. When comparing the
general shape of the pharynx during the production of uvulars and pharyngeals, it appears
that the pharynx is constricted more and at a lower point during pharyngeals than during
uvulars. So, while the pharynx itself, most likely through the action of the pharyngeal
constrictors, moves independently to produce the constrictions in [b] and [1], it is the
movement of the tongue body that constrict the pharynx during [X], [ B].
Like uvulars, the secondary articulation in emphatics also involves a tongue re-
traction that narrows the upper pharynx and retracts the tongue root and the epiglottis
along with it. However, the articulations of these two sets of sounds have four fundamen-
tal differences that are mostly ignored by the previous representational proposals. First,
the tongue body is retracted further during emphatics than during uvulars. Second, de-
spite some articulatory variability, the back of the tongue is generally moved vertically
towards the uvular area during uvulars but is horizontally slid backwards during emphat-
ics. Third, the upper surface of the tongue dorsum is depressed during emphatics but not
during uvulars. Fourth, the soft palate is actively involved in the production of all Arabic
uvular sounds. This participation is quite visible during [B], but is rather subtle during [X]
and [q]. Recall from the previous review of the articulatory studies that the uvula is
curled downwards and touching the tongue during [B]. It was also noted that the uvula is
raised and held firmly flat during [X] and [q]. Both are different positions when compared
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
75
to other (non-nasal) sounds in which the uvula is raised but somewhat relaxed and not
flat.
Norlin (1987) points to the difference between emphatics and pharyngeals noting
that "[a]s far as the pharynx is concerned it does not play an active part as an articulator"
in the production of emphatics. Instead, "[i]t is the tongue which by a backing movement
causes the constriction" (p. 72). McCarthy (1994) himself acknowledges that the pharyn-
geal constriction in emphatics and uvulars is executed by the [dorsal] articulator. How-
ever, he formally represents this constriction in emphatics underlyingly by referring to its
place rather than its active articulator. In his representation of uvulars, both place and ac-
tive articulator of the pharyngeal constriction are present. These different proposals are at
odds with his claim that the secondary articulation in emphatics is an equivalent to the
articulation of uvulars. Zawaydeh (1999) also notices the same difference between em-
phatics and pharyngeals and adds that uvulars, like emphatics, also involve a pharyngeal
constriction as a function of tongue backing. It is for this reason that she presents the
novel feature [Retracted Tongue Back] to describe uvulars and emphatics. Aside from the
fact that this feature is motivated only by this difference, Zawaydeh subscribes to the idea
that emphatics are uvularized which, as noted earlier, is articulatorily problematic. Addi-
tionally, the role of the tongue dorsum in her representation of uvulars is quite vague. It is
implicated twice: as implementer of the [dorsal] feature and as implementer of the [Re-
tracted Tongue Back] feature. To complicate things further, the two implementations are
nested under two different vocal tract nodes.
Phonologically, as explained earlier, one of the most important pieces of evidence
that Arabic gutturals constitute a natural class is their clearly low frequency of cooccur-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
76
renee in Arabic consonantal roots. Cooccurrence of two consonants belonging to that set
of sounds in the same root would violate the place-OCP-based constraint against the
cooccurrence of homorganic segments in the same root. For each one of the proposals in
§2.4 to account for the restricted cooccurrence of gutturals in Arabic roots, the place-
OCP has to address a terminal feature or class node that is common among all guttural
subclasses (the feature [pharyngeal] in McCarthy's 1994 proposal, the class node Pha-
ryngeal in Rose's 1996 proposal, and the class node LVT in Zawaydeh's 1999 proposal).
In all three proposals, those common features or nods are also present in emphatics. All
existing representational proposals would predict that roots containing emphatics and gut-
turals should be significantly infrequent. However, looking at Table 2.2, it is clear that
emphatics and gutturals cooccur rather freely. To solve this problem, McCarthy proposes
the conjunction of the major class feature [approximant] to the feature [pharyngeal] when
defining the guttural class. This would limit the applicability domain of the OCP to [pha-
ryngeal] sounds that also share the feature [+approximant]. Since emphatics are not ap-
proximants, they would be excluded from this domain. However, this approach presup-
poses that all gutturals are approximants. While this might be true for low gutturals, the
two uvulars [X] and [B] are widely considered as fricatives. While Ladefoged and Mad-
dieson ( 1996) suggest that Arabic pharyngeals are indeed approximants, they made no
such claim in regards to uvulars. This particular point is addressed experimentally in
Chapter 3 (Experiment One) of this dissertation. Alternatively, Pierrehumbert (1993) pre-
sets a requirement that the place OCP should exclude secondary articulations. However,
she makes no formal suggestions on how that might be implemented. Zawaydeh's (1999)
representations do reflect the difference between primary and secondary articulations.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
77
Nevertheless, the problem persists. As can be seen in Table 2.2, emphatics and velars
rarely cooccur. Since emphatics are primarily coronal, we have to assume that it is the
secondary articulation in emphatics that causes a violation of the OCP-based restriction.
Indeed, in Rose's proposals, emphatics underlyingly include a [dorsal] articulation. We
can assume that it is this component that triggers the OCP violation if there is a [k] or [g]
in the same root with an emphatic. It seems that simply excluding secondary articulations
from the applicability domain of the place OCP is not a viable solution.
Another major phonological challenge that faces existing representations of em-
phatics and gutturals concerns the vowel lowering issue. Recall that some of the most
compelling evidence for the existence of a guttural natural class is the lowering of vowels
to [a] in the adjacency of guttural sounds. For the most part, emphatics do not trigger a
similar effect. To account for this dissimilarity, McCarthy again appeals to the feature
[approximant]. However, no formal explanation is provided. One might also appeal to
different effects of spreading primary nodes/features from spreading secondary ones.
However, one vowel lowering process in Eastern Arabic dialects (Herzallah 1990; also
cited in McCarthy 1994), involves gutturals, emphatics, and contextually emphaticised
[r]. Elaboration on this particular evidence is presented in Chapter 6 of this dissertation.
What is important here is that primary and secondary articulations can have the same
phonological impact.
The previous discussion highlights the phonetic and phonological inadequacies of
the present proposals for the theoretical representations of emphatics and gutturals in
Arabic. These representations reflect clear phonetic-phonological mismatches and they
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
78
fail to account for differences in the phonological behavior of the sound classes in ques-
tion.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
79
CHAPTER3
Experiment One:
The Spectral Properties of Arabic Consonants
3.1 Overview
This experiment investigates the canonical spectral shapes of MSA emphatics,
gutturals, and related sounds. The goal of this chapter is to address two gaps in the acous-
tic literature on Arabic emphatics and gutturals. First, recall from the previous chapter
that previous attempts to distinguish emphatic consonants from non-emphatic ones based
on their spectral shapes have relied mostly on visual observation of spectrograms. A more
extensive objective research in this field is necessary to achieve a fuller understanding of
the acoustic correlates to emphaticness. Second, while there have been strong phoneti-
cally-based claims that Arabic pharyngeals are approximants rather than fricatives, no
similar claims are presented regarding the consonantal status of uvular continuants. Sig-
nificant theoretical proposals in one of the most notable works on the phonology of Ara-
bic gutturals (McCarthy 1994) are predicated on the claim that all Arabic gutturals are
approximants (see §2.5). It is necessary to verify this claim on phonetic grounds.
Consequently, there are two hypotheses being tested in this experiment. The first
hypothesis is that there are no differences between canonical spectral shapes of emphatic
consonants and their non-emphatic counterparts. This hypothesis is based mainly on the
prediction that the main articulations in Arabic emphatic would filter away most of the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
80
acoustic impact from their secondary articulations. The second hypothesis is that the
shapes of the power spectra of Arabic uvulars suggest that these sounds are fricatives
rather than approximants. This hypothesis follows from the fact that the vocal tract con-
striction during the articulation of Arabic uvular continuants is noticeably narrower than
what one would expect in an approximant articulation.
Fairly consistent associations between the spectral shapes of consonants and their
articulatory properties have been reported in many studies (Hughes and Halle, 1956;
Halle et al., 1957; Strevens, 1960; Heinz and Stevens, 1961; Stevens and Blumstein,
1978; Blumstein and Stevens, 1979; Evers et al., 1998). Those studies, among many oth-
ers, show that the shapes of the power spectra of obstruents are decided by the vocal tract
configuration involved in their articulation. The location and degree of the articulatory
constriction for a certain sound determines its acoustic output. However, expressing those
associations objectively and economically, especially in the case of fricatives, remains a
methodological challenge as pointed out by Kent and Read (2002). The authors point to
the intra- and inter-observer variability linked to the most commonly used methods of
power spectra characterization. A table provided by the authors (Table 5-4, p. 168) dis-
plays the substantial ranges of variability in the results obtained by several examiners
who used metrics like relative intensity, effective spectrum length, and spectral peak lo-
cation to classify fricative spectra. Among those metrics, spectral peak location has been
relatively widely used. In what is possibly the most exhaustive use of this metric in a sin-
gle study, Nartey (1982) investigates the fricative consonants in several languages.
Nartey shows that prominent peaks in the power spectra of fricatives generally correlate
with their places of articulation. Nevertheless, an inspection of the many tables provided
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
81
in the study reveals that the peak locations (both high and low ones) depend heavily on
the language, the speaker, and the phonetic context. Another widely used metric is spec-
tral tilt. Stevens and Blumstein (1978) note that syllable-initial bilabial stops coincide
with a diffuse falling spectrum, alveolar stops coincide with a diffuse rising spectrum,
and velar stops have a compact spectrum with a mid-frequency peak. This method re-
mains, however, qualitative rather than quantitative limiting its objectivity. This is espe-
cially true when characterizing and comparing the power spectra of obstruents whose
places of articulations are not widely distributed in the vocal tract.
More recently, an analysis method that employs the statistical concepts of spectral
moments has received growing attention. In this alternative, which was introduced by
Forrest et al. (1988), the distribution of spectral energy in the main noise portion of the
obstruent (frication noise for fricatives and burst plus the ensuing frication for stops) is
treated as a statistically normal distribution for which the first four moments (mean, vari-
ance, skewness, and kurtosis) are computed. Those metrics inherently express the spectral
tilt of the obstruent as well as its peakedness and its center of gravity. We could roughly
consider this method to be, among other descriptions, a quantitative upgrade of the quali-
tative metric of spectral tilt. The spectral moments generated for a given speech sound are
typically computed from the FFT power spectrum of that sound. Theoretically speaking,
spectral moments can also be based on Linear Predictive Coding (LPC) power spectra.
However, a major technical drawback associated with the LPC analysis renders it unsuit-
able as the foundation for the spectral moments analysis. According to Kent and Reed
(2002), LPC models are generally founded on the assumption that speech articulations
produce only poles (resonances; formants) and not zeros (anti-resonances; antiformants).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
82
This assumption does not hold for speech sounds where the vocal tract is highly con-
stricted, such as sibilants, or bifurcated, such as nasals. The FFT model, on the other
hand, captures both poles and zeros yielded by the different t:t;ansfer functions of the vo-
cal tract and is therefore considered to be a more appropriate source for the calculation of
spectral moments.
Though still under used, several studies have shown that spectral moments have
the ability to characterize obstruent place of articulation. Forrest et al. (1988) show that
power spectra obtained for the release portion of English [t] have consistently higher
mean and lower skewness than those of [p] and [k] (see their Figure 2, p. 119). Mean-
while, power spectra for [k] had consistently higher kurtosis values than the other two
stops. Nittrouer ( 1995) and Kardach et al. (2002) achieved similar results when compar-
ing the two English stops [t] and [k]. Forrest et al.'s (1988) discriminant function analysis
based on spectral moments yielded high classification rates for voiceless English stops.
The three voiceless stops [p, t, k] had correct classification rates that ranged from 85.3%
to 100% based on linear scale spectral moments derived for the first 20 ms from the stop
burst. By contrast, the classification of voiceless fricatives did not fare as well; even with
the inclusion of moments generated at the fricative-vowel transition (which, the study
finds, generally increases classification rates), classification of nonsibilants was as low as
61%. However, when considering sibilants alone, classification rates based on linear
scale spectral moments generated from the first 20 ms of the fricative noise were quite
high (70% to 100% ). Tomiak (1990) obtained comparatively higher classification rates
for the English fricatives [f, 9, s, J, h]. Her discriminant function analysis results based
on spectral moments showed high rates of correct identification that ranged between 75%
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
83
and 100%. However, when using spectral moments from the tokens produced by two of
her subjects for cross-validation, the overall identification rates for [f] and [9] dropped to
67% and 44%, respectively.
In their study of English voiced and voiceless fricatives, Jongman et al. (2000) re-
port that each of the four spectral moments generated at fricative onset, middle, offset,
and fricative-following vowel transition were able to distinguish at least three out of four
fricative places of articulation at each location. Subsequent discriminant function analysis
involving spectral moments (among other acoustic metrics) contributed an overall correct
identification rate of 77%. Here, again, nonsibilants were noticeably less accurately clas-
sified (64% to 68%) than sibilants (85% to. 91 %). It should be noted, though, that not
enough details were provided regarding the weight of the contribution of spectral mo-
ments as a single group of predictors to the results of Jongman et al. 's discriminant func-
tion analysis.
Most available studies of spectral moments rely on English data. However, an in-
vestigation of Polish fricatives by Jassem (1995) shows that cross-linguistic agreement in
the ranges of spectral moments' values, at least for [s], can be achieved. Among the five
voiceless Polish fricatives [s, f, J, x], [s] had the highest spectral mean and the lowest
skewness. The velar fricative [x] in Jassem's is the only fricative produced at a point in
the rear portion of the vocal tract (aside from [h] in Tomiak's (1990) study) for which
there is a spectral moments-based description. Since the present chapter covers Arabic
uvulars and pharyngeals, which are expected to reflect spectral properties close to those
of Polish [x], Jassem's study is a significant acoustic yardstick. Polish [x] had profoundly
higher skewness and kurtosis than the other four fricatives. These values are expected
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
84
since velar sounds generally have compact power spectra with well defined peaks and
comparatively little energy at the high frequency range.
Tomiak ( 1990) concludes that her results "portray the spectral moments metric as
a potential solution to the invariance problem in speech perception" (p. 187). Kent and
Read (2002) also suggest that spectral moments should be considered by anyone who
wants to study the spectral attributes of obstruents. The present experiment studies a set
of sounds with various local and global characteristic spectral attributes (compact, dif-
fuse, rising, falling, dispersed, dense). The bulk of acoustic analysis done in this chapter
involves descriptions and comparisons of MSA consonants using the spectral moments
method since, as pointed out by Jongman et al. (2000), it characterizes both local (center
of gravity) and global (tilt, and peakedness) qualities of power Heeding Kent
and Read's further suggestion that other methods of analysis may be used along with
spectral moments, a new quantitative spectral characterization method, cqlled the multi-
band spectral (MBS) analysis, is introduced for the first time in this 1 In this
method, the average RMS (root-mean-square) level for every 1000Hz frequency band of
the power spectrum is calculated by means of Fast Fourier Transform (FFT) and ex-
pressed in decibels as a single number. The graphical result is a stepped power spectrum
that averages out the many spectral minima and maxima found in a typical FFT spectrum;
revealing the bulk shape of the spectrum. The resulting loss of spectral details is not be-
lieved to have a significant impact on the unique acoustic identities of obstruents. Heinz
10 One could also add energy dispersion to the set of global spectral characteristics captured by spectral moments 11 I am very grateful to professors Raymond Kent and Paul Milenkovic for their efforts in the development of this analysis method. The former is duly credited for the concept of generating energy spectra averaged over bands of I kHz at the frication noise of the fricative and at the burst of the stop. The latter developed a computer implementation of the concept and included it in his very capable TF32 computer program.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- 10
-20
fg- 30 '-' <l) -g- 40
·'= 0.. s- 50 <C
-60
-70
-80
-90
85
-100 0 2 3 4 5 6 7 8 9 10 11
Frequency (kHz) Figure 3.1. A multi-band spectrum (stepped line) and an FFT spectrum for the Arabic voiceless fricative [s] in the sequence [asa] both generated from a 40-ms full Hamming window placed at the middle of the frication noise.
and Stevens (1961) found that listeners were able to distinguished synthetic voiceless
fricatives made from applying a single energy pole and a single zero to white noise yield-
ing rather oversimplified energy spectra. Figure 3.1 shows the shape of a multi-band
spectrum generated using a 40-ms Hamming window applied over the middle of frication
noise in [s] in the context [asa]. In this example, the MBS has 11 bands as a result of us-
ing a 22 kHz sampling rate to digitize the sound signal. A spectral length of eleven kHz is
regarded sufficient for capturing the characteristic spectral shapes of obstruents. In a
study of Australian English fricatives, Tabain (1998) concludes that spectral information
above 10 kHz is not consistent across speakers. This acoustic information appear to be
dependent on the speaker rather than on the fricative type. Additionally, studies on the
spectral properties of stop releases (Liberman et al. 1952, Stevens and Blumstein 1978,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
86
Blumstein and Stevens 1979) suggest that the spectral "templates" that characterize the
different stop places are confined to the lower 5 kHz of the power spectrum.
3.2 Methods
3.2.1 Subjects
Five male subjects participated in this experiment. They were in their late twen-
ties and early thirties. All of the subjects were native speakers of central Saudi Arabic
dialects that are closely related and have identical phonological inventories. I have noted
that all five subjects spoke the Riyadh dialect even though they were not all natives of
Riyadh. This is not surprising given the regional status of the Riyadh dialect. Much like
the Cairo dialect in Egypt and the Damascus dialect in Syria (Holes 1994), the Riyadh
dialect is the predominant and most widespread dialect in central Saudi Arabia. All of the
subjects have lived for various periods in Riyadh and a sizable number of their acquaint-
ances are speakers of the Riyadh dialect.
Riyadh dialect retains all of the MSA consonants save for [d'•l which is always
replaced by [o'], and [q], which is used occasionally but is frequently replaced by [g].
The vocalic inventory of Riyadh dialect is substantially different from MSA. The high
vowels [i] and [u] are retained only when geminated. When single, these two vowels are
typically reduced to [g], The low vowel [a] is retained in both single and geminated
forms. Furthermore, Riyadh dialect has two mid vowels that appear only in geminated
forms: [££] and [oo]. The reason for the lack of any single versions of these vowels is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
87
that they are used exclusively in lieu of the MSA vowel-glide combinations [aj] and
[aw], respectively.
The five subjects were all graduate students at the University of Wisconsin-
Madison. Their high level of education ensures that they have the desired proficiency in
MSA since, as pointed in § 1.3, MSA is the primary form of Arabic used in the educa-
tional system and intellectual circles. All five subjects showed a very high level of flu-
ency in MSA and expressed no complaints in regards to the speech materials they were
presented with. The subjects were quite capable of producing all MSA speech sounds
comfortably, including those that were absent from their native dialect. None of the sub-
jects displayed any speech- or hearing-related abnormalities.
3.2.2 Stimuli
The set of stimuli consists of real MSA words that contain the sequence VCV in
which the consonants belonged either to the class of emphatics ([f1], [d1], [01], [s1]), their
nonemphatic counterparts ([t], [d], [0], [s]), gutturals ([q], [XL [B'], [h], [<j']), or the velar
stop [k]. The vowels in those sequences were [i], [a], and [u]. Every VCV combination
was represented yielding a paradigm of 126 test words (3 x 14 x 3). A list of the test
words is provided in Appendix A.
The test paradigm is less than ideal since it is not possible to find a minimal (or
even near-minimal) set of 126 real words. This apparent drawback is outweighed by the
use of natural words rather than nonsense utterances. While the use of nonsense words
makes it possible to compile a carefully designed paradigm in which all sound sequences
of interest are represented in ;identical phonetic surroundings, such items remain, by defi-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
88
nition, alien to the speakers. Certain degrees of unnaturalness and over-articulation have
to be involved when pronouncing them. In their cineflurographic articulatory comparison
between Arabic emphatics and their non-emphatic counterparts in nonsense words as
well as natural words, Ali and Daniloff (1972) found that the tongue shape during em-
phatics vary widely across subjects in nonsense words versus natural words. Furthermore,
one of Ali and Daniloff's figures (Figure 5, p. 91) shows that the articulatory displace-
ments of the different tongue parts in emphatics in nonsense words ranged from about
half to about twice those in natural words relative to non-emphatics. The authors con-
elude that "studies utilizing contrived nonsense utterances or sustained utterances, do not
elicit the same articulatory responses which occur during production of natural speech"
(p. 92).
In order to minimize the drawbacks of using a paradigm of natural words, the se-
lection of the test words followed some general guidelines. Nasals in test words were
avoided as much as possible since they spread nasalization to neighboring vowels creat-
ing zero resonances (anti-formants) that can cause misreads in automatic formant track-
Liquids were also avoided when possible as they can be contextually emphaticised.
Morpheme boundaries were avoided as well since that might interact with emphasis
spread. Additionally, preference was given to the use of single, as opposed to geminated,
vowels in the VCV sequences. Every effort was made to ensure that these guidelines are
upheld. To this end, three dictionaries were thoroughly consulted during the compilation
of the test paradigm: Ar-Razi's (fl. 1261 A.D.) Mukhtar Al-Sihah, Wehr's (1979) A Die-
tionary of Modern Written Arabic, and Baalabaki's (1995) Al-Mawrid Arabic-English
12 The importance of this point stems from the fact that the stimuli for this experiment are also used as a subset of the stimuli for Experiment Three. See Chapter 5.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
89
Dictionary. However there are cases that go against some the stated preferences. This is
to be expected from a large experimental paradigm of natural words.
The words were presented in the carrier phrase "?alkalimtu hiya _" ("The
word is __ "). Since it is permissible in Standard Arabic speech to drop tense or case
inflectional suffixes in sentence-final words, the placement of the words in the carrier
phrase enables the subjects to pronounce the words without those suffixes which might
vary from subject to subject and cause inconsistent interference with the consonants and
vowels in question. Since such suffixes are mostly written as diacritics and are, in many
cases, optionally displayed in the written form, they were left out from the printed stimuli
to avoid any confusion.
3.2.3 Procedures
All the recordings were made in the Phonetics Lab at the University of Wiscon-
sin-Madison. Each subject was seated comfortably in a sound-attenuating booth and
asked to speak into a TOA J1 high-fidelity microphone, keeping a constant distant of ap-
proximately 15 em between his mouth and the microphone. The experimental phrases
were ordered randomly. Each experiment phrase was displayed in front of the subject as
an individual Microsoft PowerPoint (Microsoft Corp. 1987) slide. This necessitated the
presence of a computer in the recording booth. However, the computer that was used was
very quiet and, as a further precaution, was tucked below the table on which the micro-
phone, along with the computer system's monitor and keyboard, were placed. In general,
there was no detectable noise interference from the computer. Before the recording
started the subjects were instructed to read each phrase at a normal conversational rate
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
90
and effort then proceed to the next one by pressing a key on a quiet computer keyboard.
Each phrase was repeated three times by each subject, yielding a total of 15 instances of
every test phrase (3 repetitions x 5 subjects). The speech tokens were recorded on HHB
digital audio tapes using a TASCAM DA-30 digital audio recorder.
Using an Apple Macintosh desktop computer and the audio editing software Peak
LE (BIAS, Inc. 1996), the full recording session of each subject was digitized into a long
WAVE file at a 22 kHz sampling rate and 16 bit quantization. These master speech files
were transferred to recordable CD media for proper archiving. The individual test words
were then cut from the master files, normalized for amplitude, and saved as individual
WAVE files using the audio editing software GoldWave (GoldWave, Inc. 2002) on an
IBM -compatible desktop computer.
3.2.4 Acoustic Analysis
3.2.4.1 Spectral Moments
The speech analysis software Praat (Boersma and Weenink 1992) was used to
calculate the spectral moments of consonant spectra. Through the use of a scripting lan-
guage, Praat allows automation of the analysis procedures provided that the acoustic
landmarks being investigated are identified as time points or intervals. The acoustic
landmarks of importance for this experiment were the segment boundaries. To identify
these boundaries the waveform and spectrogram of each speech token were displayed in
the editor screen of Praat and the boundaries of the consonants being investigated were
identified and marked as time points. In deciding on the segment boundaries, waveforms
and wide-band spectrograms were visually consulted. Boundaries that were relatively
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
91
more difficult to distinguish were verified auditorily. The collection of time points corre-
sponding to the identified boundaries for the sound token was saved as a TextGrid
To calculate the spectral moments at the desired locations of the consonant, two
Praat script files were written; one for continuants and the other for stops. The scripts re-
ferred to the TextGrid files and then executed the following steps:
1. The analysis window locations (see further explanation below) were set rela-
tive to the landmarks previously identified in the TextGrid file. Those portions
of the sound file were then extracted as single windowed selections ( 40 ms
full Hamming).
2. Pre-emphasis was applied independently to each extracted selection. This was
done since, according to Praat' s co-author, Paul Boersma (personal communi-
cation), Praat calculates spectral moments directly without applying pre-
emphasis by default.
3. An FFT spectrum was generated for the selection.
4. The four spectral moments were computed from the FFT spectrum and their
values were recorded in an independent results text file.
The results text file was then converted to the file formats of the spreadsheet
software Microsoft Excel (Microsoft Corp. 1985) and the statistical analysis software
SPSS (SPSS, Inc. 1989) for statistical analysis.
In calculating the spectral moments for continuants, the procedures chosen by
Jongman et al. (2000) were followed. This involves computing the mean, standard devia-
tion, skewness, and kurtosis of FFT spectra generated using 40-ms full Hamming win-
13 A TextGrid file is basically a text file listing in numerical format the selected time points expressed in milliseconds.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3 5
2 4
2
Figure 3.2. Locations of the sampling windows at which the spectral moments for fricatives (above) and stops (below) were calculated.
92
dows. Jongman et al. calculate the moments over four regions (fricative onset, middle of
fricative, fricative offset, and centered over the fricative-offset/following vowel bound-
ary). In the present study, a region centered over the fricative-onset/preceding vowel
boundary was added bringing the number of test windows to five.
For stops, a 20-ms half Hamming window that starts at the beginning of the stop
release noise as well as a 40-ms full Hamming window centered over the onset of voicing
of the following vowel were used. The selection of a half Hamming window over the re-
lease is intended to make sure that the stop burst corresponds to the widest portion of the
window. This also avoids the inclusion of any possible pre-burst noises in the calculation.
And since release noises are in many cases rather short, the window length is set at 20
ms. Figure 3.2 illustrate the placement of the sampling windows over the frication noise
in fricatives and release noise in stops.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
93
3.2.4.2 Multi-Band Spectra (MBS)
The speech analysis computer program TF32 (Milenkovic 2000) was used to cal-
culate the multi-band spectra for voiceless consonants only. For continuants, multi-band
spectra were calculated using 40-ms full Hamming windows at two locations (see theRe-
sults for elaboration): frication onset and middle of frication. These two window loca-
tions correspond to windows 2 and 3 used to calculate spectral moments (see Figure 3.2).
As for stops, 20-ms full Hamming windows were placed starting 5 ms ahead of the stop
burst. This particular arrangement was followed because the software used to generate
the MBS, TF32, does not offer a half Hamming window option which would have been
ideal for stop releases. And due to the short duration of many of the stop releases covered
in this study, 20-ms rather than 40-ms window length is a suitable choice.
3.2.5 Reliability
To assess the intra-judge reliability of the obtained spectral moments analysis re-
sults, a total of 108 sound files containing continuant sounds and 81 sound files contain-
ing stop sounds (approximately 10% of the total files in both cases) were randomly se-
lected by a random number generating software and re-analyzed following the same
procedures explained in §3.2.4.1. For continuants, the correlations between the spectral
moments values (averaged from the five window locations) of the two groups of tokens
(the original and the retested tokens) were at a full 1.00 for all four moments. Agreements
within 50 Hz for spectral mean and standard deviation, 0.05 for skewness, and 0.1 for
kurtosis were between 80.6% and 94.4%. For stops, the correlations between the two
groups of tokens for all moments at both window locations were also at exactly 1.00.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
94
Agreements within 50 Hz for spectral mean and standard deviation, 0.05 for skewness,
and 0.1 for kurtosis were between 96.3% and 100%. The measurements were judged reli-
able.
In order to estimate of the intra-judge reliability of the multi-band spectra meas-
urements, a total of 54 sound files containing continuant sounds and 54 sound files con-
taining stop sounds (10% of the total files in both cases) were selected randomly by the
random number generating software and re-analyzed following the same procedures ex-
plained in §3.2.4.2. For continuants, the correlations between the original and the retested
groups of tokens in terms of the relative intensity values at each of the 11 spectral bands
were all above 0.99. Agreements within 1 dB at each band were between 89.8% and
99.1 %. For stops, the correlations between the two groups of tokens for each of the 11
spectral bands were all above 0.99. Agreements within 1 dB at_each band were between
97.2% and 100%. The measurements were judged reliable.
3.3 Results
3.3.1 Spectral Moments
3.3.1.1 Voiceless Continuants - Pooled Data
A set of four analysis of variance (ANOV A) tests was conducted for the four
voiceless continuants, [s], [s'], [X], and [h] across the five subjects and the nine vowel
contexts. In each test one of the four spectral moments (averaged from the five window
locations) served as the dependent variable. The averaged spectral moments values are
shown in Table 3.1.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
95 Table 3.1. Mean values of spectral moments for voiceless continuants averaged across speakers, window locations, and vowel contexts.
Consonant
[s] [sl']
[X] [b)
F value (df = 3,536)
*** p < .001
Mean (Hz)
6,374 6,101 4,839 3,416
250.368***
Standard deviation (Hz)
2,167 2,215 2,536 2,198
20.488***
Skewness Kurtosis
0.297 0.77 0.337 0.49 0.832 1.61 1.918 5.37
143.418*** 82.103***
A main effect of consonant types on spectral mean is obtained [F(3,536) = 250.368, p < 0.001; R2 = 0.581]. The results, along with subsequent Scheffe post hoc
comparisons, show that the spectral mean values for the two voiceless alveolars [s] and
[s1] are significantly higher than the rest of the voiceless continuants (p < 0.001). The
spectral mean of [s] is 273 Hz higher than [s"], though this difference is not significant (p
> 0.16). Among the voiceless continuants, the lowest spectral mean value belongs to the
voiceless pharyngeal [h], distinguishing it from the other three continuants (p < 0.001).
The voiceless uvular [X] is also distinguished from the other voiceless continuants by its
mid-range spectral mean value (p < 0.001). So, while the averaged spectral mean value is
incapable of distinguishing the two alveolars from each other, it succeeds in distinguish-
ing the three points of articulation (alveolar vs. uvular vs. pharyngeal). There is also a
main effect of consonant types on spectral standard deviation [F(3,536) = 20.488, p <
0.001; R2 = 0.098]. However, only [X] is distinguished from all other voiceless contin-
uants by its relatively high standard deviation (p < 0.001). The remaining three voiceless
continuants; [s], [s"], and [h]; are not significantly different from each other (p > 0.85).
There is a main effect of consonant types on the spectral skewness of voiceless contin-
uants [F(3,536) = 143.418, p < 0.001; R2 = 0.442]. The two lowest skewness values dis-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
96
tinguish the two alveolars [s] and [s1] from the rest of the rest of the continu(J.nts (p <
0.001), but not from each other (p > 0.97). The mid-range skewness value of [X] and the
high value of [h] succeed in distinguishing these two sounds (p < 0.001 for both). A main
effect of consonant types on the spectral kurtosis of voiceless continuants is obtained
[F(3,536) = 82.103, p < 0.001; R2 = 0.311]. The pharyngeal [h] has a notably high kurto-
sis value while [X] has a mid-range value and [s1] has the lowest value. Kurtosis distin-
guishes these three sounds from each other (p < 0.05). Meanwhile, the spectral kurtosis
value for [ s] falls in between those for [ s"] and [X] and fails to distinguish it from either
of those two sounds.
Figure 3.3 shows the spectral moments values at the five sampling windows for
the four voiceless continuants. It is clear from inspecting the spectral mean, spectral
skewness, and spectral kurtosis graphs that, for voiceless continuants, the locations where
the continuants are, from an acoustic point of view, most visibly distinguished from
neighboring vowels are at windows 2 and 3. It is also at these two window locations that
the widest acoustic dispersion across the various voiceless continuants is realized. This is
a clear indication that the canonical spectral shapes are achieved at the first half of the
voiceless continuant. For this reason, we will concern ourselves mostly with the spectral
moments values averaged from window locations 2 and 3 from this point forward in this
work. Four ANOV A tests similar to the ones discussed above were conducted for the
same sounds, only this time the values of the spectral moments averaged from windows 2
and 3 were used as the dependent variables. The spectral moments values are shown in
Table 3.2. Box plots of those values are shown in Figure 3.4.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
97 8.00 3.00
7.00 ........... __ --·- -... ......... N' 2.50 ' ::r:: ' 6.00 / ' ..0.::
/ ' '--" 1::
N 0 2.00 g5.oo
·p \ rn ·;;
1:: \ \ v rn \ • Q v ::E 4.00 \ '"0 1.50 .... c; \ rn
'"0 .... 1:: E 3.oo rn ..... 0.. r./) 1.00 r./) c;
2.00 J:j (.) v 0..
1.00 r./) 0.50
0.00 0.00
2.5 8.0
7.0 2.0
6.0 • N' ifJ 1.5 I ::r:: 5.0 ifJ ..0.:: v I '--" I = I I v ifJ 4.0 I 0 ..0.:: ..... I Vl 1.0 I ....
I c; 3.0 J:j c; I (.) v J:j I /. 0.. (.) r./) 0.5 v 2.0 I / 0..
r./)
1.0 0.0
0.0
-0.5 -1.0 1 2 3 4 5 1 2 3 4 5
Window Location Window Location
- •- [s] Figure 3.3. Spectral moments values for voiceless continuants at the five sampling win-dow locations. The values are averaged across subjects and vowel contexts.
-o-[s'] , - •- [xJ --<>-[h]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
98 Table 3.2. Mean values of spectral moments for voiceless continuants averaged from windows 2 and 3 and across speakers and vowel contexts.
Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [s] 7,540 2,002 -0.162 0.16 [sl·] 7,327 2,058 -0.008 -0.07
[xl 5,909 2,635 0.377 0.25 [h] 3,398 2,003 2.132 6.48
F value 385.540*** 59.934*** 154.587*** 100.345*** (df = 3,536)
*** p < .001
10.0 5.0
8.0 :E4.0 • • ,--... N '--'
6.0 ;:::: 3.0 0 '--'
;:::: i3 4.0 ·;; 2.0
:;8 <l) '0
2.0 1.0
0.0 0.0 8.0 50
• 6.0 • 40 •
VJ 4.0 • 30 • VJ <l) VJ • ;:::: 2 20 ::: 2.0 ..... • <l) ;::l
IZl 0.0 10 • • • I -2.0 • • 0
--4.0 [s] [ [X] [h]
-10 [s] [ sl'] [X] [h]
Figure 3.4. Box plots of the distributions of the spectral moments scores for the four voiceless continuants [s, sl", x. h]. The scores are averaged across subjects and vowel contexts.
Main effects of consonant type are obtained for spectral mean [F(3,536) = 385.540, p < 0.001; R2 = 0.682], spectral standard deviation [F(3,536) = 59.934, p <
0.001; R2 = 0.247], spectral skewness [F(3,536) = 154.587, p < 0.001; R2 = 0.461], and
spectral kurtosis [F(3,536) = 100.345, p < 0.001; R2 = 0.356]. The two sibilants are sig-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
99
nificantly higher in spectral mean than the two nonsibilants (p < 0.001) but are not sig-
nificantly different from each other (p > 0.4 ). The uvular [X] had a spectral mean that is
significantly higher than that of [h] (p < 0.001). Standard deviation is not capable of dis-
tinguishing the two sibilants and [h] from each other (p > 0.8). However, the uvular [X]
has a significantly higher standard deviation than the other sounds (p < 0.001). The two
sibilants are, again, undistinguishable based on their skewness values (p > 0.6). All other
pair-wise comparisons show statistically significant differences (p < 0.05). Kurtosis can-
not distinguish the two sibilants from each other nor from [X] (p > 0.9). The pharyngeal
[h], on the other hand, has a markedly high kurtosis value that distinguishes it from the
other sounds in the group (p < 0.001).
3.3.1.2 Voiceless Continuants - Individual Subjects
To investigate the variability in the rankings of the four voiceless continuants in
terms of their spectral moments values across the individual subjects, each spectral mo-
ment (averaged from windows 2 and 3) was used as the dependent variable in an
ANOV A test conducted for the four voiceless continuants across vowel contexts for each
single subject. This yielded a total of 20 ANOVAs (4 tests x 5 subjects). Figure 3.5 con-
tains 20 box plots showing the distributions of each of the four spectral moments per sub-
ject.
The results show that the two sibilants [s] and [s"J are not statistically different
from each other in the majority of pair-wise comparisons regardless of the spectral mo-
ment being investigated. However, there are two exceptions that merit elaboration. Sub-
jects 2 and 5 show that [s] has a significantly higher spectral mean than that of [s"]. Fur-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
100
-il- + -rn- . • • u []- e: V) --m- -[}- -ID- ·t :8 ..... () Q)
B -ill- ··-[I} ·-ill-· t ::l '2:. r:/)
-[]-- -{]- -ill -o -[IJ- -[[} -ill e:
"<t -ill- -ill· ·-ill- ·I :8 u Q)
B -[0- -OJ- -{]}-- ·-{} ::l '2:. r:/)
{]- -rn- -[}- -{} CfJ
·-{]- -ill-· ·t :;2 M ·-ill- -ill- -ID-· t u
Q)
B --ill ill- -ill-· ::l '2:. r:/)
-{]}- . --{]]--· --{]- ... n 00
-{!} --rn- . -{If- . -ill- e: C"l -rn- --[]}- -DJ- . ill :8 ..... () Q)
B ill +· -{]}- ·i ::l '2:. r:/)
-DJ- -ill- -ill-· .. 00
-OJ- --[[]- -Dt· -[]- :;2 ,...., {I}- -[]--- --[}- .. -ID- :8 ..... () Q)
B {]-- -{]]--- -OJ- ··1 ::l '2:. r:/) -rn- -[]- . -OJ--0 0 0 0 0 V) 0 V) 0 V) C< 0 0 0 C< 0 0 0 0 0 0 0 0 0 0 0 r--: q C"l .,., r- '<T <'i ci <'i 1 "<t M N ,...., oO \Ci ,.,f C'-i 0 M M C'-i ,...., 0 I I
Mean (kHz) St. deviation (kHz) Skewness Kurtosis
Figure 3.5. Box plots showing the distributions of the four voiceless continuants spectral moments scores for each of the five individual subjects.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
101
ther scrutiny showed that this is attributed to window 2 which covers the beginning of the
continuant. For both subjects, spectral mean at window 2 is significantly higher for [s]
than for It is safe to assume that the difference displayed at window 2 by both sub-
jects is due to stronger vowel-consonant coarticulatory influence. By comparison, at win-
dow 3, which covers the middle of the continuant, both subjects show no statistical dif-
ference between the spectral mean values of the two sibilants. This window location is at
the portion of the continuant furthest away from any vocalic influence (in VCV contexts).
The pharyngeal [b] can be strongly distinguished from other voiceless continuants
by its spectral mean, skewness, and kurtosis values. For all five subjects, this sound has
the lowest spectral mean and the highest skewness and kurtosis. The uvular [X] can, in
most cases, be distinguished from the other sounds by its mid-range spectral mean value.
This sound also has either the highest or one of the highest standard deviation values.
Only subject 1 shows [X] to have a statistically higher kurtosis than [ s, while all other
subjects show no differences. As for spectral skewness, the difference between [X] on the
one hand and [s, s"] on the other depends largely on the subject.
3.3.1.3 Voiceless Continuants - Specific Vowel Contexts
To test the vowel context-based variability in the rankings of the four voiceless
continuants, a set of ANOV A tests were conducted that directly address each individual
vowel context across subjects. A total of36 ANOVAs were conducted (4 moments x 9
vowel contexts).
The first thing to note here is that, in all nine vowel environments, the two sibi-
lants [s, s"] are not statistically different from each other in any of their spectral moments
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
102
values. Like the pair-wise comparisons results of the pooled data, the two sibilants, gen-
erally speaking, enjoy the highest spectral means and [h] has the lowest. There are impor-
tant systematic differences, however, between some vowel contexts on the one hand and
the pooled data on the other as far as the spectral mean ranking of the uvular [X] is con-
cerned. Whenever the vowel [ u] occupies either or both vowel slots in the test word, the
spectral mean of [X] becomes no different (p > 0.1) or only marginally different (p < 0.1)
from the spectral mean values of the sibilant. It is possible that the tongue position during
the articulation of the back vowel [u], which takes place near the point of articulation of
[X], ampl!fies the tongue retraction during ['X:] bringing the back of the tongue dorsum
closer to the soft palate and further narrowing the oral tract at that specific point. This, in
turn, intensifies the air stream turbulence during the articulation of [X] giving this sound
substantial energy components at the high frequencies.
The previous explanation regarding the vowel-uvular coarticulation appears more
plausible when we look at the spectral skewness results. While [h] always has the highest
skewness, in most vowel environments [X] is not statistically different from the sibilants.
However, in the environments iCi and aCi, [X] has a significantly higher skewness than
[s]. Since the vowel [i] involves an advanced tongue position, we would expect it to
cause a coarticulatory effect on [X] that is the opposite of that of [u]. A higher skewness
. value means that the power spectrum curve is skewed upwards at the lower frequencies.
Since [X] is moved away from its back point of articulation creating a wider constriction,
this yields less intense frication dimming the energy components at the higher frequen-
cies. This explanation is not without its flaws, however. Notice that in the environment i-
a the skewness of [X] is not statistically higher than those of the sibilants. As for the envi-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
103
ronments iCu and uCi, we could argue that the tongue-retraction coarticulatory effect of
[u] overrides the effect of [i]. How such contradicting effects are resolved is a compli-
cated matter that lies beyond the focus of this work.
In terms of spectral standard deviation, the two sibilants [s, s"] and the pharyngeal
[h] are always not statistically different from each other, which is the same result of the
pooled data. As for [X], in most cases this sound has the highest standard deviation. In the
presence of the vowel [u], this sound ranges in pair-wise comparisons from being no dif-
ferent than other sounds to being significantly higher. Meanwhile, pair-wise comparisons
of the spectral kurtosis results are always similar to those of the pooled data: [h] is sig-
nificantly the highest while the remaining sounds are no different from each other.
3.3.1.4 Voiceless Continuants - Discriminant Analysis
Discriminant analysis is used to classify the different voiceless continuants being
investigated based on the known values of their spectral moments averaged from sam-
pling windows 2 and 3. The results displayed in Table 3.3 show that the overall correct
classification rate is over 71%. While this general rate is somewhat high, the individual
rates of classification of the four continuants paint two contrasting pictures of the two
gutturals on the one hand and the two sibilants on the other. The respective correct classi-
fication rates for the uvular [X] and the pharyngeal [h] are more than 85% and 91%. The
two sibilants, on the other hand, are correctly classified in no more than 57.8% of their
actual incidents while being misclassified as each other in at least 37% of the cases. This
is not unexpected given the above discussed statistical proximities between the two sibi-
lants in their spectral. moment values. So, excluding differences based on secondary
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 3.3. Results of the discriminant analysis for the voiceless continuants based on the four spectral moments' values combined together as predictors. The moments' values are averaged from sampling windows 2 and 3 and across speakers and vowel contexts. The numbers represent the totals and percentages of correctly classified sounds.
Predicted Group Membership Consonant [h] [s] [xl
Original Count [h] 123 0 0 12
[s] 0 68 58 9 [sl'] 0 50 78 7
[xl 13 2 5 115 % [h] 91.1 0.0 0.0 8.9
[s] 0.0 50.4 43.0 6.7 [sl'] 0.0 37.0 57.8 5.2
[xl 9.6 1.5 3.7 85.2
71.1% of original grouped cases correctly classified.
104
Total
135
135
135
135
100
100
100
100
places of articulation, the discriminant analysis of voiceless continuants based on the val-
ues of their spectral moments is quite capable of profiling their primary places of articula-
tion.
The standardized canonical discriminant function coefficients were analyzed to
estimate the contribution of each spectral moment value to the aforementioned classifica-
tion rates. Spectral mean values stand out as the most prominent predictors. On the con-
trary, spectral standard deviation does not contribute significantly to the classification of
the four voiceless continuants.
3.3.1.5 Voiced Continuants.,.... Pooled Data
A set of four ANOVA tests was conducted for the four voiced continuants, [o],
[ (f1], [ff ], and [<i']. across the five subjects and the nine vowel contexts. In each test one of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
105
the four spectral moments (averaged from the five window locations) served as the de-
pendent variable. The averaged spectral moments values are shown in Tables 3.4.
Table 3.4. Mean values of spectral moments for voiced continuants averaged across speakers, window locations, and vowel contexts.
Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz)
[oJ 4,290 2,578 0.852 2.20 [oi·J 3,640 2,614 0.941 2.30 [K] 3,702 2,432 1.126 3.97 [1] 2,038 1,206 3.505 30.92
F value 84.784*** 146.594*** 177.321*** 142.564*** (df = 3,536)
*** p < .001
A mam effect of consonant types on spectral mean is obtained [F(3,536) = 84.784, p < 0.001; R2 = 0.318]. The highest spectral mean value belonging to the inter-
dental [o] (p < 0.01) and the lowest value belonging to the pharyngeal [1] (p < 0.001) dis-
tinguish them from the remaining voiced continuants. The emphatic interdental [o"] and
the voiced uvular [B'] are not distinct from each other (p > 0.98). A main effect of conso-
nant types on standard deviation is also observed for voiced continuants [F(3,536) = 146.594, p < 0.001; R2 = 0.448]. The voiced pharyngeal, [1], has the lowest spectral stan-
dard deviation which distinguishes it from the rest of the voiced continuants (p < 0.001).
Standard deviation fails to distinguish the other three continuants from each other (p >
0.14). A main effect for voiced continuants is also observed [F(3,536) = 177.321, p <
0.001; R2 = 0.495]. Again, only [1] is distinguished from the rest of the continuants by its
high spectral skewness (p < 0.001). The remaining three voiced continuants are not dis-
tinguishable from each other (p > 0.24). A main effect of consonant type on spectral kur-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
106 6.00 3.50
5.00 ...... \ ,--_ 3.00
/ \ N / ::r: ./ \ A- - -Jir - - A',
,--_ I \ ';;' 2.50 / ',-..::
\ .s j/ 'iiJ '
'/ • A '-' ·;; .:: il) 2.00 "' 0 il) ::E 3.00 "0 ....
03 "' "0 1.50 t1 .:: () "' il) .....
--/:J t:. t:. [/) t:r-03 1.00 t1 () il) 0..
1.00 [/)
0.50
0.00 0.00
4.00 35.0
3.50 30.0
3.00 ,--_ 25.0 N
VJ ::r: VJ
";;;' 20.0 ·;;;
il) 0 ..... .... 15.0 ;::l
"' t1 03 ()
g, 1.50 t1 10.0 () [/) il)
/ 0..
''&- - -.....- / [/) ..A 1.00 5.0 /
/ Jf ---.---k'
0.50 '/ 0.0
0.00 -5.0 1 2 3 4 5 1 2 3 4 5
WindQw Location Window Location
- ·- [0] Figure 3.6. Spectral moments values for voiced continuants at the five sampling window locations. The values are averaged across subjects and vowel contexts.
--o--- [0'] --A- [ff] -----t:r-- [)]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
107
tosis is obtained as well [F(3,536) = 142.564, p < 0.001; R2 = 0.441]. The exceptionally
high 30.9 kurtosis value for [)] distinguishes this sound from the rest (p < .001). Kurtosis
fails to make any other significant distinctions among voiced continuants.
Figure 3.6 shows the spectral moments values at the five sampling windows for
the four voiced continuants. It is clear from inspecting the graphs, particularly those of
spectral mean and spectral skewness, that voiced continuants distinguish themselves
acoustically from neighboring vowels and from each other at sampling windows 3 and 4,
which cover the middle and the end of the continuant, respectively. So, the canonical
spectral shapes of voiced continuants are achieved at the second half of the sound. Hence,
in the ensuing discussion, we will concern ourselves with the spectral moments values
averaged from window locations 3 and 4. Another set of four ANOV A tests were con-
ducted for the same sounds across subjects and vowel contexts. In each one of these tests,
one of the spectral moments averaged from windows 3 and 4 was the dependent variable.
The mean values of these spectral moments are shown in Table 3.5, and their distribu-
tions are shown in the box plots in Figure 3.7.
Table 3.5. Mean values of spectral moments for voiced continuants averaged from windows 3 and 4 and across speakers and vowel contexts.
Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [oJ 5,270 2,818 0.364 0.80 [oi·J 4,612 2,900 0.501 1.20 [B'] 4,131 2,592 0.982 2.79 [CJ] 2,001 1,221 3.601 31.99
F value 99.789*** 153.151*** 176.579*** 136.494*** (df= 3,536)
*** p < .001
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
10.0
8.0 N'
6.0 '-'
2.0
•
I
• • • •
0.0 L---..J...------1---.....J_-_ _L._ __
C/J C/J
10
8 6
g 4 2
VJ 0
-2
• •
$y I • • •
___ [oJ [B) [«i)
5.0
";;' 3.0 0
·;;: 2.0 <!)
"0
1.0
108
0.0 L,_ _ _.__ __ J.._ _ ___. __ __
150
120
90 C/J 0
60
30
0
• I
[oJ [B)
• • •
[)) Figure 3.7. Box plots of the distributions of the spectral moments scores for the four voiced continuants [ o, o\ B, 1). The scores are averaged across subjects and vowel contexts.
Main effects of consonant type are obtained for spectral mean [F(3,536) = 99.789,
p < 0.001; R2 = 0.355], spectral standard deviation [F(3,536) = 153.151, p < 0.001; R2 = 0.459], spectral skewness [F(3,536) = 176.579, p < 0.001; R2 = 0.494], and spectral kur-
tosis [F(3,536) = 136.494, p < 0.001; R2 = 0.430]. The interdental [o] is distinguished by
the highest spectral mean value (p < 0.05), while the pharyngeal [)] is distinguished by
the lowest value (p < 0.001). The emphatic interdental [o"] and the uvular [B"] are fall in
between and are not distinct from each other. In terms of spectral standard deviation, the
two interdentals [o, o"] have the highest values. The pharyngeal [<i'] is distinguished from
all sounds by its low standard deviation (p < 0.001). The uvular [B"] has a midrange value
that distinguish it from other sounds except [o] whose standard deviation is only margin-
ally higher (p < 0.1) than that of [ff]. The two interdentals have the lowest skewness and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
109
are not statistically different in that metric. Meanwhile, the high skewness of [1] and the
mid-range skewness of [K] distinguish these two sounds in all pair-wise comparisons (p <
0.05). The [1] has a very high kurtosis which distinguishes it from all other sounds (p <
0.001 ). Kurtosis values of the other three sounds are not statistically different.
3.3.1.6 Voiced Continuants - Individual Subjects
Variations in the voiced continuants' spectral moments rankings between the five
subjects were examined by means of 20 ANOVA tests (4 moments x 5 subjects) con-
ducted for the four voiced continuants across vowel contexts. In each test the dependant
variable is one of the four spectral moments averaged from windows 3 and 4. Figure 3.8
includes 20 box plots showing the score distributions of each of the four spectral mo-
ments for each subject.
Only subject 3 shows [6] having a statistically higher spectral mean than [6'•l All
other subjects show no statistical difference between spectral means of the two sounds.
This is quite interesting since, based in the pooled results in the previous section, spectral
mean is the only metric that reflects any statistical difference between the two sounds. All
subjects show [1] as having the lowest spectral mean, although, for subject 4, [1] is not
statistically lower than [K]. The ranking of the spectral mean value of [K] shows no stable
pattern and depends on the subject. All five subjects show no difference between [6] and
[6'] in terms of spectral standard deviation. Also, with the exception of subject 5, the sub-
jects show no statistical difference between [6, 6'] and [K]. The pharyngeal [1], on the
other hand, is statistically shown to have the lowest standard deviation by all subjects.
The two interdentals are also not statistically distinct from each other in terms of their
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
110
.. -ill -ill --ill- -[]-tn --DJ-- -[[]--- -[[}- -ill u 0)
B -OJ-- --[]-- -[]]-;::l "'" Cl'l ':£ -[[]--- -[}-- .. ---DJ- ··I
-[]]- ·-[]}- -ill- ·-ill-7 ----[]]- -[0-- . -[}-u .. 0)
B -[[]---- --DJ- -[]- .,, "" ;::l Cl'l ':£ ---rn- --ID- ·+ ··I
-[]- ·---[]-- -{I}- ·-ill ('() --rn- •. -[]- t ..... (.) 0)
B ill-· -{]]- -1]}- ·t ;::l "'" Cl'l ':£ -rn- -DJ- t· ':£
-[]- -ill- ·-OJ-N ---DJ- -[]---- -ill- .. . ·I ..... (.) 0)
B I I --{]]--- ·-rn- . (,r< ;::l ':£ Cl'l
-[J}- -rn- -[}-- ..... 1 ':£
• -ill- .... -[]]- -rn-,...., -[]- -ITJ--- -{D- ·-1]} u 0)
B ----[0- -OJ-- -rn- (,r< ;::l ':£ Cl'l
--rn-·" --rn- -ill-0 0 0 0 0 0 0 0 0 0 "! c: "! 0 tn 0 0 0 0 0 c: 0 0 c: 0 c: c: 0 c: 0 t-- tn N 0 0i N 0\ \0 ('()
00 \C) ..,f N ci 7 ('() C'i ,...., ci I ,....,
Mean (kHz) St. deviation (kHz) Skewness Kurtosis
Figure 3.8. Box plots showing the distributions of the four voiced continuants spectral moments scores for each of the five individual subjects.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
111
spectral skewness as shown by all subjects. The five subjects also show that [1] always
has a significantly higher skewness than all other sounds. As for [B], as was the case for
spectral standard deviation, its skewness ranking is subject-dependant. All five subjects
show that the spectral kurtosis of [1] is significantly higher than any of the other three
sounds. The kurtosis values of [o], [01], and [B], on the other hand, do not show any sig-
nificant statistical differences by any of the subjects.
3.3.1.7 Voiced Continuants- Specific Vowel Contexts
The vowel context-conditioned variations in the rankings of the spectral moments
were examined by means of a set of 36 ANOVA tests conducted for the four voiced con-
tinuants across subjects. Each individual test compares the values of one of the four spec-
tral moments in one of the nine vowel contexts ( 4 moments x 9 vowel contexts).
In almost all vowel contexts, the two interdentals [ o, o1] are not statistically dif-
ferent from each other in terms of their spectral mean. The spectral mean of the uvular
[B] is not statistically different from at least one of the two interdentals in all contexts.
The pharyngeal [1], on the other hand, is statistically distinguished from the other three
sounds by having the lowest mean in almost all vowel contexts. The pharyngeal [1] is
also significantly distinguished from the other sounds by having the lowest spectral stan-
dard deviation, the highest spectral skewness, and the highest spectral kurtosis in all
vowel contexts. The other three sounds are mostly not significantly distinct from each
other in the values of those three metrics.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
112
3.3.1.8 Voiced Continuants - Discriminant Analysis
To assess the power of spectral moments in classifying the four voiced contin-
uants a discriminant analysis test was conducted. The predictors used in the tests were the
four spectral moments values averaged from sampling windows 3 and 4. Table 3.6 shows
that the overall correct classification rate for voiced continuants is quite poor (51.7%).
Only [l] enjoys a high classification rate (82.2%) while [6], and [B] are frequently
confused with each other. Among the latter three continuants, cases of misclassification
as one another are near, or above, the 25% chance (as there are four categories in the
classification function). None of those three sounds is accurately classified more than
48.9% of its actual incidents. Meanwhile, misclassifications of [<t] as one of the three
sounds [6, 6\ B], and vice versa, are all below the 25% chance.
Table 3.6. Results of the discriminant analysis for the voiced continuants based on the four spectral moments' values combined together as predictors. The moments' values are averaged from sampling windows 3 and 4 and across speakers and vowel contexts. The numbers represent the totals and percentages of correctly classified sounds.
Predicted Group Membership Consonant [«!] [ol [oY] [.K]
Original Count [«!] 111 22
[ol 2 66 32 35 [oY] 5 42 49 39
[.K] 15 31 36 53 % [«!] 82.2 .7 .7 16.3
[ol 1.5 48.9 23.7 25.9 [oY] 3.7 31.1 36.3 28.9
[.K] 11.1 23.0 26.7 39.3
51.7% of original grouped cases correctly classified.
Total
135
135
135
135
100
100
100
100
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
113
Based on the standardized canonical discriminant function coefficients, none of
the four moments contributes significantly to the classification function. This is not sur-
prising given that the overall classification percentage is quite low to start with.
3.3.1.9 Voiceless Stops- Pooled Data
Another set of four ANOV A tests was conducted for the four voiceless stops [t],
[t'1], [k], and [q] across the five subjects and the nine vowel contexts. In each test one of
the four spectral moments (averaged from the two window locations) served as the de-
pendent variable. The averaged spectral moments values are shown in Tables 3.7.
Table 3.7. Mean values of spectral moments for voiceless stops averaged across speakers, window locations, and vowel contexts.
Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [ t] 4,986 2,433 0.937 2.28 [tl'] 3,870 2,116 1.159 3.60 [k] 4,244 2,515 1.282 3.93 [q] 3,821 2,300 0.982 3.36
F value 23.670*** 8.583*** 4.102** 3.629* (df = 3,536)
* p < .05; ** p < .01; *** p < .001
A main effect of consonant type on spectral mean is observed for voiceless stops
[F(3,536) = 23.670, p < 0.001; R2 = 0.112]. Only [t] is significantly distinguished from
all other stops by its high spectral mean (p < .001). A smaller distinction exists between
[ q] and [k] in which the spectral mean of the former is marginally significantly lower
than that of the latter (p < 0.1). Spectral mean is unable to distinguish [t1] from both [k],
and [q] (p > 0.12). There is a main effect of consonant type on the spectral standard de-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
114
viation of voiceless stops [F(3,536) = 8.583, p < 0.001; R2 = 0.040]. The emphatic stop [t
'] is distinguished from [t] and [k] by the lowest standard deviation value (p < 0.01). The
latter two are not significantly different from each other (p > 0.8). The mid-range spectral
standard deviation value of [q] is not significantly different from both [t] and [t'] (p >
0.18) and is only marginally lower than that of [k] (p < 0.1). A main effect of consonant
type on the spectral skewness of voiceless stops [F(3,536) = 4.102, p < 0.01; R2 = 0.017]
is obtained. The only significant distinction made by skewness is between [t] and [kL in
which the former has a significantly lower spectral skewness than the latter (p < 0.05).
There is a less significant distinction whereby the spectral skewness of [q] is marginally
lower than that of [k] (p < 0.01). No other distinctions are made by the spectral skewness
of stops. A main effect of consonant types on spectral kurtosis is obtained for voiceless
stops [F(3,536) = 3.629, p < 0.05; R2 = 0.014]. Kurtosis makes only one distinction be-
tween [t] and [k] (p < 0.05). The remaining pair-wise comparisons show no significant
differences.
Figure 3.9 shows the four spectral moments values at both sampling windows lo-
cations for the four voiceless stops. When comparing the results of the spectral mean
generated at windows 1 and 2 in the case of voiceless stops, it is evident that window 1 is
the location where all stops distinguish themselves from the following vowel. It is clear
that the spectral means of all four stops are higher at the stop release than at the stop-
vowel junction. Unlike the case of continuants, however, the remaining spectral moments
do not reflect a similar trend. Nevertheless, because window 1 covers only the stop re-
lease, and because of the aforementioned fact regarding spectral mean, our attention in
the following discussion will concentrate on this window location alone as the location of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
7.00
6.00
"""'5.00 N
§4.00
]3.00 u p.,
r:/)2.00
1.00
... ' ' ' ' ' ' ' ' .. .........
0.00 .__ ______ .....__ _____ ___j
1 2 Window Location
3.00
N' 2.50 :r: '-' :::::
2.00 ·;; Q "0 1.50 "0 §
1.00 J:j u p.,
r:/l 0.50
6.00
5.00
"""" N
4.00 '-'
VJ
3.00 ] u & 2.00
r:/l
1.00
115
0.00 ....__ ______ .....___ _____ _____.
1 2 Window Location
- ·- [t] Figure 3.9. Spectral moments values for voiceless stops at the two sampling window locations. The values are averaged across subjects and vowel contexts.
---o- [t']
- ·- [k] ---<>--- [q]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
116
the canonical spectral shape of the stop. The spectral moments averaged from window 1
are listed in Table 3.8. Figure 3.10 shows the box plots that illustrate the distribution of
the spectral moments scores for the four voiceless stops as calculated from that window
location. For each spectral moment, an ANOV A test was conducted for the four voiceless
stops across subjects and vowel contexts with that moment (calculated at window 1) as
the dependent variable
Table 3.8. Mean values of spectral moments for voiceless stops calculated at window 1 and averaged across speakers and vowel contexts.
Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [t] 5,872 2,324 0.825 1.68
4,544 1,956 1.403 4.84 [k] 4,784 2,569 1.050 2.38 [q] 4,603 2,499 0.649 1.52
F value 24.504*** 19.627*** 10.212*** 12.809*** (df= 3,536)
*** p < .001
The tests show main effects of consonant type on spectral mean [F(3,536) = 24.504, p < 0.001; R2 = 0.116], spectral standard deviation [F(3,536) = 19.627, p < 0.001;
R2 = 0.094], spectral skewness [F(3,536) = 10.212, p < 0.001; R2 = 0.049], and spectral
kurtosis [F(3,536) = 12.809, p < 0.001; R2 = 0.062]. Spectral mean is only capable of dis-
tinguishing [t] from the rest of the stops. This plain stop had the highest spectral mean (p
< 0.001). The remaining stops were not significantly different from each other. Standard
deviation can only distinguish the emphatic [f1] from all other stops (p < 0.01). The re-
maining stops are not statistically different aside from the fact that [t] has a marginally
lower standard deviation than [k] (p < 0.1). The spectral skewness of [t'] is significantly
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
117 10.0 • I • 5.0
• • 8.0 : I 'N'4.0 • $ ,......_ • N •
$ 6.0 ';; 3.0 '-' 0
',::j 1:: ro i'3 4.0 ·;; 2.0
::E <l)
I "0
2.0 1.0
0.0 0.0 5.0 • • 40 •
+ • • 30 • 2.5 • • •
VJ • VJ VJ 20 • <l) ·;;; I • 1:: 0 0.0 ...... • ..... t ] ::l 10 • if] • -2.5 • • • • • 0
• • -5.0 -10 [t l Wl [k] [q] [t l [k] [q]
Figure 3.10. Box plots of the distributions of the spectral moments scores for the four voiceless stops [t, t\ k, q]. The scores are averaged across subjects and vowel contexts.
higher than those of [t] and [q] (p < 0.01). The latter two ate not distinct from each other.
The skewness of [k] had a mid-range value that is not statistically different from those of
[t] and [t'1] and only marginally higher than that of [q] (p < 0.1). The spectral kurtosis of
[t1] is significantly higher than the kurtosis of any other stop (p < 0.01). The other three
stops are not statistically different.
3.3.1.10 Voiceless Stops- Individual Subjects
Variations in the rankings of the spectral moments between the individual sub-
jects were examined by means of 20 ANOVA tests (4 moments x 5 subjects) conducted
across vowel contexts. In each test the dependant variable is one of the four spectral mo-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
118
-ill- -ill -{[}--· -il-V) ·-ill ·-OJ-- -[]- ···-[}-..... u <!)
B --rn- -[]- -[]- -ill-;:l C/J --m- ill- -f]- .
---[[]- -ITJ-- -[]-· ·1 'i" -[[]-- -[[]- --{]]-. . . u <!)
B .. --DJ- -{[}- ·-ill- --(} ;:l • C/J
-{[}- -ill- -DJ--rn- -rn-·· -ill-- ·-ill
M ---rn- ·-rn- -ill-· -rn-..... u <!)
B -{!}- ·---[}- . --[]- ·--[]-;:l C/J
-ill-· -m- -[]- -{]-
-{]- --DJ- --[]- -ill C"'1 --[]}- --en- -{[}- ·-[} u <!)
B ·-ot- -DJ- -[]-- -ill ;:l C/J
-OJ-- --o- -m- . ·-ill --{]]- --[0- -G
..... -{]}- --[[]--- {]- -[} u <!)
B .. -ill -[0-- ·--{]- --o-;:l C/J -rn- -OJ- --[0- --(]}
0 0 0 0 o· 0 0 0 0 'r: 0 V) 0 0 0 0 0 0 0 0 0 0 0 N 0 C'i V) M C"'1 ...-<
00 .,0 .,t C'i .,t M C'i 0 I I I
Mean (kHz) St. deviation (kHz) Skewness Kurtosis
Figure 3.11. Box plots showing the distributions of the four voiceless stops spectral moments scores for each of the five individual subjects.
2:
g ;:::::,
;:::::,
o: g ;:::::,
;:::::,
:§:
g ;:::::,
;:::::,
:§:
g '2. ;:::::,
:§:
g ;:::::,
;:::::,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
119
ments (calculated at window 1). Figure 3.11 includes 20 box plots showing the score dis-
tributions of each of the four spectral moments for each subject.
Aside from subject 3, all subjects show that [t] has a significantly higher spectral
mean than its emphatic counterpart [t1} Subject 3 shows that the two are not statistically
different. It should be noted that the voice onset time (VOT) produced by subject 3 fol-
lowing [t] (71 ms, on average) was markedly longer than the other subjects (from 30 to
49 ms, on average). The VOT following [f'] for subject 3 (19 ms, on average), mean-
while, was in line with those of the other subjects (from 10 to 20 ms, on average). It is
possible that the extra duration between the stop release and the start of the vowel for
subject 3 has the effect of distancing and weakening the coarticulatory effects of the fol-
lowing vowel. These effects should be able to enhance the difference between the acous-
tic signatures of the plain [t] and the emphatic This way, the acoustic differences be-
tween these two stops seem to be neutralized for subject 3. There are no stable patterns
for the ranking of spectral standard deviation scores of the four stops across the five sub-
jects. Standard deviation rankings of voiceless stops seem to be highly subject dependant.
The same is somewhat true regarding the rankings of spectral skewness. However, the
skewness of the emphatic stop [t'i'] is either statistically the highest or among the highest.
Kurtosis results are quite similar to those of skewness: has either the highest or one of
the highest values, while the rankings of the other stops are subject-dependant.
For the most part, there is a great deal of inter-subject variability in the rankings
of voiceless stops spectral moments generated from the release portion. It is possible that
the variability in the duration of the release among the four stops results in substantial
variability in the frequency resolution captured by the analysis window. We should keep
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
120
in mind, though, that spectral moments are not the only method used in this experiment.
These four stops are tested again in §3.3.2.3 using the multi-band spectral analysis.
3.3.1.11 Voiceless Stops- Specific Vowel Contexts
Variations in spectral moments rankings based on vowel contexts were also in-
spected by means of a set of 36 ANOVA tests (4 moments x 9 vowel contexts) conducted
across subjects. Each individual test compares the values of one of the four spectral mo-
ments for the four voiceless stops in one of the nine vowel contexts.
The identity of the following vowel causes some consistent variation in the values
and rankings of the four spectral moments. This to be expected given the fact that the
stop release is occasionally a rather short acoustic landmark that resides close to the coar-
ticulatory influence of the ensuing vowel. The results show that the values and rankings
of the stop release spectral moments obtained before the two vowels [i, a] are different
from those obtained before [u]. When the following vowel is either [i] or [a], the spectral
mean of [t] is mostly significantly higher than the other stops which are not statistically
different from each other. When the following vowel is [u], all stops are not statistically
different from each other. The spectral standard deviations of the four voiceless stops are
mostly not significantly different from each other before the vowels [i, a]. Before [u], the
two back stops [k] and [q] are mostly significantly higher than the two alveolars [t] and
[t']. The velar stop [k] generally has the highest spectral skewness before the vowels [i,
a] but not always significantly. The remaining stops are mostly not different from each
other in those environments. Before [u], the two alveolar stops almost always have sig-
nificantly higher spectral skewness than the two back stops. The spectral kurtosis of [k] is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
121
statistically the highest before [i, a] or all stops are no different. Before [u], on the other
hand, [t'1] always has the highest kurtosis while the remaining stops are not different from
each other.
It is clear, then, that both spectral standard deviation and spectral skewness do not
reflect any consistent acoustic differences between [t] and [1'1]. Spectral mean does reflect
a distinction between these two stops, but only before [i] and [a]. The only spectral mo-
ment that always reflects a distinction is kurtosis. The emphatic [t'1] has a consistently
higher kurtosis than its plain counterpart [t]. When taking into account the rather consis-
tent vowel effects along with the high inter-subject variability in the values and ranking
of the spectral moments values of the four voiceless subjects, it is possible that a subject
x vowel (i/a vs. u) x consonant analysis would produce better results. However, this pur-
suit lies outside the main purpose of this study and is better left for future consideration.
3.3.1.12 Voiceless Stops - Discriminant Analysis
Voiceless stops fair poorly as a group in discriminant analysis functions. As Table
3.9 shows, only [t] is correctly classified in a substantial number of cases than
74%) while the overall classification percentage is low (50.9% ). Spectral mean and spec-
tral skewness emerge as the only major contributors in the classification. The importance
of spectral mean is most likely a result of its rather stable relation with [t]. This stop is
consistently associated with high spectral mean. This is the likely reason for its compara-
tively high individual classification rate. The importance of spectral skewness, mean-
while, is most likely due to its slightly notable ability to distinguish alveolar stops from
back stops.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
122 Table 3.9. Results of the discriminant analysis for the voiceless stops based on the four spectral moments' values combined together as predictors. The moments' values are calculated from sampling window l and across speakers and vowel contexts. The numbers represent the totals and percentages of correctly classified sounds.
Predicted Group Membership Consonant [k] [q] [t] [t\'] Total
Original Count [k] 37 35 21 42 135
[q] 37 75 5 18 135
[t] 10 0 101 24 135 [ 19 27 27 62 135
% [k] 27.4 25.9 15.6 31.1 100
[q] 27.4 55.6 3.7 13.3 100
[t] 7.4 .0 74.8 17.8 100
14.1 20.0 20.0 45.9 100
50.9% of original grouped cases correctly classified.
3.3.1.13 Voiced Stops- Pooled Data
As there are only two voiced stops in this study, [d] and [d>], ANOVA is not a vi-
able choice since post hoc comparisons require a minimum of 3 independent variables.
Therefore, four t -tests are used to compare the means of the values of each of the four
spectral moments for both stops across subjects and vowel contexts. The values are listed
in Table 3.10. There are significant differences between the two stops in spectral mean [t
= 6.464 (df = 268), p < 0.001], skewness [t = 5.137 (df = 268), p < 0.001], and kurtosis [t
= 3.168 (df = 268), p < 0.01]. The spectral standard deviation values of the two voiced
stops are not significantly different [t= 0.253 (df = 268), p > 0.8]. Figure 3.12 shows the
spectral moments values at the two window locations for the two voiced stops.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 3.10. Mean values of spectral moments for voiced stops averaged across speakers, window locations, and vowel contexts.
Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [d] 4,187 2,084 1.346 4.93 [di"J 3,477 2,064 0.834 2.98
t(df = 268) 6.464*** 0.253 5.137*** 3.168**
** p < .01; *** p < .001
123
The averaged spectral moments scores calculated from window 1 alone are listed
in Table 3.11. The box plots in Figure 3.13 represent the distribution of the spectral mo-
ments scores for the two voiced obtained from sampling window 1. As was the case for
the voiceless stops, this window is treated as the location of the canonical shape of shape
of the voiceless stops. This is supported by the fact that, for both stops, the spectral mean
value at this window (Figure 3.12) is higher than the one obtained at the stop-vowel june-
tion.
Each spectral moment was used as the dependant variable in at-test conducted for
the two stops across subjects and vowel contexts. The spectral moments values are calcu-
lated from window 1 alone. The tests show significant differences between the two stops
in the values of spectral mean [t = 4.850 (df = 268), p < 0.001] and spectral skewness [t =
4.688 (df = 268), p < 0.001]. The two stops are not statistically different in terms of their
spectral standard deviation [t = 0.668 (df = 268), p > 0.5] nor their spectral kurtosis [t =
1.352 ( df = 268), p < 0.177].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
124 5.00 2.50 .. _ -- ---t:l ----4.00 :Il 2.00
'--'
---- :::: N .8 ::r:: ..... ro ·;; 1.50
ll) ro Q ll)
::;8 '"d ..... 'C<l ro
'"d t3 2.00 :::: 1.00 ro ll) Vl 0..
[/) 'C<l ..... ..... ()
1.00 ll)
0.50
0.00 0.00
1.50 6.00 .. _ --- ....... ...._...._. 5.00 ·-------·
----N
1.00 4.00 :::: '--'
D ll) D [/)
0 ..... [/) ..... 3.00 ::l 'C<l 1:l 'C<l () ll) 1:l
[ 2.00 [/)
1.00
0.00 0.00 1 2 1 2
Window Location Window Location
Figure 3.12. Spectral moments values for voiced stops at the two sampling window loca-tions. The values are averaged across subjects and vowel contexts.
- ·- [d]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
125 Table 3.11. Mean values of spectral moments for voiced stops calculated at window 1 and averaged across speakers and vowel contexts.
Consonant Mean (Hz) Standard deviation Skewness Kurtosis (Hz) [d] 4,448 2,068 1.407 4.84
3,859 2,012 0.845 3.69
t(df = 268) 4.850*** 0.668 4.688*** 1.353
*** p < .001
10.0 4.0 •
8.0 I • I .:.:: N 9 ::r:: 6.0 • '--'
>:: .:.:: .g 2.0 '--' >:: c<l c<l 4.0 ·;;: C)
::8 C) "0
2.0 ....; l.O IZl
0.0 0.0
6.0 60 • • • 50
4.0 • 40 • I
(/] (/] • C) • (/] 30 • >:: • 8 2.0 • • ..... • C) ;::l 20 • .:.::
IZl
l 0.0 10
• 0 • -2.0 -10
[d] [d'] [d] [d'J
Figure 3.13. Box plots of the distributions of the spectral moments scores for the two voiced stops [d, d1'].
The scores are averaged across subjects and vowel contexts.
3.3.1.14 Voiced Stops- Individual Subjects
A total of 20 t-tests were conducted on the spectral moment scores of the two
voiced stops for each subject individually to test whether all of the above stated rankings
are stable across subjects (4 moments x 5 subjects). Figure 3.14 represents box plots of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
126
1- {]- . <r tr) 2 tl Q)
B t -ill- -ill- -i]} :::l [/)
:3
.. *. -ill- t· 1 <r
"'" 2 ..... (_) Q)
B ·-ill --[I}- -{]-:::l [/)
2
·i --rn- ... -ill- -o <r ('C) 2 tl Q)
B . -ill-· -ill- -t :::l [/)
:3
·---{]-- ill . -D-· <r N 2 tl Q)
B -{]- {]]- ---ill--:::l [/)
:3
t --ID- -{]}- ·t <r
2 -(_) Q)
B -t· -{TI- --{]- . . -ill :::l [/) :3
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 "'" <:'i c:i <:'i "'" N cx:i 0 ..,f <:'i ci ('C) C'i ci I
Mean (kHz) St. deviation (kHz) Skewness Kurtosis
Figure 3.14. Box plots showing the distributions of the two voiced stops spectral moments scores for each of the five individual subjects.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
127
those scores for each subject. All five subjects show that the spectral mean of [d] is sig-
nificantly higher than that of [ d"] and that the spectral standard deviations of both stops
are not statistically different. Relative spectral skewness values of the two stops is subject
dependent: subjects 1, 3, and 5 show that [d] has a significantly higher skewness while
the other two subjects show no statistical skewness difference between the two stops.
Spectral kurtosis does not distinguish the two stops from each other.
3.3.1.15 Voiced Stops- Specific Vowel Contexts
A set of 36 t -tests were also conducted to assess the variations in spectral mo-
ments rankings based on vowel contexts (4 moments x 9 vowel contexts). The results
show that in the vast majority of vowel contexts, there are no or just marginal statistical
differences between [d] and [d"] in terms of spectral mean, standard deviation, and kurto-
sis. The only exceptions are in the aCu environment where [d] has a significantly higher
spectral mean, and in the uCa environment where the spectral kurtosis of [d] is signifi-
cantly higher as well. As for spectral skewness, in the vowel contexts which include the
vowel [ u] at either or both vowel positions, with the exception of the aCu context, [ d] has
a higher skewness than [d"]. In all remaining contexts, the two stops are not statistically
distinct from each other.
3.3.1.16 Voiced Stops -Discriminant Analysis
As shown in Table 3 .12, the discriminant analysis shows that the two voiced stops
were rather well classified overall (more than 78% of the cases). Cases of misclassifica-
tion do not exceed 28.1 %. Analyzing the standardized canonical discriminant function
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
128
coefficients reveals that spectral mean and spectral skewness play the roles of the most
prominent predictors. This is to be expected given the statistical rankings of spectral
mean in subject-based data analysis (§3.3.1.14) and spectral skewness in vowel-based
data analysis (§3.3.1.15).
Table 3.12. Results of the discriminant analysis for the voiced stops based on the four spectral moments' values combined together as predictors. The moments' values are calculated from sampling window I and across speakers and vowel contexts. The numbers represent the totals and percentages of correctly classified sounds.
Predicted Group Membership Consonant [d] [dl'] Total
Original Count [d] 116 19 135 38 97 135
% [d) 85.9. 14.1 100
[dl'] 28.1 71.9 100
78.9% of original grouped cases correctly classified.
3.3.2 Multi-Band Spectra
3.3.2.1 Voiceless Continuants
Since the canonical shapes of voiceless continuants were judged to be at the be-
ginning and middle portions of the frication noise, multi-band spectra were generated us-
ing 40-ms full Hamming windows at locations that correspond exactly to windows 2 and
3 of the previous method. That is, the first window covers the first 40 ms of the con tin-
uant while the second window covers the middle 40 ms. For each continuant, the inten-
sity values at each frequency band were averaged across windows, subjects, and vowel
contexts. The averaged raw intensity values and the averaged normalized intensity values
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
129
for the eleven frequency bands in the four voiceless continuants covered by this method
are listed in Tables 3.13 and 3.14, respectively. The raw intensity value ofa given fre-
quency band is basically the relative intensity at that band as yielded directly by the
multi-band spectrum while the normalized value is the raw value minus the average of
the four raw intensity values of all four continuants at the same frequency band. Figure
3.15 shows four histograms that reproduce the overall shapes of the multi-band spectra of
the four voiceless continuants.
Table 3.13. Mean relative intensity values at the 11 frequency bands for the four voiceless continuants averaged from the two sampling windows across speakers and vowel contexts.
Frequency Band
2 3 4 5 6 7 8 9 10 11 [s] -64.5 -53.7 -44.5 -34.4 -27.2 -26.2 -31.4 -28.7 -29.3 -24.2 -24.8
-63.5 -54.2 -44.8 -33.6 -27.2 -26.5 -'-32.3 -29.3 -30.1 -25.4 -25.9 [x] -53.4 -45.9 -46.0 -37.5 -40.6 -42.7 -49.3 -47.0 -48.2 -40.8 -40.7 [n] -51.8 -42.5 -38.4 -42.2 -45.6 -50.1 -59.3 -54.7 -58.2 -51.8 -49.8
Table 3.14. Mean normalized relative intensity values at the 11 frequency bands for the four voiceless continuants averaged from the two sampling windows across speakers and vowel contexts. The normalized value at each band is the result of subtracting the averaged raw value of the band across all continuants from the raw value of the band for a given continuant.
Frequency Band
1 2 3 4 5 6 7 8 9 10 11 [s] -6.2 -4.6 -1.1 2.5 8.0 10.3 11.7 11.2 12.2 11.4 10.5 [sl'] -5.2 -5.1 -l.4 3.3 8.0 9.9 10.9 10.7 11.4 10.2 9.4 [x] 4.9 3.2 -2.6 -0.6 -5.4 -6.3 -6.2 -7.1 -6.7 -5.2 -5.4 [h] 6.5 6.7 5.0 -5.3 -10.4 -13.8 -16.2 -14.8 -16.8 -16.2 -14.5
The normalized intensity value for each band was used as the dependent variable
in a one-way ANOVA conducted for the four continuants across subjects and vowel con-
texts. This resulted in 11 individual ANOV A tests. Main effect of continuant type on the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
0
-10 -20
-30 -40
-60 ;::l
[s]
E s <r: 0 <l)
-10 'i<l
-20 0::: -30 -40 -50 -60
[X]
2 3 4 5 6 7 8 9 10 11
130
[h]
2 3 4 5 6 7 8 9 10 11 Frequency Band (kHz)
Figure 3.15. Four histograms replicating the multi-band spectra of the four voiceless continuants. The values for the data bars are the relative intensity values of the eleven frequency bands in actual multi-band spectra (averaged from the two Hamming windows across subjects and vowel contexts). The error bars represent one standard deviation.
normalized intensity value at all 11 bands is observed [F(3,536) ranges from 43.877 to
610.271, p < 0.001]. Nine of the subsequent eleven Scheffe post hoc pair-wise compari-
sons show that individual normalized intensity values succeeded in distinguishing contin-
uant primary places of articulation (p < 0.01). The two alveolars [s] and are not sig-
nificantly different from each other in any of the eleven bands (p ranges from .437 to
1.000). The high similarities between the two sibilants are quite visible in Table 3.13 and
Figure 3.15. The normalized intensity values for this pair of sounds are lower in the low-
frequency bands and higher in the high-frequency bands than those of [X] and [h]. The
voiceless pharyngeal [h] follows a pattern opposite to that of the alveolars. Namely, it has
the highest normalized intensity values at the low-frequency bands and the lowest values
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
131
at the high-frequency bands. At band 1, the intensity value for [h] is not significantly dif-
ferent from that of [X] (p > .1 ). Meanwhile, the normalized intensity at band 3 fail to dis-
tinguish the uvular [X] from the two sibilants [s] and [s'] (p = 0.237 and 0.453, respec-
tively).
3.3.2.2 Voiceless Continuants - Discriminant Analysis
To weigh the ability of the gross spectral shapes of voiceless continuants, in the
form of multiple frequency bands, to classify these sounds, a discriminant analysis was
conducted. The normalized intensity values at each of the 11 frequency bands, averaged
from the two sampling window locations, were used as predictors.
Table 3.15. Results of the discriminant analysis for the voiceless continuants based on the normalized intensity values at each of the 11 frequency bands, averaged from the two sampling window locations, combined together as predictors. The numbers represent the totals and percentages of correctly classified sounds. The data is averaged across speakers and vowel contexts.
Predicted Group Membership Consonant [nJ [s] [s\·] [xl Total
Original Count [nJ 134 0 0 135
[s] 0 83 52 0 135
0 57 78 0 135
[xl 0 0 0 135 135
% [nJ 99.3 .0 .0 .7 100
[s] .0 61.5 38.5 .0 100 [s\·] .0 42.2 57.8 .0 100
[xl .0 .0 .0 100.0 100
79.6% of original grouped cases correctly classified.
Table 3.15 shows an overall correct classification rate of 79.6%. The two guttur-
als [X] and [h] are almost always correctly classified (100% and 99.3% correct classifica-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
132
tion rates, respectively). The two sibilants, on the other hand, are highly misclassified as
each other. The plain [s] is correctly classified in only 61.5% of the cases, while its em-
phatic counterpart [s'] is classified correctly in only 57.8% of the cases. All cases of mis-
classification of these two sounds (38.5% and 42.2%, respectively) were as each other.
So while multi-band spectra were quite successful in classifying main places of articula-
tion, they fail to classify continuants based on the presence or absence of a secondary
place of articulation. Analyzing the standardized canonical discriminant function coeffi-
cients indicates that band 1 (0 to 1 kHz), band 3 (2 to 3 kHz), and band 7 (6 to 7 kHz)
weigh in almost equally as the most prominent contributors. It seems that these three
bands show the least amount of variability across the sample speech tokens.
3.3.2.3 Voiceless Stops
Tables 3.16 and 3.17 show the averaged intensity values and the averaged normal-
ized intensity values for the eleven frequency bands in the four voiceless stops [t], [t'],
[k], and [ q]. Figure 3.16 shows four histograms that reflect the overall shapes of the
multi-band spectra of the four voiceless stops.
Table 3.16. Mean relative intensity values at the II frequency bands for the four voiceless stops averaged across speakers and vowel contexts.
Frequency Band I 2 3 4 5 6 7 8 9 10 II
[t] -54.5 -48.1 -40.3 -36.2 -34.4 -36.7 -43.5 -41.6 -42.8 -36.0 -34.9 [ -50.3 -47.0 -42.6 -39.2 -38.9 -44.0 -52.9 -51.9 -52.8 -47.3 -46.2 [k] -49.9 -42.8 -39.0 -38.2 -37.7 -45.3 -49.6 -46.7 -47.7 -40.7 -39.0 [q] -47.1 -45.6 -45.3 -38.7 -44.4 -46.4 -53.8 -52.5 -53.8 -46.8 -46.5
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
133 Table 3.17. Mean normalized relative intensity values at the 11 frequency bands for the four voiceless stops averaged across speakers and vowel contexts. The normalized value at each band is the result of subtracting the averaged raw value of the band across all stops from the raw value of the band for a given stop.
Frequency Band 2 3 4 5 6 7 8 9 10 11
[ t] -4.1 -2.2 1.5 1.9 4.5 6.4 6.5 6.5 6.5 6.7 6.8 [t\') 0.1 -1.1 -0.8 -1.2 0.0 -0.9 -2.9 -3.7 -3.5 -4.6 -4.6 [k] 0.6 3.1 2.8 -0.1 1.1 -2.2 0.3 1.5 1.6 2.0 2.6 [q] 3.4 0.2 -3.5 -0.7 -5.6 -3.3 -3.9 -4.3 -4.5 -4.1 -4.9
The normalized intensity value for each band was used as the dependent variable
in a one-way ANOVA conducted for the four continuants across subjects and vowel con-
texts for a total of 11 individual ANOV As. Main effect of stop type on the normalized
intensity value at all 11 bands is observed [F(3,536) ranges from 4.346 to 76.013, p <
0
-10 -20
-30
-40 fg -50 '-'
-60 ::l
[t]
s <r: 0 I!)
.2': -10 c;; -20 -30
-40 -50 -60
[k]
2 3 4 5 6 7 8 9 10 11
[f']
[q]
2 3 4 5 6 7 8 9 10 11 Frequency Band (kHz)
Figure 3.16. Four histograms replicating the multi-band spectra of the four voiceless stops. The values for the data bars are the relative intensity values of the eleven frequency bands in actual multi-band spectra (averaged from the burst Hamming window across subjects and vowel contexts). The error bars represent one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
134
0.01]. The alveolar stop [t] is distinguished by the lowest intensity value at band 1 (p <
.001) and the highest values at bands 5 through 11 (p < .01). On the contrary, [q] is dis-
tinguished by the highest band 1 value (p :::; .01). For the most part, bands 4 through 11
show a consistent pattern where [t] is distinguished by the highest values and [k] is dis-
tinguished by mid-range values, while [t1] and [q] have the lowest values that are not
significantly different except at band 5 where the value for [q] is significantly lower than
that of [t']. When comparing the plain/emphatic pair [t, t1], aside from bands 2 and 3, the
two sounds are different at all other bands (p < 0.05).
3.3.2.4 Voiceless Stops - Discriminant Analysis
As seen in Table 3.18, an overall correct classification rate of 68.3% is obtained
for voiceless stops. Each of the four stops is correctly classified in at least 63.7% of the
cases. The uvular stop [q] is misclassified as either [k] or [t1]. Meanwhile, [t] is misclas-
sified mainly as [t'] while [k] and [t1] exhibit no specific misclassification patterns.
When analyzing the standardized canonical discriminant function coefficients, bands 3, 9,
and 10 are the only insignificant contributors to the classification tasks. Among the eight
bands that show significant contributions, no subset stands out as a more prominently
contributing group.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
135 Table 3.18. Results of the discriminant analysis for the voiceless stops based on the normalized intensity values at each of the 11 frequency bands combined together as predictors. The numbers represent the totals and percentages of correctly classified sounds. The data is averaged across speakers and vowel contexts.
Predicted Group Membership Consonant [k] [q] [t] [ t\•] Total
Original Count [k] 86 14 14 21 135 [q] 17 93 0 25 135 [t] 7 3 104 21 135
Wl 11 23 15 86 135 % [k] 63.7 10.4 10.4 15.6 100
[q] 12.6 68.9 0.0 18.5 100 [t] 5.2 2.2 77.0 15.6 100
Wl 8.1 17.0 11.1 63.7 100
68.3% of original grouped cases correctly classified.
3.3 Discussion and Conclusions
The results of this experiment indicate that there are no reliable and salient acous-
tic differences between the canonical spectral shapes of Arabic emphatic and non-
emphatic fricatives. As for stops, the results show strong spectral differences between
emphatics and non-emphatics. These results, however, are not conclusive due to unbal-
anced coarticulatory interference from adjacent vowels into the acoustic analysis as
pointed out below. The results further show that the spectral shapes of Arabic uvular con-
tinuants show strong fricative-like qualities.
The results strongly indicate that the spectral shapes of the two sibilants [s, are
very similar. This similarity is reflected in both the spectral moments and the multi-band
spectra analyses and is very stable across subjects and vowel contexts. As for the two in-
terdentals [0, while the pooled data show that these two sounds have statistically dis-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
136
tinct spectral mean values, four of the five individual subjects show no difference be-
tween the spectral means of the two sounds. As for the remaining metrics, the two inter-
dentals are generally not distinct from each other. This, however, follows from the fact
that voiced non-sibilant fricatives, as a whole, do not reflect substantial spectral differ-
ences. In general, it seems that the primary articulation in emphatic fricatives masks any
potential impact of the secondary articulation on the acoustic identity of the sound signal.
Tables 3.19 and 3.20 show the classification rates of emphatic and non-emphatic
consonant pairs based on spectral moments and multi-band spectra, respectively. The
overall accurate classification rate of the pair [ s, s"] using spectral moments as predictors
is around chance (54.1% ). This basically means that [s] is almost totally indistinguishable
from [s"] based on their spectral moments values. Multi-band spectra fair better. Here, an
overall correct classification rate of 68.5% is achieved. It is clear that this analysis
method benefits from the larger number of predictors (intensity values in 11 bands) when
compared to spectral moments (4 moments). Nevertheless, the fact that almost one third
of the actual incidents of each member of the [ s, s"] pair is misclassified as the other de-
spite the use of 11 predictors in a test that involves only two sounds is an indication that
the frication portions of [s] and [s"] are acoustically similar. Recall also that the differ-
ences between the intensity values of these two sounds at all 11 bands are statistically
insignificant. It shows from the classification results of the pair [ 0, o"] that, much like the
classification of the two sibilants using multi-band spectra, about one third of the actual
incidents of each interdental are misclassified as the other. We conclude, therefore, that
the canonical spectral shapes of Arabic plain/emphatic continuant pairs do not include
reliable acoustic correlates to the phonetic difference between them.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
137 Table 3.19. Results of the discriminant analyses for the plain/emphatic consonant pairs based on the spectral moments values as predictors. The numbers represent the percentages of correctly classified sounds. The data was averaged across speakers and vowel contexts.
Predicted Group Membership Cons. [s]
Original % [s] 48.9 51.1 [sY] 40.7 59.3
X= 54.1
[ol [oYJ 62.2 37.8 31.1 68.9
X= 65.6
[ t] [ t\']
[d] [d\']
Predicted Group Membership
[t] [tl']
81.5 18.5 27.4 72.6
X= 77.0
[d] [dl']
85.9 14.1 28.1 71.9
X= 78.9
Table 3.20. Results of the discriminant analyses for the plain/emphatic voiceless consonant pairs based on the normalized relative intensity values of the multi-band spectra as predictors. The numbers represent the percentages of correctly classified sounds. The data was averaged across speakers and vowel contexts.
Predicted Group Membership Predicted Group Membership Cons. [s] [sl'] [t] [ t\']
Original % [s] 68.9 31.1 [t] 86.7 13.3 31.9 68.1 13.3 86.7
X= 68.5 X= 86.7
By comparison, members of the plain/emphatic stop pair [t, t'] are well dis tin-
guished by both methods, as are [d, d'] for which only spectral moments were used. The
relatively high classification rates of plain/emphatic stop pairs should be viewed with
caution, however. One possible reason for these high rates is that stop releases for em-
phatic stops are followed by shorter VOTs (see §2.2.1), bringing them closer to coarticu-
latory influence from following vowels. In fact, there were incidents where the start of a
vowel following an emphatic was immediately attached to the stop burst. In such cases, a
substantial portion of the ensuing vowel was covered by the analysis window. Further-
more, variability in VOT between members of the stop pairs results in variability in the
frequency resolution available in the analysis windows. A long VOT can allow more af-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
138
ter-burst frication to be admitted into the analysis window. Meanwhile, a short VOT
makes only a short, or even no, frication available to the window. This inconsistent influ-
ence makes any judgment based on the canonical spectral information of stops question-
able. The high classification rates of emphatic/nonemphatic stop pairs are, therefore, in-
conclusive.
Broadly speaking, the results reflect the general spectral shapes and articulatory
qualities of the consonants being investigated. Alveolar sibilant sounds typically have
diffuse rising spectra with intense high frequency energy and very little or no energy at
the low frequencies. The high frequency concentrations of energy in [s] shift the energy
center of gravity in its spectrum higher giving this sound the highest spectral mean. The
spectrum is also tilted higher at the high frequencies and lower at the lower frequencies
yielding negative skewness. A diffuse spectrum, when treated as a normal distribution
density curve, is expected to have relatively thicker tails than a compact spectrum: This
explains the low kurtosis value for [s]. Spectral moments results posted here for the sibi-
lant fricative [s] are in line with those reported in other accounts of spectral moments in
obstruents such as Jongman et al. (2000), Jassem (1995), and Tomiak (1990). The agree-
ment in sibilant spectral mean and standard deviation is especially notable.
Jongman et al. (2000) and Tomiak (1990) report high classification rates of [s] in
their discriminant analysis results. In contrast, Forrest et al. (1988) report relatively low
classification rates for fricatives in general, especially [s]. Notice that the low individual
classification rates reported here for [s] and [s"] reflect mostly the constant misclassifica-
tion of these two sounds as each other. Overall, however, classification of these two sibi-
lants as a single class of sounds is quite high. The two sibilants are rarely misclassified as
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
139
nonsibilants (6.7% for [s] and 5.2% for [s']) when using spectral moments. Misclassifica-
tion of sibilants as non-sibilants drops to 0% when using multi-band spectra.
The lower spectral mean of the glottal [h] reported in Tomiak (1990) and, to a
lesser extent, the voiceless velar fricative [x] in Jassem (1995), when compared to
less oral fricatives is a pattern that holds for Arabic gutturals as the present study shows.
Among the gutturals, the pharyngeal [<i'] has no airflow turbulence during its articulation
which is typical of a voiced approximant (Catford 1977). Compared to vowels, voiced
approximants have much weaker energy at the higher frequencies due to the narrower
vocal tract opening (Bickley and Stevens 1987, Stevens 1998). This causes almost all of
the acoustic energy in [<i'] to be concentrated in the lower 3 kHz frequency range. As a
result, treating the power spectrum of this sound as a normal distribution yields a very
sharply peaked curve over the low frequencies with a very narrow or no tail extending
into the high frequency range. This is reflected by the low spectral mean and standard
deviation as well as the high skewness and kurtosis. The latter is profoundly higher in
comparison with the other voiced continuants. Similarly, in the voiceless pharyngeal [h],
the bulk of spectral energy is located at lower frequencies than in fricatives. As a result,
[h] exhibits the lowest spectral mean and the highest skewness among the four voiceless
continuants. The noticeably high kurtosis is due to the smaller presence of energy as we
move higher in the frequency scale. The approximant qualities displayed by [h], how-
ever, are less drastic than those displayed by [<i']. The Arabic pharyngeal pair [h, )]
clearly reflects the typical acoustic properties of approximants as defined by Catford
(1977). Namely, only the voiceless member of the pair has some turbulence in the airflow
during its articulation while no turbulence is present during the articulation of the voiced
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
140
member. This stands in opposition to fricative voiced/voiceless pairs where turbulence is
always present. The present study provides acoustic support for the claims of Laufer and
Condax (1979, 1981), Catford (1977), and Ladefoged and Maddieson (1996) that Arabic
pharyngeals are approximants rather than fricatives.
The uvular [:X] is articulated near the velar region where sounds typically have
compact, well-formed spectra with mid-frequency peaks. Therefore, this sound exhibits a
mid frequency spectral mean and slightly high skewness and kurtosis. Likewise, the
voiced uvular [B] generally has a comparatively more compact spectrum with a mid to
low frequency peak. This explains the relatively high skewness and kurtosis values asso-
ciated with this sound when compared to the interdentals [o, The two uvular contin-
uants exhibit spectral qualities that are clearly more similar to those of fricatives than to
those of approximants. While this is not very evident in the case of [:X] (since voiceless
approximants also involve airstream turbulence), the spectral qualities of [B] clearly indi-
cate that uvular continuants are fricatives. Notice that [B] is not even close to [1] in the
values of any of the spectral moments metrics. The substantially higher standard devia-
tion and much lower kurtosis in uvular [B] indicates that the sound energy in the spec-
trum of this sound is widely dispersed. This points to the presence of airflow turbulence
as a result of a narrow fricative constriction. This finding does not agree with view that
Arabic uvulars are There are significant theoretical ramifications to this
finding. Recall from Chapter 2 that McCarthy's (1994) proposes that the identification of
14 One might argue that there is a fricative/approximant variation in the surface realization of Arabic uvular continuants and that these sounds are still underlyingly approximants surfacing as fricatives in the tokens tested. This is highly doubtful, though, given that in the test tokens the consonants are placed in VCV environments. This is the most likely environment for the supposed approximant variants to surface due to articulatory undershoot. There seems to be a strong tendency for Arabic uvular continuants to surface as fricatives and should be classifted as such.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
141
the guttural class for the purposes of the OCP-based restrictions on Arabic morpheme
structure is achieved by taking the feature [pharyngeal] in conjunction with the feature
[+approximant]. Since, in McCarthy's view, all gutturals are approximants and emphatics
are not, members of the two classes are allowed to cooccur. The present finding nullifies
the claim that all gutturals are approximants and demands a new reasoning for the free
cooccurrence of gutturals with emphatics. This topic is addressed in detail in Chapter 6. 15
The relationships between continuant types and their spectral qualities are not as
sharply defined for voiced continuants as for voiceless continuants. While voiceless con-
tinuants as a group are generally very well classified by both the spectral moments and
the multi-band spectra methods, the voiced uvular [ff] and the two interdentals [o, o'] were frequently misclassified as one another. In general the three fricatives [o, o', ff]
show smaller differences in terms of their spectral moments values when compared to
voiceless fricatives. Non-sibilant spectra are known for lack of strong characteristic cues
(Harris 1958, La Riviere et al. 1975). Voiced fricatives typically share low frequency
voicing bands and formant-like structures reflecting energy modulation by the vibrating
vocal folds. Thus, the presence of voicing in these sounds seems to produce a leveling
effect in the distribution of acoustic energy. The voiced pharyngeal [1], meanwhile, had a
high correct classification rate of 82.2%. It seems that, while spectral moments are not
very capable of distinguishing voiced fricative spectra, they seem to be quite able of dis-
tinguishing voiced fricatives from voiced approximants.
15 I should note here that McCarthy (1994) bases his classification of all Arabic gutturals including uvular continuants on Clements' ( 1990) modification of Catford' s ( 1977) definition of approximants cited above. Clements requires all non-approximants to involve oral stricture. This stipulation is made by Clements solely to exclude nasals. As explained in Chapter 7, recent research on pharyngeal and laryngeal articulations reveal that there are several stricture possibilities in the pharynx ranging from stop to approximant.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
142
The spectral moments scores for stops were either not different from each other or
highly dependant on the subject and the following vowel. It is not surprising, then, that,
contrary to the high classification rates of voiceless stops reported by Forrest et al.
(1988), the present study shows poor classification rates. It is possible that the types of
stops investigated in both studies play a major part in this discrepancy. Forrest et al.' s
focus was on the three stops [p, t, k] whose places of articulation are, more or less, evenly
and substantially spaced in the vocal tract. By contrast, the present study focuses on the
four stops [t, t', k, q] two of which ([t, t'1]) have the same primary place of articulation
while the other two ([k, q]) are produced at points of articulations that are very close to
each other. In general, though, the plain alveolar stop [t] was well distinguished from the
rest of the stops. In the discriminant analysis, this stop was correctly classified in 74.8%
of its actual incidents as the only highly classified voiceless stop. The velar [k] and the
uvular [q] are produced at points of articulation that are very close to each other. Mean-
while, the emphatic [t'1] seems to be affected by the coarticulatory effect of the following
vowel. Vowels following emphatics usually have a low F2. The coarticulatory effect of
this drop in F2 on the preceding [t'1] would most likely be in the form of a drop in the
center of gravity of the rising spectrum typical of an alveolar stop. These articulatory
configurations of the three stops [k, q, t'1] conspire to bring their acoustic qualities close
to each other.
When comparing the two spectral analysis methods used in this experiment,
multi-band spectra perform better in classifying obstruent places of articulation. The
overall correct classification rate of voiceless continuants when using multi-band spectra
was slightly higher than when using spectral moments (79.6% vs. 71.1 %). Multi-band
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
143
spectra achieved almost 100% correct classification rate for all sounds if we regard the
two sibilants as a single class since the two non-sibilants were almost never misclassified.
With spectral moments, on the other hand, the two non-sibilants were misclassified in
8.9% to 14.8% of their actual incidents. Multi-band spectra faired better than spectral
moments in the overall classification of voiceless stops (68.3% vs. 50.9% ). Both methods
were able to accurately classify [t] better than other stops. While the classification rates
for voiceless stops are generally low, the difference between the two methods is substan-
tial. Unlike spectral moments, which are essentially predicated on the translation of the
gross energy distribution into a smoothed out probability density curve, multi-band spec-
tra capture not only the gross distribution of energy in the power spectra, but also the
relative prominence of energy concentrations in the different stretches of spectral fre-
quency. In this sense, multi-band spectra preserve what is known about traditional power
spectra while expressing it as a small number of variables. Additionally, multi-band spec-
tra allow for the calculation of a prototypical power spectrum for a given voiceless ob-
struent by averaging the multi-band spectra from a number of sound samples from differ-
ent speakers. While still in need of further investigation involving more speakers, speech
sounds, and languages, multi-band spectra look like a promising new analysis method for
objective, quantitative characterization of voiceless obstruent power spectra.
3.4. Summary
This chapter investigates the canonical spectral qualities of MSA consonants. The
results show that Arabic emphatic/non-emphatic continuant pairs are only slightly and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
144
inconsistently distinguished from each other based on their canonical spectra. As for em-
phatic/non-emphatic stop pairs, their canonical spectra do distinguish them substantially.
However, the fact that there are also dynamic differences between emphatic stops and
non-emphatic ones (in the form of shorter VOTs following the former compared to the
latter) undermines those distinctions. Overall, canonical spectral cues are not considered
a reliable source for acoustic distinctions between emphatics and non-emphatics. The
present experiment, therefore, offers a modern and objective support for the earlier stud-
ies cited in Chapter 2 which generally found little spectral shape distinctions between
emphatic and non-emphatic consonant pairs.
This chapter also provides experimental support for the claim that Arabic pharyn-
geals are approximants, not fricatives. However, the findings strongly suggest that the
two Arabic uvular continuants should be classified as fricatives. This particular finding
provides further challenges to the phonological views that crucially classify Arabic uvu-
lar continuants as approximants,
Having investigated the acoustic qualities of the consonants themselves, we move
in the next chapter to another possible source for acoustic distinction between emphatics
and non-emphatics: the coarticulatory effect of the consonants in question on neighboring
vowels.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
145
CHAPTER4
Experiment Two:
Anticipatory and Carryover Consonant-Vowel Coarticulation
4.1 Overview
Experiment One did not yield reliably salient acoustic cues that could distinguish
emphatic sounds from their non-emphatic counterparts. This excludes the canonical spec-
tral qualities of consonants as a potential source for acoustic data that can be used to
achieve the main goals of this The present experiment investigates the an-
ticipatory and carryover coarticulatory effects of MSA emphatics, non-emphatics, and
gutturals on adjacent vowels. The main goal is to characterize these effects then compare
and contrast them. We need to find out if these effects represent consistent and reliably
salient cues for the secondary articulation in emphatics. We also need to know if the coar-
ticulatory effects of emphatics on adjacent vowels resemble the effects of gutturals. In
light of the previous acoustic accounts of Arabic emphatics cited in Chapter 2, we predict
that the coarticulatory effects of emphatics on adjacent vowels to be very different from
those of non-emphatics. We also predict emphatics and gutturals to show different coar-
ticulatory effects on neighboring vowels. Hence we formulate two test hypotheses. The
first hypothesis being tested is that the acoustic coarticulatory effects of emphatics on ad-
jacent vowels are different from the effects of non-emphatics. The second hypothesis be-
16 Of course, the finding in Experiment One that uvulars continuants are fricatives, not approximants, is quite crucial to our main goals. See Chapter 6.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
146
ing tested is that emphatics and gutturals have different acoustic coarticulatory effects on
adjacent vowels.
The claim that cues for the articulation of consonants can be reflected in their
coarticulatory effects on adjacent vowels follows directly from the central concepts of the
acoustic theory (Pant 1960). The shape of the vocal tract when producing a specific
sound filters the noise energy and gives a specific acoustic signature to that sound. When
moving from one sound to another, the vocal organs cannot snap instantly from one con-
figuration to another. They shift instead from the specific configuration they assume dur-
ing the production of one sound to the configuration necessary for the production of a
following sound. The temporal morphing between the two configurations is preserved as
acoustic transitions from the first sound to the next one. These transitions are quite visi-
bly reflected in the formant frequencies of vowels.
Early perceptual studies on vowel formant transition patterns in synthetic CV syl-
lables have stressed the importance of those transitions as cues to consonant articulation.
Harris (1958) found that, while listeners could separate the sibilant pair [s, J] from the
non-sibilant pair [f, 9] based on frication noise, differentiating between the members of
the second pair depended on the transition in the adjacent vowel. So if [9] is followed by
a vowel spliced from the syllable [fV] it was perceived as [f] and vice versa. Meanwhile,
the two sibilants [s, J] were almost always accurately classified regardless of the ensuing
vowel. The highly salient noise portions of the sibilants can differentiate them from other
fricatives as well as from each other. Similar results were achieved for the voiced frica-
tives [v, o, z, 3].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
147
Liberman et al. ( 1954) found that listeners were able to distinguish between the
three voiceless stops [b, d, g) and, separately, between the three voiced stops [p, t, k] in
synthetic CV syllables based on the size of formant transitions as well as the type of fol-
lowing vowel. Somewhat similar observations were obtained for the three nasals [m, n,
IJ] in synthetic VC syllables yielding relative consistency in the perception of consonantal
place of articulation based on similar transition sizes and directions regardless of the con-
sonant type. These findings also indicate that the patterns of place-cueing formant transi-
tions are similar in terms of pattern of movement and general size, in both CV and VC
contexts.
Delattre et al. (1955) propose a hypothetical fixed starting point for the second
formant frequency, or locus, that is fixed for every consonant (with certain exceptions)
across vowels but varies from one consonant to another. An F2 transition, therefore, is
basically the path that F2 must travel from that starting point to the steady state of the
vowel. Delattre et al. found that the loci for the stops [b, d) were approximately at 720 Hz
and 1800Hz, respectively. Further research by Stevens and House (1956) found that pos-
sible F2 loci for velar stops range from 600 Hz to 2500 Hz. Citing articulatory investiga-
tion of English stops by Dembowski (1998), Kent and Read (2002) attribute this variation
to the inconsistent location of the point of constriction for velar stops. Although the
works cited here investigate stops, Delattre et al. suggest that the same results should be
obtained for other consonants. It should be noted though that the value of transition loci
as invariant cues for consonant place of articulation has been questioned. An extensive
investigation by Kewley-Port (1982) of different vowel transition-based metrics includ-
ing formant frequency loci obtained from natural speech indicate that F2 loci, aside from
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
148
those obtained for alveolars, are not very dependable cues for place of articulation across
different vowel contexts. She points, however, that her data "support the well-known
claim that the direction and extent of the F2 formant transitions is an important place cue
for most but not all of the vowels examined" (p. 386).
The correlation between consonant place of articulation and formant transitions
discussed above can be extended to include secondary articulations as well. In fact, the
effects of secondary articulations on the transitions and steady states of neighboring vow-
els are sometimes more prominent than those of primary articulations. The works cited in
Chapter 2 indicate that F2 transitions next to emphatic sounds are always lower than
those next to non-emphatics. In a review of other types of secondary articulations by
Ladefoged and Maddieson (1996), acoustic cues for those articulations were visible on
the spectrograms of neighboring vowels. Contrastive labialization in Pohnpeian generally
causes a drop in F2 transition (and in some contexts F1 transition as well) of an adjacent
vowel. Velarization in Marshallese consonants is also accompanied by a sizable drop in
F2 transition of an adjacent vowel compared to plane consonants. Russian palatalized
consonants are usually associated with a high adjacent F2 transition. The acoustic cues
for secondary articulations are not always equally realized in the anticipatory (VC) and
carryover (CV) directions. In the languages they discussed, Ladefoged and Maddieson
note that while the effects of velarization on vowel transitions are almost equal in both
directions, labialization and palatalization are observed more in the carryover than in the
anticipatory contexts.
In all of the previous examples, the coarticulatory effects of consonantal secon-
dary articulations on adjacent vowels override the effects of primary articulations. It is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
149
possibly a consequence of the tongue assuming the configuration for the secondary ar-
ticulation prior to the release of the consonant into the vowel in CV contexts (or starting
to take shape for the secondary articulation during the preceding vowel prior to the con-
sonantal closure in VC contexts). The articulators, therefore, do not start simply from the
primary articulation configuration and continue into the vowel, but rather start from a
complex articulatory configuration involving two places at once.
4.2 Methods
4.2.1 Subjects
The same five subjects who participated in Experiment One also participated in
this experiment. The experimental phrases for this experiment were intermixed with those
for Experiment One, and the subjects were not aware of the existence of separate experi-
ments.
4.2.2 Stimuli
The set of stimuli for this experiment consisted of real MSA words containing the
four emphatic coronals [t'l, d't, o'l, their non-emphatic counterparts [t, d, o, s], the
seven gutturals [q, x, ff, h, ), h, ?], and the velar stop [k]. The test words present these
sounds along with the three vowels [i, a, u] in #CV and VC# contexts. In compiling the
test paradigm, the same general guidelines for selecting the test words in Experiment One
(explained in §3.2.2) were followed. The only exception to this was the use of geminated
vowels rather than single vowels. The reason for this choice is that geminate vowels gen-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
150
erally have more evident steady state portions than do single vowels. In Arabic, single
vowels are quite short and, in many cases, appear fully transitional, making the identifica- .
tion of a steady state highly subjective. Since this experiment addresses the coarticulatory
effect of consonants on both the transition and the steady state portions of adjacent vow-
els, the frequent lack of well-defined steady state portions in single vowels make them
less convenient than geminate vowels. The two interdentals [ o, o>] were excluded from
the #CV portion of this experiment since, for most CV combinations, no real words con-
taining these sounds initially followed by geminate vowels were found. The resulting test
paradigm consisted of 90 words ((14 x 3) + (3 x 16)). The set of stimuli is listed in Ap-
pendix B.
The VC# words were presented in the carrier phrase "?alkalimtu hiya _"
("The word is _"). As was the case in Experiment One, the subjects were instructed to
drop the optional tense and case inflectional suffixes. This was done to avoid the influ-
ence of the suffix sounds (across the consonant) on the vowel. As for the #CV words, the
words were presented in the modified carrier phrase "_ hiya 'lkalimah" ("_is the
word."). This modification is intended to isolate the CV pair at the left edge of the phrase
and avoid any coarticulatory influence from another sound across the consonant. Given
that the test word is topicalized in the test phrase (as if to introduce the word to someone),
it was possible to ask the subjects to drop the case and tense markings in this position as
well. Again, this was done to avoid any interference from other sounds that belong to the
suffixes.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
151
4.2.3 Procedures
It was noted earlier that the material for this experiment was presented to the sub-
jects intermixed with the material for Experiment One. Hence, the procedures for this ex-
periment were the same as those for Experiment One.
4.2.4 Acoustic Analysis
The sound analysis software Praat (Boersma and Weenink 1992) was used to
automatically generate formant tracks using the Burg algorithm. The tracks were calcu-
lated using a succession of 25-ms Gaussian analysis windows. This window length was
found to be ideal, given that shorter windows result in misreads by the formant tracker,
while longer windows risk averaging away short formant transitions. The temporal dis-
tance between the centers of each two analysis windows was 5 ms. The experiment fo-
cuses on the first two formants only. As noted in §2.2.1, El-Dalee (1984) found that
changes in F3 are inconsistent while Giannini and Pettorino (1982) report that F3 loci
next to emphatics are not different from those next to non-emphatics. Formant values
were measured at the end of vowel transitions in VC# sequences and at the beginning of
vowel transitions in #CV sequences as well as at the steady state portion of the vowel in
both environments. These acoustic landmarks were identified based on visual inspection
of the waveform and the spectrogram of the sound token. Auditory verification was
added in cases where the vowel-consonant boundary was difficult to pinpoint. Figure 4.1
illustrates the locations of the acoustic landmarks of interest to this experiment. After
their identification, the landmarks were recorded as time points and annotated to a Praat
TextGrid file. A specially written Praat script referred to these files and automatically re-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
152
corded the values of F1 and F2. For vowel transitions, formant readings were made 2.5
ms inside the vowel from the time point recorded in the TextGrid file rather than exactly
at the vowel-consonant boundary. This modification was done to avoid false formant val-
ues as a result of incorrect tracking of the formants or incorrect averaging of vowel-edge-
based and consonant-edge-based formant reading points. The script then stores the for-
mant readings as a text file. The text file was then converted to proper formats of the
spreadsheet software Microsoft Excel (Microsoft Corp. 1985) and the statistical analysis
software SPSS (SPSS, Inc. 1989).
a b c d
Figure 4.1. Cursor locations at vowel steady states in the CV (b) and VC (c) contexts as well as at the vowel transition edges in the two contexts (a and d, respectively).
4.2.5 Reliability
To estimate the intra-judge reliability of formant measurements, 63 CV tokens
and 72 VC tokens (10% of the total files in both cases) were randomly selected by a ran-
dom number generating software and re-analyzed following the same procedures ex-
plained in §4.2.4 above. In terms of F1 values at the transition and the steady state po-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
153
tions, the correlations between the original and the retested tokens were above 0.98 for
both the CV and the VC environments. Agreements within 50 Hz were between 94.4%
and 100%. As for F2 values at the transition and the steady state potions, the correlations
between the original and the retested tokens were above 0.99 for both phonetic environ-
ments. Agreements within 50 Hz were between 81.8% and 100%. The measurements
were judged reliable.
4.3 Results
Tables 4.1, 4.2, and 4.3 list the averaged formant frequency values for the three
vowels at all measurement locations in the VC and CV contexts. For each of the three
vowels [i, u, a] in the VC sequence, a set of four one-way ANOV As were conducted for
all 16 consonants across all subjects. The dependant variable for the first ANOVA was
the frequency value at the steady state portion of F1 (F1vowe1), for tlie second it was the
frequency value at the steady state portion of F2 (F2vowe1), for the third it was the fre-
quency value at the offset of F1 (F1offset), and for the forth it was the frequency value at
the offset of F2 (F2offset). A similar set of ANOV As was conducted for the three vowels in
the CV sequences for only 14 consonants for the reasons noted in §4.2.2. The dependent
variables for these ANOV A tests were the frequency values at the onsets of F1 (F1onset)
and F2 (F2onset) as well as at the steady state portions of F1 and F2. Scheffe post hoc pair-
wise comparisons were conducted as well to establish objective comparisons between the
mean values of the formant frequency values next to all consonants.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
154 Table 4.1. Average formant frequency values for the vowel [i] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants. The two interdentals [o] and [o\"] were not included in the CV contexts.
vc cv
F1vowel F1otT,ct F2vowcl F2otTsel F1vowcl F1onscl F2vowcl F2onscl
[t l 318 296 2294 2099 379 307 2386 2285
[tl"] 386 437 2354 1306 360 436 2310 1408
[d] 308 279 2341 2095 331 290 2293 2088
[d\'] 339 393 2251 1270 362 403 2200 1243
[oJ 331 325 2264 1807
354 354 2224 1179
[s] 334 326 2286 2102 321 292 2327 2105
[ sl·] 386 355 2364 1578 365 387 2294 1481
[k] 320 284 2267 2319 332 287 2309 2309
[q] 351 472 2261 1538 368 423 2293 1922
[X] 363 344 2282 1878 369 351 2263 2193
[ff] 347 427 2227 1478 350 390 2342 1736
[nJ 340 441 2261 2120 345 373 2330 2216
[!J] 322 547 2257 1834 346 453 2310 2013
[h] 330 317 2342 2358 324 302 2334 2328
[7] 309 324 2335 2260 320 320 2306 2293
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
155 Table 4.2. Average formant frequency values for the vowel [a] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing al116 consonants. The two interdentals [6] and [ol'] were not included in the CV contexts.
vc cv
Flvowcl Flo!Tset F2vowel F2olTsct F1vowel Flonscl F2vowcl F2onsel
[t] 730 495 1594 1730 730 510 1594 1761
[tl'] 722 544 1205 1084 757 615 1179 1074
[d] 712 419 1626 1724 688 423 1679 1800
[dl'] 725 492 1216 1045 704 506 1094 1029
[oJ 694 440 1586 1566
[ai·J 725 474 1187 1032
[s] 697 533 1631 1719 727 441 1610 1704
[ sl'] 744 680 1139 1185 751 577 1186 1052
[k] 715 566 1572 1860 737 499 1639 1876
[q] 747 639 1442 1196 759 406 1201 775
[X] 737 750 1511 1446 761 672 1288 1238
[K] 727 510 1497 1310 736 534 1253 1212
[h] 745 808 1601 1691 766 782 1577 1621
[)] 733 778 1633 1590 769 803 1594 1606
[h] 712 766 1599 1632 748 696 1559 1641
[?] 706 750 1619 1556 708 767 1608 1619
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
156 Table 4.3. Average formant frequency values for the vowel [u] obtained at mid-vowel and transition edge locations in both VC and CV contexts containing all 16 consonants. The two interdentals [o] and [o\'] were not included in the CV contexts.
vc cv Fl vowel Floffscl F2vowcl F2otlscl Fl vowel Flonscl F2vowcl F2onscl
[t l 376 348 823 ·1564 378 325 960 1310
[t\'] 413 404 885 963 426 431 812 921
[d] 360 307 935 1642 370 329 920 1713
[dl'] 413 385 821 863 405 406 848 979
[0] 365 320 997 1541
414 355 805 894
[s] 425 315 756 1430 403 338 962 1464
[ sl'] 383 389 765 1083 421 388 859 967
[k] 390 341 745 808 402 346 868 937
[q] 386 423 914 747 404 412 823 769
[X] 394 404 898 857 402 390 806 814
(K] 423 384 759 828 411 414 815 738
[nJ 405 448 783 1398 396 427 928 1248
['I] 398 504 841 1441 395 460 945 1295
[h] 382 376 750 891 385 340 814 911
[?] 377 398 899 875 382 382 795 793
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
157
2500 [Vt] [Vt\']
I Fl •: N ::c: I '-' 2000 ----· ---F2 u I = J:. -· .- I (L) 1500 ::l I a< T (L) ----.. L
1000 l ... r I = !::-. c<j 500 ···I s -- •. ---·-:£ :.c··- ](:'-... -.. ..
0 >I-. 0
2500 [Vd] [Vd\']
N' r I ::c: -'-' 2000 • u I I = (L) 1500 ::l a< l: I (L)
1000 [ ·![ ... : ... :-· .. J: = c<j 500 s ....::£ 1----... ·;--------':&:
0 >I-. 0
2500 [Vs] [Vs\']
N' .t e; 2000
u I = I (L) 1500 T ;::l 1 a< I (L) :t:: .. I 1000 ... I :+:- r ::::: :OK. __ ----I c<j --I s 500 ;,;; .. 1------:r --I ... -- ----c!: -I -----
0 >I-. 0
2500 [Vol
N I ::c: '-' 2000 I u ::::: ::r £ ll) 1500 ::l a< 1> (L) ..
1000 I l: ... :., . = :£ .... _ c<j s 500 ..... :11: :.::. "· ... :•...---------;-.:: "\'.,._______ __ :li: .. :lf: 0
>I-. 0 mid offset mid offset mid offset mid offset mid offset mid offset
[i] [a] [u] [i] [a] [u]
Figure 4.2. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the four Arabic plain coronals [t, d, o, s] and their emphatic counterparts [t\ d\ ol', Formant values are averages across speakers. The error bars represent± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
----- 2500 N ::c:
-;::: 2000 (.) = g 1500 c::r £ 1000
500
0
----- 2500 N ::c:
-;::: 2000 (.) = g 1500 g'
..t: 1000
-----N ::c:
500
0
2500
-;::: 2000 (.) = <!.) 1500 ;:::l c::r <!.) .... 1000
= <:<! 500 8 ....
0
0
-----2500
N ::c: '-' 2000 ;>.. (.) = <!.) 1500 ;:::l c::r <!.)
..t: 1000 ..... = <:<! § 500 0
0
[Vk]
I· I
[Vxl
I ·I
- -------c:+::
[Vh]
I-
[Vh] T t I
:c-- -:I:
mid offset [i]
,. ""--. .L
l--- I
I --"---
T------ I
mid offset [a]
--•-Fl
F2
I·· I
:!:· JE---c!:
I ... ·· I
mid offset [u]
[Vq]
I.
[VIS"]
I.
[V1]
I
[V?]
mid offset [i]
JC. ···· ...
I------I
mid offset [a]
158
.. I
.!:··· . I -- ..;:,:
I
mid offset [u]
Figure 4.3. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the velar [k], the three uvulars [q, x, IS"], the two pharyngeals [h, 1] and the two laryngeals [h,?]. Formant values are averages across speakers. The error bars represent± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
159
4.3.1 Anticipatory (VC) Coarticulation
Figures 4.2 and 4.3 represent simplified formant transition tracks of the vowels [i,
u, a] when preceding the 16 consonants. These tracks are basically interpolation lines
connecting the mean formant frequency values at the middle of the steady state portion
and at the offset of the vowel. Error bars (± one standard deviation) were added at the
data points.
4.3.1.1 Anticipatory Coarticulation in F1
The statistical analysis of variance shows that, for the vowel [i], main effects of
consonant type are obtained on both F1vowei [F(l5,224) = 16.812, p < 0.001, R2 = 0.498]
and F1ottset [F(l5,224) = 34.429, p < 0.001, R2 = 0.677]. The subsequent Scheffe pair-wise
post hoc comparisons are summarized in Tables 4.4 and 4.5. In terms of F1vowei values the
comparisons yield a small number of significant differences between different conso-
nants. Flvowei is generally higher (albeit only slightly in most cases) next to emphatics and
uvulars compared to the rest of sounds. F1vowei values before [t'i'] and [s"] are significantly
higher than those before their non-emphatic counterparts (p < 0.01) while [d"] and [o"]
are not significantly different from their non-emphatic counterparts in terms ofpreceding
F1vowei (p > 0.4). F1offset is generally higher next to emphatics and gutturals as opposed to
plain oral consonants. Most pair-wise comparisons of Floffset values do not reflect signifi-
cant differences. There seems to be an association between high Floftset values and the
four gutturals [q, B", h, )], especially the latter. Emphatics are also occasionally correlated
with high F1offset values. However, this correlation is not consistent across all emphatics.
The two laryngeals are not associated with high F1oftset values.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
160 Table 4.4. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Flvowci values of the vowel [i] in the context [iC].
[o] [o\'] [t] [t\'] [d] [s] [k] [q] [xJ [K] [h] [1] [h] [7]
[oJ
[t] [t\'] *** [d]
[s] ***
[k] [q] [X] [K] [h] [1] [h] [7]
**
*
*** *** ** ***
*** ***
*
** *** *** ***
-------- -------
*** * *** ***
* *** *
* *** *** *** * ***
Table 4.5. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Florrsct values of the vowel [i] in the context [iC].
[o] [o\'] [t] [t\'] [d] [s] [k] [q] [X] [K] [h] [1] [h] [7]
[oJ [o\'J [t]
** *** [d] *** [d\'] * *** [s] ** [ s\']
[k] *** ** [q] *** *** *** *** *** *** *** [X] * *** [K] * *** *** * *** [h] *** *** *** *** *** * [1] *** *** *** ** *** *** *** *** *** *** *** ** [h] *** *** ** *** *** [7] ** *** ** *** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
161
For [a], no main effect of consonant type on F1vowel is obtained [F(15,224) = 1.067, p > 0.3, R2 = 0.004]. As shown in Table 4.6, however, pair-wise comparisons indi-
cate that the values of F1vowel next to all 16 consonants are not significantly different from
each other (p > 0.9). By contrast, a main effect on F1offset is obtained [F(15,224) = 34.138,
p < 0.001, R2 = 0.675]. The subsequent Scheffe pair-wise post hoc comparisons are
summarized in Table 4.7. There is a strong correlation between F1offset values of [a] and
the low gutturals (pharyngeals and laryngeals), the uvular [X], and the emphatic [s'].
F1otfset values next to those sounds are significantly higher than most of the other sounds
(p < 0.01). The rest of the consonants are not significantly different from each other in
terms of preceding F1offset values.
For [u], main effects of consonant type are obtained on both F1vowel [F(15,224) =
7.047, p < 0.001, R2 = 0.275] and F1offset [F(15,224) = 15.268, p < 0.001, R2 = 0.472].
However, the subsequent Scheffe post hoc comparisons (summarized in Tables 4.8 and
4.9) reveal only few significant pair-wise differences among the F1vowel and F1offset values
next to the 16 consonants. Aside from the alveolar pair [s, s>], emphatics are generally
accompanied by slightly higher F1vowel values than their non-emphatics counterparts. The
differences are not significant for the pairs [t, t>] and [o, O'l] (p > 0.15) and only margin-
ally significant for the pair [d, d>] (p < 0.1). As for F1offset• the value before the voiceless
pharyngeal [h] is consistently higher compared to those before plain oral consonants,
while the value before [1] is significantly higher then those before most of the other con-
sonants (p < 0.05). Among the three uvulars, only [q] is distinguished from some of the
plain orals by a higher F1ottset value. Meanwhile, the plain orals [t], [d], [o], [s], and [k]
are accompanied by the lowest F1otfset values.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
162 Table 4.6. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Fl vowet values of the vowel [a] in the context [aC].
[o] [<'F] [t] [d] [dl'] [s] [k] [q] [X] [K] [ti.] ['!] [h] [?] [o]
[t]
[d] [dl'] [s]
[k] [q] [X] [B'] [ti.] ['I] [h] [?]
Table 4.7. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Flortsct values of the vowel [a] in the context [aC].
[o] [t] [tl'] [d] [dl'] [s] [ [k] [q] [X] [B'] [ti.] [)] [h] [?]
[ol
[t] [tl'] [d]
[s] *** *** ** *** **
[k] [q] ** * *** [X] *** *** *** *** *** *** *** ** [B'] * *** [ti.] *** *** *** *** *** *** *** *** * *** ['I] *** *** *** *** *** *** *** *** *** [h] *** *** *** *** *** *** *** ** *** [?] *** *** *** *** *** *** *** ** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
163 Table 4.8. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Fl vowel values of the vowel [u] in the context [uC].
[o]
[t]
[d]
[s]
[k] [q] [X]
[h] [1] [h] [7]
[o] [t] [t\'] [d] [d\'] [s] [k] [q] [X] [B"] [n] [1] [h] [7]
*
* **
* **
Table 4.9. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of Fl otiset values of the vowel [ u] in the context [ uC].
[o] [t] [d] [s] [k] [q] [X] [B'] [n] [1] [h] [7]
[o]
[t]
[d] *
[s]
[k] [q] * *** ** [X] * [15] [nl *** * *** *** ** [1] *** *** *** * *** *** *** ** *** * *** [h] *** [7] ** * p < 0.05 ** p < 0.01 *** p < 0.001 -no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
164
4.3.1.2 Anticipatory Coarticulation in F2
Main effects of consonant type on both F2vowei [F(l5,224) = 2.338, p < 0.01, R2 =
0.077] and F2offset [F(l5,224) = 162.479, p < 0.001, R2 = 0.910] are obtained for [i].
However, F2vowei values next to all 16 consonants are not significantly different from each
other as shown by the Scheffe post hoc comparisons in Table 4.1 0. The distinction be-
tween emphatics and non-emphatics is clearly reflected by F2offset values as shown in Ta-
ble 4.11. The four emphatics as well as the two uvulars [q] and [rr] cause a substantial
drops in F2 transitions of [i] resulting in significantly lower F2offset values than the rest of
the consonants (p < 0.05). The mid-range F2offset values preceding the two gutturals [X]
and ['1] as well as the interdental [o] distinguish them from most of the remaining conso-
nants, while the values before [h] and [?] are in the same range as those before plain
orals. The highest F2offset value is the one preceding [h], distinguishing this sound from
most of the other sounds.
For [a], significant main effects of consonant types are obtained on both F2vowei
[F(15,224) = 144.518, p < 0.001, R2 = 0.900] and F2oftset [F(l5,224) = 146.468, p < 0.001,
R2 = 0.901]. The subsequent Scheffe post hoc comparisons are summarized in Tables
4.12 and 4.13. The values of F2vowei are significantly lower before the four emphatics [t'>],
[d'>], [o"], and [s"] than before any other consonant (p < 0.001). The two uvulars [X] and
[rr] also cause significant lowering of F2vowei that is generally milder than the lowering
caused by emphatics. In several cases, these two uvulars are distinguished from non-
emphatics and low gutturals. The uvular stop [q] is accompanied by a lower mid-range
F2vowei that distinguishes it from all other sounds except from other uvulars. Plain orals,
pharyngeals, and laryngeals are mostly not distinctly different from each other in term of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
165 Table 4.10. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowct values of the vowel [i] in the context [iC].
[o] [01'] [t] [t\'] [d] [s] [sl'] [k] [q] [X] [B"] [h] [CJ] [h] [?]
[oJ
[t l
[d]
[s]
[k] [q] [X]
[h] [CJ] [h] [?]
Table 4.11. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F201rsct values of the vowel [i] in the context [iC].
[oJ [01'] [t l [t\'] [d] [s] [ [k] [q] [X] [B' l [h] [CJ] [h] [?] [oJ [()I'] *** [t l *** *** [t\'] *** *** [d] *** *** ***
*** *** *** [s] *** *** *** *** [s\'] * *** *** *** *** *** *** [k] *** *** *** * *** *** [q] ** *** *** * *** ** *** *** [X] *** * *** *** * *** *** *** [B'] *** *** *** *** *** *** *** [h] *** *** *** *** *** *** * *** ['l] *** ** *** ** *** ** ** *** *** *** *** [h] *** *** ** *** ** *** ** *** *** *** *** * *** [?] *** *** *** *** *** *** *** *** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
166 Table 4.12. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowci values of the vowel [a] in the context [aC].
[oJ [o\'J [t] [dJ [di'J [sJ [kl [qJ [xl [H'] [hJ ['i'J [hJ [?J
[oJ [oi'J *** [t l *** [tl'] *** *** [d] *** ***
-- -------------------------
[dl'] *** *** *** [s] *** *** *** [sl'] *** *** *** *** [k] *** *** *** *** [q] *** *** *** *** *** *** *** *** ** [X] *** *** * *** * *** [H'] *** *** ** *** *** *** [h] *** *** *** *** *** ['I] *** *** *** *** *** ** *** [h] *** *** *** *** *** [?] *** *** *** *** *** **
Table 4.13. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F20 rrsct values of the vowel [a] in the context [aC].
[o] WJ [t] [d] [s] [si·J [k] [q] [Xl [H"] [h] ['i'] [h] [?]
[oJ [oi'J *** [t] ***
*** *** [d] *** ***
*** *** *** [s] *** *** ***
*** *** *** *** [k] *** *** *** *** *** [q] *** * *** *** *** *** [X] *** *** *** *** *** *** *** *** *** [K] *** *** *** *** *** *** *** *** [h] *** *** *** *** * *** *** *** [l] *** *** *** *** *** *** *** [h] *** *** *** *** *** *** ** *** [?] *** * *** * *** *** *** *** *** * p < 0.05 **p<0.01 *** p < 0.001 - no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
167
F2vowei value of preceding [a]. F2offset values accompanying the four emphatics as well as
the uvular [q] are significantly lower than those accompanying almost all of the remain-
ing consonants (p < 0.001). The value of F2offset before [X] and [ff] are not as low, but are
still significantly lower when compared to several other consonants. These mid-range
F2offset values distinguish the uvular fricatives from the majority of the remaining sounds.
The plain orals [t, d, o, s, k] and the lower gutturals [h, 1, h, ?] are accompanied by the
highest F2offset values. These two classes are mostly not distinct from each other.
Significant main effects of consonant types on both F2vowei [F(15,224) = 15.134, p
< 0.001, R2 = 0.470] and F2offset [F(15,224) = 146.124, p < 0.001, R2 = 0.901] are also ob-
tained for [u]. The post hoc comparisons, summarized in Tables 4.14 and 4.15, show that
the values of F2vowei establish only few significant differences between the consonants.
Before the emphatic [ o>] F2vowei is significantly lower compared to that before [ o] (p <
0.001), but this is mainly because [o] is preceded by the highest F2vowei among all 16 con-
sonants. All other emphatic/non-emphatic pairs are not significantly different from each
other even though the stop [d] is also preceded by a substantially high F2vowei· The picture
is different for F2offset values. Here, the values before all emphatics are significantly lower
than those before the non-emphatic counterparts (p < 0.001). Emphatics, uvulars, laryn-
geals, and the velar [k] are accompanied by lower F2offset values than the remaining
sounds. Among that group, the three uvulars and [k] are associated with the lowest F2onset
values. Plain coronals and pharyngeals are accompanied by the highest F2offset values and
are mostly not distinct from each other.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
168 Table 4.14. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2vowei values of the vowel [ u] in the context [ uC].
[oJ [cFJ [tJ [dJ [sJ [kJ [qJ [Xl [K] [nJ ['i'J [hJ [?J [oJ
[t] ***
[d] ***
[s] *** *** *** **
[k] *** *** [q] ** * ** [X] * [K] *** *** * [nJ *** * ['i'] * [h] *** *** ** * [?] * * *
Table 4.15. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 16 consonants in terms of F2oiTsct values of the vowel [ u] in the context [ uC].
[oJ Wl [tJ [dJ [di'J [s] [kJ [qJ [Xl [H] [nJ ['i'J [hJ [?J [oJ [61'] *** [t l *** [tl'] *** *** [d] *** ***
*** *** *** [s] *** *** * ***
*** *** *** ** *** [k] *** *** *** *** *** [q] *** *** ** *** *** *** [X] *** *** *** *** ** [K] *** *** *** *** *** [nJ *** *** *** *** *** *** *** *** *** ['I] *** *** * *** *** *** *** *** *** [h] *** *** *** *** * *** *** [7] *** *** *** *** * *** *** * p < 0.05 ** p < 0.01 *** p < 0.001 -no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
169
In sum, Flvowel values make very few emphatic/non-emphatic distinctions. These
values do not distinguish gutturals as a class from non-gutturals. High Floftset values are
consistently associated with pharyngeals. High Floffset values are also occasionally associ-
ated with emphatics and uvulars. This association, however, is not as consistent as the
association with pharyngeals. Laryngeals are associated with high Floffset values in the
vowel [a] alone. F2vowel distinguishes emphatics from their non-emphatic counterparts in
the vowel [a] alone. F2offset• meanwhile, is very capable of distinguishing emphatics from
non-emphatics. Uvulars are also associated with low F2oftset values. These values are ei-
ther low or at the mid-range in [i] and [a]. In [u], F2offset values preceding uvulars are
lower than those preceding all other sounds, including emphatics. Pharyngeals are pre-
ceded by F2otfset values that are slightly lower than, or at the same range as, those preced-
ing plain coronals. F2offset values preceding laryngeals depend on the vowels since these
sounds have no particular coarticulatory effects on adjacent vowels.
4.3.1.3 Anticipatory Coarticulation - Discriminant Analysis
For the most part, the previous two sections indicate that the values of Fl and F2
at the vowel steady state portions reflect only minor distinctions among consonants. On
the contrary, Fl and F2 transitions reflect some systematic distinctions. To asses the ca-
pabilities of Floffset and F2offset as acoustic coarticulatory metrics to categorize the conso-
nant classes being investigated, a number of discriminant analysis tests were performed
in which these values were used as predictors. Rather than looking at the 16 consonants
individually, the categorization tasks were done for classes of sounds. The four emphatics
were grouped together and so were the four non-emphatic coronals, the three uvulars, and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
170
the two pharyngeals. The velar [k] and the two laryngeals were excluded from these tests
since their Floffset and F2offset values are highly dependent on the vowel type.
Table 4.16. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values ofF1 transitions in VC contexts.
Predicted Group Membership
Vowel Original Emph. Phar. Plain. Uvu. Total [i] Emph. 41.7 11.7 21.7 25.0 100.0
Phar. 13.3 73.3 3.3 10.0 100.0 Plain. 21.7 0.0 78.3 0.0 100.0 Uvu. 28.9 28.9 15.6 26.7 100.0
X= 54.4% [a] Emph. 25.0 10.0 48.3 16.7 100.0
Phar. 0.0 83.3 0.0 16.7 100.0 Plain. 13.3 0.0 70.0 16.7 100.0 Uvu. 24.4 26.7 15.6 33.3 100.0
X =49.7% [u] Emph. 26.7 10.0 25.0 38.3 100.0
Phar. 6.7 66.7 6.7 20.0 100.0 Plain. 25.0 0.0 71.7 3.3 100.0 Uvu. 11.1 26.7 20.0 42.2 100.0
X= 50.3%
Results of the first three discriminant analysis tests are summarized in Table 4.16.
In each test, Floffset of one of the three individual vowels serve as the predictor. The over-
all correct classification rates predicted by Flotfset are not higher than 54.4%. The results
show that only pharyngeals and plain coronals are relatively well categorized in the three
vowel environments. Floffset values before these two classes of sounds are at the two ends
of the scale: pharyngeals are preceded by the highest values while plain coronals are pre-
ceded by the lowest. The high classification rates of plain coronals is especially notable
given that the analysis of variance tests reported earlier do not particularly distinguish
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
171
these sounds based on the F1oftset values preceding them. Emphatics and uvulars are fre-
quently misclassified as each other. This is due to the mid-range F1otfset values of these
sounds. It appears that the mid-range values preceding emphatics are not distinct enough
from those preceding their plain counterparts. For this reason, emphatics are also fre-
quently misclassified as plain coronals. Meanwhile, emphatics are not frequently misclas-
sified as pharyngeals. Based on F1ortset values, uvulars are misclassified as pharyngeals
above the 25% chance level in all three vowels. The most likely cause for this is the
rather consistently high F1offset values preceding the uvular stop [q].
Table 4.17. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F2 transitions in VC contexts.
Predicted Group Membership
Vowel Original Emph. Phar. Plain. Uvu. Total [i] Emph. 75.0 3.3 0.0 21.7 100.0
Phar. 0.0 33.3 46.7 20.0 100.0 Plain. 0.0 26.7 61.7 11.7 100.0 Uvu. 26.7 8.9 11.1 53.3 100.0
X=59.5% [a] Em ph. 83.3 0.0 0.0 16.7 100.0
Phar. 0.0 50.0 43.3 6.7 100.0 Plain. 0.0 43.3 55.0 1.7 100.0 Uvu. 20.0 11.1 0.0 68.9 100.0
X= 66.2% [u] Em ph. 60.0 6.7 0.0 33.3 100.0
Phar. 0.0 63.3 .36.7 0.0 100.0 Plain. 0.0 33.3 66.7 0.0 100.0 Uvu. 24.4 0.0 0.0 75.6 100.0
X= 66.2%
Table 4.17 shows the classification results of the three discriminant analysis tests
in which F2offset values of the three individual vowels serve as the predictors. The overall
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
172
classification rates, which range between 59.5% and 66.2%, are noticeably higher than
those obtained by Florrset values. After [i] and [a], the classification rates of emphatics
(75% and 83.3%, respectively) are substantially higher than the classification rates of
other sounds. This is a predictable outcome given that emphatics are consistently pre-
ceded by very low F2offset values in comparison with the other sounds. After [u], emphat-
ics are accurately classified in 60% of their actual incidents. It is very interesting to note
that, in all three vowel contexts, there were no cases of misclassification of emphatics as
plain coronals or vice versa. Uvulars are also relatively well classified after [a] and, more
evidently, after [u]. Recall from the previous section that F2offset values of [u] were lower
before uvulars than even those before emphatics. It seems that this association is strong
enough to profile uvulars as a class next to [u]. Overall, emphatics and uvulars are fre-
quently misclassified as each other. This follows from the fact that F2orrset values before
these two classes are lower than those before plain orals and pharyngeals. The latter two
classes are also frequently misclassified as each other.
The analysis of variance and discriminant function analysis tests show that Florfset
cannot classify emphatics as a class of sounds distinct from their non-emphatic counter-
parts (plain coronals). F2offset' meanwhile, is shown to be capable of accurately classifying
the two classes. To verify these findings, another discriminant analysis test was con-
ducted. In this test, which involved only the classes of emphatics and plain coronals,
Floffset and F2offset in all three vowels were combined and used together as predictors. The
test shows a very high overall accurate classification rate of 93.1 %. Inspecting the stan-
dardized canonical discriminant function coefficients reveals that this high rate is largely
contributed by F2offset· Notice that when we used F2offset values alone earlier, there were no
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
173
cases of misclassification between emphatics and plain coronals. Adding F1offset values as
a predictor actually drops the classification rate slightly.
The results show that F2offset is, by far, the most solid acoustic cue for the secon-
dary articulation in eiJ,lphatics in the VC context. Emphatics are consistently and reliably
associated with low F2offset values in preceding vowels. F1offset does not have any signifi-
cant role in distinguishing emphatic sounds from plain coronals. Overall, pharyngeals are
preceded by high Floffset values. For the most part, uvulars are associated with mid-range
values for both F1offset and F2offset in preceding vowels. There are two exceptions to his
trend. First, the uvulars stop [q] is associated with F1offset values that are as high as those
preceding pharyngeals. Second, after [u], all uvulars are associated with F2offset values that
are even lower than those preceding emphatics.
4.3.2 Carryover (CV) Coarticulation in Fl
Figures 4.4 and 4.5 represent simplified formant transition tracks of the vowels [i,
u, a] when following the 14 consonants covered in this portion of the study.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
N' 2500 ::r: '-" 2000 ;;.., u c <l) 1500 ;::l c::r <l)
1000 i:: ro 500 s ..... 0
!:.I.. 0
2500 N ::r: '-" 2000 ;;.., u c
<l) 1500 ;::l c::r <l) ..... 1000 !:.I.. ..... c ro 500 s ..... 0
!:.I.. 0
2500 N ::r: '-" 2000 ;;.., u c <l) 1500 ;::l c::r <l)
1000 i:: ro 500 s ..... 0
!:.I.. 0
[tV] I 1
"--"-
[dV]
[sV]
:::.:::-----·-- ---;.]t:
onset mid [i]
onset mid [a]
--•-- Fl • F2
:£--------:.::
:.'------- -"""
onset mid [u]
T --------:.-:
:r------:Jr
onset mid [i]
I
onset mid [a]
174
:r ::t: --X
'l .. ..... ---- -- -w.:
:.::------------.;<·
onset mid [u]
Figure 4.4. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] following the three Ar&bic plain coronals [t, d, s] and their emphatic counterparts [t\ d\', Formant values are averages across speakers. The error bars represent ± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
""" 2500 N ::c: ';:. 2000 (.)
1500 a"
&: 1000 c "' 500 s ..... 0
0
""" 2500 N ::c: '-" 2000 ;>--, (.)
v 1500 ;::l a" v
1000 ......
"' 500 s ..... 0
0
""" 2500 N ::c: '-" 2000 ;>--, (.)
v 1500 ;::l a" v ..... 1000 ......
"' 500 s ..... 0
0
""" 2500 N ::c: '-" 2000 ;>--, (.)
v 1500 ;::l a" v
1000 ......
"' 500 s ..... 0
0
[kV]
I I
:::..;;------ -X
[XV]
I ;,.:
::J;: -- ---<•c
[nVJ
I
:1.7--- ----£
[hV]
1. I
,., - -·- --:a:
onset mid [i]
·;::------------ ;!
I-----J:
I-------:t:
onset mid [a]
-•-Fl ---F2
I :r
I . I
onset mid [u]
[qV]
[rN] I
[)V]
I I.--
[?V]
onset mid [i]
_I
I-- I
I-- I
----!:
onset mid [a]
175
I J[:' ---- -- ::..::
I- I
onset mid [u]
Figure 4.5. Simplified first and second formant tracks of the three Arabic vowels [i, a, u] preceding the velar [k], the three uvulars [q, x, ff], the two pharyngeals [n, )] and the two laryngeals [h,?]. Formant values are averages across speakers. The error bars represent ± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
176
4.3.2.1 Carryover Coarticulation in Fl
A main effect of consonant type on F1onset [F(l3,196) = 25.734, p < 0.001, R2 = 0.606] is obtained for the vowel [i]. The subsequent Scheffe post hoc comparisons (sum-
marized in Table 4.18) indicate that, in most cases, F1onset of the vowel [i] are able to dis-
tinguish the three emphatics and the gutturals [q], [E], [h], and [<i'] from the remaining
sounds. The values of F1onset before all three emphatics are significantly higher in com-
parison with their non-emphatic counterparts (p < 0.01). The highest F1onset value is the
one following the pharyngeal [<i'], distinguishing it from all other sounds, including its
voiceless counterpart [h]. The uvular [X] has a mid-range F1onset value that distinguishes it
from only [t'] and [<i']. While the analysis of variance indicates that there is also a main
effect of consonant type on F1vowei [F(13,196) = 5.897, p < 0.001, R2 = 0.233] of the
vowel [i], the post hoc comparisons (Table 4.19) reveal that all F1vowei values are very
close to each other. The only significant pair-wise differences are that [t] was higher than
both [s] and[?] (p < 0.05).
For the vowel [a], main effects of consonant type on both F1anset [F(13,196) = 71.665, p < 0.001, R2 = 0.815] and F1vowei [F(13,196) = 2.669, p < 0.001, R2 = 0.094] are
obtained. The Scheffe post hoc comparisons (Table 4.20) show that F1onset values in [a]
are capable of distinguishing the low gutturals (pharyngeals and laryngeals) from the re-
maining sounds in most cases. F1onset values following the guttural sounds, except for [q]
and [E], are significantly higher compared to most of the consonants under investigation.
The highest values occur after [h], [<i'], and [?]. The three emphatics are generally accom-
panied by F1onset values that are higher than those accompanying their plain counterparts.
While [s"] is followed by F1anset value that is significantly higher than after [s], F1anset af-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
177
Table 4.18. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms ofF !onset values of the vowel [i] in the context [Ci].
[t] [d] [s] [k] [q] [Xl [K] [n] ['i'] [h] [?]
[t l ***
[d] *** ** ***
[s] *** *** [ sl'] * ** ** [k] *** *** *** [q] *** *** *** *** [X] * [ff] * ** *** *** [ll] * * * ['I] *** *** *** *** *** * [h] *** *** ** *** ** *** [?] *** * *** ***
Table 4.19. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of Flvowci values of the vowel [i] in the context [Ci].
[t] [ti'J [d] [s] [k] [q] [Xl [K] [n] ['i'] [h] [?]
[t l
[d]
[s] *
[k]
[q]
[X]
[ff]
[nJ ['I] [h]
[?] * * p < 0.05 ** p < 0.01 *** p < 0.001 -no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
178
Table 4.20. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of Fl onset values of the vowel [a] in the context [Ca].
[t] [tl'] [d] [s] [k] [q]. [X] [B'] [n] ['I] [h] [?]
[t] [tl']
[d] *** [dl'] * [s] *** [sl'] *** *** [k] * [q] *** *** [X] *** *** *** *** *** *** [ff] * ** *** [nJ *** *** *** *** *** *** *** *** * *** ['I] *** *** *** *** *** *** *** *** ** *** [h] *** *** *** *** * *** *** *** [?] *** *** *** *** *** *** *** *** ***
Table 4.21. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of Flvowct values of the vowel [a] in the context [Ca].
[t] [d] [s] [k] [q] [X] [ff] [n] ['I] [h] [?]
[t] [tl']
[d] [dl']
[s] [sl']
[k]
[q]
[X] [B']
[nJ ['I] [h]
[?] * p < 0.05 ** p < 0.01 *** p < 0.001 · -no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
179
ter [t'] is only marginally higher than after [t] (p < 0.1) and the difference is not signifi-
cant for the pair [d, d>]. At the steady state portion of [a], no significant differences are
detected among the 14 consonants in terms ofF1vowel values (Table 4.21).
There are main effects of consonant type on both F1onset [F(l3,196) = 16.203, p <
0.001, R2 = 0.486] and F1vowel [F(13,196) = 5.304, p < 0.001, R2 = 0.211] for the vowel
[u]. The subsequent post hoc comparisons (Table 4.22) reveal relatively few significant
pair-wise differences achieved by F1onset· The comparisons indicate that F1onset is able to
distinguish the subset of the emphatic [t'] along with the gutturals [q], [rr], [h], and [1].
Those sounds are followed by higher F1onset values than some of the remaining sounds.
F1onset values after the two emphatics [t>] and [d>] are significantly higher when compared
to the values after [t] and [d] (p < 0.05) while [s>] is followed by F1onset value that is in-
significantly higher than the one following [s]. The highest F1onset value follows [1], while
the plain orals along with [h] are followed by the lowest values. As for F1vowel values,
most pair-wise post hoc comparisons reflect no significant differences (Table 4.23).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
180
Table 4.22. Summary of the degree of statistical significance as expressed by the Schefte post hoc pair-wise comparisons of the 14 consonants in terms ofFlonset values of the vowel [u] in the context [Cu].
[t] [d] [s] [ [k] [q] [X] [ff] [nJ [1] [h] [?]
[t]
[tl'] *** [d] *** [dl'] * * [s] ***
[k] ** [q] ** ** * [X]
[ ff] ** ** * [nJ *** *** ** ** [1] *** *** *** *** [h] *** * ** *** [?] *
Table 4.23. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of Flvowcl values of the vowel [u] in the context [Cu].
[t] [t'l] [d] [s] [k] [q] [X] [ff] [nJ [1] [h] [?] [t]
[U] * [d] ** [dl']
[s] [ * [k]
[q]
[X]
[ff] [nJ [1] [h]
[?] * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
181
4.3.2.2 Carryover Coarticulation in F2
For the vowel [i], main effects of consonant type on both F2onset [F(13,196) = 81.183, p < 0.001, R2 = 0.833] and F2vowel [F(l3,196) = 2.616, p < 0.01, R2 = 0.091] are
obtained. The Scheffe post hoc comparisons (summarized in Table 4.24) reveal that
F2onset is capable of distinguishing the three emphatics. F2onset following the three emphat-
ics are significantly lower than the ones accompanying the majority of the remaining
sounds. The uvular [ff] is followed by a mid-range F2onset value that distinguishes it from
all other sounds except [s"] and [q]. The same is true, but to a lesser extent, for [q] which
is accompanied by a mean F2onset value that falls between those accompanying plain cor-
onals and the one accompanying [B"]. For the most part, the remaining consonants are not
significantly different from each other. Almost all of F2vowei values are not significantly
distinct from each other (Table 4.25).
There are main effects of consonant type on both F2onset [F(l3,196) = 236.833, p <
0.001, R2 = 0.936] and F2vowel [F(l3,196) = 91.173, p < 0.001, R2 = 0.849] for the vowel
[a]. The differences between emphatics and the remaining sounds are reflected by the
Scheffe post hoc comparisons of F2onset values, summed up in Table 4.26. The three em-
phatics are followed by significantly lower F2onset values than those that follow the other
sounds (p < 0.001). The two uvulars [X, ff] are distinguished from almost all other sounds
by the mid-range F2onset values that follow them. The uvular stop [q] is followed by the
lowest F2onset value among all the sounds investigated here. On the other end of the scale,
the five plain orals [t], [d], [6], [s], and [k] are followed by the highest F2onset values. As
is the case at the onset of the vowel, the three emphatics and the three uvulars are associ-
ated with F2vowei values (Table 4.27) that are significantly lower than the remaining
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
182
Table 4.24. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [i] in the context [Ci].
[t] [tq [d] [s] [k] [q] Lxl [K] [h] ['I] [h] [?]
[t] [t\'] *** [d] *** [dl'] *** *** [s] *** *** [s\'] *** *** *** [k] *** *** *** [q] *** *** *** *** *** [X] *** *** *** [IS'] *** ** *** *** *** *** *** [h] *** *** *** * *** ['I] *** *** *** * * [h] *** *** *** *** *** ** [?] *** *** *** *** *** *
Table 4.25. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowcl values of the vowel [i] in the context [Ci].
[t] [tl'] [d] [s] [k] [q] [X] [K] [h] ['I] [h] [?]
[t]
[d]
* [s]
[k]
[q]
[X]
[IS']
[h]
['I] [h]
[7] * p < 0.05 **p<O.Ol *** p < 0.001 -no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
183
Table 4.26. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onsct values of the vowel [a] in the context [Ca].
[t] [t'l] [d] [s] [s\·] [k] [q] [X] [B'] [h] [)] [h] [?] [t]
[t\'] *** [d] *** [d\'] *** *** [s] *** *** [s\'] *** *** *** [k] *** *** * *** [q] *** *** *** *** *** *** *** [X] *** * ***' *** *** ** *** *** [B'] *** *** ** *** * *** *** [h] *** ** *** *** *** *** *** *** [)] *** *** *** *** *** *** *** *** [h] *** * *** *** *** *** *** *** [?] *** ** *** *** *** *** *** ***
Table 4.27. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowci values of the vowel [a] in the context [Ca].
[t] [t"] [d] [s] [s\'] [k] [q] [X] [B"] [h] ['!] [h] [?]
[t]
[F] *** [d] *** [d\'] *** *** [s] *** *** [ s\'] *** *** *** [k] *** *** *** [q] *** *** *** *** [X] *** *** *** *** *** [K] *** *** * *** *** [h] *** *** *** *** *** *** [)] *** *** *** *** *** *** [h] *** *** *** *** *** *** [?] *** *** *** *** *** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
184
sounds (p < 0.001). These two classes are mostly not distinct from each other. Plain orals
and lower gutturals are also not distinct from each other.
For [u], main effects of consonant type on both F2onset [F(13,196) = 130.082, p <
0.001, R2 = 0.889] and F2vowei [F(13,196) = 13.357, p < 0.001, R2 = 0.435] are obtained.
Here, too, F2onset is able to distinguish emphatics from their plain counterparts, as re-
flected by the post hoc comparisons (Table 4.28). The values of F2onset following the three
emphatics, the three uvulars, the two laryngeals, and the velar [k] are significantly lower
than those following the three plain coronals and the two pharyngeals (p < 0.001). The
lowest F2onset values are those that follow the two uvulars [q, B"] which are, in some cases,
even significantly lower than those values following emphatics. Few pair-wise distinc-
tions are achieved by F2vowei values (Table 4.29). The only emphatic/non-emphatic dis-
tinction is that [t>] was followed by a significantly lower F2vowei value than [t] (p < 0.01);
while the values after the remaining emphatics are only slightly lower than those after
their non-emphatic counterparts. Also, the values following the three uvulars and the two
laryngeals are lower than those following the two pharyngeals, but mostly insignificantly.
To sum up, Flonset values in vowels following emphatics are inconsistently higher
than those that follow plain coronals. Similarly, uvulars, when compared to plane cor-
onals, are preceded by inconsistently higher F1onset· The only class of sounds that are con-
sistently accompanied by significantly high F1onset values are pharyngeals. Emphatics are
consistently associated with very low F2onset values. The same is true for uvulars. How-
ever, before [i] and [a], F2onset following uvulars are mostly not as low as those following
emphatics. Before [u], uvulars are followed by the lowest F2onset values among all 14
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
185
Table 4.28. Summary of the degree of statistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2onset values of the vowel [ u] in the context [Cu].
[t] [d) [s) [ [k] [q] [X) (B) [h) [1) [h) [7] [t]
*** [d) *** *** [d\') *** *** [s] *** *** *** [ s\') *** *** *** [k] *** *** *** [q] *** *** ** *** ** [X] *** *** *** (B) *** * *** *** *** *** * [h) *** *** *** *** *** *** *** *** [1] *** *** *** *** *** *** *** *** [h) *** *** *** *** *** [?] *** *** * *** *** ***
Table 4.29. Summary of the degree of st\ltistical significance as expressed by the Scheffe post hoc pair-wise comparisons of the 14 consonants in terms of F2vowcl values of the vowel [ u] in the context [Cu].
[t] [d) [dl') [s] [ [k] [q] [X] (B) [h) [1] [h) [?] [t]
[tl') ** [d)
[s] ***
[k] [q] ** ** [X] *** *** (B) ** ** [h) * * [1] ** * ** ** [h) ** ** ** [?] *** * *** ** *** * p < 0.05 ** p < 0.01 *** p < 0.001 - no significant difference
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
186
sounds. In terms of F2onset values, pharyngeals and plain coronals are quite similar to each
other. Laryngeals do not have any specific coarticulatory effects on neighboring vowels.
4.3.2.3 Carryover Coarticulation - Discriminant Analysis
To weigh the capabilities of Flonset and F2onset in CV sequences to categorize the
consonant classes under investigation, a number of discriminant analysis tests were per-
formed in which these values were used as predictors. As with the previous discriminant
analysis. test in §4.3.1.3, the categorization tasks were done for four classes of sounds:
emphatics, plain coronals, uvulars, and pharyngeals. The velar [k] and the two laryngeals
were excluded from these tests since the Flonset and F2onset values following them are
highly dependent on the vowel type.
Results of the first three discriminant analysis are summarized in Table 4.30.
In each test, Flonset of one of the three individual vowels serve as the only predictor. The
overall correct classification rates predicted by Flonset are not higher than 56.4%. Plain
coronals are the only set of sounds that are relatively well classified in all three vowel
contexts. This is because these sounds are generally followed by the low Flonset values.
The high Flonset values that follow pharyngeals cause them to be accurately classified be-
fore [a] and, to a lesser extent, before [u]. Before [i], pharyngeals are confused as uvulars
in 30% of their actual occurrences. Emphatics and uvulars are never well classified as
sound classes. Their mid-range values do not make them stand out among the sound
classes investigated. In general, there are numerous incidents where emphatics and uvu-
lars are confused as pharyngeals, as plain coronals, or as each other.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
187 Table 4.30. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of Fl transitions in CV contexts.
Predicted Group Membership
Vowel Original Emph. Phar. Plain. Uvu. Total [i] Em ph. 6.7 46.7 15.6 31.1 100.0
Phar. 16.7 46.7 6.7 30.0 100.0 Plain. 0.0 0.0 95.6 4.4 100.0 Uvu. 11.1 24.4 15.6 48.9 100.0
X =49.7% [a] Em ph. 51.1 6.7 20.0 22.2 100.0
Phar. 6.7 93.3 0.0 0.0 100.0 Plain. 6.7 0.0 73.3 20.0 100.0 Uvu. 28.9 15.6 35.6 20.0 100.0
X= 56.4% [u] Emph. 4.4 40.0 13.3 42.2 100.0
Phar. 6.7 70.0 6.7 16.7 100.0 Plain. 0.0 0.0 77.8 22.2 100.0 Uvu. 24.4 22.2 6.7 46.7 100.0
X=47.9%
Table 4.31. Discriminant analysis results for the four classes of Arabic sounds, emphatics, plain coronals, pharyngeals, and uvulars based on the values of F2 transitions in CV contexts.
Predicted Group Membership
Vowel Original Em ph. Phar. Plain. Uvu. Total [i] Emph. 91.1 0.0 0.0 8.9 100.0
Phar. 0.0 16.7 50.0 33.3 100.0 Plain. 0.0 35.6 46.7 17.8 100.0 Uvu. 22.2 20.0 24.4 33.3 100.0
X=49.7% [a] Em ph. 60.0 0.0 0.0 40.0 100.0
Phar. 0.0 83.3 16.7 0.0 100.0 Plain. 0.0 15.6 84.4 0.0 100.0 Uvu. 33.3 11.1 0.0 55.6 100.0
X= 69.7% [u] Em ph. 86.7 2.2 0.0 11.1 100.0
Phar. 20.0 56.7 23.3 0.0 100.0 Plain. 0.0 26.7 73.3 0.0 100.0 Uvu. 13.3 0.0 0.0 86.7 100.0
X= 77.6%
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
188
Table 4.31 shows the classification results of the three discriminant analysis tests
where F2onset of the three individual vowels is used as the predictor. The overall classifica-
tion rates range between 49.7% and 77.6%. Before [i], emphatics are very accurately
classified due to the very low F2onset that follow them. Meanwhile, the other three classes
are often confused with each other. The range of F2onset values following uvulars reaches
lower than those following plain coronals and pharyngeals which cause some cases of
uvulars to be misclassified as emphatics. Before [a], both pharyngeals and plain coronals
are well classified - the former class by mid-range values, and the latter class by high
values. Emphatics and uvulars are followed by rather equally low F2onset value ranges
causing them to be misclassified as each other quite frequently. Before [u], uvulars are
followed by the lowest F2onset values and are highly accurately classified. Low F2onset val-
ues also follow emphatics and enable them to be well classified. The high F2onset values
following plain coronals enable those sounds to be correctly classified in over 73% of
their actual incidents. Only pharyngeals are poorly classified due to the F2onset values that
follow them which range between the high values following plain coronals and the low
values following emphatics. For this reason, pharyngeals are frequently confused as
members of those two classes.
There are few occasions where emphatics are confused with their plain counter-
parts when using F1onset as the predictor, and no occasions of confusion when using F2onset·
As we did in §4.3.1.3, the capabilities of these two acoustic metrics to classify emphatics
and non-emphatic were examined using another discriminant analysis test. In this test,
F1onset and F2onset in all three vowels were combined and used together as predictors. A
high overall accurate classification rate of 89.3% was achieved. Based the standardized
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
189
canonical discriminant function coefficients, this high rate IS contributed mainly by
F2onset·
The previous ANOV A and discriminant analysis tests reveal that, as was the case
in the VC context, F2onset is a very solid acoustic cue for the secondary articulation in em-
phatics in the CV context. Emphatics are consistently followed by low F2onset values in
vowels. Flonset does not have as strong a role in distinguishing emphatics from non-
emphatics. However, high Flonset values correlate rather highly with pharyngeals. F2onset
values following pharyngeals and plain coronals are generally not substantially different
from each other. Following uvulars, Flonset values are at the mid-range. F2onset values,
meanwhile, vary depending on the vowel. In [i], they are close to those following plain
coronals. In [a], they are as low as those following emphatics. In [u] they are even lower
than those following emphatics.
4.4 Discussion and Conclusions
The results of the present experiment show that the coarticulatory acoustic effects
of MSA emphatics on neighboring vowels distinguish these sounds very reliably from
their non-emphatic counterparts. The main coarticulatory correlate of emphatics is a size-
able drop in the F2 transition in adjacent vowels. While emphatics are also generally as-
sociated with higher Fl transitions than those accompanying non-emphatics, this associa-
tion is not nearly as salient nor as consistent as the association between emphatics and F2
drops. Like emphatics, uvulars are also associated with lower F2 and higher Fl transi-
tions in adjacent vowels. However, the magnitudes of these changes are not consistent.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
190
The size of F2 drop next to uvulars depends on the vowel type. The only sounds associ-
ated fairly consistently with high F1 transitions are pharyngeals. Laryngeals, meanwhile,
are not associated with any specific transitions in adjacent vowels.
Pharyngeals are categorized rather accurately based on the high F1 values that ac-
company them. These sounds are produced with a narrow constriction at the lowest part
of the pharynx. Recall from Figure 1.1 that a constriction near that area corresponds to
the node of F1 explaining the high value of that formant. Next to the vowel [a], laryn-
geals show strong association with high F1 transitions as well. However, next to [i] and
[u], no such effects are detectable. This is not unexpected given that laryngeals typically
do not include any supraglottal configuration of their own. The high F1 in [a] next to la-
ryngeals is that of the vowel itself which has the highest F1 among the three Arabic vow-
els. This finding differs from that of Zawaydeh ( 1999) who reports that laryngeals, like
pharyngeals, are also associated with high F1 in neighboring vowels. As a matter of fact,
Zawaydeh reports that all Arabic gutturals are associated with high F1 values in the tran-
sitions and steady states of adjacent vowels. She uses this association as a phonetic basis
for the grouping of Arabic gutturals into a single natural class. It was pointed in §2.2.4
that Zawaydeh' s finding in regards to laryngeals is quite puzzling. Associating laryngeals
with high Fl in adjacent vowels goes against the tenets of the acoustic theory. These
sounds do not have any constriction above the glottis, so what causes a rise in F1? It was
hypothesized in §2.2.4 that Zawaydeh's subjects were possibly raising their larynges dur-
ing [h] and [?] or producing those sounds with wider mouth openings than normal. A
raised larynx shortens the vocal tract and raises essentially all formant frequencies. Wider
openings at the lips correspond to widening at the antinodes of all formants raising all of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
191
them. I repeat here that Zawaydeh originally reported her results for the vowel [a] alone
initially using five subjects. In order to reject the notions that the high F1 next to laryn-
geals is that of the vowel itself, she tested F1 in [i] using two of her subjects. The results
reported in the present work are obtained from five subjects for all consonant-vowel
combinations. I therefore question Zawaydeh's findings and reject the proposal that high
F 1 values are a viable argument for the grouping of Arabic gutturals.
The most consistent association noted in this experiment is the one between em-
phatics and F2 transition of neighboring vowels. Emphatics are always accompanied by
significantly lower F2 transitions in adjacent vowels in comparisons to with their plain
counterparts. This cue extends to the steady state portion of [a] only. The resistance of [i]
to the spread of emphasis is well documented in the literature (see §2.2.1). It appears that
the antagonism between the articulatory demands on the tongue dorsum during the pro-
duction of [i] (fronting) and the secondary articulation in emphatics (retraction) blocks
any further extension of emphasis into [i]. As for [u], this vowel already involves a re-
traction of the tongue dorsum which is reflected by its characteristically low F2 causing
emphasis to spread vacuously to the steady state. However, results of the statistical analy-
ses of variance as well as the discriminant analyses reported in this study indicate that
emphasis spread to the transitional portion of the vowel is sizable and consistent enough
to be considered a very reliable acoustic indicator for the presence of an emphatic conso-
nant. It seems that the spread of emphasis to [u] is expressed acoustically in that it over-
rides the raised consonant-vowel transition associated with non-emphatic coronals. F1
transitions next to emphatics are, generally speaking, higher than those next to plain cor-
onals. However, the ANOVA and discriminant analysis tests show that these differences
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
192
are not always significant. Contrary to the assertion of Giannini and Pettorino (1982),
who relied on speech samples from a single subject and did not use any form of statistical
analysis, F1 is not a viable acoustic cue of the presence of an emphatic consonant. The F1
transitions neighboring emphatics are not as consistently nor as sizably high as those
neighboring pharyngeals. It looks like emphatics, unlike pharyngeals, do not involve ac-
tive tongue root retraction which would result in a substantial and consistent raising of
Fl. The mildly and inconsistently high F1 transitions neighboring emphatics point to sub-
tle retraction of the tongue root as a byproduct of the overall retraction of the tongue dor-
The previously stated findings in regards to the association between emphatics
and both F1 and F2 along with their articulatory interpretations strongly support El-
Dalee's (1984) similar views. The present experiment also extends parts of those claims
to include uvulars. Like emphatics, uvulars are generally associated with low F2 transi-
tions on neighboring vowels in both directions. However, the size of F2 drop next to uvu-
lars depends largely on the vowel. In [i] and [a], the values of F2 transitions next to uvu-
lars are either comparable to the values next to emphatics or range between the values
next to plain coronals and the values next to emphatics. In [u], F2 transitions next to uvu-
. lars are lower than those next to any other sound class, including emphatics. Recall from
Experiment One that we claim that uvulars, when adjacent to the vowel [u], are retracted
further back towards the uvular region. When adjacent to [u], the spectral mean of the
17 It might be argued that emphatics underlyingly involve active tongue root retraction that is lost in a phonetic reorganization of the articulatory qualities of these sounds (Kingston and Diehl 1994 ). This is quite doubtful, though, since such a modification goes against the role of phonetic reorganization which is to minimize articulatory effort and/or maximize acoustic contrast. An added tongue root retraction would significantly raise Fl and enhance the acoustic distinction between emphatics and their non-emphatic counterparts. It can also be said that tongue root retraction would facilitate that retraction of the whole tongue mass including the dorsum. See §6.3.2 and §6.4 below for cases where the two retractions co-exist.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
193
uvular [X] is in the same range as the spectral mean of the sibilants [s, s"]. This was at-
tributed to greater constriction in uvular fricatives as a result of further backing of the
tongue. The very low F2 transitions in [ u] next to uvulars support this claim. Further
backing of the tongue dorsum in the upper pharynx results in more constriction near the
first antinode of F2. This results in a more pronounced drop in this formant. Fl transi-
tions next to uvulars are in the same range as those next to emphatics. These transitions
are not as high nor as consistent as the transitions next to pharyngeals. It seems that, uvu-
lars, like emphatics and unlike pharyngeals, do not involve independent tongue root re-
traction as part of their articulation.
Notice in Figures 4.2 and 4.4 that F2 in [i] and [a] next to emphatics have falling
transitions as we move from the vowel's steady state towards the consonant in VC se-
quences (and rising transitions as we move away from the consonant towards the vowel's
steady state in CV sequences). In [u], however, the patterns are reversed. Here, F2 is
slightly higher at the transition than at the steady state. However, F2 transitions in [u]
next to emphatics remain the lowest with the exception of those next to uvulars. It seems
that emphatics do not merely cause a drop in F2 of an adjacent vowel, but rather cause
that formant to start at a somewhat fixed point regardless of the vowel type. That fixed
point is lower than the prototypical F2 values of [i] and [a] but slightly higher than the
prototypical F2 value of [u]. This phenomenon is quite similar to the locus concept dis-
cussed in §4.1. In this sense, the emphatic coronals [t'l, 6'1, s"] share a second formant
locus different from the one shared by their non-emphatic coronal counterparts [t, d, 6,
s]. Obrecht (1961) investigated the perceptual relevance of emphatic loci in Lebanese
Arabic and found that the perceptual "zone of velarization" lies between 1000 Hz and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
194
1400 Hz. The average F2 locus for emphatics reported by Obrecht is around 1200 Hz
compared to a locus around 1800 Hz for non-emphatics. Using speech samples from a
singlelraqi subject, Giannini and Pettorino (1982) report an emphatic F2 locus of 1000
Hz compared to 2000 Hz for non-emphatics.
To find out if a stable F2 hub (locus) exists for all emphatics, the F2 transition
edges in sequences were compared by means of an ANOV A test. A similar test was
done for the sequence. In both cases, the tests were conducted for the emphatic
sounds across vowel contexts with F2 transition serving as the dependent variable. Two
similar tests were performed for the non-emphatic coronals. Figure 4.6 displays the
means and standard deviations of F2 transitions averaged for all vowels in the in the two
2500 """' N :I: ';:: 2000 (.) >= <l) ;::l 0"' 1500 <l) .....
i:: ro 1000 E ..... 0
500 [Vt] [Vd] [Vs] [Vol [tV] [dV] [sV]
2500 N' :I: ';:: 2000 (.) >= <l) ;::l 0"' 1500 <l) .... i:: ro 1000 a ..... 0
500 [Vtl'] [Vdl'] [Vsl'] [Vo1·l
Figure 4.6. Mean F2 transitions next to the non-emphatic coronals and their emphatic counterparts. The formant transitions values are averaged across vowels and speakers. The error bars represent ± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
195
contexts VC and CV, separately, next to emphatics and non-emphatics. ANOVA results
for the emphatics in the VC context show a main effect of consonant type [F(3,176) = 14.997; p < 0.001]. Scheffe post hoc test, however, indicate that and are
not significantly distinct from each other in terms of mean F2 transition value of the pre-
ceding vowels. Mean F2 value preceding [s"] is significantly higher than those preceding
the other emphatics (p < 0.01). In the CV context, which does not include no main
effect of consonant type was detected [F(2,132) = 1.469; p = 0.234]. This was reflected
by the subsequent post hoc comparisons which revealed no pair-wise differences. The
test results for the non-emphatics in the VC context show a main effect of consonant type
[F(3,176) = 5.174; p < 0.01]. Scheffe post hoc test indicate that the only pair-wise differ-
ences involve F2 before [o] which is significantly lower than the values before [t] (p <
0.05) and [d] (p < 0.01). No main effect of consonant type was detected [F(2,132) = 1.542; p = 0.218] in the CV context, which does not include [o]. Consequently, no pair-
wise differences were indicated by the subsequent post hoc comparisons. The tests indi-
cate that both emphatics and non-emphatics show little variability in F2 transition values.
This is an indication that there are two separate F2 loci for Arabic coronals; one for em-
phatics (around 1100Hz) and another for non-emphatics (around 1700Hz).
Figures 4.7 through 4.10 represent plots of the loci estimates for consonants that
are believed to possess them. It is notable here that no loci were plotted for [k], a sound
that is believed to possess more than one. The reason is simply that the high velar locus
associated with non-back vowels cannot be estimated simply by averaging F2 transitions
of adjacent vowels. This locus is higher than F2 in both [i] and [a] as indicated by the di-
rections of their formant tracks. As for the low locus, Arabic has only one back vowel,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
[Vt]
_,,] •
[Vd]
'- ] a • u_/
[Vs]
a • 7 u
[VoJ
a
u
196
N' 2000 e;
;>, u
1500 0'
.11000 500 § &: 0
a
u
N' 2000 e;
;>, u
1500 5 g.
500 § 4...... .j .1000
[Vdl'] 0
a
u
N' 2000 e;
;>, u
1500 5 • g.
1000 § l £ 500 § &:
[Vsl'] 0
a
,-... N
2000 e; ;>, u
1500
---.......... 0' "'- .1000 £ ----- .....
-
____________ _;__u_;_ __________ __,J 0500 J J
Figure 4.7. Stylized second formant tracks of the three Arabic vowels [i, a, u] preceding the four Arabic plain coronals [t, d, o, s] and their emphatic counterparts [t\ d\ O\ The figure illustrates also the location of the second formant locus at the left edge of each consonant. Each locus is calculated by averaging the values of F2 transition offsets of the three vowels.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
197
a __ /, N
2000 e; ;>. u
1500 ::::: a ' C)
;::l cr C)
u 1000 u "-.. E
500 s ..... 0
ll.. [Vk] [Vq]
0
a
N' 2000 e;
;>. u
1500 ::::: a ........... C) ;::l cr C) .....
u 1000 ll.. E u
500 § 0
ll.. [VB]
0
N' 2000 e;
;>. a .1500
u :::::
u_/ C) ;::l cr C) .....
1000 ll.. E
500 s ..... 0
ll.. [V)]
0
N' 2000 e;
;>.
a u 1500 :::::
C) ;::l
a cr C) .....
1000 ll.. .... ::::: u u
500 8 ..... 0
ll.. [Vh] [V?]
0
Figure 4.8. Stylized second formant tracks of the three Arabic vowels [i, a, u] preceding the Arabic velar [k] as well as the seven gutturals [q, x, B, b, ), h,?].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2000 l 2 1500 c::r
....... 1000 § ] 500
0
g 2000 l ;>, () 2 1500 c::r
....... 1000 § § 500
0
g 2000 [ G' • 2 1500 c::r £ 1000 § ] 500
0
a
• t [tV]
• [dV] t
.... --· a
• [sV] t
r
r r
a
u
a u
a
u
198
Figure 4.9. Stylized second formant tracks of the three Arabic vowels [i, a, u] following the three Arabic plain coronals [t, d, s] and their emphatic counterparts [t\ d\ sll The figure illustrates also the location of the sec-ond formant locus at the left edge of each consonant. Each locus is calculated by averaging the values of F2 transition offsets of the three vowels.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
199
N e; 2000 ;>-, (.) a ::::: <l) 1500 ;:I cr <l) / a
1000 ..... ::::: u u ro § 500 0
IJ...
0 [kV] [qV]
r N e; 2000 ;>-, (.) ::::: 1500 <l) ;:I cr a a <l) ....
IJ... 1000 u ro u
8 500 .... 0
IJ...
0 [XV] [IN]
g 200) r ;>-, • (.) • ::::: 1500 a a <l) ;:I cr <l) ....
IJ... 1000 ro 8 500 ..... 0
IJ...
0 [bV] [1V]
N e; 2000 ;>-, (.) a ::::: 1500 a <l) ;:I cr <l) .....
IJ... 1000 ..... ::::: u u ro s 500 ..... 0
IJ...
0 [hV] [?V]
Figure 4.10. Stylized second formant tracks of the three Arabic vowels [i, a, u] following the Arabic velar [k] as well as the seven gutturals [q, x, B, b, 1, h, ?].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
200
[u], which also makes averaging of values inapplicable. Perceptual experiments like those
conducted by Delattre et al. (1955) and Stevens and House (1956) are probably required
to estimate these loci. Another notable issue is that, unlike emphatics, uvulars, which also
spread emphasis, do not appear to have any single F2 locus. As noted earlier, the edge of
F2 transitions next to uvulars, while generally low, depend on the vowel. It seems that
uvulars, like velars, adapt their specific point of articulation to the articulation of the ad-
jacent vowel. This is quite different from the highly stable F2 loci associated with em-
phatics. The association between F2 drop and emphatics is more rigid than the associa-
tion between F2 drop and uvulars. This indicates that the dorsal retraction implicated in
the articulation of emphatics is more articulatorily stable and less prone to coarticulatory
influence from adjacent vowels than the dorsal retraction in uvulars. Finally, it is quite
interesting that the two pharyngeals [h, 1] appear to have a common locus as the only
gutturals to do so. To pursue this possibility, two t-tests were conducted to compare the
means of F2 transitions in the two pharyngeals. One test was for the VC context and the
other for the CV context. For the VC context, the t-test show a significant difference be-
tween the means of F2 transitions next to the two pharyngeals [t = 2.205 (df = 88); p <
0.05], while for the CV context, no significant difference exists between the two pharyn-
geals [t = 0.718 (df = 88); p = 0.475]. It appears, then, that the proposed 'pharyngeal lo-
cus' (located around 1600 Hz) is realized more visibly on the right edge of the pharyn-
geal consonants. This locus is possibly a consequence of the [a)-like position (in the
front-back continuum) assumed by the tongue body during pharyngeals. This is supported
by the flat F2 transitions between pharyngeals and the vowel [a]. X-ray tracings in Delat-
tre (1971) and Ghazeli (1977) also support this claim. Apparently, this position is an ar-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
201
ticulatory target that does not adapt very much to neighboring vowels. If it did, these
sounds would be as articulatorily transparent as [h] resulting in no locus.
Among the three articulatory descriptions of the secondary articulation in Arabic
emphatics discussed in the Introduction, pure pharyngealization is the one least favored
by the acoustic evidence reported here. The acoustic correlate of emphaticness is not con-
sistent with the articulatory proposal that emphatics and uvulars involve a constriction in
the pharynx by retracting the tongue root (Davis 1995, Rose 1996). Tongue root retrac-
tion is associated with high F1 values while emphatics and uvulars are not significantly
associated with high F1 in adjacent vowels. The correlation between low pharyngeal con-
striction and high F1 was predicted by Halle and Stevens (1969) and later proven true in
experimental investigations of several languages. We have already seen such effect from
Arabic pharyngeals. Similarly, in their reviews of Tsakhur, Udi, and !X66, three lan-
guages belonging to different families which possess pharyngealized vowels, Ladefoged .
and Maddieson (1996) reported correlation between pharyngealization and raising of F1,
but no specific correlation with F2. Also, .in languages that have vowel sets involving re-
traction of the tongue root as well as vowel sets involving the reverse form of this articu-
lation, i.e. advancing of the tongue root (ATR), the acoustic correlates of these added ar-
ticulations are strongly reflected by Fl. In a study of one such language, Kwawu, a
dialect of Akan, Hess (1992) found that the most consistent acoustic difference between
[ +ATR] and [ -ATR] vowel pairs is that members of the former class had lower F1 values
than members of the latter class. Similar acoustic effects were also reported by Fulop et
al. (1998) for Degema, an Edoid language, which also contrasts [+ATR] and [-ATR]
vowels. The same acoustic effect was also shown to be acquired by plain vowels when
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
202
neighboring primarily pharyngeal consonants. Ladefoged and Maddieson display spec-
trograms of pharyngeal-containing words from Burkikhan, and Agul dialect, that clearly
show that Fl at the transition of the vowel following the pharyngeal is quite high (around
1000 Hz in one case) before gradually falling as the vowel production progresses.
It was mentioned in the Introduction that Arabic emphatics have also been de-
scribed as velarized (Obrecht 1961, Catford 1977) and uvularized (McCarthy 1994,
Zawaydeh 1999). Velarization is compatible with the acoustic effects that characterize
emphatics. Contrastively velarized consonants in languages such as Marshallese (Choi
1995, Ladefoged and Maddieson 1996) and Russian (Bolla 1981 18) as well as non-
contrastively velarized consonants such as the 'dark' [l] in English (Sproat and Fujimura
1993) cause significant lowering ofF2 in neighboring vowels. Ladefoged and Maddieson
(1996) also display spectrograms and spectral slices of a Marshallese velarized [m11 ] and
its plain counterpart [m] showing a sizable drop in the second spectral peak of the velar-
ized nasal as opposed to the plain one. Uvularization as a secondary articulation implies
one of two possibilities. The first is that a uvularized sound has a secondary articulation
that is an exact copy of the articulation of [X] or [B], including the soft palate participa-
tion. The second, and more likely one, is that the sound involves a secondary articulation
in the form of retracting and raising the tongue dorsum towards the uvular region. Either
way, uvularization is a superset of velarization since primarily uvular sounds are consid-
ered complex pharyngeal-velar sounds. Uvularization, then, is expected to display the
same acoustic effects as velarization, which is exactly what Catford (1977) states.
18 Bolla actually refers to non-palatalized Russian sounds as pharyngealized. However, those sounds are widely known currently as velarized. See Chapter 6 of this dissertation for more on this topic.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
203
Among the three articulatory descriptions of the emphatic secondary articulation
in Arabic, pharyngealization must be excluded. This characterization fails both descrip-
tively and explanatorily in the fields of phonological representation and phonetic articula-
tory-acoustic correspondence. Among the two remaining candidates, velarization and
uvularization, the latter has never been reported in any language other than Arabic. Fur-
thermore, according to the Handbook of the International Phonetic Association, there is
no IPA symbol for uvularization. As stated earlier, uvularization can be understood as the
combination of two concomitant secondary articulations: velarization and pharyngealiza-
tion. Other reported double secondary articulations include labiovelarization and
labiopalatalization. Ladefoged and Maddieson ( 1996) explain that the former is actually
what takes place in the majority of labialized sounds. This fact is quite understandable
when we consider plain labialization to be the addition of a [w]- or [u]-like articulation to
a primary articulation. The sounds [w] and [u] always involve a velar raising of the back
of the tongue. As for labiopalatalization, Ladefoged and Maddieson note that it is in fact
an allophonic variation of the labiovelarization occurring in the context of front vowels.
In these two types of articulation, velarization or palatalization coexists with labializa-
tion, the most widely encountered secondary articulation cross-linguistically. It is coun-
terintuitive to expect two relatively rare types of secondary articulations, velarization and
pharyngealization, to coexist as a double secondary articulation in the same sound. When
we add this point to the phonological arguments against pharyngealization (which is a
component of the uvularization) presented in Chapter 2, the case for uvularization in
Arabic emphatics weakens. This point is explored in more detail in the next two chapters.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
204
4.5 Summary
The acoustic results in this experiment show that emphatics differ greatly from
their non-emphatic counterparts in terms of their coarticulatory impact on adjacent vow-
els. The most pronounced difference is that emphatics cause large F2 drops in the transi-
tion portions of adjacent vowels while non-emphatics do not. Generally, uvulars show
similar effects on adjacent vowels as do emphatics. These results are interpreted as an
acoustic reflection of the dorsal retraction in both emphatics and uvulars. However, the
size and stability of these effects are different between these two sound classes. While F2
drops next to emphatics are associated with a highly stable low F2 locus, they vary in size
and locus next to uvulars depending on the vowel. The lack of stability in this acoustic
correlate next to uvulars is interpreted as a less rigid association between uvulars, as op-
posed to emphatics, and dorsal retraction. While dorsal retraction in emphatics is highly
stable regardless of the adjacent vowel, it is more adaptable to the vowel environment in
uvulars. This indicates that the articulatory implementation of dorsal retraction in em-
phatics and uvulars might be different.
Pharyngeals are reliably associated with high Fl transitions in adjacent vowels.
The rises in Fl in vowels adjacent to emphatics and uvulars, which are frequently re-
ported in previous acoustic works, are not as sizeable nor as consistent as the Fl rise next
to pharyngeals. Meanwhile, laryngeals are not associated with any particular transitions
in adjacent vowels. These findings are interpreted as indications that the only Arabic
sounds that involve active tongue root retraction are pharyngeals. The milder and unsta-
ble Fl rises next to emphatics and uvulars are interpreted as indications of small and in-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
205
consistent tongue root retractions that follow as by-products of the general retractions of
the tongue dorsum. These findings also do not support the claim that high Fl in neighbor-
ing vowels is a potential acoustic grouping factor for the class of Arabic gutturals.
Some of these results and claims are investigated further in the following chapter.
The most important point to elaborate on is the differences in implementing the dorsal
retractions in emphatics and uvulars. Such differences should be reflected in the ways
those sounds impact vowel-to-vowel coarticulation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
206
CHAPTERS
Experiment Three:
Vowel-to-Vowel Coarticulation
5.1 Overview
The previous two experiments examine the more direct acoustic correlates of the
articulatory properties of consonants. A less direct consonantal articulatory-acoustic cor-
relation that has been discussed in the literature is located in the patterns of influence of
an intervocalic consonant on the coarticulatory interaction between vowels flanking that
consonant in VCV sequences. In particular, consonants that place articulatory demands
on the tongue dorsum tend to minimize trans-consonantal vowel-to-vowel coarticulation
compared to consonants that do not. Such influences are often measurable both articula-
torily and acoustically. This interaction provides an experimental opportunity to investi-
gate and compare the articulatory traits of Arabic emphatics and gutturals. In particular,
testing this phenomenon could potentially enhance our understanding of the similarities
and differences between the dorsal retractions in both emphatics and uvulars. The present
experiment looks at how these two sets of sounds, as well as the other sound classes in-
vestigated in this dissertation, affect vowel-to-vowel coarticulation in order to determine
the presence and size of the coarticulatory effects they allow between flanking vowels in
VCV contexts. There are two hypotheses being tested here. The first hypothesis is that,
unlike their non-emphatic counterparts, emphatics should resist vowel-to-vowel coarticu-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
207
lation. This hypothesis is based on the presence of dorsal retraction in Arabic emphatic
coronals. The second hypothesis is that there are acoustic differences between emphatics
and uvulars in terms of their effects on vowel-to-vowel coarticulation. This hypothesis
follows from the more constant and less flexible dorsal retraction in emphatics as op-
posed to uvulars and the absence of any dorsal retraction in other Arabic gutturals.
Vowel-to-vowel coarticulation across an intervening consonant is a subject that
has aroused some experimental and theoretical interest in recent years. The phenomenon
was first reported by Ohman (1966). In an acoustic study of vowel-stop-vowel sequences
in Swedish, Ohman noticed that the shapes of formant transitions in the first vowel de-
pend not only on the following consonant, but also on the type of the second vowel. The
transition patterns suggest that the articulatory configuration of the first vowel proceeds
to the configuration of the second vowel in a smooth diphthongal movement as if the
consonant did not exist. Ohman found a similar behavior in English, but not in Russian,
vowel-stop-vowel sequences.
A more extensive investigation of Russian VCV sequences by Purcell (1979) con-
firms Ohman's findings and adds that Russian palatalized and velarized consonants also
inhibit vowel-to-vowel coarticulation in the reverse direction. However, in a comparison
between vowel-to-vowel coarticulation in English, Russian, Hungarian, and Polish, Choi
and Keating ( 1991) found that palatalized consonants in all three Slavic languages permit
vowel-to-vowel coarticulation. It should be noted, nevertheless, that in one of Choi and
Keating's figures (Figure 7, p. 83), English permits more sizable amounts of coarticula-
tion in both directions than do any of the other three languages.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
208
To explain these findings, Ohman proposed that vowels and consonants employ
different articulatory control mechanisms. He explains that a VCV sequence is not simply
a linear succession of three sounds. Rather, such a sequence is primarily a consonant su-
perimposed on a vowel layer in which one articulatory setting flows into the next. The
lack of V2 effects on in Russian, Ohman explains, is due to the fact that, unlike Swed-
ish and English stops, Russian stops involve either palatal or velar secondary articula-
tions. These secondary articulations are vocalic by nature and impose motor demands on
the articulatory control mechanism of vowels; thus interrupting the flow from to V2•
Keating (1985), on the other hand, adopts an explanatory model based on auto-
segmental phonological features. She proposes that the phonetic implementation of
speech sounds has access to the phonological specification of that segment. If a segment
is phonologically underspecified for a certain feature, this underpecification persists into
phonetics. Coarticulation is then treated as spreading of features. In English Vb V se-
quences, for example, the lingual features of V1 spread to V2 since English [b] is not speci-
fied for any vocalic lingual features to block the spreading. This spreading is imple-
mented in the final shape of the sound as an interpolation between the articulatory targets
of and V2• Sounds with a secondary place of articulation such as Russian palatalized
consonants, on the other hand, use the vowel feature tier to project their secondary articu-
lation features. These sounds, then, do not permit vowel-to-vowel coarticulation since
they block feature spreading from one vowel to the other. This model predicts that any
consonant with a secondary articulation should block vowel-to-vowel coarticulation.
Instead of a two-leveled view of coarticulation in which discreet, timeless phono-
logical units are reinterpreted as articulatory maneuvers with extrinsic timing, Fowler
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
209
(1980) and Fowler and Saltzman (1993) propose that the phonological constituents be
treated as dynamic phonetic gestures with predefined intrinsic timing structure. In this
alternate view, the intrinsic timing structures of the gestures involved in the coarticulating
segments overlap leading the segments to be coproduced. The size of coproduction be-
tween gestures employing the same articulator is less than if they employ independent
articulators. It also depends on how much the current gesture 'resists' the effects of the
coarticulating gesture. This model, then, does not exclude the possibility that conflicting
gestures could coarticulate.
The concept of 'coarticulation resistance' was first introduced by Bladon an Al-
Bamerni (1976). It was proposed to account for the different degrees of spatial coarticula-
tory effects allowed by speech segments. Bladon and Al-Bamerni propose that the phono-
logical specification for segments, as well as certain boundary types, be assigned a nu-
merically valued feature for coarticulation resistance. To capture cross-linguistic
differences in coarticulation resistance, they suggest that such feature can be language-,
or even dialect-, specific. Recasens (1985) argues that resistance to coarticulation should
be based on the constraints forced on the articulators by the speech segments rather than
on the linguistic status of those segments. Recasens et al. (1993) found that Catalan and
Italian alveopalatals and palatals resist vowel-induced coarticulation more so than alveo-
lars. They attribute this finding to the fact that the tongue dorsum is more involved in the
production of alveopalatals and palatals than,in the production of alveolars. The same ex-
planation is proposed by Recasens et al. ( 1996) to account for their finding that Catalan
velarized [1] resists vowel-dependent coarticulation more so than German non-velarized
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
210
[1]. Recasens suggests that the same explanations should apply for vowel-to-vowel coar-
ticulation resistance by intervening consonants.
5.2 Methods
5.2.1 Subjects
The same five subjects who participated in the previous two experiments also
took part in this experiment.
5.2.2 Stimuli
The set of stimuli that was used for Experiment One was also used in the present
experiment. However, the present experiment adds the two laryngeals [h] and [?] yield-
ing a paradigm of 144 test words (3 x 16 x 3). Refer to Appendix C for a list of the words
used in this experiment.
5.2.3 Procedures
The stimuli for all three experiments were intermixed and presented to the sub-
jects together in the same recording session. Therefore, the experimental procedures fol-
lowed here are identical to those followed in the previous two experiments.
5.2.4 Acoustic Analysis
As was the case in Experiment Two, the sound analysis software Praat (Boersma
and Weenink 1992) was used to automatically generate formant tracks using the Burg
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
211
algorithm. The tracks were calculated using a succession of 25-ms Gaussian analysis
windows. The temporal distance between the centers of each two analysis windows was 5
ms. For the V,C- portion of the V,CV2 sequence F2 was measured at the end of the vowel
transition and for the -CV2 portion of the sequence F2 was measured at the beginning of
the vowel transition. These acoustic landmarks were identified based on visual inspection
of the waveform and the spectrogram of the sound token. Auditory verification was
added in cases where the vowel-consonant boundary was difficult to pinpoint. The cursor
was placed at the identified landmark which was subsequently recorded as a time point.
The time points for all sound files were annotated to a single Praat TextGrid file. A spe-
cially written Praat script referred to the TextGrid file and automatically recorded the val-
ues of F2 transitions. Formant readings were made 2.5 ms inside the vowels from the
time point recorded in the TextGrid file rather than exactly at the vowel-consonant
boundary. As explained in Experiment Two, this modification was done to avoid false
formant values as a result of incorrect tracking of the formants at the consonant-vowel
junction or incorrect averaging of vowel-edge-based and consonant-edge-based formant
reading points. The script then stored the formant readings as a text file. This file was
then converted to proper formats of the spreadsheet software Microsoft Excel (Microsoft
Corp. 1985) and the statistical analysis software SPSS (SPSS, Inc. 1989).
5.2.5 Reliability
To estimate the intra-judge reliability of formant measurements, 216 sound files
(10% of the total files) were randomly selected by a random number generating software
and re-analyzed following the same procedures explained in section 5.2.4 above. The cor-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
212
relations between the original and the retested F2 values at the transition portions were
above 0.99 for both the V1 and V2• Agreements within 50 Hz were 90.3% for V1 and
80.1% for V2• The measurements were judged reliable.
5.3 Results
To determine whether a certain consonant allows vowel-to-vowel coarticulation,
for each of the three target vowels [i, a, u], the mean F2 transition values when the source
trans-consonantal vowel is [i] are compared against the mean F2 transition values when
the source trans-consonantal vowel is [u]. These two source vowels are expected to have
opposite effects on the F2 transition of the target vowel since [i] has the highest F2 and
[u] has the lowest. The F2 transition values were compared using t-tests. Six t-tests were
conducted for every consonant, three in each direction (three anticipatory target vowels
and three carryover target vowels), for a total of 96 t-tests (3 tests x 2 directions x 16
consonants).
5.3.1 Anticipatory Vowel-to-Vowel Coarticulation
The mean F2 values of the transitions in the target vowels in the anticipatory di-
rection are listed in Tables 5.1 and 5.2. Figures 5.1 and 5.2 show the means and standard
deviations of F2 transitions in the three Arabic vowels [i, a, u] as they occur before the
sixteen consonants being investigated. The source trans-consonantal vowel in each case is
either [i] or [u].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
213 Table 5.1. Second formant frequency means (and standard deviations) in Hz of the V1 transitions preceding the four MSA emphatics and their non-emphatic counterparts.
Target Source Intervocalic Consonant Vowel Vowel (VI) CV2) [t] [d] [dl'] [s] [sl'] [o]
[i] [i] 2135 1384 2189 1290 2112 1503 2052 1224 (59) (149) (106) (92) (43) (156) (74) (93)
[u] 1980 1344 1987 1248 1920 1532 1747 1177 (30) (142) (58) (59) (71) (156) (121) (85)
[a] [i] 1781 1076 1794 1050 1785 1079 1753 1045 (132) ( 101) (65) (48) (83) (102) (81) (69)
[u] 1706 1043 1696 998 1645 1114 1561 1018 (71) (92) (52) (90) (91) (53) (51) (71)
[u] [i] 1718 1070 1692 946 1677 978 1621 891 (157) (68) (96) (60) (120) . (118) (95) (43)
[u] 1536 972 1518 892 1409 1003 1356 930 (58) (56) (145) (44) (117) (72) (69) (71)
Table 5.2. Second formant frequency means (and standard deviations) in Hz of the V1 transitions preceding the MSA gutturals and the velar stop [k].
Target Source Intervocalic Consonant Vowel Vowel CV1) CV2) [k] [q] [X] [K] [n] [)] [h] [?] [i] [i] 2412 1585 1888 1740 2127 1989 2362 2312
(89) (46) (93) (138) (112) (189) (94) (Ill) [u] 2108 1261 1394 1066 1905 1544 1492 1701
(158) (64) (87) (76) (91) (90) (140) (252) [a] [i] 1990 1096 1487 1516 1826 1739 1937 1906
(146) (52) (57) (97) (102) (130) (117) (83) [u] 1423 1107 1262 1027 1661 1412 1213 1291
(76) (75) (119) (78) (90) (122) (197) (109) [u] [i] 1125 830 897 908 1468 1465 1312 1345
(117) (49) (46) (98) (203) (152) (142) (126) [u] 949 834 938 651 1484 1232 999 960
(89) (62) (69) (57) (1 03) (97) (80) (53)
The four non-emphatic coronals [t, d, o, s] allow anticipatory vowel-to-vowel
coarticulation in all three vowels. In 11 of the 12 comparisons involving these conso-
nants, the value of F2 transition in V1 was significantly higher when V2 is [i] than when it
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
214 [t] [tY)
2500 'N' iCV aCV uCV iCV aCV uCV e; 2100 :!: ;>. :•: (.) T T ;:::: • !: C) 1700 .L • ::I .l c:::r :!: C) T T ..... • [.l.., 1300 .l • ..... .l ;:::: ..... ..... !: Cl:l • • s 900
..L ...... :!: ..... 0
[.l.., p < 0.001 = 0.060 p < 0.001 p = 0.459 p = 0.365 p < 0.001 500 [d) [dl']
2500 'N' ..... iCY aCV uCV iCV aCV uCV e; 2100 • ..L ;>. :!: (.) :!: ;:::: ..... C) 1700 :!: • ::I ..L T c:::r • C) .L ..... ..... [.l.., 1300 • :!: ..... ..L ;:::: ..... Cl:l s • :!: 900 ..L ..... 0
[.l.., < 0.001 < 0.001 = 0.149 p = 0.061 p < 0.01 500 [s] [s\]
2500 ,.-., iCV aCV uCV iCV aCV uCV N
e; 2100 :e: ;>. !: (.) I ;:::: T C) 1700 ..... ::I • • T T ..L .L c:::r T • • C) .l ..... • .l [.l.., 1300 .L ..... T :!: ;:::: • T ! Cl:l s ..L • ..... 900 .L 0
[.l.., < 0.001 < 0.001 < 0.001 p = 0.621 p = 0.246 p = 0.487 500 [oJ [o\]
2500 'N' iCV aCV uCV iCV aCV uCV e; 2100 ! ;>. (.) T ;:::: I C) 1700 • ..... ::I .L :!: • c:::r ..L C)
!: ..... 1300 [.l.., ..... ..... • • ..L ;:::: .... !: ! Cl:l !: s 900 :e: .....
0 [.l.., < 0.001 < 0.001 <0.001 p=0.165 p = 0.308 p = 0.082 500
[i] [u] [i] [u] [i] [u] [i] [u] [i] [u] [i] [u] Second Vowel (V2)
Figure 5.1. Anticipatory V-to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the four plain coronals [t, d, o, s] and their emphatic counterparts [t\ d\, o\, s\). The effects are expressed as the mean value of F2 transition of the first vowel when the trans-consonantal vowel is either [i] or [ u]. Formant values are averages across speakers. The error bars represent± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
215 [k] [q]
2500 ..,... • iCV aCV uCV iCV aCV uCV ..... e; 2100 T • T ;>-, .l • u .l 1::::
1700 0'
!: ..... 1300 [.I.. :!: ..... T !: 1:::: • :!: o::l .L ..,...
E1 900 • ..... ..... :!: :!: 0 [.I.. < 0.001 < 0.001 < 0.001 = 0.642 = 0.866 500
[X] 2500
[K] iCV aCV uCV iCV aCV uCV N
e; 2100 ;>-, ..,... u • T 1:::: .....
1700 • 1. ..... 0' :!: • ..,... ..... ..... 1300 • T [.I.. ..... • .L !: o::l
!: !: ..... E1 900 :!: • ..... ..... 0 [.I.. < 0.001 <0.001 < 0.001 < 0.001 < 0.001:!: 500
[nJ [1] 2500
iCV aCV uCV iCV aCV uCV N
e; 2100 T T • .L ;>-, ..... • u • ..... 1 1:::: ..... • T
1700 ..... ..,... • • T ..,... .L ..... T 0' ..... • T • • ..... • ..... 1 ..... • .l [.I.. 1300 .L ..... • ..... o::l E1 900 ..... 0
[.I.. < 0.001 < 0.001 < 0.001 < 0.001 <0.001 500 [h] [?]
2500 e iCV aCV uCV T iCV aCV uCV N ..... • e; 2100 .L
T ;>-, • T I u .L 1:::: 1700 • T 1 0' • T T .... 1. T T • [.I.. 1300 • • .L • 1. .L
o::l 1 I :!: E1 900 ..... 0
[.I.. < 0.001 < 0.001 < 0.001 < 0.001 < 0.001 500
[i] [u] [i] [u] [i] [u] [i] [u] [i] [u] [i] [u] Second Vowel (V2)
Figure 5.2. Anticipatory V-to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the velar [k], the three uvulars [X, K, q], the two pharyngea1s [h, ?], and the two laryngeals [h, ?]. The effects are expressed as the mean value of F2 transition of the first vowel when the trans-consonantal vowel is either [i] or [u]. Formant values are averages across speakers. The error bars represent± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
216
is [u] [t ranges from 3.880 to 9.025, (df = 28); p < 0.01]. The only exception is that, be-
fore [t], the difference between F2 transition of [a] before [i] is only marginally higher
than before [u] [t ranges 1.962, (df = 28); p < 0.1]. The velar [k] also allows anticipatory
vowel-to-vowel coarticulation in all three vowels [t ranges from 4.619 to 13.368, (df = 28); p < 0.001]. The four emphatics [t'1, d", 6", s'l], on the other hand, tend to block vowel-
to-vowel anticipatory coarticulation. Of the 12 t-tests conducted for VCV sequences in-
volving emphatics, nine showed no significant differences between the mean F2 transi-
tion value in vl based on the identity of v2 [t ranges from -1.806 to 1.485, (df = 28); p 2:
0.149]. The emphatic stops [d'l] and [t'l] allow F2 transition in [u] to coarticulate with V2
[t = 2.810 and 4.288, respectively, (df = 28); p < 0.01]. The mean value of F2 transition
of [a] before [d'1] is marginally higher when V2 is [i] than when it is [u] [t = 1.950, (df = 28); p < 0.1].
Among the three uvulars, [B] allows F2 transitions in all three vowels to coarticu-
late with V2 [t ranges from 8.754 to 16.559, (df = 28); p < 0.001]. The voiceless fricative
[X] allows V2 to coarticulate with F2 transition in preceding [i] and [a] [t = 15.003 and
6.596, respectively, (df = 28); p < 0.001] but not in preceding [u] [t = -1.874, (df = 28); p
= 0.071]. The uvular stop [q] allows coarticulation from V2 into F2 transition of [i] [t = 15.977, (df = 28); p = 0.001] but blocks it from affecting [a] [t = -0.470, (df = 28); p = 0.642] and [u] [t = -0.170, (df = 28); p = 0.866]. Five of the six t-tests involving the two
pharyngeals [h] and [)] show that these two sounds allow anticipatory coarticulation
from V2 into F2 transition of V1 [t ranges from 4.711 to 8.231, (df = 28); p < 0.001]. The
one exception concerns the vowel [u] before [h] where no coarticulation is detected [t =-
0.267, (df = 28); p = 0.791]. Both laryngeals, [h] and [?] allow large degrees of anticipa-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
217
tory vowel-to-vowel coarticulation in VCV sequences [t ranges from 7.454 to 19.974, (df
= 28); p < 0.001].
5.3.2 Carryover Vowel-to-Vowel Coarticulation
The mean F2 values of the transitions in the target vowels in the carryover direc-
tion are listed in Tables 5.3 and 5.4. Figures 5.3 and 5.4 show the means and standard de-
viations of F2 transitions of the three Arabic vowels [i, a, u] occurring after the sixteen
consonants. In each case, the trans-consonantal vowel is either [i] or [u].
All four plain coronals [t, d, o, s] allow coarticulation from V1 into F2 transition
of V2 in all VCV sequences [t ranges from 2.049 to 13.781, (df = 28); p :S 0.05]. The same
is true for the velar stop [k] [t ranges from 2.679 to 7.158, (df= 28); p < 0.05]. In eight of
the 12 t-tests conducted for VCV sequences in which the consonant was one of the four
emphatic coronals [t", d", o", s1], coarticulation from V1 was permitted into F2 transition
of V2. Both [t1] and [o1] allow carryover coarticulation in all VCV sequences [t ranges
from 2.493 to 3.914, (df = 28); p < 0.05]. The voiced emphatic stop [d1] permits signifi-
cant carryover coarticulatory effects when V2 is either [a] or [u] [t = 3.720 and 2.098, re-
spectively, (df = 28); p < 0.05] but only marginally significant effects when V2 is [i] [t =
1.890, (df = 28); p < 0.1]. The emphatic alveolar fricative [s1] exhibits the most resistance
to carryover coarticulation. This sound blocks coarticulation from affecting [i] or [a] [t =
1.435 and 1.306, respectively, (df = 28); p 0.162] and allows only marginally signifi-
cant effects on [u] [t = 2.017, (df= 28); p < 0.1].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
218 Table 5.3. Second formant frequency means (and standard deviations) in Hz of the V2 transitions following the MSA emphatics and their non-emphatic counterparts.
, Target Source Intervocalic Consonant Vowel Vowel (Vz) (VI) [t] [d] [s] [oJ [oi·J [i] [i] 2251 1532 2183 1324 2164 1586 2116 1246
(83) (160) (64) (131) (79) (300) (70) (93) [u] 1968 1315 1894 1238 1843 1461 1841 1123
(83) (142) (89) (119) (52) (155) (33) (96) [a] [i] 1844 1062 1888 1042 1746 1079 1783 1019
(44) (66) (65) (76) (67) (80) (74) (88) [u] 1694 985 1727 947 1624 1034 1628 944
(76) (62) (62) (64) (76) (105) (59) (57) [u] [i] 1449 970 1720 975 1510 996 1560 951
(147) (62) (70) (82) (109) (81) (82) (82) [u] 1356 911 1544 912 1382 930 1374 881
(96) (68) (96) (84) (48) (97) (125) (63)
Table 5.4. Second formant frequency means (and standard deviations) in Hz of the V2 transitions following the MSA gutturals and the velar stop [k].
Target Source Intervocalic Consonant Vowel Vowel (Vz) (VI) [k] [q] [X] [H] [nJ ['l] [h] [?] [i] [i] 2321 1787 2273 1784 2219 2075 2391 2328
(116) (182) (114) (88) (103) (164) (98) (112) [u] 2062 1546 1711 1278 1897 1702 1867 1682
(79) (127) (146) (165) (120) (108) (165) (248) [a] [i] 1992 1107 1249 1300 1676 1688 1861 1848
(96) (74) (97) (81) (105) (75) (76) (98) [u] 1783 1110 1182 1031 1468 1404 1375 1124
(135) (71) (112) (101) (114) (99) (108) (127) [u] [i] 1021 877 931 880 1353 1311 1032 1162
(112) (47) (62) (58) (120) (68) (94) (215) [u] 905 849 768 679 1277 1208 947 847
(124) (36) (81) (54) (90) (122) (32) (88)
The voiced uvular [B"] permits substantial carryover coarticulation in all VCV se-
quences [t ranges from 8.026 to 10.486, (df = 28); p < 0.001]. The voiceless fricative [X:]
allows V1 to coarticulate with F2 transition in following [i] and [u] [t = 11.719 and 6.203,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
219 [t] [tl']
2500 ,-.,
I VCi VCa VCu VCi VCa VCu N
tS 2100 ;;.., I u :::: C) 1700 :! T ;:I 0' T • C) • "T" .l T .... 1. • u.. 1300 ...... • .... 1. :::: !: o:l !: !: s 900 !: .... 0 u.. p < 0.001 p < 0.001 p = 0.050 p so·.ooi p < 0.01 p < 0.05 500
[d] [dl'] 2500
,-., VCi VCa VCu VCi VCa VCu N
tS 2100 !: ;;.., "T" !: u • :::: ......
!: !: C) 1700 ;:I "T" 0' • ...... T u.. 1300 • T
.L • .... ..I.. :::: :! o:l !: • s 900 ...... I .... 0 u.. p < 0.001 p < 0.001 p < 0.001 p = 0.069 p s 0.001 p < 0.05
500 [s]
2500 ,-., VCi VCa VCu VCi VCa VCu N
tS 2100 I ;;.., u :!: T :::: !: C) 1700 :! ;:I T • T 0' • 1 C) .L :!: • .... 1. u.. 1300 -;:: I T o:l • I "T" s 900 .L • .... ...... 0 u.. p < 0.001 p < 0.001 p < 0.001 p=0.162 p = 0.202 p = 0.053 500
[oJ 2500
,-., VCi VCa VCu VCi VCa VCu N
tS 2100 !: ;;.., u :e: :! :::: C) 1700 ;:I :!: I 0' T
1300 • "T" ..I.. • -;:: ...... "T" • • o:l ...... I s 900 ...... :!: I .... 0 u.. p < 0.001 p < 0.001 p < 0.001 p SO.OOI p < 0.01 p < 0.05
500 [i] [u] [i] [u] [i] [u] [i] [u] [i] [u] [i] [u]
First V owe! (VI)
Figure 5.3. Carryover V -to-V coarticulatory effects on the three Arabic vowels [i, a, u] across the four plain coronals [t, d, o, s] and their emphatic counterparts [t\ d\ o\ The effects are expressed as the mean value of F2 transition of the second vowel when the trans-consonantal vowel is either [i] or [u]. Formant values are averages across speakers. The error bars represent ± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
[k] 2500 I VCi N' VCa
e; 2100 .L
i ..,.. ;;.., • u ..L T c • il) 1700 1. ;:l c::r il) ,_
ll.. 1300 c ro s 900 ,_ 0
ll.. <0.001 p < 0.001 500 [X]
2500 ,-.., T VCi VCa N • e; 2100 .L
;;.., {.) T :::: il) 1700 • ;:l 1. c::r il) ,_
1300 ..,.. ll.. • T ...... ..L • :::: .L ro s 900 ,_ 0 ll.. < 0.001 = 0.090
500 [nJ
2500 ,-..,
T VCi VCa N
e; 2100 • ..L
;;.., T u • :::: .L T il) 1700 • ;:l ..L c::r T il) • ,_
1300 .L ll.. ...... :::: ro s 900 ,_ 0 ll.. < 0.001 < 0.001 500
[h] 2500 ..,..
,-.., • VCi VCa N ..L
e; 2100 ;;.., T i u • :::: l. il) 1700 ;:l c::r il) T ,_
1300 • ll.. ..L
c ro s 900 ,_ 0 ll.. <0.001 < 0.001 500
[i] [u] [i] [u]
[q]
VCu VCi
T • 1 T • 1.
T • T .L • .L
< 0.001
(K] VCu VCi
.,... • .....
T • l. :!: ..
p<O.OOJ ..... <0.001 ['l]
VCu VCi T • l.
T • ..L
T • .,... .L • .....
< 0.001
[?] T VCu VCi • .L
T • 1 ..,.. • ..L :•:
< 0.001
[i] [u] [i] [u] First V owe! (V 1)
VCa
! !
= 0.915
VCa
..,.. • ..L
p < 0.001
VCa
..,.. • ..L
< 0.001
..,.. • ..L
VCa
T • .L
< 0.001
[i] [u]
220
VCu
p = 0.083
VCu
:!: < 0.001:!:
VCu
T • .L
< 0.01
VCu
T • 1 .,... • ..... < 0.001
[i] [u]
Figure 5.4. Carryover V -to-V coarticulatory effects on the three Arabic vowels [i, a, u] across [k], the three uvulars [X, JS, q], the two pharyngeals [n, ?], and the two laryngeals [h, 7]. The effects are expressed as the mean value of F2 transition of the second vowel when the trans-consonantal vowel is either [i] or [u]. Formant values are averages across speakers. The error bars represent± one standard deviation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
221
respectively, (df = 28); p < 0.001] but permits only marginally significant effects on F2
transition in following [a] [t = 1.757, (df = 28); p < 0.1]. The uvular stop [q] allows sig-
nificant carryover coarticulatory effects into [i] [t = 4.194, (df= 28); p < 0.001], marginal
effects into [u] [t = 1.798, (df = 28); p < 0.1], but no significant effects into [a] [t =-
0.108, (df = 28); p = 0.915]. Five of the six VCV sequences involving the pharyngeals
[h] and [)] reflect significant carryover vowel-to-vowel coarticulation [t ranges from
2.869 to 8.888, (df = 28); p < 0.01]. The only exception is in the sequence uhV where
only marginally significant coarticulatory effect takes place [t = 1.959, (df = 28); p < 0.1].
All six VCV sequences in which the consonant is one of the two laryngeals [h] or [?] re-
flect significant carryover coarticulation from V1 [t ranges from 3.281 to 17.439,
(df = 28); p < 0.01].
Figure 5.5 shows the averaged sizes of anticipatory and carryover coarticulation
allowed by each of the 16 sounds under investigation. Each data bar represents the sub-
traction of the averaged F2 value in all vowels when the trans-consonantal vowel is either
[u] from the value when the trans-consonantal vowel is [i]. So, for example, the size of
overall anticipatory effect allowed by [t] is calculated by subtracting the average of F2
transition values in all three vowels in the V1 position when the V2 is [ u] from the average
F2 transition in all three vowels in the V1 position when V2 is [i]. The larger the number
the more sizable the coarticulatory effect permitted by the intervening consonant. It is
clear from the figure that laryngeals allow the strongest coarticulatory effects. Strong ef-
fects are also allowed by the two pharyngeals and the five plain oral consonants. The
three uvulars show large variability in the size of permitted coarticulatory effects. The
coarticulatory effects allowed by [q] are relatively small compared to [ff] which allows
Reproduced w
ith permission of the copyright ow
ner. Further reproduction prohibited without perm
ission.
700
!:1. 600 :;; (1)
@ ;::l
g 500 ;::;· ntv e; 0 :2 400 < ..... ;: > -· ;::l n ::::-. 300 < 8. 1"0
"' :::: .....
< ';'; 200 :::.::< :r::n N -· '-'I
< 100 n :::: '-' 0 ..., < 0 tv
-100
I I
• Anticipatory
D Carryover
[t] [d] [s] [oJ [k] [t'] [q] [X] [B"] [h] [l] [h]
Figure 5.5. Sizes of anticipatory and carryover vowel-to-vowel coarticulatory effects across the sixteen Arabic consonants under investigation. The sizes of the effects are measured by subtracting the value of F2 transition of the target vowel when the trans-consonantal vowel is [u] from the value when the trans-consonantal vowel is [i]. F2 transition values are averaged across speakers.
[?]
N N N
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
223
very sizable effects. The effects permitted by [X] are in between. The four emphatics
clearly resist coarticulatory effects more than any other class of sounds. This is especially
true in the anticipatory direction where the effect permitted by [s"] actually has a negative
value.
Excluding emphatics, the degree of anticipatory coarticulation permitted by seven
of the remaining 12 consonants is larger than that of carryover coarticulation. The reverse
is true for the other five consonants. Across all segment types, the sizes of carryover
coarticulatory effects are less variable and less extreme than those 'of anticipatory effects.
F2 differences in the carryover direction range from 78 to 562 Hz, while in the anticipa-
tory direction the range is between -30 and 636Hz. It is interesting to note that while the
size of carryover effects allowed by the four non-emphatic coronals is mostly constant,
the two fricatives [s] and [o] allow more anticipatory effects than the two stops [t] and
[d]. Meanwhile, all of the four emphatics allow more carryover than anticipatory coar-
ticulatory effects.
5.4 Discussion and Conclusions
The results in this experiment indicate that MSA emphatic consonants strongly
resist vowel-to-vowel coarticulation in both the anticipatory and the carryover directions
while their non-emphatic counterparts, do not. Like non-emphatic coronals, pharyngeals,
laryngeals, and the velar [k] allow substantial vowel-to-vowel coarticulation. The impact
of uvulars on vowel-to-vowel coarticulation, meanwhile, depends on their degreed of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
224
constriction. Larger degrees of constriction in uvulars correspond to higher vowel-to-
vowel coarticulation resistance.
The results show that the presence and size of vowel-to-vowel coarticulation per-
mitted by a consonant depends largely on how much tongue participation is implicated in
the production of the consonant. The two laryngeals [h] and [?] feature little or no tongue
involvement in their production. It is therefore not unexpected that these two sounds al-
low the most sizable vowel-to-vowel coarticulation in all sequences in both directions
among the consonants investigated. During the production of laryngeals, the tongue dor-
sum is free to move smoothly from the position it assumes in V, to that in V2• The two
pharyhgeals [h, )] and the four plain coronals [t, d, 6, s] are produced with active tongue
participation. However, this participation is limited to the tongue root in the case of the
pharyngeals and the tongue tip/blade in the case of the plain coronals. The tongue dor-
sum, meanwhile, is not actively involved in the production of these sounds. Since the
tongue dorsum is the main articulator of vowels, it is understandable that pharyngeals and
plain coronals allow vowel-to-vowel coarticulation due to their minimal interference with
the dorsal maneuvers. The size of vowel-to-vowel coarticulatory effects permitted by the
two pharyngeals [h, )] is somewhat lower than expected since their main articulation
takes place in the lower pharynx away from the tongue dorsum. However, it was noted in
Experiment Two that the tongue dorsum is apparently more active during the production
of these two sounds than one would think. In Experiment Two it was reported that there
was a 'pharyngeal locus' that represents an acoustic hub from which F2 of a following
vowel in a CV sequence starts. The claim presented in Experiment Two that the tongue
body assumes an [a]-like articulatory target located in the middle of the front/back axis
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
225
receives additional support here. If the tongue body was totally passive during the articu-
lation of pharyngeals, we would notice a more sizable vowel-to-vowel coarticulatory ef-
fects approaching in size those noticed for laryngeals. It seems that the [a]-like target of
the tongue body slightly limits the movement of the tongue dorsum. However, since this
target is located around the middle of the front/back dimension, it is almost always lo-
cated close to the middle of the interpolation line that extends from the articulatory target
of V1 to that of V2• The overall result of this situation is that pharyngeals almost always
allow significant vowel-to-vowel coarticulatory effects, but those effects are not as siz-
able as those allowed by laryngeals in which the tongue body is completely free.
The velar stop [k] is produced with active tongue dorsum participation, yet it al-
lows considerable vowel-to-vowel coarticulation in both directions; more so than plain
coronals, as a matter of fact. However it is widely acknowledged that plain velar stops
greatly adapt the position of their point of constriction to the vocalic context. This way, in
a VkV environment the point of velar constriction during the production of [k] is decided
through interpolation between the two dorsal settings for the flanking vowels. Therefore,
the adaptable placement of the tongue dorsum during [k] imposes little restrictions on the
tongue movement.
The four emphatic coronals [t', d', 6', s"] exhibit the most resistance to vowel-to-
vowel coarticulation. These sounds involve active tongue backing that constrains the
tongue dorsum greatly and prevents smooth transitioning from vl to v2. It is important to
note, however, that this resistance is not obtained for all VCV sequences containing em-
phatics consonants. While, blocking of anticipatory vowel-to-vowel coarticulation is
found in 10 of 12 VC"V sequences, only four of 12 sequences exhibit vowel-to-vowel
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
226
coarticulation blocking in the carryover direction. Moreover, while other groups of
sounds exhibit generally larger coarticulatory effects in the anticipatory than in the carry-
over direction, emphatics exhibit the opposite pattern. This is probably due to enhance-
ment by mechanico-inertial effects which, by nature, flow in the same direction as carry-
over coarticulation does.
The vowel-to-vowel coarticulatory effects recorded for uvulars vary greatly
among the members of this guttural subset. The voiced fricative [B] permits large coar-
ticulatory effects in both directions in all VBV sequences. The voiceless fricative [X]
permits vowel-to-vowel coarticulation in four of the six VxV sequences covering both
directions. Size-wise, the overall amount of vowel-to-vowel coarticulatory effects al-
lowed by [X] is quite substantial and compares well to the pharyngeal approximants. The
uvular stop [q], on the other hand, shows strong resistance to vowel-to-vowel coarticula-
tion. Among the six VqV sequences covering both directions, only two sequences show
significant vowel-to-vowel coarticulatory effects. It seems that vowel-to-vowel coarticu-
lation resistance depends not only on the involvement of the tongue dorsum in the articu-
lation of the intervening consonant but also on the degree of constriction executed by the
dorsum. The degree of constriction involved in the production of fricatives is less than
that involved in the production of stops. Within fricatives, voiceless sounds typically in-
volve more constriction than voiced sounds in order to produce audible turbulence in the
airflow. Apparently, uvular resistance to vowel-to-vowel coarticulation increases in that
order (stop > voiceless fricative > voiced fricative). This proposal is logical if we take
into account Recasens' ( 1985) argument that coarticulation resistance is based on the
physical constraints placed on the articulators by a segments articulation. The amount of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
227
physical involvement of the articulators required by the production of a stop is more than
the amount required by the production of a homorganic fricative. The higher physical re-
quirements yield higher articulatory constraints on the articulators. The result is that [ q]
exhibits high resistance to vowel-to-vowel coarticulatory effects, [X] mostly allows such
effects, while [B"] is highly transparent to them.
The different patterns of influence on vowel-to-vowel coarticulation between uvu-
lars and emphatics indicate that the types of articulatory constraints each class imposes
on the tongue dorsum are different. Even though the secondary articulation in emphatics
is approximant by definition, it constrains the tongue more so than do the articulations of
the two uvulars [B", xJ. Ghazeli (1977) and Delattre (1971) report that the two uvular
fricatives involve some adjustment in the position of the velum in that it is spread flat
over the back tongue in the case of [X] and is curled downward towards the back of the
tongue in the case [B"]. Furthermore, the general direction of the tongue dorsum move-
ment during uvulars as reported by Ghazeli and others involves considerable raising
along with the retraction. These are indications that the articulatory muscle that is most
active physically during the production of uvulars is the palatoglossus. As explained in
Chapter 2, this muscle originates from the soft palate and inserts into the back of the
tongue and works to either lower the soft palate or raise the back of the tongue. Inspect-
ing Figure 2.1, which shows the extrinsic muscles of the tongue, clearly reveals that the
palatoglossus is located around the area towards which the tongue moves during the ar-
ticulation of uvulars. The styloglossus also seems to be variably active during the produc-
tion of uvulars. During the articulation of [X] and [B"], the styloglossus is probably re-
sponsible for cradling the tongue backwards to achieve the noticeable tongue retraction.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
228
Additionally, the velar depressor muscles execute the downward movement of the velum
towards the inside of the oral cavity the articulation of [B]. For [X] and [q], the flattened
and seemingly tensed shape of the velum are most likely executed by the velar elevator
and tensor muscles.
Electromygraphic (EMG) studies indicate that three extrinsic muscles of the
tongue; the genioglossus, the styloglossus, and the hyoglossus; are the only muscles nec-
essary for achieving the plain articulations of the different vowels (Maeda and Honda
1994, Honda et al. 1992). The action of these muscles is sufficient to cover all articula-
tory movements that span the vowel space. If the articulation of uvulars [X] and [B] in-
volve mainly the palatoglossus and the velar muscles, it becomes understandable why
these sounds do not exhibit mu.ch resistance to vowel-to-vowel coarticulation. These
sounds interfere only slightly with the actions of the muscle group responsible for execut-
ing vowel articulations. This leaves these muscles relatively free to transition smoothly
from one vowel configuration to the other. The secondary articulation of emphatics,
though somewhat variable, generally shows more retraction and less raising than the ones
involved in [X] or [B]. Giannini and Pettorino (1984) suggest that the extrinsic muscles of
the tongue execute the secondary articulation in emphatics. Indeed, the pattern and axis
of tongue dorsum movements during emphatics are more in line with actions the sty-
loglossus and the hyoglossus. The fact that the styloglossus can raise the tongue (in addi-
tion to retracting it) and that the hyoglossus can lower the tongue (also in addition tore-
tracting it) might explain the variability in the elevation of the retracted tongue dorsum
during emphatics (compare, in particular, Ghazeli's 1977 x-ray tracings of and
Both these muscles are actively involved in the production of vowels. Since emphatics
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
229
place their own articulatory demands on them, the two muscles are physically constrained
during the consonant in the sequence which greatly limits their freedom in moving
from one vowel position to the other. Similarly, during the production of the uvular stop
[q], participation of the styloglossus is apparently more active than during [X] or [B] since
this sound involves more retraction than the two fricatives. Moreover, the occlusion dur-
ing [ q] is between the tongue dorsum and the soft palate in an arched stretch that spans
the horizontal as well as the vertical axes. A combined raising and backing of the tongue
is necessary for this occlusion to be effective. The more active involvement of the sty-
loglossus during [q] is a logical explanation for why this sound is the only uvular that is
highly resistant to vowel-to-vowel coarticulation.
The previous rationalization of the differences between the uvular fricatives on
the one hand and the emphatic consonants and the uvular stop on the other hand depends
on viewing main articulators from the angle of the individual muscles that control them.
With this view in mind we conclude that the secondary articulation in emphatics is sig-
nificantly different from the articulation of uvulars. During emphatics, the tongue dorsum
is controlled by the extrinsic muscles of the tongue. During uvulars, on the other hand,
the tongue dorsum is controlled primarily by the palatoglossus and secondarily by the
styloglossus. This view challenges existing proposals which suggest an articulatory
equivalence between the dorsal articulations in these two classes of sounds.
When inspecting the VCV coarticulatory aspects of emphatics and uvulars it ap-
pears that "coarticulatory aggression" is encountered between classes of sounds but not
necessarily within the members of the same class. The term "coarticulatory aggression"
was coined by Fowler and Saltzman (1993) to refer to the observation that sounds that
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
230
resist coarticulatory influence from other sounds the most are the same sounds that tend
to spread their own coarticulatory effects to other sounds. When comparing the findings
of the present experiment with those of Experiment Two, we notice that emphatics and
the uvular [ q] spread the strongest co articulatory effects to neighboring vowels and also
exhibit the most resistance for vowel-to-vowel coarticulation. However, when comparing
emphatic sounds to each other and uvulars to each other, this pattern is not necessarily
observable. Among the four emphatics, [s1] is generally accompanied by the least sizable
and most variable drop in F2 transition in neighboring vowels in CV and VC environ-
ment. This emphatic, however, exhibits the most resistance to vowel-to-vowel coarticula-
tion. Moreover, among the three uvulars, the coarticulatory effect of [B] on neighboring
vowels is clearly more sizable than that of [X]; yet the latter exhibits more vowel-to-
vowel coarticulation resistance than the former. This leads me to suggest that coarticula-
tory aggression should be considered as a property of sound classes rather than individual
sounds.
5.5 Summary
This experiment shows that Arabic non-emphatic coronals, pharyngeals, laryn-
geals, and the velar [k] allow significant amounts of anticipatory and carryover vowel-to-
vowel coarticulatory effects. Arabic emphatics, on the other hand, show strong and rela-
tively consistent resistance to such effects. The impacts of uvulars on vowel-to-vowel
coarticulatory effects follow from the degrees of constriction involved in their articula-
tions. The voiced uvular fricative [B] involves a relatively mild degree of constriction.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
231
Consequently, this sound permits great degrees of vowel-to-vowel coarticulatory effects.
The voiceless fricative [X] involves a slightly higher degree of constriction and, as a re-
sult, allows similar, but not as substantial, effects. The uvular stop [q] involves the high-
est degree of constriction causing this sound to be mostly opaque to vowel-to-vowel coar-
ticulation.
These results are interpreted as indications that the dorsal involvement in the ar-
ticulations of emphatics and uvulars are not completely similar. When coupled with the
articulatory studies cited in Chapter 2, the results suggest that the tongue dorsum during
the articulation of Arabic emphatics is pulled back through the action of both the sty-
loglossus and the hyoglossus muscles. Both these muscles are employed by vowel articu-
lations. An intervening emphatic in VC'V sequences would therefore place its own ar-
ticulatory demands on part of the set of muscles executing the articulations of the
flanking vowels. Hence, the articulatory movement from one vowel to the other would
not be a free one. Uvulars, meanwhile, seem to involve only the styloglossus. The
hyoglossus cannot be implicated in their articulation since the outcome of its contraction
is antagonistic to the tongue raising necessary for their articulation. The degree of in-
volvement of the styloglossus in uvulars depends on their degree of constriction. Since
the stop [q] requires the most contraction by the styloglossus in order to achieve full oc-
clusion, this uvular interfere the most with vowel-to-vowel coarticulation. The tongue
raising implicated in the articulation of uvulars seems to be, at least partially, the respon-
sibility of the palatoglossus muscle. It should also be kept in mind that all uvulars involve
active participation by the soft palate that is absent in emphatics.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
232
The articulatory details discussed in Experiments Two and Three should be re-
flected in the phonological representations of these sounds. The following chapter dis-
cusses this issue in detail and formalizes the phonological representations of these
sounds. These representations are then shown to be more adequate at addressing the chal-
lenges faced by the previous proposals discussed in §2.4. Also, an alternative reasoning
for the grouping of Arabic gutturals into a single natural class is presented.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
233
CHAPTER6
Implications and Alternatives
The goal of this chapter is to motivate phonological representations for Arabic
emphatics and gutturals in the light of the phonetic findings of the previous three chap-
ters. To start with, the articulatory inferences from the acoustic data obtained in the three
experiments are reviewed and elaborated on. Those articulatory views are compared and
contrasted with the articulatory presumptions behind the existing formal proposals for the
representations of Arabic emphatics and gutturals. Along the way, an alternative basis for
the grouping of the different subsets of Arabic gutturals into one phonological natural
class is proposed. Formal phonological representations for Arabic emphatics and guttur-
als are then presented. These proposals are tested against the phonological constraints and
rules discussed in Chapter 2; namely, OCP-based morpheme structure constraints on Ara-
bic roots and guttural-conditioned vowel lowering. It was stated in Chapter 2 that these
rules and constraints pose severe challenges to existing representational proposals. It is
shown that the alternative representations provided here are more capable of handling
those challenges. Finally, the classifications of emphatic and guttural inventories in Tigre,
an Ethio-Semitic language, and Sta'at'imcest, an Interior Salish language, are touched on.
These languages differ from Arabic in terms of the groupings of emphatics and gutturals
into natural classes in terms of place of articulation. Such differences have to be reflected
in the phonological representations of their emphatic and guttural sounds. It is argued that
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
234
there are phonetic differences among these languages that underlie the representational
differences.
6.1 Emphatic and Guttural Articulations
Recall from Chapter 2 that emphatics and gutturals cooccur in Arabic consonantal
roots. According to previous representational proposals, this fact stands in violation of the
place OCP since both classes possess similar place or articulator terminal features or
class nodes. Such similarities are expected to trigger place OCP violations and, as a con-
sequence, restrict the cooccurrence of emphatics with gutturals. It was also mentioned in
Chapter 2 that McCarthy ( 1994) proposes that the domain of the place OCP applicability
should be limited by the conjunction of the feature [approximant] to the feature [pharyn-
geal]. In essence, Arabic gutturals are identified, for OCP purposes, as [pharyngeal,
+approximant]. Under McCarthy's proposal, since emphatics are not approximants, they
are excluded from this domain and their cooccurrence with gutturals is not restricted
since it results in no violation of the place OCP. Crucial to the success of this proposal is
that all Arabic guttural subclasses be classified as approximants. According to Catford
( 1977) approximants are consonants in which the vocal tract narrowing, when compared
to fricatives, "is somewhat larger, and flow through it is turbulent only when voiceless,
otherwise it is non-turbulent" (p. 122). Clements (1990) refines the definition of ap-
proximants to include "any sound produced with an oral tract stricture open enough so
that airflow through it is turbulent only if it is voiceless" (p. 293). This definition quali-
fies all gutturals to be approximants (since they do not involve any oral constrictions) and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
235
supports McCarthy's argument. However, recent investigations of pharyngeal/laryngeal
articulations (Esling 1996, 1999; Edmondson et al. 2005) show that the articulatory pos-
sibilities in the pharyngeal cavity include approximants, fricatives, stops, and even trills.
We cannot, therefore, presume that a non-oral articulation is by default a non-fricative
one. The power spectra of pharyngeals, as shown in Experiment One (Chapter 3), support
the claims that these sounds are approximants. Laryngeals do not involve any supraglottal
constriction of any type and thus trivially qualify as approximants. Uvular continuants, on
the other hand, possess fricative-like acoustic attributes pointing to the involvement of
substantial airflow turbulence in their production. Arabic uvular continuants are clearly
fricatives. As non-approximants (i.e., [-approximant]), uvulars are expected to cooccur
with other gutturals as per McCarthy's proposal. Moreover, uvulars should not cooccur
with emphatics. Both of these expectations do not materialize as seen in Table 2.2. The
articulatory finding about uvulars poses serious challenges to McCarthy's arguments
since it reveals incompatibilities between his phonological designation of uvular contin-
uants and their phonetic reality.
The consistent and sizable correlation between higher F1 transitions and Arabic
pharyngeals, as shown in Experiment Two (Chapter 4 ), strongly suggests that pharyn-
geals are produced with an actively retracted tongue root (i.e., [ +RTR]). A retraction in
the tongue root produces a constriction in the pharynx near the node of F1 (see Figure
1.1) as reflected by the high value of that formant frequency. However, although uvulars
are also associated with relatively high F1 transitions in neighboring vowels, the raising
in the case of uvulars is not as substantial, nor as consistent, as that of pharyngeals. A
logical explanation for this is that the tongue root retraction in uvulars is not actively re-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
236
tracted. The tongue root is backed only as a side-effect of the retraction of the tongue dor-
sum since these structures are closely attached together. In regards to laryngeals, the re-
sults show no particular association with high F1 transitions. This is in line with the pre-
dictions of acoustic-articulatory models since laryngeals do not involve any specific
supra-glottal narrowings of their own. Therefore, any claim pertaining to high F1 as an
acoustic correlate common among all gutturals (McCarthy 1994, Zawaydeh 1999) ex-
plaining their grouping into a natural class in phonology is challenged.
Experiment Two also shows that emphatics, like uvulars, are also associated with
high F1 transitions in neighboring vowels that are not as substantial nor as consistent as
the transitions next to pharyngeals. It seems that the tongue root retraction in emphatics is
also a byproduct of the general retraction of the tongue dorsum and not an independent
gesture. The most robust and most consistent acoustic correlate to Arabic emphatics is a
drop in F2 at the transition of the neighboring vowels. The raising of F1 adjacent to em-
phatics is not as crucial to the difference between emphatics and non-emphatics. While
these findings have been reported in numerous works, the present study provides stronger
evidence, in the form of linear discriminant analysis, of the capabilities of F1 raising and
F2 lowering to classify sounds on the basis of the presence or absence of emphaticness.
The inconsistency and smaller magnitude of change in the values of F 1 at the transitions
of vowels neighboring emphatics pose a serious challenge to the views that Arabic em-
phatics are [+RTR] sounds (Davis 1995, Rose 1996). If tongue root were actively re-
tracted during emphatics, we would expect them to consistently correspond to substan-
tially high F1 transitions on neighboring vowels. Furthermore, since the first node of F2
is located near the node of F1, we would also expect the [ +RTR] specification in emphat-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
237
ics to trigger higher F2 transitions on neighboring vowels. Since both acoustic outcomes
are not what take place in reality, the claim that the tongue root produces the secondary
articulation in emphatics has to be dismissed. McCarthy (1994 ), Norlin (1987), and
Zawaydeh (1999) suggest that it is the tongue dorsum that produces the pharyngeal con-
striction in emphatics. McCarthy and Zawaydeh make similar suggestions for uvulars.
Both of these claims are supported by the results in Experiment Two. A constriction pro-
duced by a backed tongue dorsum takes place higher in the pharynx than where [ +RTR]
articulations are expected. Such constriction corresponds to the antinode of F2 which
would explain the low F2 in the transitions of vowels neighboring emphatics and uvulars.
This effect is stronger for emphatics since they involve a larger dorsal retraction than do
uvulars. I would also add here that the palatine dorsum lowering observed in emphatics
(but not uvulars) by Ali and Daniloff (1972) and Ghazeli (1977) is very important to the
acoustic correlate in question. This lowering widens the vocal tract near the second node
of F2. Since widening has the opposite effect of constriction, the result is an enhancement
of the acoustic product of the upper pharynx constriction.
The results of Experiment Two favor the dismissal of any precise pharyngeal con-
striction likeness between pharyngeals on the one hand and emphatics and uvulars on the
other. The tongue root, which is controlled by the pharyngeal constrictors, is the most
logical source of the pharyngeal constriction in the former, while the latter two classes
achieve their pharyngeal constrictions through the backing of the tongue dorsum, which
is controlled by the extrinsic muscles of the tongue. But does this, then, support the view
that emphatics are uvularized? The notion of a 'uvularized' sound means one of two pos-
sibilities. The first is that the sound has a secondary articulation involving the retraction
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
238
and raising of the tongue dorsum towards the uvula. The second is that the sound has a
secondary articulation that is a full-fledged copy of primarily uvular sounds in the sense
that there is a raising and retraction of the tongue dorsum with a concomitant action (curl-
ing or flattening) of the soft palate. The articulatory studies cited in Chapter 2 do not sup-
port either one of those possibilities. During emphatic articulation, the tongue dorsum is
retracted almost horizontally towards the posterior wall of the pharynx while the soft pal-
ate shows no peculiar action. The results of Experiment Three help us understand the ar-
ticulatory differences between emphatics and uvulars (and gutturals in general). Arabic
emphatics show strong resistance to vowel-to-vowel coarticulation. This finding suggests
that these sounds employ the same set of articulatory muscles as vowels. The direction of
the tongue back movement suggests that the muscles in action are the hyoglossus and the
styloglossus. Both of these two muscles pull back the tongue mass into the pharynx. As
far as the vertical placement of the tongue, these two muscles have effects that are in in-
direct opposition to each other. The styloglossus raises the back of the tongue while the
hyoglossus lowers it. This is the possible reason behind the variability in the height of the
tongue dorsum during emphatics. Compare the vertically balanced, or slightly lowered,
tongue dorsum during [f1] in Al-Ani's (1970) x-ray images as well as during both and
in Ghazeli's (1977) images to the lowered dorsum during in the study of Ali and
Daniloff's (1972) images as well as during in Ghazeli's images. The vector sum of
the simultaneous pull by the hyoglossus and the styloglossus is a general retraction on the
horizontal axis accompanied by slight variability in vertical positioning. All of the possi-
ble locations of the retracted tongue dorsum during emphatics occur near the first anti-
node of F2 which ensures a relatively stable acoustic effect.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
239
We mentioned in discussing the results of Experiment Two that Arabic emphatics
show similar acoustic coarticulatory effects on neighboring vowels as those reported for
velarized sounds (Ladefoged and Maddieson 1996). This particular similarity merits fur-
ther elaboration. Russian is one language that has a set of velarized sounds. A rich source
of articulatory and acoustic evidence on Russian is Bolla (1981 ). Figure 6.1 shows
Bolla's x-ray tracings of Russian velarized [P'] and [r11 ] as well as their palatalized coun-
[P]
Figure 6.1. X-ray tracings of palatalized and velarizedlaterals and liquids in Russian (Bolla 1981, plates 76-79).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
240•
terparts [Ii] and [ri] for the sake of comparison. We can clearly see that Russian velarized
sounds display tongue. dorsum retraction into the upper pharynx that is quite similar to the
retraction involved in Arabic emphatics. In fact, Bolla refers to these Russian variants as
'pharyngealized'. It seems that the articulatory configurations of Arabic emphatics are
not as rare or as unique as one would think when reviewing a large portion of the re-
search done on them. It seems also that none of the articulation-based terms used to refer
to Arabic emphatics mentioned in Chapter 1 are The term 'pharyngealized' is
better suited for sounds where the secondary articulation takes place lower in the pharynx
like Tsakhur vowels (Hess 1992). Such articulations are more likely executed by the pha-
ryngeal constrictor muscles, not the tongue muscles. I propose here calling Arabic em-
phatics (as well as Russian velarized sounds, dark [1] English, and similar sounds) 'oro-
pharyngealized' in reference to the general place where the retracted tongue dorsum
moves to. I, therefore, call for a new IPA diacritic to symbolize this articulation. The dia-
critic ['] used in the Handbook of the IPA (1999), as well as throughout this dissertation,
to symbolize secondary articulation in emphatics is more suited for true pharyngealized
sounds such as those in Tsakhur. The diacritic [¥] is derived from the corresponding sym-
bol [y] used for voiced velar fricatives. This symbol would be misleading since the
tongue dorsum movement during emphatics does not involve any substantial raising to-
wards the velar region.
19 'Velarized' seems to be relatively more appropriate than the terms pharyngealized and uvularized since it groups emphatics with phonetically similar sounds encountered in other languages. However, all three terms are not accurate articulatorily. Norlin (1987) rejects the term velarization based on the incompatibility between the real acoustic effects of emphaticness on neighboring vowels and the effects yielded by vocal tract models of velarization. It should be noted, however, that Norlin interprets the term 'velarization' literally (raised tongue dorsum towards the velar region) as shown in his vocal tract model implementations.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
241
Further evidence against the claim that the secondary emphatic articulation is a
retraction of the tongue root is provided by Experiment Three (Chapter 5). In this ex-
periment, emphatics exhibit strong resistance to vowel-to-vowel coarticulation. This is to
be expected since their articulation largely employs the tongue dorsum which is the main
articulator of vowels. Had the tongue root been the implicated articulator, we would ex-
pect more substantial vowel-to-vowel coarticulation across emphatics since the articula-
tion of Arabic vowels does not implicate the tongue root. Uvulars range from strongly
opaque to vowel-to-vowel coarticulation to highly transparent to it. The ranking of resis-
tance vowel-to-vowel coarticulation among uvulars falls in a specific order: [q] > [X] >
[ff]. This is also the same ranking of the degree of constriction involved in those three
sounds. For the tongue back to execute a stronger degree of constriction in the uvular
area, it has to be raised higher and brought back further. The articulatory studies cited
earlier show exactly that. Raising and backing is what contracting the styloglossus does
to the tongue dorsum. So, this muscle has to be implicated in the articulation of all three
uvulars. However, since uvulars as a group allow vowel-to-vowel coarticulation, their
articulation cannot be executed primarily by the styloglossus as suggested in McCarthy
(1994). If the styloglossus were the main contributor in uvular articulation, we would ex-
pect much stronger resistance to vowel-to-vowel articulation since the styloglossus is a
very active muscle during vowel articulation. The degree of tongue body retraction in
uvulars is mostly less than in emphatics. On the other hand, the tongue dorsum is raised
considerably in uvulars as opposed to emphatics. There has to be a difference in the mus-
cles articulating both sounds. Raising the back of the tongue in the manner displayed by
uvulars is most likely due to the contraction of the palatoglossus. This is a very natural
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
242
assumption given the fact that this muscle is also affiliated with the soft palate which also
actively involved in uvular articulations. The exclusion of the hyoglossus from uvulars
makes further sense now. The hyoglossus pulls the tongue dorsum downwards away from
the uvular point of articulation in an action that is fully antagonistic to that of the pala-
toglossus (Seikel et al. 1997). So, emphatic articulation is most likely carried out by two
muscles, the styloglossus and the hyoglossus, that are also active in the production of
vowels while uvulars are produced mainly by the action of the palatoglossus along with
the velar depressors (for [B]) and velar tensors (for [X] and [q]). The styloglossus is ar-
gued here to be also involved in uvulars to maintain a constriction narrow enough for the
production of obstruents. The magnitude of styloglossus contribution increases in line
with the degree of constriction required by the uvular sound. Since the styloglossus is
also active in vowel production, uvulars resist vowel-to-vowel coarticulation in a manner
that reflects the degree to which this muscle is involved in their production.
In sum, the pharyngeal constrictions involved in emphatics and gutturals are the
products of different muscular mechanisms. Laryngeals are produced by the constriction
or spreading of the vocal folds which are controlled by the intrinsic muscles of the larynx.
Pharyngeals are clearly produced with a retracted tongue root which is controlled by the
pharyngeal constrictors. Uvulars are produced by complex velar and lingual maneuvers.
In [B], the velum is curled downwards and inside the mouth. In [X] and [q] the velum is
pulled upwards and flattened. The tongue dorsum in all three uvulars is pulled. up and
backwards. These maneuvers are executed by both lingual muscles (styloglossus and
palatoglossus) and velar muscles (tensors and depressors as well as the palatoglossus
again). The secondary articulation in emphatics is a retraction of the tongue dorsum as a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
243
product of the contraction of both the hyoglossus and the styloglossus. The acoustic find-
ings also show that Arabic pharyngeals are approximants while uvular continuants are
fricatives. Following is a discussion of the phonological ramifications of these articula-
tory details.
6.2 Alternative Basis for the Guttural Natural Class
As explained in the Chapter 2, McCarthy ( 1994) excludes the possibility that
Arabic gutturals share a common active articulator. He offers instead the place of articu-
lation feature [pharyngeal] to represent these sounds phonologically. McCarthy justifies
this feature, which covers a wide articulatory area, on the basis of Perkell's (1980) pro-
posals that distinctive features are "orosensory patterns" that provide feedback informa-
tion specific to each articulatory movement and that are linked to consistent acoustic
characteristics. McCarthy rationalizes that the neuro-sensorily-impoverished pharynx (in-
cluding the larynx) should be treated as a single place of articulation. After explaining his
main proposal, McCarthy adds:
"The orosensory target model is not the only possible approach to the
problem posed by the feature [pharyngeal]. One obvious alternative is that
the pharynx has a uniform characterization in motoric terms. Clearly this
is not true at the lowest level: the uvular constriction is presumably made
primarily by a gesture of the styloglossus, while the true pharyngeals 1
and h are formed by the pharyngeal constrictors and the glottals are made
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
by the intrinsic muscles of the larynx. But it is certainly possible that these
consonants form a motoric unity at some much higher level." (p. 201)
244
McCarthy's assumption that uvulars are primarily controlled by the styloglossus
causes him to miss the "higher level" of "motoric unity" among the various gutturals. The
rationalization for the differences betwe.en the articulations of uvulars the emphatics pro-
posed earlier depends on dissecting the main articulators into the components of their
musculature. So the articulatory maneuvers and constraints are viewed as properties of
the individual muscles rather than· the complex speech organs that are controlled by sev-
eral muscles. Interestingly, this point of view has another important explanatory potential
as far as the unification of gutturals into a single natural class is concerned. When consid-
ering the innervation sources of the individual articulatory muscles, we notice that the
tongue muscles (including the styloglossus and the hyoglossus) receive their motor in-
nervation from the XII hypoglossal cranial nerve (Zemlin 1968, Perkins and Kent 1986,
Seikel et al. 1997). Meanwhile, motor innervation of the palatoglossus and most velar
muscles is supplied by the X vagus cranial nerve. An exception is the tensor veli palatini
which is innervated by the mandibular branch of the V trigeminal nerve. The X vagus
also supplies motor innervation for most of the pharyngeal muscles including the pharyn-
geal constrictor muscles that are the main active muscles during the articulation of
pharyngeals. One notable exception is the stylopharyngeus muscle which is innervated by
the IX glossopharyngeal cranial nerve. The intrinsic muscles of the larynx that control the
vocal folds and glottal spreading or constriction also receive their motor innervation from
the X vagus. So the main muscles in the pharyngeal region that are implicated in the ar-
ticulations of the three guttural subgroups share a main source of innervation: the X
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
245
vagus cranial nerve. This shared neuromotor conduit that is not involved in the produc-
tion of oral sounds should be seriously considered when asking why three sound sub-
classes that are produced at three different points of articulation are viewed as compo-
nents of a single class of sounds.
This neuromotor-based grouping of gutturals, along with the previous explana-
tions of vowel-to-vowel coarticulatory patterns involving Arabic uvulars and emphatics,
are largely in accord with Joos' (1948) "Overlapping Innervation Wave Theory" re-
advocated recently by Lindblom and Sussman (2002) and Lindblom et al. (2002). Joos'
concept is based on the assumption that speech segments are the result of individual neu-
ral waves sent simultaneously but individually from the speech control center to the in-
volved articulators. These waves increase and diminish in time and overlap with the
waves of neighboring segments. Resolving different neural instructions carried to the ar-
ticulator by neighboring waves is simply a matter of combination or subtraction. If the
two waves place non-contradicting demands on the articulator, the vector sum of the two
movements is executed. If the two articulatory demands are in contradiction with each
other, the weaker wave is subtracted from the stronger wave and the result gets executed
by the articulator. The innervation waves are static and invariant at the highest level of
organization. Joos refers to these abstract forms as 'neuremes'-his equivalent to the
phoneme. Aside from the theoretical details of this model, its basic concepts are not very
different from some of the more recent models that have been proposed to explain coar-
ticulation; mostly those of Fowler and Recasens (see §5.1).
The neural waves to the guttural articulators travel through separate pathways
from the neural waves to the lingual articulators. Hence, the neuromotor articulatory in-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
246
structions for a pure guttural sound would not clash with those for vowels. This is why, in
Experiment Three, vowel articulations move relatively smoothly from one vowel to the
other through an intervening laryngeal or pharyngeal. In the secondary articulation of
emphatics, the implicated muscles and neural pathways are the same ones implicated in
the articulations of vowels. This way, the neural instructions to the articulators clash re-
suiting in an interruption of the articulatory transition from one vowel to the next through
an intervening emphatic. In uvulars, both neural pathways (lingual and guttural) are em-
ployed. However, the strength of the signal traveling through the lingual neural pathway
depends on the required degree of constriction. In [q], the degree of constriction is maxi-
mal. Hence, the lingual neural signal is relatively strong and is capable of interrupting the
exchange between the neural instructions of the flanking vowels. This translates in a
strong resistance for vowel-to-vowel coarticulation. In [ff], the degree of constriction is
minimal meaning the lingual neural wave is weak and is easily overridden by those of the
flanking vowels. This is why [ff] allows substantial vowel-to-vowel coarticulation. The
voiceless [X] falls in between both in lingual involvement and transparency to vowel-to-
vowel coarticulation.
Clearly, then, the main muscles argued here to be involved in the articulation of
uvulars are closely related, from a neuromotoric angle, to those involved in the articula-
tion of both pharyngeals and laryngeals. We can, therefore, attribute those muscles to a
single active articulator?0 This articulator extends from the anterior faucial pillars to the
2° Kenstowics (1994) briefly makes a similar suggestion noting that "[a] staunch supporter of the articulator model might speculate that the Glottal and Tongue Root articulators run along the same pathway and branch more superficially: the relevant sense of 'articulator' would be defined at this more abstract level" (p. 457). Zawaydeh (1999) also generally refers to the pharynx as an active articulator. She provides no convincing reasoning, however.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
247
larynx, inclusively. This v1ew can be considered as a complement to Perkell's and
McCarthy's identification of distinctive features as orosensory patterns. Like all guttural
articulations, all lingual articulations share a single source of motoric innervation (the XII
hypoglossal). However, unlike the pharynx, the tongue is divided into two distinct active
articulators: the tongue blade and the tongue dorsum. This disparity can be understood
when considering the lower degree of tactile sharpness available to the general pharyn-
geal articulator as opposed to that available to the lingual articulators. The more detailed
sensory feedback in the oral cavity maximizes the efficiency of the XII hypoglossal nerve
as single conduit of motoric innervation to two active articulators.
The guttural components in the formal representations of Arabic gutturals should
be used in reference to this common articulator. So, based on the implicated muscles and
their motoric innervation sources, laryngeals, pharyngeals, and uvulars should be re-
garded as involving guttural components (corresponding to the intrinsic laryngeal mus-
cles, the pharyngeal constrictors, and the palatoglossus with the soft palate muscles along
with the X vagus nerve which innervates those muscles). Uvulars should also contain a
dorsal component (corresponding to the styloglossus muscle and the XII hypoglossal
nerve which innervates all of the tongue muscles). On the other hand, emphatics should
involve only a dorsal component (corresponding to the styloglossus and the hyoglossus as
well as the XII hypoglossal nerve). Since none of the muscles innervated by the X vagus
nerve are argued to be actively involved in the articulation of Arabic emphatics, no gut-
tural components should be present in their formal representations.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
248
6.3 Formal Representations
Formal phonological representations of Arabic emphatics and gutturals are pre-
sented here in the light of the previous elaboration on the articulatory traits of those
sounds. The general representational framework adopted here is that of Halle et al.'s
(2000) Revised Articulator Theory (RAT) presented in ( 1) in the Introduction. There are
two main reasons for this choice. First, the anatomical organization of the vocal tract is
reflected accurately in this model. Most notably, the grouping of the Tongue Root with
the Larynx under the Guttural node reflects the close relationship between the muscles
controlling those organs as pointed by the authors. This particular point strongly agrees
with the explanations and arguments in §6.2 above regarding the articulatory affinities
among guttural sounds. Second, the model introduces a formal way of reflecting the dif-
ferences between primary and secondary articulations that avoids the shortcomings of
competing proposals. The latter point will be explained shortly.
I propose here some changes to Halle et al.'s (2000) model. First, the term 'Place'
should be replaced with 'Oral':' Using the term 'Place' conveys the idea that no portion
of the vocal tract is considered an encompassing place of articulation but the oral tract.
Clearly the pharynx and the larynx are other vocal tract portions where speech articula-
tions are made. Moreover, the term 'Oral' highlights the dichotomy of the vocal tract into
two main articulatory zones, the oral tract and the guttural tract. Alternatively, McCarthy
(1994) suggests retaining the Place node and bifurcating it into an Oral node and the fea-
21 The similar term "Oral Cavity" is used by Clements (1987) to refer to the class node that dominates the node Place (basically the equivalent of Place in the Halle et al. 2000 model) and the stricture feature [±continuant].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
249
ture [pharyngeal] (see (16)). According to McCarthy, this solution is motivated in part by
some phonological rules of vowel-to-vowel assimilation that are blocked by Oral conso-
nants but not guttural ones. Note, however, that Halle et al.'s model does not need tore-
sort to such stipulation since there is an inherent separation between the Oral and the Gut-
tural nodes. Second, I propose changing the name of the articulator node 'Tongue Root'
to 'Pharynx'. While the term 'Tongue Root' is not formally problematic, it remains a
misnomer and a potential source of confusion. Tongue root-based articulations are clearly
not mainly executed by any of the tongue muscles. Rather such articulations are con-
trolled by the pharyngeal constrictors. Thus, using the term 'Pharynx' instead adds to the
clarity of the phonetic naturalness of the model. Lastly, I suggest adding the feature [ap-
proximant] to the class features [consonantal] and [sonorant]. According to Clements
( 1990) approximants act as a group in some phonological processes. Both McCarthy
(1994) and Padgett (1995) argue that this feature interacts with OCP-based restrictions on
morpheme well-formedness. The modified feature tree is shown in (20).
The Halle et al. (2000) model acknowledges six articulators and represents them
by the nodes Lips, Tongue Blade, Tongue Body, Soft Palate, Tongue Root, and Larynx.
The first three articulator nodes are grouped under the higher level node Place while the
last two are grouped under Guttural. One feature under each articulator node is unary
while the rest are binary. The unary features are the six articulator features [labial], [cor-
onal], [dorsal], [rhinal], [radical], and [glottal]. The presence of any of those features in
the representation of a phoneme indicates that its respective articulator is the designated
articulator for that phoneme. If two of those features are present, the phoneme is consid-
ered a complex sound with two primary articulations. The authors use the phoneme /kP/
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(20) Modified version of Halle et al' s (2000) feature tree. [suction]------------------.. [continuant] -----------------.... [strident] -----------------.. [lateral]----------------.... [round] [labial] > Lips ---------..
250
[consonantal] [distributed] /Tongue [sonorant]
[approximant] [high]-------.. [low] -------..... [back] -------' [dorsal] ____ __,
Tongue Body
[nasal]---------....> Soft Palate __ __,
[ATR] [R TR] ) Pharynx ------..
[spread gl] --------.. [constricted gl] --------.... [stiff vfJ ---------7 Larynx ___ __, [slack vfJ ___ ____, [glottal] ___ ___.,
Guttural
as an example. For this sound, both [labial] and [dorsal] are specified under their respec-
tive articulator nodes indicating that the sound is labiodorsal. If one of the articulations in
a complex sound is primary and the other is secondary, only the articulator feature of the
primary articulation is present. The example used by the authors for this type of sounds is
the phoneme /kw/. In this case, only the feature [dorsal] is present indicating that this
sound is primarily dorsal. To indicate that the sound is a labialized dorsal, the feature
[+round] is specified under the Lips articulator without the presence of the feature [la-
bial].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
251
In this sense, Arabic emphatics have the articulator feature [coronal] present un-
der the articulator node Tongue Blade since Arabic emphatics are primarily coronal. The
feature [+back] is specified under the Tongue Body articulator node signifying the pres-
ence of a secondary backing of the tongue dorsum in Arabic emphatics. It should be
noted that this representational proposal for Arabic emphatics is not new. In illustrating
the superiority of the RAT model over Herzallah's (1990) V-Place-based model in repre-
senting Arabic emphatics, Halle et al. (2000) offer a formal proposal containing the fea-
ture [+back] under the Tongue Body node. I adopt this representation here with the minor
replacement of Place with Oral in accordance with the general modification to the model
stated earlier. The modified representation is displayed in (21 ).
(21) Representation of Arabic emphatics (slightly modified from Halle et al. 2000:409) .
t" d" o" s"
I Oral
Blade Body
[cor] [+back] [-high] [-low]
The relevance of the features [-high] and [-low] is not of great importance for our spe-
cific purposes. It can be argued, however, that these two features are a phonological re-
flection of the articulatory vertical equilibrium of the simultaneous use of two muscles to
execute the dorsal retraction: the styloglossus, which is also a tongue back elevator, and
the hyoglossus, which is also a tongue back depressor.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
252
In the same context where Halle et al. present their representation for emphatics,
they propose the representation shown in (22) for Arabic uvulars that treat them primarily
(and solely) as velars.
(22) Halle et al.' s representation of Arabic uvulars (2000:409)
XBq I
Place
Body
[dors] [+back][-high] [-low]
In this representation, the main articulation of uvulars is identical to the secondary articu-
lation of emphatics. It appears that Halle et al. (2000) agree with the view that Arabic
emphatics are 'uvularized' coronals which is a view that this dissertation rejects. The
acoustic evidence presented here favors the well established view that uvulars are com-
plex segments that have both a dorsal and a radical component (see, in particular, Elorri-
eta 1991). However, unlike the prominent claim that Arabic uvulars are double articu-
lated (McCarthy 1994; Davis 1995, Zawaydeh 1999), I argue here that Arabic uvulars,
including the stop [q] are primarily guttural and secondarily dorsal. Phonetic evidence for
this comes from the timing of tongue movements involved in the articulation of uvulars.
Ladefoged and Maddieson (1996) note that, in double articulated sounds, both articula-
tions are simultaneous. It was mentioned in Chapter 2 that Delattre (1971) notes in his x-
ray study that there is a two-staged movement by the tongue body during the articulation
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
253
of Arabic uvulars. The first is a horizontal sliding backwards and the second is a raising
of the retracted tongue body towards the soft palate. It is clear that the two articulations
involved in the production of Arabic uvulars are not simultaneous. Furthermore, accord-
ing to Ladefoged and Maddieson (1996), only stops and nasals may be doubly articu-
lated. For fricatives produced by two gestures, one of those gestures has to be considered
a secondary articulation. Secondary articulations usually start before and end after pri-
mary ones (emphasis spread comes to mind here). It is logical to assume that the backing
movement is the secondary articulation in uvulars while the raising is the primary articu-
lation. It has already been explained that tongue retraction is a result of constriction of the
styloglossus (keeping in mind that the hyoglossus cannot be contracted during uvulars). I
therefore consider Arabic uvulars to be primarily guttural and secondarily dorsal. The
designated articulator feature for their primary articulation is [radical] under the Pharynx
node. Hence, I give uvulars the phonological representation in (23).
(23) Alternative representation of Arabic uvulars
XHq r---, Oral Guttural
I I Body Pharynx
rT--1 (l [+back][-high] [-low] [radical] [-RTR]
Notice that I designate uvulars as [-RTR]. This goes against several previous claims
(Davis 1995, Halle 1995, Rose 1996, Shahin 1997, Zawaydeh 1999). Previous proposals
that Arabic uvulars are [+RTR] rely, for the most part, on Ghazeli's (1977) x-ray images.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
254
These images are interpreted as depictions of a tongue root retraction in uvulars. How-
ever, a look at the tongue root and epiglottis locations in Ghazeli's images of plain oral
consonants reveals that these locations in Ghazeli's subject (himself) start at rest points
that are further back than usual. It is only logical that a backing of the tongue dorsum,
such as what we see in uvulars, results in further backing of the epiglottis and tongue root
to which they are attached. Given the already backed start locations, the added by-
product retraction might seem subjectively large. On the other hand, if we consider Delat-
tre's (1971) x-ray images of Arabic uvulars we see no substantial backing of the tongue
root and epiglottis. Additionally, the acoustic evidence reported in Experiment Two
shows that Fl transitions in vowels neighboring uvulars, while generally high, are not as
high as in vowels neighboring pharyngeals. The claim that uvulars are [+RTR] is unsub-
stantiated phonetically.
As for Arabic pharyngeals and laryngeals, I give them the representations in (24).
We have argued earlier that only pharyngeals involve .an active retraction of the root of
the tongue (through the action of the pharyngeal constrictors). Accordingly, the feature
[ +RTR] is present only in these sounds. The whole supra-laryngeal tract in laryngeals is
passive during laryngeals. Therefore, these sounds are considered [-RTR].
(24) Alternative representations of Arabic pharyngeals and laryngeals h) h?
I I Guttural Guttural
I Pharynx Pharynx
rl rl [radical] [+RTR] [radical] [-RTR]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
255
Now that we have introduced the alternative proposals for the formal phonologi-
cal representations of Arabic emphatics and gutturals, let us review the capability of these
proposals to overcome the descriptive and analytical shortcomings of the previous pro-
posals. We will discuss how these alternatives handle the Arabic MSCs of root cooccur-
rence restrictions and guttural-conditioned vowel lowering.
6.3.1 Arabic Morpheme Structure Constraints Revisited
It was mentioned in §2.5 that existing representational proposals predict that Ara-
bic emphatics and gutturals should not cooccur in the same roots since the secondary ar-
ticulation in emphatics and the primary articulation in gutturals are similar at the level of
the terminal features ([pharyngeal]- McCarthy 1991, 1994), class nodes (Pharyngeal-
Rose 1996), or Lower Vocal Tract node (L VT - Zawaydeh 1999) triggering an OCP
violation. In reality, Arabic emphatics and gutturals cooccur quite freely as seen in Table
2.2. The alternative representations proposed here overcome this problem trivially since
the secondary articulation in emphatics is fundamentally different from the primary ar-
ticulations in gutturals. However, there are two problems in this context that we still need
to address. The first problem concerns the free cooccurrence of emphatics and uvulars
while both share dorsal components. It was noted in §2.5 that combinations of emphatics
and velars or uvulars and velars are avoided in Arabic roots. We can easily say that since
velars are dorsal sounds and both emphatics and uvulars are secondarily dorsal, the pres-
ence of an emphatic or a uvular in an Arabic root that also has a velar stands in violation
of the OCP. But why do emphatics and uvulars cooccur rather freely when both have
secondary dorsal components? The second problem concerns the free cooccurrence of the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
256
uvular stop [q] with low gutturals (pharyngeals and laryngeals). We would expect that the
cooccurrence of [ q], as a guttural itself, with other gutturals to be significantly more re-
stricted than what Table 2.2 shows. We propose here some modifications to the domain
and mechanism for the application of place OCP to address these two problems in order.
Let us start with the issue of emphatic-uvular unrestricted cooccurrence. We men-
tioned in §2.5 that Pierrehumbert (1993) requires the place OCP effect to exclude secon-
dary articulations. This requirement, in its present shape, is too loose and would wrongly
predict that velars should cooccur freely with emphatics as well as with uvulars. I would
like to modify this requirement by stating that place OCP effect excludes secondary ar-
ticulations only if both affected articulations are secondary. What this means is that a
primary articulation cannot cooccur with a similar articulation whether it is primary or
secondary. On the other hand, a secondary articulation can cooccur with a similar secon-
dary articulation. It is necessary for the possibility of satisfying this requirement to mod-
ify of the level at which the place OCP operates. Place OCP applies to individual auto-
segmental place features (Mester 1986; McCarthy 1988, 1991, 1994; Yip 1989 - see
also Padgett 1995 for the role of stricture features in place OCP application). I propose
here that the place OCP functions at the level of the articulator class nodes in Arabic
roots. In the Halle et al. (2000) articulator-based model a terminal articulator feature is
basically a label indicating that its dominating articulator class node is the designated ar-
ticulator executing the structure features of the segment. So there is some equivalence
between the articulator features and their dominating articulator nodes. There is an addi-
tional benefit to this proposal. Place OCP affects only articulator features but not binary
features like [±back]. McCarthy (1986; cited by Padgett 1995) argues that articulator fea-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
257
tures that are subject to place OCP effects must be privative. While this requirement is
inherited in the Halle et al. (2000) model used here, arguing for class nodes as the domain
over which place OCP operates discards the need for this requirement to begin with.
The presence of a terminal articulator feature in a sequence of sounds makes the
tier of that feature, along with the tier of its dominating articulator node, visible for the
place OCP. If a certain articulator feature is not present in any one of a sequence of pho-
nemes, the tier of its dominating node is invisible to the place OCP. So, if two segments
employ an articulator which is secondary in both cases, these segments are not avoided
since the relevant articulator nodes are not visible to the effect of the OCP. If, on the
other hand, the articulation in question is secondary in one sound but primary in the
other, the relevant tier on which the articulator node is projected would be visible and the
OCP would take effect. So, the place OCP can be formulated as seen in (25).
(25) Place OCP
*x x
Artie. Node Y Artie. Node Y
[artie.] ([artie.])
In (24) the placement of the terminal articulator feature in the second phoneme in paren-
thesis indicates that the presence of the terminal articulator feature in one or both pho-
nemes triggers an OCP violation at the node level since it takes only one terminal articu-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
258
lator feature to expose the tier of its dominating node to OCP operation. All Arabic gut-
turals share the same articulator ([radical]), causing the node Pharynx to be visible to
place OCP effects. Thus, their cooccurrence in the same root is restricted since it violates
the place OCP as shown in (26; irrelevant nodes and features are omitted).
(26) *x
Pharynx Pharynx
[radical] [radical]
The cooccurrence of emphatics with the velar [k] (and similarly the cooccurrence of uvu-
lars with [k]) is restricted since the Tongue Body node is exposed to the effects of the
place OCP. This exposure is due to the existence of the dependant articulator feature
[dorsal] in [k]. An example is given in (27).
(27) *t" k
Tongue Body Tongue Body
[dorsal]
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
259
As for the cooccurrence of emphatics and uvulars, neither class has the feature [dorsal] in
its representation causing the tier of the articulator node Tongue Body to be invisible for
place OCP purposes as shown in (28). Therefore, emphatics and uvulars may cooccur
freely in Arabic roots.
(28) X
Tongue Body Tongue Body
Specifying the applicability domain of the place OCP over the tier of articulator
class nodes, as we do here, is, in some ways, procedurally similar to the treatment of root
cooccurrence restrictions in Padgett's (1995) 'articulator group' proposal. Padgett pro-
. poses that the elements addressed by the place OCP are the articulators. Upon identifying
articulator similarities, the OCP mechanism than checks for similarities in "OCP-
subsidiary features" which are stipulated in the language for a given articulator.
We turn now to the free cooccurrence of the uvular stop [q] with low gutturals.
We saw earlier that the cooccurrence of two coronals in Arabic roots is not avoided
unless both are sonorants, fricatives, or stops. McCarthy (1994, and references therein)
use the limiting statements in (29) to govern the domain within which the OCP may ap-
ply to restrict the root cooccurrence of Arabic coronals.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
260
(29) Applicability domain of the OCP for Arabic coronals (from McCarthy 1994:206)
a. [coronal] I __ b. [coronal] I __
[ acontinuant] [asonorant]
The first statement denotes that OCP-based restrictions apply to two coronals only if they
share the same feature specification for [continuant]. The second statement indicates that
OCP-based restrictions apply to two coronals only if they share the same feature specifi-
cation for [sonorant]. Going back to the issue at hand, notice that Arabic low gutturals are
approximants while uvulars (including [q]) are not. Meanwhile, all Arabic gutturals, ex-
eluding [q], are continuants. The feature specifications for [approximant] and [contin-
uant] for Arabic gutturals are listed in (30).
(30) Stricture features specifications for Arabic gutturals
[approx] [cont]
q
+
lll + +
h? + +
Notice that, in terms of articulatory stricture, the uvular stop [q], on the one hand, and the
low gutturals, on the other hand, are maximally different. Both feature specifications for
these two sides are opposite each other. Uvular continuants stand in the middle, sharing
one feature specification with each side. It is possible to use those differences and rela-
tions in stricture features to refine our understanding of place OCP-based restrictions on
the cooccurrence of gutturals. Interaction between stricture and place OCP application is
well documented (see Padgett 1995 and references therein). According to Padgett, a Ian-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
261
guage stipulates which features interact with place OCP applicability on a given articula-
tor. Whereas this stipulation is integrated in a process of 'checking' for OCP applicability
in Padgett's theory, I follow McCarthy in stating the stipulation as independent limiting
statements whose satisfaction is a prerequisite for the application of the place OCP.
Based on these two constriction features, I would like to present the statement in (31) to
limit the place OCP applicability domain for Arabic gutturals.
(31) Applicability domain of the OCP for Arabic gutturals
[radical] I __ or
[acont] [aapprox]
According to this statement, two guttural segments must share the same specification for
[approximant] or [continuant] to trigger a place OCP violation. The uvular stop [q] does
not cooccur with either [X] or [ff] since all three sounds are [-approximant]. All gutturals,
excluding [q], are [+continuant], explaining their rare cooccurrences. On the other hand,
[q] does not share any of the feature specifications for [approximant] and [continuant]
with pharyngeals and laryngeals. The cooccurrence of [q] with low gutturals, therefore,
lies outside the limitations of (31 ).
In sum, the proposed representations for Arabic emphatics and gutturals, in con-
junction with the formulations in (25) and (31) offer more adequate explanations for the
patterns of root cooccurrences involving emphatics and gutturals than previous proposals.
We turn next to the issue of guttural lowering in vowels.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
262
6.3.2 Guttural Lowering Revisited
In this section, we take a look at the issue of vowel lowering in the neighborhood
of gutturals but not emphatics. Section 2.3.2 reviewed some of the phonological evidence
presented by a number of phonologists illustrating that, in several languages, vowels are
lowered to [a] or epenthetic vowels surface as [a] when adjacent to gutturals. McCarthy
(1994) treats this as a spreading of the feature [pharyngeal] from the guttural to the
vowel. Emphatics, which in McCarthy's proposals also have the feature [pharyngeal], do
not trigger such lowering effects. McCarthy vaguely appeals to the link between ap-
proximants and low vowels to justify this disparity. In the alternative proposals we pre-
sent here, vowel lowering can be treated as the spreading of the feature [radical] from
gutturals to target vowels. The spreading rule is formulated in (32). Since emphatics un-
derlyingly lack the articulator node Pharynx, they do not cause vowel lowering.
(32) Guttural Vowel Lowering
[radical]
To explain the relationship between the low vowel [a] and the articulator [radical], I use
Calabrese's (1993, cited in Halle et al. 2000) 'equivalency relations'. According to Halle
et al., "[t]his idea is implemented formally by positing that Universal Grammar includes a
special set of rules whose adoption ... is favored over the adoption of other rules". Halle
et al. posit a rule which equates [dorsal, -back] in consonants with [coronal] in vowels.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
263
Following this reasoning, we can posit a similar rule which states that [radical] in conso-
nants is equaled by [dorsal, +low] in vowels.
There remains an important issue to be tackled here. While emphatics generally
do not trigger vowel lowering, there is a process discussed in Herzallah ( 1990) and
McCarthy (1994) where in some varieties of Arabic vowel lowering appears in the adja-
cency of gutturals and emphatics. This process, known as ?imiila 'raising and fronting',
is reported in Northern Palestinian Arabic and Syrian Arabic, among other Eastern Ara-
bic dialects. In these dialects, the feminine suffix surfaces as [-i] or [-e] unless the stem-
final consonant is a guttural, an emphatic, or a contextually emphaticised [r]. Evidence
from Northern Palestinian, as shown in Herzallah (1990:136-137) is presented in
(33) ?imiila in Northern Palestinian Arabic (Herzallah 1990:136-137)
Plain stems
hilm-i 'one dream' sall-i 'a basket' barz-i 'projection' samak-i 'a fish'
Emphatic stems Guttural stems
bat't'1-a 'a duck' falq-a 'a piece, or a half' buus1-a · 'a bamboo stick' salx-a 'one skinning' buuz"-a 'ice cream' mam-a 'loitering' mar'r'-a 'once' fallah-a 'a peasant'
mar3-1 'a small plain' fariid'-a 'an obligation' zarii f!-a 'plants, offspring' walh-a 'love, or sudden
awakening' farJ-i 'a mattress'
Herzallah analyses this process as the spreading of [pharyngeal] to the suffix
vowel. This process seemingly challenges my present argument that emphatics do not
have a pharyngeal component. Notice, however, that the fact that [r] causes vowel lower-
ing only when contextually emphaticised indicates that the representational constituent
22 Herzallah uses the symbol [a] for the emphaticised version of [a].
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
264
whose spreading triggers 7imala does not have to be part of that sound's underlying rep-
resentation.
To account for this process, I propose here that East Arabic languages have a spe-
cific rule that assigns [radical] to any sound that has the feature set [+back, -high, -low].
This rule is ordered before the vowel lowering rule in (32). In some sense, this proposal is
the reverse of McCarthy's (1994) proposal that the feature [dorsal] is redundantly as-
signed to emphatics. McCarthy's proposal is not a natural one since not all pharyngeal
constrictions are executed by the tongue body. Recall that in pharyngeals the tongue dor-
sum is not usually retracted, while in laryngeals the whole tongue is virtually passive. It
makes little sense, then, to assume that a redundancy rule would assign [dorsal] to a [pha-
ryngeal] sound only if it is an emphatic. The reverse rule proposed here is more natural
since it can be linked to the fact that retracting the tongue dorsum necessarily results in an
oropharyngeal narrowing since the tongue mass would have no other place to move to.
Furthermore, the link between sounds that involve a retracted tongue dorsum and guttural
articulations is not rare as the following section shows.
6.4 A Note on Ethio-Semitic and Interior Salish
Having discussed the alternative representations of Arabic emphatics and guttur-
als, we turn our attention briefly to Ethio-Semitic and Interior Salish. Phonological data
from both language families has already been reviewed in Chapter 2. The main issue we
discuss here is that both language families seem to treat their emphatics, in terms of place
of articulation, as members of guttural natural class to the exclusion of laryngeals. Laryn-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
265
geals in these languages are considered placeless. This is different from the case of Ara-
bic where emphatics are not usually grouped with gutturals as a natural class while laryn-
geals are (as reflected by the free cooccurrence of the members of both classes in conso-
nantal roots as well as most cases of vowel lowering). Ethio-Semitic evidence for this
difference comes from Tigre (Lowenstamm and Prunet 1985, 1987; cited in Rose 1996)
where pharyngeals and emphatics (ejectives) trigger vowel lowering, but not laryngeals.
Interior Salish evidence is cited in §2.3.1 where, in Moses-Columbian, a pharyngeal
sound cannot be followed by another pharyngeal, a uvular, or an emphatic (retracted al-
veolar) in the same root. No such restriction on laryngeals exists. The differences in
sound groupings among the three languages (Arabic, Tigre, and Moses-Columbian) merit
some consideration. I point out here that there are phonetic differences among the three
languages as far as emphatics and gutturals are concerned. These differences call for a
reconsideration of the formal representations of emphatics and gutturals in the three lan-
guages.
We start with Tigre. The emphatic sounds in Tigre are known to be ejectives. This
means that the airstream pressure source in the articulation of Tigre emphatics is glottalic
rather than pulmonic. According to Catford (1977), in the glottalic initiation of airstream
pressure "the glottis is tightly closed, the larynx is jerked upwards by action of the extrin-
sic laryngeal muscles, which attach the larynx to the hyoid bone and other structures
above it". Catford also adds that, "[t]here may, in addition, be some secondary sphinc-
teric compression of the pharynx" (p. 68). This maneuver pushes the air volume above
the glottis outwards. This laryngeal action employs both the intrinsic and the extrinsic
muscles of the larynx. The intrinsic muscles execute the glottal closure while the extrinsic
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
266
muscles raise the while larynx t9 push the air volume. The sphincteric narrowing of the
pharynx is caused by the pharyngeal constrictors. According to Seikel et al. (1997), the
laryngeal elevators include the extrinsic muscles of the larynx as well as the thyra-
pharyngeus muscle which is part of the large inferior pharyngeal constrictor muscle. Both
the intrinsic muscles of the larynx and the pharyngeal constrictors are guttural muscles
according to the understanding we established here. It is safe to assume, then, that Tigre
emphatics involve a guttural articulation not present in Arabic emphatics. Proposed rep-
resentations for Tigre pharyngeals and emphatics are shown in (34). Following Rose
(1996), Tigre ejectives are represented as [+RTR]. This feature receives phonetic backing
from Catford's (1977) description cited earlier. The presence of a guttural component in
Tigre gutturals and emphatics should explain their treatment as a natural class in phonal-
ogy.
(34) Representation of Tigre emphatics ( ejectives) and pharyngeals. a. Ejectives b. Pharyngeals
I Oral Guttural Guttural
I I I T. Blade Pharynx Pharynx
I I [coronal] [+RTR] [radical] [ +RTR]
In regards to Interior Salish, the acoustic work by Bessell and Czaykowska-
Higgins (1992) shows that there are strong phonetic similarities between emphatics (or
retracted alveolars, as they are often called), on the one hand, and the uvular and pharyn-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
267
geal gutturals, on the other. They provide acoustic vowel space charts showing the effects
of those sounds on vowels. The charts clearly show that emphatics, uvulars, and pharyn-
geals in Salish correspond to substantially high Fl transitions and substantially low F2
transitions. In Arabic, by comparison, we saw in Chapter 4 (Experiment Two) that em-
phatics correspond to substantially low F2 transitions and mildly high Fl transitions.
Arabic pharyngeals, meanwhile, correspond with substantially high Fl transitions and
slightly lowered F2 transitions. These results suggest that there is a difference in the ar-
ticulatory nature of emphatics and gutturals in Arabic and Salish. A recent ultrasonic
study of Sta'at'imcest (Lillooet Salish) by Namdaran (forthcoming) strongly backs up
this suggestion. The relatively large group of guttural sounds in Sta'at'imcest is generally
similar to that in other Interior Salish dialects. Namdaran reports that emphatics in
Sta' at' imcest are articulated with a retracted tongue root. Retraction of the tongue dorsum
in these sounds seems to be dialect dependant. This is different from Arabic where the
tongue root is only retracted as a byproduct of the strong tongue dorsum retraction. Fur-
thermore, Namdaran notes that the tongue root retraction in Sta'at'imcest [5'] is not severe
(cf. Arabic pharyngeals). This sound also involves a substantially retracted tongue dor-
sum somewhat similar to the uvular [q]. In Sta'at'imcest pharyngeals, the pharyngeal
constriction takes place in the upper pharynx. As a matter of fact, Namdaran describes
the Sta' at' imcest sounds [5' ,lw ,l' ,l'w] as pharyngealized uvulars. Arabic [5'] is articulated
with a largely retracted tongue root while the tongue dorsum is held in a mid oral position
(Ghazeli 1977, Delattre 1971). There clearly are a fundamental differences between Ara-
bic and Sta'at'imcest in the articulations of emphatics and gutturals. Both classes of
sounds in Sta'at'imcest involve active retraction of the tongue root and, depending on the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
268
dialect, active retraction of the tongue dorsum. Representations of Sta'at'imcest emphat-
ics and pharyngeals are shown in (35). The representation of Sta'at'imcest emphatics is
essentially a version of the representation given by Namdaran (forthcoming) modified
slightly to fit the Halle et al. (2000) model.
(35) Representations of Sta'at'imcest retracted alveolars and pharyngealized uvulars. a. Retracted Alveolars b. Pharyngealized Uvulars
Oral Guttural Oral Guttural
I I I I T. Blade Pharynx T. Body Pharynx
I I [coronal] [+RTR] [+back] [radical] [+RTR]
The reasoning behind the grouping of Tigre and Sta'at'imcest gutturals into a sin-
gle natural class is less abstract than the one provided for Arabic in §6.2. The natural
class of gutturals in Tigre and Sta'at'imcest includes all sounds that are produced with a
retracted tongue root (i.e., [+RTR]). Laryngeals are excluded from these classes since
they do not involve any supraglottal constriction of their own.
6.5 Summary
This chapter puts together the results of the previous three acoustic experiments
and arrives at more detailed understanding regarding the articulatory traits of Arabic em-
phatic and guttural sounds. These details are used to motivate alternative formal represen-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
269
tations of these sounds. Emphatics are argued to be primarily coronal and secondarily
dorsal sounds. Uvulars are argued to be primarily radical and secondarily dorsal. Pharyn-
geals and laryngeals are argued to be radical sounds. Only pharyngeals are argued to be
[+RTR] sounds in Arabic. The three guttural subclasses are argued to be produced by a
single active articulator, the pharynx, based on high-level motoric unity among the differ-
ent guttural muscles producing those subclasses. The formal representations presented
here are shown to be better at handling the theoretical challenges facing existing propos-
als. Emphatics are allowed to cooccur freely with gutturals since emphatics do not share a
pharyngeal component with gutturals. While both emphatics and uvulars have dorsal
components, it is argued that since these components are representations of secondary
articulations in both classes of sounds, they do not trigger any OCP violations. Hence, the
cooccurrence of emphatics and uvulars is not restricted. The free cooccurrence of the
uvulars stop [q], which is considered here to be a guttural, with low gutturals is argued to
be a consequence of maximal distinction between this stop and the low gutturals in terms
of stricture features.
Vowel lowering in guttural contexts is argued to be the result of spreading the ar-
ticulator feature [radical]. Emphatics do not generally cause vowel lowering since the
lack the feature [radical]. However, to account for cases where both emphatics and gut-
turals cause vowel lowering in some Eastern Arabic dialects, it is argued here that a re-
dundancy rule in such dialects assigns [radical] to any sound with the feature specifica-
tion [+back, -high, -low]. In these dialects, the assigned feature [radical] in emphatics
and the contextually emphaticised [r], along with the underlying feature [radical] in gut-
turals are spread one to the target vowel by a single vowel lowering rule.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
270
Two other languages that have emphatic and guttural sounds are briefly dis-
cussed: Tigre and Sta'at'imcest. What is interesting about these two languages is that
they group emphatics with the gutturals and exclude laryngeals. The phonological identi-
ties of emphatics and gutturals in these two languages have to be different from those in
Arabic. It is argued here that there are phonetic differences that underlie these phonologi-
cal differences. Tigre emphatics are ejectives meaning they involve a pharyngeal action
of larynx raising. Furthermore, these sounds are argued elsewhere to be [ +RTR], an ar-
ticulatory property not uncommon in ejective sounds. Recent articulatory investigations
on Sta'at'imcest show that their emphatics (or retracted alveolars) always involve are-
tracted tongue root and occasionally involve a retracted dorsum. The pharyngeals in this
language involve a sizeable dorsal retraction along with the tongue root retraction causing
their narrowest constriction to take place in the upper pharynx. Emphatics in both lan-
guages are argued to include radical components representing their active secondary pha-
ryngeal articulations. The natural class of gutturals in Tigre and Sta'at'imcest are there-
fore defined as the sounds that have the feature [ +RTR]. This admits emphatics and
pharyngeals (and uvulars in the case of Sta'at'imcest- Tigre has no uvulars) and ex-
cludes laryngeals.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
271
CHAPTER 7
Conclusion and Future Directions
This dissertation is motivated primarily by the descriptive and analytical inade-
quacies facing existing formal representations of Arabic emphatic and guttural sounds. It
is believed that these inadequacies are a direct consequence of a lack of un.derstanding of
the phonetic similarities and differences among those sounds. This dissertation aims to
further our understanding of the phonetic qualities of those sounds in order to arrive at
alternative formal representations that are more capable of handling the relevant phono-
logical phenomena such as root cooccurrence restrictions and vowel lowering. The dis-
sertation also seeks to propose a more elaborate reasoning for the grouping of the three
Arabic guttural sub-classes (uvulars, pharyngeals, and laryngeals), which are produced at
distinct points of articulation, into a single natural class. To achieve these goals, the dis-
sertation relies on a set of three acoustic experiments using speech samples from Modern
Standard Arabic. The dependence on acoustic data to further our knowledge about articu-
lation follows from the well-established theories and models of acoustic-articulatory rela-
tions.
The first acoustic experiment focuses on two gaps in the acoustic literature on
Arabic emphatics and gutturals. This first gap is the lack of any extensive and objective
acoustic research on the spectral differences between Arabic emphatics consonants and
their non-emphatic counterparts. The spectral shapes of consonants are among the most
important correlates to articulation. The possibility that consonantal spectral correlates to
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
272
emphaticness could be detected is in need of extensive, modern research. The second gap
is the lack of any reliable phonetic characterization of the consonantal status of Arabic
uvular continuants. These sounds are mostly characterized as fricatives. However, in
some phonological views, these sounds are classified as approximants. A principled pho-
netic judgment is needed in this matter. This experiment addresses these two issues by
describing and comparing the spectral shapes of the consonants in question using two
spectral analysis methods: spectral moments and multi-band spectra. The latter is a novel
tool being introduced for the first time in this dissertation. The results indicate that no
highly reliable acoustic correlates to emphaticness can be located in the spectral shapes of
the consonants. The canonical spectra of consonants are, therefore, not considered reli-
able sources for emphatic/non-emphatic acoustic differences. More importantly, the re-
sults show that Arabic uvular continuants have the spectral qualities of fricatives. This
finding has major impacts on any phonological views that are predicated on classification
of all Arabic gutturals as approximants.
The second acoustic experiment focuses on the coarticulatory impact of emphat-
ics, non-emphatics, and gutturals on the formant frequencies of adjacent vowels. The dif-
ferent coarticulatory impacts of these sounds on the Fl and F2 of adjacent vowels are in-
terpreted as indications of the articulatory qualities of those sounds. The results show that
the most reliable coarticulatory correlate to emphaticness is a substantially low and stable
F2 locus in the vowel adjacent to the emphatic sound. Uvulars are also associated with
low F2 transitions. These transitions, however, do not point to identifiable F2 loci in ad-
jacent vowels. The size of F2 drop in uvulars depends on the identity of the identity of the
vowel. Pharyngeals, meanwhile, are associated with consistently high Fl transitions.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
273
While emphatics and uvulars are also generally accompanied by high Fl transitions,
these transitions are not as high nor as consistent as those accompanying pharyngeals.
These findings are interpreted as indications that the only sounds in Arabic where the
tongue root is actively retracted are pharyngeals. Emphatics and uvulars, meanwhile, are
produced with a retracted tongue dorsum. Any tongue root retraction in emphatics and
uvulars is considered a by-product of the dorsal retraction. These findings pose chal-
lenges to the phonological views that represent Arabic emphatics and uvulars as [+RTR]
sounds. Furthermore, the more stable association between low F2 transitions and emphat-
ics, as opposed to uvulars, is interpreted as an indication that the dorsal retractions in em-
phatics and uvulars are not quite similar.
The third acoustic experiment investigates the coarticulatory effects of two flank-
ing vowels on each other across an intervening consonant which was a plain oral, an em-
phatic, or a guttural. Such coarticulatory effects are usually influenced by the articulatory
restriction placed on the tongue dorsum by the intervening consonant. The results show
that plain orals and low gutturals allow significant amounts of anticipatory and carryover
vowel-to-vowel coarticulatory effects. Arabic emphatics, on the other hand, strongly and
consistently resist these effects. The impacts of uvular gutturals on vowel-to-vowel coar-
ticulation depend on the degrees of constriction involved in their articulations. The
voiced uvular fricative [B], which involves mild constriction, permits large degrees of
vowel-to-vowel coarticulatory effects. The voiceless fricative [X], which involves a
higher degree of constriction, allows coarticulatory effects that are not as substantial. The
uvular stop [q], which involves the highest degree of constriction, strongly resists vowel-
to-vowel coarticulation. These results are interpreted as indications that the involvement
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
274
of the tongue dorsum in the articulations of emphatics and uvulars are not fully similar.
The tongue dorsum during the articulation of Arabic emphatics seems to be pulled back
through the action of both the styloglossus and the hyoglossus muscles. Both these mus-
cles are employed by vowel articulations explaining why these consonants strongly resist
vowel-to-vowel coarticulation. By comparison, the dorsal retraction in uvulars seems to
involve only the styloglossus since contraction of the hyoglossus is directly antagonistic
to the tongue raising necessary for their articulation. The degree of involvement of the
styloglossus in uvulars depends on their degree of constriction. Since the stop [q] requires
the most contraction by the styloglossus, this uvular interfere the most with vowel-to-
vowel coarticulation. The tongue raising during the articulation of uvulars seems to be, at
least partially, the responsibility of the palatoglossus muscle. All three uvulars also in-
volve active participation by the soft palate.
The articulatory perspectives gained from the three acoustic experiments are over-
viewed in Chapter 6 and used to revise the phonological representations of Arabic em-
phatics and gutturals. Emphatics are argued to include a dorsal component in their repre-
sentation, but no pharyngeal component. Uvulars are argued to include both dorsal and
pharyngeal components. Pharyngeals and laryngeals should include pharyngeal compo-
nents only. Formal representations reflecting these views are presented and shown to be
· more adequate at handling root cooccurrence restrictions and vowel lowering in Arabic
than existing proposals. The chapter also introduces a new reasoning for the grouping of
Arabic gutturals into a single natural class. The three guttural subclasses are argued to be
produced by a single active articulator, the pharynx, based on high-level motoric unity
among the different guttural muscles producing those subclasses. The chapter also briefly
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
275
discusses two other languages that possess emphatic and guttural sounds, Tigre and
Sta'at'imcest. These two languages group emphatics with the gutturals into a single natu-
ral class and exclude laryngeals. It is argued that there are phonetic differences that un-
derlie these phonological differences. Unlike Arabic emphatics, Tigre and Sta'at'imcest
emphatics seem to involve active pharyngeal articulations. The natural class of gutturals
in Tigre and Sta'at'imcest are defined as the sounds that have the feature [+RTR]. This
admits emphatics and excludes laryngeals.
Much of the mischaracterization of the articulatory properties of the sounds inves-
tigated in this dissertation stems from the physical proximity and dependency between
the precise articulators executing those sounds. In future research, these sounds can be
subjected to extensive and direct articulatory studies. A very telling method would be
electromyography (EMG). This method can measure and diagnose individual muscle ac-
tivities with precision. This method is used to study speech sound articulation. However,
there· are not many published works in this field as far as pharyngeal articulations are
concerned. A possible reason is that EMG applications are quite invasive since they re-
quire the insertion of electrodes in the forms of small pins or plates into the musculature.
One could imagine how difficult this would be if the musculature being studied is in a
more internal organ like the pharynx or the larynx. Thus, finding and applying an alterna-
tive to EMG that is less invasive would add great depth to our understanding of the nu-
ances of speech production.
Studies of muscle fiber types in the different articulatory muscles could also be
very informative as far as the distinctions among articulators are concerned. Muscle fi-
bers are generally classified into slow-twitch Type I fibers and fast-twitch Type II fibers.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
276
Dr. Raymond Kent brought to my attention the possibility of grouping and classifying
articulators based on the distribution of different types of fibers in their controlling mus-
cles (see Kent 2004). The oral-guttural dichotomy could potentially be based on the ratio
of slow-twitch fibers to fast-twitch fibers present in the musculatures of those cavities. So
far, no studies dedicated to that goal have been conducted.
Staying in the field of articulatory studies, one can see that most articulatory
works reviewed in this dissertation employ x-rays. This method, while still considered
very helpful for studying speech production, poses some health risks to subjects. Pro-
longed exposure to radiation is not a liability experimenters and participants are easily
willing to take. The risks are aggravated if one intends to study speech production in mo-
tion. One downside to this is that all x-ray studies cited here had very few subjects, or
even just one. At times it would be only the experimenters themselves who are willing to
take the risk. Computerized tomography (CT) scans also use x-rays and, therefore, inherit
the same health risks. Recently, use of ultrasound imaging and magnetic resonance imag-
ing (MRI) as methodological alternatives has generated increasing interest. These meth-
ods carry much less health risk than x-rays. The advent of 3-D ultrasound and motion
"open MRI" carries with it exciting new methodological potential for speech sciences.
Employing these new methods can enrich our understanding of the details of pharyngeal
articulations and speech production as a whole.
This dissertation introduces the multi-band spectral (MBS) method as a refine-
ment of the venerable FFT method for the purpose of characterizing obstruent power
spectra. This method shows real potential as an economic, quantitative alternative to FFT
spectra. Given that the application of this method in the present dissertation is limited to
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
277
eight sounds, five subjects, and one language, extending its use to include more sounds
from more subjects and from various languages is of the essence. This should test the
method's cross-subject, cross-examiner, and cross-linguist reliability.
Finally, a cross-linguistic articulatory and phonological comparison of guttural
sounds is needed in order to provide more solid grounding for their phonological repre-
sentations. As we briefly see in Chapter 6 of this dissertation, some guttural sounds in
unrelated, or partially related, languages that are phonologically presumed to be equiva-
lents are substantially different articulatorily. This can result in theoretical descriptive and
analytical problems. Given that phonological representations of speech sounds are, in
general, phonetically grounded, starting out with a solid, thorough, and methodologically
advanced phonetic (articulatory and acoustic) understanding of the similarities and differ-
ences among the guttural sounds in different languages should be the first step before
characterizing them phonologically. I believe that this notion is true for all speech sounds
in general. In this manner, observable and testable facts can play an even larger role in
specifying abstract, higher-level phonological units.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
278
REFERENCES
Al-Ani, S. (1970). Arabic phonology; an acoustical and physiological investigation. The Hague: Mouton.
Ali, L. & Daniloff, R. (1972). A contrastive cinefluorographic investigation of the articulation of emphatic-non emphatic cognate consonants. Studia Linguistica, 26, 81-105.
Ali, L. & Daniloff, R. (1974). The perception of coarticulated emphaticness. Phonetica, 29, 225-231.
Alwan, A. (1989). Perceptual cues for place of articulation for the voiced pharyngeal and uvular consonants. Journal of the Acoustical Society of America, 86, 549-556.
Anderson, S. (1985). Phonology in the twentieth century: theories of rules and theories of representations. Chicago: University of Chicago Press.
Ar-Razi, Muhammad Ibn Abi Bakr. (1976). Mukhtar Al-Sihah. (M. Khatir, Ed.). Cairo: al-Hay'ah al-Misriyah al-'Ammah lil-Kitab. (Original work undated).
Avery, P. & Rice, K. (1989). Segment structure and coronal underspecification. Phonology, 6, 179-200.
Baalbaki, R. (1995). Al-Mawrid: A modern Arabic-English dictionary. Beirut: Dar El-Ilm Lilmalayin.
Bateson, M. (1967). Arabic language handbook. Washington: Center for Applied Linguistics.
Bessell, N. (1992). Towards a phonetic and phonological typology ofpostvelar articulations, Ph.D. Dissertation, University of British Columbia.
Bessell, N. & Czaykowska-Higgins, E. (1992). Interior Salish evidence for placeless laryngeals. Proceedings of North Eastern Linguistics Society. 22, 35-49.
BIAS, Inc. (1996). Peak LE. Computer software.
Bickley, C. & Stevens, K. (1987). Effects of a vocal-tract constriction on the glottal
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
279
source: data from voiced consonants. InT. Baer, C. Sasaki and K. Harris (eds.), Laryngeal function in phonation and respiration. Boston: College-Hill Press.
Bladon, R. & Al-Bamerni, A. (1976). Coarticulation resistance in English /1/. Journal of Phonetics, 4, 137-150.
Blevins, J. (2004). Evolutionary phonology: The emergence of sound patterns. Cambridge: University Press.
Blumstein, S. & Stevens, K. (1979). Acoustic invariance in speech production evidence from measurements of the spectral characteristics of stop consonants. Journal of the Acoustical Society of America, 66, 1001-1017.
Boersma, P. & Weenink, D. (1992). PRAAT. Computer program.
Boff Dkhissi, M-C. (1983). Contribution a !'etude experimentale des consonnes d'arriere de l'arabe classique (locuteurs marocains). Travaux de l'Institut de Phonetique de Strasbourg, 15, 1-363.
Bolla, K. (1981). A conspectus of Russian speech sounds. Koln: Bohlau.
Brame, M. (1972). On the abstractness of phonology: Maltese? In M. Brame (ed.), Contributions to Generative Phonology, 22-61. Austin: University of Texas Press.
Broselow, E. (1979). Cairene Arabic syllable structure. Linguistic Analysis, 5, 542-582.
Butcher, A. & Ahmad, K. (1987). Aerodynamic characteristics of pharyngeal consonants in Iraqi Arabic. Phonetica, 44, 156-172.
Card, E. (1983). A phonetic and phonological study of Arabic emphasis, Ph.D. Dissertation, Cornell University.
Catford, J. (1977). Fundamental problems in phonetics. Bloomington: Indiana University Press.
Chiba, T. & Kajiyama, M. (1958). The vowel, its nature and structure. Tokyo: Phonetic Society of Japan. (Originally published in 1941).
Choi, J. (1995). An acoustic-phonetic underspecification account of Marshallese vowel allophony. Journal of Phonetics, 23, 323-347.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
280
Choi, J. & Keating, P. (1991). Vowel-to-vowel coarticulation in three Slavic languages. University of California Working Papers in Phonetics, 78, 78-86.
Chomsky, N. & Halle, M. (1968). The sound pattern of English: Studies in language. New York: Harper & Row.
Clark, J. & Yallop, C. (1995). An introduction to phonetics and phonology. Cambridge, Massachusetts: Blackwell.
Clements, G. N. (1985). The geometry of phonological features. Phonology Yearbook, 1985, 225-252.
Clements, G. N. (1987). Phonological feature representation and the description of intrusive stops. CLS 23: Parasession on Autosegmental and Metrical Phonology. Chicago. 29-50.
Clements, G. N. (1990). The role of the sonority cycle in core syllabification. In J. Kingston and M. Beckman (eds.), Papers in Laboratory Phonology I, 283-333. Cambridge: University Press.
Davis, S. (1995). Emphasis spread in Arabic and grounded phonology. Linguistic Analysis, 26, 465-498.
Delattre, P. (1971). Pharyngeal features in the consonants of Arabic, German, Spanish, French, and American English. Phonetica, 23, 129-155.
Delattre, P., Liberman, A., & Cooper, F. (1955). Acoustic loci and transitional cues for consonants. Journal of the Acoustical Society of America, 27,769-773.
Dembowski, J. (1998). Articulator point variability in the production of oral stop consonants, Ph.D. Dissertation, University of Wisconsin--Madison.
Edmondson, J., Esling, J., Harris, J., & Huang, T. (2005). A laryngoscopic study of glottal and epiglottal/pharyngeal stop and continuant articulations in Amis--an Austronesian language of Taiwan. Language and Linguistics, 6, 381-396.
El-Dalee, M. (1984). The feature of retraction in Arabic, Ph.D. Dissertation, Indiana University.
El-Halees, Y. (1985). The role ofF1 in the place-of-articulation distinction in Arabic.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
281
Journal of Phonetics, 13, 287-298.
Elorrieta, J. (1991). The feature specification of uvulars. Proceedings of the West Coast Conference on Formal Linguistics. 10, 139-149.
Esling, J. (1996). Pharyngeal consonants and the aryepiglottic sphincter. Journal of the International Phonetic Association, 26, 65-88.
Esling, J. (1999). The IPA categories "pharyngeal" and "epiglottal": Laryngoscopic observations of pharyngeal articulations and larynx height. Language and Speech, 42, 349-372.
Evers, V., Reetz, H., & Lahiri, A. (1998). Crosslinguistic acoustic categorization of sibilants independent of phonological status. Journal of Phonetics, 26, 345-370.
Fant, G. (1960). Acoustic theory of speech production. The Hague: Mouton & Co.
Forrest, K., Weismer, G., Milenkovic, P., & Dougall, R. (1988). Statistical analysis of word-initial voiceless obstruents preliminary data. Journal of the Acoustical Society of America, 84, 115-123.
Fowler, C. (1980). Coarticulation and theories of extrinsic timing. Journal of Phonetics, 8, 113-133.
Fowler, C. & Saltzman, E. (1993). Coordination and coarticulation in speech production. Language and Speech, 36, 171-195.
Fre Woldu, K. (1981). Facts regarding Arabic emphatic consonant production. Reports from Uppsala University Department of Linguistics, 7.
Fulop, S., Kari, E., & Ladefoged, P. (1998). An acoustic study of the tongue root contrast in Degema vowels. Phonetica, 55, 80-98.
Garnes, S. (1975). An acoustic analysis of double articulations in Ibibio. Ohio State University Working Papers in Linguistics, 20, 44-55.
Ghazeli, S. (1977). Back consonants and backing coarticulation in Arabic, Ph.D. Dissertation, University of Texas at Austin.
Giannini, A. & Pettorino, M. (1982). The emphatic consonants in Arabic: Speech
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
laboratory report IV. Naples: Istituto Universitario Orientale.
Goldsmith, J. (1976). Autosegmental phonology, Ph.D. Dissertation, Massachusetts Institute of Technology.
282
Grossman, R. (1964). Sensory innervation of the oral mucosae. Journal ofthe Southern California State Dental Association, 32, 128-133.
GoldWave, Inc. (2002). GoldWave. Computer software.
Halle, M. (1983). On distinctive features and their articulatory implementation. Natural Language & Linguistic Theory, 1, 91-105.
Halle, M. (1995). Feature geometry and feature spreading. Linguistic Inquiry, 26, 1-46.
Halle, M., Hughes, G., & Radley, J. (1957). Acousticproperties of stop consonants. Journal of the Acoustical Society of America, 29, 107-116.
Halle, M. & Stevens, K. (1969). On the feature [Advanced Tongue Root]. Quarterly Progress Report of the MIT Research Laboratory in Electronics, 94, 209-215.
Halle, M., Vaux, B., & Wolfe, A. (2000). On feature spreading and the representation of place of articulation. Linguistic Inquiry, 31, 387-444.
Harris, K. (1958). Cues for the discrimination of American English fricatives in spoken syllables. Language and Speech, 1, 1-7.
Hayward, K. & Hayward, R. (1989). 'Gutteral': arguments for a new distinctive feature. Transactions of the Philological Society, 87, 179-193.
Heath, J. (1987). Ablaut and ambiguity: phonology of a Moroccan Arabic dialect. Albany: State University of New York Press.
Heinz, J. & Stevens, K. (1961 ). On the properties of voiceless fricative consonants. Journal of the Acoustical Society of America, 33, 589-596.
Herzallah, R. (1990). Aspects of Palestinian Arabic phonology: a nonlinear approach, Ph.D. Dissertation, Cornell University.
Hess, S. (1992). Assimilatory effects in a vowel harmony system: An acoustic analysis of
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
advanced tongue root in Akan. Journal of Phonetics, 20, 475-492.
Holes, C. (1994). Arabic. In R. Asher (ed.), The Encyclopedia of Language and Linguistics, 191-194. Oxford: Pergamon Press.
283
Honda, K., Kusakawa, N., & Kakita, Y. (1992). An EMG analysis of sequential control cycles of articulatory activity during utterances. Journal of Phonetics, 20, 53-63.
Hughes, G. & Halle, M. (1956). Spectral properties of fricative consonants. Journal of the Acoustical Society of America, 28, 303-310.
Ibn al-Jazari, Muhammad. (1986). Al-Tamhid Fi 'IlmAl-Tajwid. (Gh. Hamad, Ed.). Beirut: Maktabat al-Ma'arif. (Original work undated).
Ibn Jinni, Abu al-Fath 'Uthman. (1954). Sirr sina'at al-i'rab. (M. As-Saqqa, M. Az-Zafzaaf, I. Mustafa, & A. Amin, Eds.). Cairo: Mustafa Lubabi Al-Halabi & Sons.
IPA. (1999). Handbook of the International Phonetic Association: A guide to the use of the International Phonetic Alphabet. Cambridge: Cambridge University Press.
Jakobson, R., Fant, G., & Halle, M. (1952). Preliminaries to speech analysis: The distinctive features and their correlates. Cambridge, MA: Acoustics Laborataory Massachusetts Institute of Technology.
Jakobson, R. & Halle, M. (1956). Fundamentals of language. The Hague: Mouton.
Jassem, W. (1995).The acoustic parameters of Polish voiceless fricatives: An analysis of variance. Phonetica, 52, 251-158.
Jongman, A., Wayland, R., & Wong, S. (2000). Acoustic properties of English fricatives. Journal of the Acoustical Society of America, 108, 1252-1263.
Joos, M. (1948). Acoustic Phonetics. Language: Journal of the Linguistic Society of America, Language Monographs 24, 1-136.
Kardach, J., Wincowski, R., Metz, D., Schiavetti, N., Whitehead, R., & Hillenbrand, J. (2002). Preservation of place and manner cues during simultaneous communication: a spectral moments perspective. Journal of Communication Disorders, 35, 533-542.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
I
284
Kaye, A. (1990). Arabic. In Comrie, B. (Ed.), The World's Major Languages, 664-685. New York: Oxford University Press.
Keating, P. (1985). CV phonology, experimental phonetics, and coarticulation. UCLA Working Papers in Phonetics, 62, 1-13.
Kenstowicz, M. (1994 ). Phonology in generative grammar. Cambridge, Massachusetts: Blackwell.
Kent, R. (2004). The uniqueness of speech among motor systems. Clinical Linguistics & Phonetics, 18, 495-505.
Kent, R. & Read, C. (2002). Acoustic analysis of speech. Albany, NY: Singular.
Kewley-Port, D. (1982). Measurement of formant transitions in naturally produced stop consonant-vowel syllables. Journal of the Acoustical Society of America, 72, 379-389.
Keyser, S. & Stevens, K. (1994). Feature geometry and the vocal tract. Phonology, 11, 207-236.
Kingston, J. & Diehl, R. (1994). Phonetic knowledge. Language, 70, 419-454.
Kuriyagawa, F. (1984). The Features of /k/ and /q/ in Cairo Standard Arabic. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, 18, 65-73.
Ladefoged, P. & Maddieson, I. (1996). The sounds of the world's languages. Oxford: Blackwell.
LaRiviere, C., Winitz, H., & Herriman, E. (1975). The distribution of perceptual cues in English prevocalic fricatives. Journal of Speech and Hearing Research, 18, 613-622.
Laufer, A. & Baer, T. (1988). The emphatic and pharyngeal sounds in Hebrew and in Arabic. Language and Speech, 31, 181-205.
Laufer, A. & Condax, I. ( 1979). The epiglottis as an articulator. Journal of the International Phonetic Association, 9, 50-56.
Laufer, A. & Condax, I. (1981 ). The function of the epiglottis in speech. Language and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Speech, 24, 39-62.
Leben, W. (1973). Suprasegmental phonology, Ph.D. Dissertation, Massachusetts Institute of Technology.
285
Lehn, W. (1963). Emphasis in Cairo Arabic. Language: Journal of the Linguistic Society of America, 34, 29-39.
Liberman, A., Delattre, P., & Cooper, F. (1954). The role of consonant-vowel transitions in the perception of the stop and nasal consonants. Psychological Monographs, 68, 1-13.
Lieberman, P. & Blumstein, S. (1988). Speech physiology, speech perception, and acoustic phonetics. Cambridge: Cambridge University Press.
Lindblom, B. & Sussman, H. (2002). Principal components analysis of tongue shapes in symmetrical VCV utterances. Fonetik 2002., (TMH-QPSR). Stockholm. 44, 1-4.
Lindblom, B., Sussman, H., Modarresi, G., & Burlingame, B. (2002). The trough effect: Implications for speech motor programming. Phonetica, 59, 245-262.
Lowenstamm, J. & Prunet, J-F. (1985). Tigre vowel harmonies. Paper presented at the 16th Annual Conference on African Linguistics. Yale University.
Lowenstamm, J. & Prunet, J-F. (1987). Vertical harmonies in Tigre. Paper presented at the 18th Annual Conference on African Linguistics. UQAM.
Maeda, S. & Honda, K. (1994). From EMG to formant patterns of vowels: The implication of vowel spaces. Phonetica, 51, 17-29.
McCarthy, J. (1979). Formal problems in Semitic phonology and morphology, Ph.D. Dissertation, Massachusetts Institute of Technology.
McCarthy, J. (1986). OCP effects: Gemination and antigemination. Linguistic Inquiry, 17(2), 207-263.
McCarthy, J. (1988). Feature geometry and dependency: A review. Phonetica, 45, 84-108.
McCarthy, J. (1991). Semitic gutturals and distinctive feature theory. In B. Comrie and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
286
M. Eid (eds.), Perspectives on Arabic Linguistics, III., 63-91. Amsterdam: Benjamins.
McCarthy, J. (1994). The phonetics and phonology of Semitic pharyngeals. In P. Keating (ed.), Phonological Structure and Phonetic Form: Papers in Laboratory Phonology Ill. Cambridge: Cambridge University Press.
McCawley, J. (1967). Le role d'un systeme de traits phonologique dans une theorie du langage. Languages, 8, 112-123.
Mester, R. (1986). Studies in tier structure, Ph.D. Dissertation, University of Massachusetts at Amherst.
Microsoft Corp. (1985). Excel. Computer software.
Microsoft Corp. (1987). PowerPoint. Computer software.
Milenkovic, P. (2000). TF32. Computer program.
Namdaran, N. (forthcoming). Retraction in St'at'imcets (Lillooet Salish): An ultrasonic investigation, Masters Thesis, University of British Columbia.
Nittrouer, S. (1995). Children learn separate aspects of speech production at different rates: Evidence from spectral moments. Journal of the Acoustical Society of America, 97, 520-530.
Norlin, K. (1987). A phonetic study of emphasis and vowels in Egyptian Arabic: Lund University, the Department of Linguistics Working Papers 30.
Obrecht, D. ( 1961 ). Effects of the second format in the perception of verlar1zation in Lebanese Arabic, Ph.D. Dissertation, University of Pennsylvania.
Odden, D. (1986). On the role of the Obligatory Contour Principle in phonological theory. Language, 62, 353-383.
Odden, D. (1988). Anti anti-gemination and the OCP. Linguistic Inquiry, 19,451-475.
Ohman, S. (1966). Coarticulation in VCV utterances: Spectrographic measurements. Journal of the Acoustical Society of America, 39, 151-168.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
287
Padgett, J. (1995). Stricture in feature geometry. Stanford: CSLI Publications.
Palmer, J. (1993). Anatomy for speech and hearing. Baltimore: Williams & Wilkins.
Penfield, W. & Rasmussen, T. (1950). The cerebral cortex of man. New York: Macmillan.
Perkell, J. (1971 ). Physiology of speech production: A preliminary study of two suggested revisions of the features specifying vowels. MIT Research Laboratory of Electronics Quarterly Progress Report, 102, 123-139.
Perkins, W. & Kent, R. ( 1986). Functional anatomy of speech, language and hearing: A primer. San Diego: College-Hill Press.
Pickett, J. (1999). The acoustics of speech communication: Fundamentals, speech perception theory, and technology. Needham Heights, MA: Allyn & Bacon.
Pierrehumbert, J. (1993). Dissimilarity in the Arabic verbal roots. Proceedings of North Eastern Linguistics Society. 23, 367-381.
Purcell, E. (1979). Formant frequency patterns in Russian VCV utterances. Journal of the Acoustical Society of America, 66, 1691-1702.
Recasens, D. (1985). Coarticulatory patterns and degrees of coarticulatory resistance in Catalan CV sequences. Language and Speech, 28, 97-114.
Recasens, D., Farnetani, E., Fontdevila, J., & Pallares, M. (1993). An electropalatographic study of alveolar and palatal consonants in Catalan and Italian. Language and Speech, 36, 213-234.
Recasens, D., Fontdevila, J., & Pallares, M. (1996). Linguopalatal coarticulation and alveolar-palatal correlations for velarized and non-velarized 11/. Journal of Phonetics, 24, 165-185.
Ringel, R. (1970). Oral region two-point discrimination in normal and myopathic subjects. Second Symposium on Oral Sensation and Perception. Springfield, Ill: Charles C. Thomas. 309-321.
Rose, S. (1996). Variable laryngeals and vowel lowering. Phonology, 13,73-117.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Saussure, F. (1966). Course in generallinguistics. (W. Baskin, Trans) New York: McGraw-Hill. (Original work published 1915).
Seikel, J., King, D., & Drumright, D. (1997). Anatomy and physiology for speech and language. San Diego: Singular Publishing Group.
288
Semaan, K. (1963). Arabic phonetics; Ibn Sina's Risalah on the points of articulation of the speech-sounds. Arthur Jeffery memorial monographs, no. 2. Lahore: Sh. Muhammad Ashraf.
Shahin, K. ( 1997). Postvelar harmony: An examination of its bases and cross linguistic variation, Ph.D. Dissertation, University of British Columbia.
Shahin, Kimary N. (1996). Accessing pharyngeal place in Palestinian Arabic.
Sibawayh. (1898). Kitab Sibawayh. Baghdad: Al-Muthanna Library. (Original work undated).
Sproat, R. & Fujimura, 0. (1993). Allophonic variation in English /1/ and its implications for phonetic implementation. Journal of Phonetics, 21, 291-311.
SPSS, Inc. (1989). SPSS. Computer program.
Stevens, K. (1989). On the quantal nature of speech. Journal of Phonetics, 17, 3-45.
Stevens, K. (1993). Models for the production and acoustics of stop consonants. Speech Communication, 13, 367-375.
Stevens, K. (1999). Articulatory-acoustic-auditory relationships. In W. Hardcastle and J. Laver (eds.), The handbook of phonetic sciences. Oxford: Blackwell.
Stevens, K. (1998). Acoustic Phonetics. Cambridge, Massachusetts: The MIT Press.
Stevens, K. & Blumstein, S. (1978). Invariant cues for place of articulation in stop consonants. Journal of the Acoustical Society of America, 64, 1358-1368.
Stevens, K. & House, A. (1955). Development of a quantitative description of vowel articulation. Journal of the Acoustical Society of America, 27, 484-493.
Stevens, K. & House, A. (1956). Studies of formant transitions using a vocal tract analog.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
289
Journal of the Acoustical Society of America, 28, 578-585.
Stevens, K. & House, A. (1961 ). An acoustical theory of vowel production and some of its implications. Journal of Speech and Hearing Research, 4, 303-320.
Strevens, P. (1960). Spectra of fricative noise in human speech. Language and Speech, 3, 32-49.
Tabain, M. (1998). Non-sibilant fricatives in English: Spectral information above 10kHz. Phonetica: International Journal of Speech Science, 55, 107-130.
Trubetskoi, N. (1969). Principles of phonology. Berkeley: University of California Press.
Vaux, B. (1993). Is ATR a laryngeal feature? Ms. Harvard University.
Wehr, H & Cowan, J. (1979). A dictionary of modern written Arabic: (Arabic-English). Wiesbaden: Harrassowitz.
Yip, M. (1989). Feature geometry and cooccurrence restrictions. Phonology, 6, 349-374.
Younes, M. (1982). Problems in the segmental phonology of Palestinian Arabic, Ph.D. Dissertation, University of Texas at Austin.
Younes, M. (1993). Emphasis spread in two Arabic dialects. In M. Eid and C. Holes (eds.), Perspectives on Arabic Linguistics, V, 119-145. Amsterdam: Benjamins.
Zawaydeh, B. (1997). An acoustic analysis ofuvularization spread in Ammani-Jordanian Arabic. Studies in the Linguistic Sciences, 27, 185-200.
Zawaydeh, B. (1999). The phonetics and phonology of gutturals in Arabic, Ph.D. Dissertation, Indiana University.
Zemlin, W. (1968). Speech and hearing science; anatomy and physiology. Englewood Cliffs, N.J.: Prentice-Hall.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
290
APPENDIX A
Stimuli - Experiments One and Three
Carrier phrase:
"?alkalimtu hiya _"
"The word is "
IPA Arabic Script Gloss
kitaab y\.:f 'book'
9iqah ' .. 'trust' '"-"-' '
?ixaa? l.:>-1 >- < 'fraternity'
?iBaa9ah it$-1 'aide/rescue'
zihaaf :.__:,\.;... . J 'picking up speed'
Yl..+ 'mountain passes'
'\11 J < 'frame'
Jidaad ;\...L.::. 'harsh ones'
'lighting'
kisaf 'pieces'
'quorum'
'bones'
sikak 'streets' J
kutib 'was written'
kutub J
'books' ' J
kutal 'masses'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
9uqif
suquf
buqa<i
?ux;io
dux;uul
dux;aan
lmmub
eusaa?
bubie
suhub
tuhaf
bu'li9
fu'lab
wu'luud
budi?
sudus
cJ.3udad
kusib
?usus
' 0 ... :.
0 •
0 :,
' ! ..\.:.:.\
' , .
' ' }
0 • / !
' }
::Jy)
+ jf
:.j,' . )
' } 0...\.J , .
' ' ' u"..L..o
' ' ' , ' ' '
' } ! <..J""""\
0 J !'
291
'became understood'
'ceilings'
'spots'
'was taken'
'entrance'
'smoke'
'fatigue'
'bleating'
'was researched'
'clouds'
'works of art'
'was sent'
'branches'
'promises (n.)'
'was crossed out'
'frames'
'fresh dates'
'was started'
'one sixth'
'new ones'
'was digested'
'heated grilling stones'
'curiosity'
'was won'
'bases'
'was phlebotomized'
'statues'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ku6ib
nu6ur
nukat
sukuut
fatak
batuul
waqib
eaquf
faqad
saxun
?axa6
?axiir
faHif
faHab
faHUuf
fahub
d3ahad
taflib
fafluul 0 ,,
jk!
'was lied'
' (warning) signs'
'shreds'
'shady'
'systems'
'shaded'
'was spilled'
'jokes'
'silence'
'exterminate'
'virgin'
'rude'
'understood'
'lost'
'became hot'
'took'
'last'
'in love (with)'
'riot'
'deeply in love (with)'
'became pale'
'denied'
'became tired'
'sent'
'does a lot'
'became slower'
'crossed out'
292
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
nadim
?adab
waduud
nad\:id:3
fad\:ul
nad\:ab
?as if
kasul
kasab
fas\:ad
ka6ib
ka6ab
ka6uub
Ja6\:if
na6\:uf
Ja6\:af
fakih
sakab
?akuul
mabiituh
mabiitii
fariiquh
fariiqii
jastasii:rruh
'
y¥ yJ$--
Yj£ :....av•
}
:__;,V:.
j_,s-1 'J /
}
'}
293
'regretted'
'literature'
'affectionate'
'ripened'
'became virtuous'
'diminished'
'felt sorry'
'became lazy'
'won'
'phlebotomized'
'able to see'
'lying'
'lied'
'frequent liar'
'hard'
'became clean'
'hardship'
'humorous'
'spilled'
'one who eats a lot'
'his spending of the night'
'my spending ofthe night'
'his team'
'my team'
'his cooking'
'my cooking'
'(he) finds (it) enjoyable'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
tastasii:sii
juziihuh
tuziihii
takiiduh
takiidii
d;3aliisuh
d;3aliisii
diikuh
diikii
buu:sit
kuusah
faatik
baahi8
0 J
'L '-? -·
294
'(you- f. sg.) find enjoyable'
'(he) removes it'
'(you- f. sg.) remove it'
'he sells it'
'(you -f. sg.) sell it'
'(he) unveils'
'(you- f. sg.) unveil'
'deceive (him)'
'(you- f. sg.) deceive'
'his patient'
'my patient'
'his companion'
'my companion'
'his shirt'
'my shirt'
'his wine'
'my wine'
'harming'
'his keeper'
'my keeper'
'his rooster'
'my rooster'
'was taken by surprise'
'zucchini'
'one inch'
'exterminator'
'researcher'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
faat\:i?
faas\:uuljaa
'beach'
'kidney beans'
295
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
APPENDIXB
Stimuli - Experiment Two
1. VC# context stimuli.
Carrier phrase:
IPA
jabiit
siiq
jafiix
jaziiH
jasiih
jabii\'
jatiih
sii1
jumiit'i
biid
mariid'i
kiis
qamiis'i
,, .. !,....... f "
"?alkalimtu hiya _"
"The word is ,
Arabic Script
/
'
'
<\....:.> .__.
' J
'
'
' J
'
296
Gloss
'spend the night'
'was led'
'grow old'
'go astray'
'tour'
'sell'
'get lost'
'was annoyed'
'(he) unveils'
'deserts'
'patient'
'bag'
'shirt'
'seek refuge for someone'
'keeper'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
297
abiik 'your father (object of prep.)'
taabuut Q.y.\j 'coffin' ' J 'market'-suuq J_y' ..
jaduux cJ--4 'faint' ' jarUUH tJJ.. 'evade'
jabuuh C.r.t 'disclose'
kuu'i t_y5' 'elbow' J
fuuh ' . 'his mouth'
suu1 'evil' ' jasuut'i .k /
_y--1. 'whip'
suud ' J 'blacks' ,:,.Y'
furuud'i 'obligations' ' 'razor' muus vr
buus'i 0 'reed' <..J""Y. 0
'seek refuge' 0
.1_# 'well kept' .
abuuk y.i 'your father (subjective)'
baat 'spent the night'
saaq 0L- 'leg'
daax cb 'fainted' ' zaaH t_lj 'went astray'
saah OL. c 'toured'
oaa'i 'spread'
taah 'got lost' 0
saa1 ,_L,. 'became worse' 0
sijaat'i 'whips'
saad 'prevailed'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
298
rijaadr :;,4) 'gardens'
daas 'stepped on'
baasr 0 L r..T' . 'bus'
'drizzle' 0
J:,u.:,... 'well keeping'
abaak 'your father (object of verb)'
2. #CV context stimuli.
Carrier phrase:
"_ hiya 'alkalimah"
" is the word."
IPA Arabic Script Gloss
tiin 0 'figs' lN
tuut 0 'berries' uy
taab y\; 'repented'
qiis 0 o, 'was measured' J
quut :.:..,; 'nourishment'
qaas 0 \j <...!" 'measured'
x;iirah 0/ 'elite' 0 .frf
x;uuoah 0. J 'helmet' o.:>y
x;aab y\.>. 'failed'
B'iid 'delicate women' 0
B'UUl 'ogre'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
299
Kaab yli- 'failed to appear'
hiik 'was knit'
b.uut 0 J 'whale' ...::..>y-
haak 'knitted'
£iid 'holiday'
£uud 0 'twig'
£aad 'returned'
hiib 'wCJ.s feared'
huud 0 'Hud' (name of a prophet)
haab ylA 'feared' 0
1iioaa? 'harming (n.)' 0 l
1uutii I)JI 'was given'
1aat 'coming'
'perfume'
y_,k 'bricks'
ytb 'became pleasant'
diik ./ 'rooster'
duud 0 'worms' 0
daa? 'disease' :J } ))4' 'scarcity'
0 'narrowness' 0
'became lit'
sii? 'was annoyed'
suu? 'evil' 0
saa? 'became evil' 0 'reputation'
0y 'wool'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
300
s'iaad 'hunted'
kiis 'bag'
kuub y§' 'cup'
kaad 'nearly (did something)'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
301
APPENDIXC
Additional Stimuli - Experiment Three
The following words are added to the stimuli in Appendix A to form the set of stimuli for
Experiment Three.
Carrier phrase:
IPA
d.3ihaad
mi1aat
Juhid
Juhub
suhaad
su1il
Ju1uun
su1aal
Jahid oahab
mahuul
sa1im
"?alkalimtu hiya _"
"The word is "
Arabic Script
0 J
0 J J
0 ' •
Gloss
'struggle'
'hundreds'
'was attended'
'shooting stars'
'insomnia'
'was asked'
'matters'
'question'
'attended'
'gold'
'frightful'
'got tired or
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
302
da?ab 'persevered'
da?uub 'persevering (n.)'
fabiihuh 0' 'his look-alike'
fabiihii 'my look-alike' 0 ' )
jusii?uh 'does (it) badly'
tusii?ii 0 ' '(you- f. sg.) do badly'