A computational phonology of Russian - Peter A. Chew.pdf

7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

1/425

A Computational Phonology of Russian

byPeter A. Chew

ISBN: 1-58112-178-4

DISSERTATION.COM

Parkland, FL USA 2003


2/425


Copyright 2000 Peter A. Chew

All rights reserved.

Dissertation.com

USA 2003

ISBN: 1-58112-178-4www.Dissertation.com/library/1121784a.htm
http://www.dissertation.com/library/1121784a.htmhttp://www.dissertation.com/library/1121784a.htm


3/425


Peter Chew

Jesus College, University of OxfordD. Phil. dissertation, Michaelmas 1999

Abstract

This dissertation provides a coherent, synchronic, broad-coverage, generativephonology of Russian. I test the grammar empirically in a number of ways todetermine its goodness of fit to Russian. In taking this approach, I aim to avoidmaking untested (or even incoherent) generalizations based on only a handful ofexamples. In most cases, the tests show that there are exceptions to the theory, but atleast we know what the exceptions are, a baseline is set against which future theoriescan be measured, and in most cases the percentage of exceptional cases is reduced tobelow 5%.

The principal theoretical outcomes of the work are as follows. First, I show that all ofthe phonological or morphophonological processes reviewed can be described by agrammar no more powerful than context-free.

Secondly, I exploit probabilistic constraints in the syllable structure grammar toexplain why constraints on word-marginal onsets and codas are weaker than on word-internal onsets and codas. I argue that features such as [!initial] and [!final], andextraprosodicity, are unnecessary for this purpose.

Third, I claim that !"! should be lexically unspecified for the feature [!sonorant], andthat the syllable structure grammar should fill in the relevant specification based on itsdistribution. This allows a neat explanation of the voicing assimilation properties of!"!, driven by phonotactics.

Fourth, I argue that jers in Russian should be regarded as morphological objects, notsegments in the phonological inventory. Testing the grammar suggests that whileepenthesis cannot be regarded as a major factor in explaining vowel-zero alternations,it might be used to explain a significant minority of cases.

Fifth, I suggest that stress assignment in Russian is essentially context-free, resultingfrom the intersection of morphological and syllable structure constraints. I show thatmy account of stress assignment is simpler than, but just as general as, the best of thethree existing theories tested.

Finally, this dissertation provides new insight into the nature and structure of theRussian morphological lexicon. An appendix of 1,094 morphemes and 1,509allomorphs is provided, with accentual and jer-related morphological informationsystematically included.


4/425

_______________________________


by

Peter Chew

University of OxfordJesus College

Michaelmas 1999

_______________________________

Thesis submitted for the degree of Doctor of Philosophyat the University of Oxford


5/425

Acknowledgements

I would like to thank my supervisor, John Coleman, for his help. Without hisencouragement and support even before I embarked upon this research, I woulddoubtless now be a well-paid but bored chartered accountant. Auditing linguistictheories has proved to be more rewarding in many ways than auditing financialstatements, and I am confident that the choice of leaving my previous job to pursuethis research was the right one.

It would not have been possible to complete this D. Phil. without the support of mywife, Lynn. She has always been there to give practical suggestions, as a soundingboard for ideas, and simply as a partner in life, sharing encouraging and discouragingtimes together. God could not have given me a better wife.

My parents have also been a great practical help, babysitting almost weekly, havingus round for meals, and generally helping reduce the stress in our lives. AlthoughJonathan, who was born 15 months before I submitted this thesis, has taken time frommy studies, we are very grateful for his arrival. I cannot think of a better way ofspending my time, and I cannot imagine a better son.

A number of people have read drafts of my work or listened to me, giving helpfuladvice which enabled me to sharpen my thoughts and improve the way in which Iexpressed them. Thanks (in alphabetical order) to Dunstan Brown, Bob Carpenter,Andrew Hippisley, Mary MacRobert, Stephen Parkinson, Burton Rosner, IrinaSekerina, Andrew Slater, and Ian Watson. Andrew Slater has also provided invaluabletechnical support. I often feel that he puts the rest of us to shame with his good

humour, helpfulness, and a constant willingness to go the extra mile.

My friends at the Cherwell Vineyard Christian Fellowship have provided adependable support network which has kept Lynn and me going through not alwayseasy times. First and foremost, they have encouraged us to keep looking towards theone without whom we can do nothing. However, I know I will also look back on thelaughs we have had with Richard and Janet Remmington, Evan and EowynRobertson, Judy Irving, and others, on Thursday evenings with fond memories.

Finally, I would like to thank my college, Jesus College, for providing very generousfinancial support throughout my time at Oxford. And without the financial support ofthe Arts and Humanities Research Board (formerly the British Academy), I would nothave undertaken this research project in the first place.


6/425

List of abbreviations and symbols

General symbols

!! enclose phonemic representations, e.g. !#$%&'!

() enclose phonetic representations, e.g. (*$%&')

!#c++,#-rn+in! denotes morphological tokenization; subscripts classify individualmorphs+ morpheme boundary. syllable boundary

/ denotes word-stress in IPA transcriptions (stress on the vowel to theright)!+,#-!rn denotes a single morpheme (classificatory subscript is outsideobliques)" syllable# the empty stringanter anterior C any consonantCFG context-free grammar cons consonantalcont continuantcoron coronalDCG (Prolog) Definite Clause Grammar del_rel delayed releaseinit initial

later lateralOT Optimality TheoryPSG phrase structure grammar sonor sonorantSSG Sonority Sequencing GeneralizationV any vowelvfv vocal fold vibrationvoc vocalic

Symbols used in morphological tokenization

r* root

s suffixc* clitici inflectional endingp pronominala adjectivaln substantivalv verbal

d durative process

r* resultative processi iterative processc* completed process

*No ambiguity arises with respect to the use of non-unique symbols, because themeaning of each symbol is also dependent on its position; full details are given insection 3.2.1.2.


7/425

5

Table of contents

Acknowledgements ..................................................................................................................................3List of abbreviations and symbols............................................................................................................4Table of contents ......................................................................................................................................5Table of figures.........................................................................................................................................7List of tables.............................................................................................................................................8Chapter 1: Introduction.............................................................................................................................9

1.1 Introduction................................................................................................................................91.2 Why computational linguistics? .............. ............. .............. .............. ............. .............. .......... 131.3 The framework.............. ............. .............. .............. ............. .............. ............. ............ ............. .16

1.3.1 Phrase-structure grammar.............. ................ ............. ............ ............... ................ ......... 161.3.2 Context-free grammar.......... .............. ............. .............. .............. ........... ................ ......... 19

1.4 The methodology .....................................................................................................................241.5 The dataset used for the tests....... ................ .............. ............. .............. .............. ............... ....... 261.6 Summary..................................................................................................................................30

Chapter 2: Syllable structure .............. ............. .............. .............. ............. .............. .............. ............. .....322.1 Overview and aims........ ............. .............. ............. ............ ................ ............. ............ ............. .322.2 The syllable in phonological theory ............. .............. .............. ............... ................ ............. .... 34

2.2.1 Sonority and syllable structure ........... .............. ........... ................ ........... .............. .......... 372.2.2 Morpheme structure constraints or syllable structure constraints? ............. ............. ....... 402.2.3 Syllable structure assignment ............. ............. ............ ............... ................ ........... ......... 43

2.2.3.1 Kahns (1976) syllable structure assignment rules ............ ............. .............. ........... ..452.2.3.2 Its (1986) method of syllable structure assignment ........... ............. ........... ........... ..492.2.3.3 Syllable structure assignment in Optimality Theory.................. ............. .............. .....512.2.3.4 Phrase-structure analysis of syllable structure ............. .............. .............. ............. .....542.2.3.5 Syllable structure assignment: conclusions........... .............. ............. .............. ........... .56

2.3 A linear grammar of Russian syllable structure ............. ........... .............. ............. .............. ...... 582.3.1 The phonological inventory of Russian ............ ............. .............. ........... ............... ......... 59

2.3.1.1 Preliminaries: controversial issues.............. ............ ................ ............. ............ .......... 592.3.1.2 The classification system ........... .............. ............. .............. ............. .............. ............ 68

2.3.2 The syllable structure rules............ ................ ............. ............ ................ .............. .......... 722.4 A heuristic for deciding between multiple syllabifications ............. ............ ................ ............. 952.5 Extensions to the grammar.............. ............... ................ ............. .............. ............. ............ ...... 99

2.5.1 Further phonological features ............. ............. ............ ............... ............... ................ ...1022.5.2 Four phonological processes in Russian........... ............. .............. ............. ............ ........ 105

2.5.2.1 Consonant-vowel interdependencies............. ............. ............ ............... ............... .... 1052.5.2.2 Reduction of unstressed vowels ............. .............. ........... ............... ............... ........... 1142.5.2.3 Word-final devoicing............. ........... .............. ............. .............. ........... ............... .... 1202.5.2.4 Voicing assimilation ................................................................................................127

2.5.3 A test of the extensions to the grammar.............. ........... ................ ............. .............. ....1412.6 Summary................................................................................................................................146

Chapter 3: Morphological structure......................................................................................................1493.1 Introduction and aims..... .............. .............. ............. .............. ............. .............. ............. ......... 149

3.1.1 Generative approaches to word-formation............... ............... ............... ................ ....... 152

3.1.2 Morphology and context-free grammar........... ............. ............. ............ ............. .......... 1583.2 A linear grammar of Russian word-formation ............. ............ ............. .............. ........... ........ 161

3.2.1 The morphological inventory of Russian........ .............. ........... .............. ............. .......... 1613.2.1.1 Preliminaries: controversial issues.............. ............ ................ ............. ............ ........ 1643.2.1.2 The classification system ........... .............. ............. .............. ............. .............. .......... 165

3.2.2 The word-formation rules ........... ............. ............ ................ ........... ................ ............. .1703.2.2.1 Words with no internal structure....... ............... ................ ............. ............ ............... 1713.2.2.2 Nouns.......................................................................................................................1723.2.2.3 Verbs........................................................................................................................1783.2.2.4 Prefixation................................................................................................................180

3.3 Vowel-zero alternations in context-free grammar...... .............. ............. .............. ............... ....1853.4 A heuristic for deciding between multiple morphological analyses...... .............. .............. ..... 202

3.4.1 Assigning costs to competing analyses...... .............. ............. .............. ............. ............. 2053.4.2 Should the cost mechanism be based on hapax legomena?..........................................209


8/425

6

3.5 Tests of the word-formation grammar.............. .............. ............. ............ ................ ............. ..2143.5.1 Test of coverage of the word-formation grammar........... ............... ................ ........... ...2153.5.2 Test of the grammars treatment of vowel-zero alternations .............. ........... ............... 218

3.6 Conclusion .............................................................................................................................222

Chapter 4: Stress assignment: three existing theories...........................................................................2244.1 Introduction............................................................................................................................224

4.1.1 Two approaches to stress in Russian: the Slavist and the generative approaches......... 2244.1.2 Aims of this chapter...... .............. .............. ............. .............. .............. ............. .............. 232

4.2 Three theories of stress assignment..... .............. ............. .............. ............... ................ ........... 2334.2.1 Halle (1997)..... .............. ............. .............. ............. .............. ............. .............. ............. .2334.2.2 Melvold (1989)............ .............. ............. .............. ........... ............... ................ ........... ...2374.2.3 Zaliznjak (1985) ...........................................................................................................244

4.3 Derivational theories and underdeterminacy............ .............. .............. ............. .............. .......2484.3.1 Computing underlying accentuations by brute force.................. .............. .............. ....2514.3.2 Backwards phonology and the Accent Learning Algorithm..... .............. ........... ........... 252

4.3.2.1 A concise encoding of solutions....... ............ ............. ............. ............ ............. ...... 2574.3.2.2 Formalization of the Accent Learning Algorithm....... ................ ............. .............. ..2594.3.2.3 A small-scale demonstration of the ALA on a non-problem combination............... 261

4.3.2.4 Problem words ........... ............. .............. .............. ............. .............. ............. ............. 2714.3.2.5 Modifications to the ALA to allow for different theories ......... ................ ............. ..2744.3.2.6 Conclusions from the ALA............... ................ ........... .............. ............. .............. ...278

4.3.3 Unique specification of the morpheme inventory by defaults ............. .............. ........... 2834.4 Tests to ascertain the coverage of the three theories ............ ................ .............. ............. .......291

4.4.1 Test of Halles theory on non-derived nouns.......... ............ ............. ........... .............. ....2924.4.2 Test of Halles theory on non-derived and derived nouns ........... .............. ............. ...... 2934.4.3 Test of Melvolds theory on non-derived and derived nouns ............. ............. ............ .2944.4.4 Test of Melvolds theory on nouns, non-reflexive verbs and adjectives.............. ......... 2954.4.5 Test of Zaliznjaks theory on nominative singular derived nouns............ ............. ....... 2964.4.6 Test of Melvolds theory on nominative singular derived nouns............... ........... ........ 2974.4.7 Analysis of errors in Melvolds and Zaliznjaks theories ................ ........... .............. ....298

4.5 Summary ................................................................................................................................307Chapter 5: Stress assignment: a new analysis.......................................................................................309

5.1 Introduction............................................................................................................................3095.2 Context-free phonology and stress in Russian ............. ............ ............. .............. ............. ...... 3115.2.1 Encoding which morpheme determines stress .............. ............. ............ ................ ....... 3125.2.2 Polysyllabic morphemes.... ............... ............... ............ ................ ............. ............ ........ 3185.2.3 Post-accentuation..........................................................................................................3195.2.4 Jer stress retraction .............. ............. ............ ................ ............. ............ ............... ........ 3255.2.5 Plural stress retraction............. .............. ............... ................ ............. .............. ............. .3295.2.6 Dominant unaccented morphemes...... .............. ............. .............. ............. ............ ........ 3335.2.7 Concluding comments about the context-free phonology ........... ............. ............ ........ 336

5.3 A test of the entire grammar.......... .............. .............. ............. .............. .............. ............. ....... 3385.4 Conclusions............................................................................................................................343

Appendix 1: Russian syllable structure grammar.................................................................................346Appendix 2: Russian word-formation grammar ................ ............. .............. ............. ............ ............... 355Appendix 4: Morphological inventory ........... .............. ............. .............. .............. ............. .............. ....358

Appendix 5: The computational phonology as a Prolog Definite Clause Grammar.............................392References. ...........................................................................................................................................413


9/425

7

Table of figures

Figure 1. The Chomsky Hierarchy ......... ................ .............. ........... ................ .............. ............. ............ 20

Figure 2. Classification of analyses of an imperfect grammar ............. .............. .............. ............... ....... 25Figure 3. Tree-structure for!+%+%! .......................................................................................................75Figure 4. Lattice showing the hierarchy of Russian phoneme classes..................................................110Figure 5. The Russian vowel system .............. ............. .............. .............. ............... ................ ............. .115Figure 6. The Russian vowel system in unstressed positions .............. ................ .............. ................ ...116Figure 7. Partial syllabic structure of pretonic !%! after a [$back] consonant ............ ............. ............. .119Figure 8. Tree-structure for!"#$%&!0'1$'2%!(0'1&'.23)..........................................................................138Figure 9. Parse tree for'"#()*)+,-.$!4'1$"',5,+6#&'! ......................................................................158Figure 10. Examples of subtrees from Figure 9....................................................................................159Figure 11. Morphological tokenization of'"#()*)+,-.$!4'1$"',5,+6#&'!.........................................160Figure 12. Parse tree for'"#()*)+/0!4'1$"',5,+,7!..........................................................................161Figure 13. Oliveriuss (1976) tokenization of*"'1)'&!5148&8',4%! woman....................................175

Figure 14. Parse tree for*"'1)'&!5148&8',4%! woman.....................................................................175Figure 15. Three alternative representations of!906c+&8',&rv+%svi+&'sa!...................................................181

Figure 16. Representation of the morpheme !-%#2!%!-%#62! weasel ...................................................190Figure 17. Structure of#"'23"% ...........................................................................................................199Figure 18. Structure of4,#,*56& ........................................................................................................200Figure 19. Structure of4,#*,5 ............................................................................................................201Figure 20. Parse tree for-4"7)&6$',-.$ (with log probabilities)........................................................208Figure 21. Rank-frequency graph.........................................................................................................213Figure 22. Analysis of coverage of morphology grammar ........... ................ ........... .............. ............. ..217Figure 23. Parse tree for-4"7)&6$',-.$..............................................................................................314Figure 24. Morphological/phonological structure of#8!&%&:!$;0ra+%2sn+%in3!.....................................322Figure 25. The constraint pool..............................................................................................................324Figure 26. Morphological/phonological structure of#8!&:%!$;0ra+%2sn+in1!.........................................327

Figure 27. Morphological/phonological structure of(/-,:./!",#ra+/6&sn1+,in!.....................................332Figure 28. Morphological/phonological structure of8:!,("'$ //;c+06"'ra+14'sn+in! ..............................335


10/425

8

List of tables

Table 1. Types of rules permitted by grammars in the Chomsky Hierarchy.............. ................ ............ 20Table 2. Analysis of words in on-line corpus ............. .............. ............. .............. .............. ............. ........ 30

Table 3. Russian morpheme structure constraints on consonant clusters ............... ................ ............. ...41Table 4. Reanalysis of morpheme-medial clusters using syllable structure ............. .............. .............. ..42Table 5. Phonological inventories of different scholars ............. .............. .............. ........... ................ .....65Table 6. The phonemic inventory of Russian.........................................................................................67Table 7. Classification of Russian phonemic inventory ........... ................ ............. ............ ............... ...... 69Table 8. Distribution of word-initial onsets by type...............................................................................77Table 9. Distribution of word-final codas by type..................................................................................88Table 10. Further coda rules...................................................................................................................90Table 11. Exhaustive list of initial clusters not accounted for................................................................91Table 12. Exhaustive list of final clusters not accounted for..................................................................92Table 13. The twelve most frequently applying onset, nucleus and coda rules......................................97Table 14. Feature matrix to show classification of Russian phonemes and allophones with respect to all

features........................................................................................................................................103Table 15. Allophonic relationships in consonant-vowel sequences ................ ........... .............. ............ 107

Table 16. Allophones of!%! and !6! ......................................................................................................117Table 17. Results of phoneme-to-allophone transcription test ............ ............... ................ .............. ....145Table 18. Classification system for substantival inflectional morphs...................................................169Table 19. Further categories of morphological tokenization................................................................173Table 20. Summary of results of parsing 11,290 words ............. .............. ........... .............. ............. ...... 217Table 21. Derivations of six Russian words in accordance with Halle (1997).....................................237Table 22. Derivations of five Russian words in accordance with Melvold (1989)...............................242Table 23. Possible solutions for-.,6!#&/6-! table (nom. sg.).............................................................254Table 24. Possible solutions for-.,6&:

Table 29. Demonstration that Melvolds theory is problematic ........... .............. ................ .............. ....279Table 30. Demonstration that Zaliznjaks theory is problematic..........................................................280Table 31. Number of candidate accentuations against..........................................................................282Table 32. Ranking of underlying morpheme forms..............................................................................288Table 33. Results of testing Halles theory on non-derived words.......................................................293Table 34. Results of testing Halles theory on non-derived and derived nouns....................................294Table 35. Results of testing Melvolds theory on non-derived and derived nouns...............................294Table 36. Results of testing Melvolds theory......................................................................................295Table 37. Results of testing Zaliznjaks theory ........... .............. .............. ................ ............. ............ ....297Table 38. Results of testing Melvolds theory......................................................................................297Table 39. Analysis of words incorrectly stressed by Melvolds theory................................................299Table 40. Analysis of words incorrectly stressed by Zaliznjaks theory ........... ............... ................ ....300Table 41. Exceptions common to Zaliznjaks and Melvolds theories.................................................302Table 42. Prefixed nouns stressed incorrectly by Zaliznjak ............. ............. ............ ................ ........... 303

Table 43. Words derived from prefixed stems ............. ................ ........... ................ ........... .............. ....305Table 44. Further words derived from prefixed stems..........................................................................305Table 45. Results of testing the overall phonology for its ability to assign stress .............. .............. ....340Table 46. Results of testing Melvolds theory on 4,416 nouns.............................................................341


11/425

9

Chapter 1: Introduction

1.1 Introduction

This dissertation provides a coherent, synchronic, broad-coverage, generative

account of Russian phonology. By broad-coverage, I mean that it will cover a

number of phonological phenomena (stress assignment, syllabification, vowel-zero

alternations, word-final devoicing, voicing assimilation, vowel reduction, and

consonant-vowel interdependencies) within a single constrained grammar. While I

have not attempted to deal exhaustively with all the phonological problems of interest

in Russian (for example, I do not attempt to account for all morphophonological

alternations), the current work covers those areas which have attracted the most

attention in the literature on Russian phonology.

While all these aspects of Russian phonology have been richly documented,

generally they have been dealt with in isolation; the one notable exception to this is

Halles (1959) Sound Pattern of Russian. The following quotation (op. cit., p. 44)

serves to show that Halles account of Russian phonology is also intended to be

broad-coverage in the sense just outlined:

When a phonological analysis is presented, the question always arises as to whatextent the proposed analysis covers the pertinent data. It is clearly impossible in a

description to account for all phonological manifestations in the speech of even asingle speaker, since the latter may (and commonly does) use features that arecharacteristic of different dialects and even foreign languages. (E.g., a speaker ofRussian may distinguish between nasalized and nonnasalized vowels in certain[French] phrases which form an integral part of his habitual conversationalrepertoire.) If such facts were to be included, all hopes for a systematic descriptionwould have to be abandoned. It is, therefore, better to regard such instances asdeviations to be treated in a separate section and to restrict the main body of thegrammar to those manifestations which can be systematically described.


12/425

10

The aim of the current work is thus substantially the same as that of Halle

(1959). However, in the forty years since then there have been a number of advances,

both linguistic and technological, which allow us to take a fresh (and perhaps more

rigorous) look at some of the same phenomena which Halle and others attempted to

describe. In the late 1950s and early 1960s Chomsky and co-workers pioneered work

in developing a formal theory of language (Chomsky 1959, 1963, 1965); this work

established clearly-defined links between linguistics, logic and mathematics, and was

also foundational in computer science in the sense that the principles it established

have also been applied in understanding computer programming languages. These

advances make it possible to formulate a theory of Russian phonology, just as Halle

did, but to test it empirically by implementing the theory as a computer program and

using it to process very large numbers of words. Moreover, since the technological

advances which make it possible to do this owe a great deal to Chomskys work, the

transition from generative grammar to computational grammar can be a comparatively

straightforward one.

One of the defining features of generative grammar is the emphasis on

searching for cross-linguistic patterns. Without denying the value of language-specific

grammar, Chomsky and Halle (1968) (to many the canonical work of generative

phonology) illustrates this thinking:

...we are not, in this work, concerned exclusively or even primarily with the facts ofEnglish as such. We are interested in these facts for the light they shed on linguistictheory (on what, in an earlier period, would have been called universal grammar)and for what they suggest about the nature of mental processes in general Weintend no value judgment here; we are not asserting that oneshouldbe primarilyconcerned with universal grammar and take an interest in the particular grammar ofEnglish only insofar as it provides insight into universal grammar and psychologicaltheory. We merely want to make it clear that this is our point of departure in thepresent work; these are the considerations that have determined our choice of topicsand the relative importance given to various phenomena. (p. viii)


13/425

11

The emphasis on cross-linguistic generalization, characteristic of Chomskys

work, has characterized generative linguistics ever since: indeed, there is a

considerable branch of linguistics (Zwicky 1992 is an example) which abstracts

completely away from language-specific data. (This branch deals in what Zwicky

1992: 328 refers to as frameworks as opposed to theories.) While frameworks have

their place (indeed, a theory cannot exist without a framework), the difficulty is

always that frameworks cannot be verified without theories. In this light, Chomsky

and Halle (1968) claimed to establish both a cross-linguistic framework and a theory

about English phonology.

The focus of this description is on ensuring that the phonology of Russian

proposed is both internally consistent and descriptively adequate&that is, that it

makes empirically correct predictions about Russian&rather than on attempting to

develop any particular linguistic framework. Exciting possibilities are open in this line

of research thanks to the existence of computer technology. It is possible to state

grammatical rules in a form which has the rigour required of a computer program, and

once a program is in place, large corpora can be quickly processed. Thus the

phonology of Russian presented here is computational simply because of the

advantages in speed and coverage that this approach presents.

Establishing that a linguistic theory can be implemented as a computer

program and verifying its internal consistency in this way is a valuable exercise in

itself, but non-computational linguists may be sceptical: some may argue that this

kind of approach does not contribute anything to linguistics per se. Whether or not

this is criticism is well-founded (and I believe it is not), I hope that this dissertation


14/425

12

will satisfy even the more stringent critics by making a number of key contributions to

linguistic knowledge. These are as follows.

First, I propose that both the distribution of!"! and its behaviour with respect

to voicing assimilation can be explained if!"!, unlike all other segments in the

phonological inventory of Russian, is lexically unspecified for the feature [!sonorant].

The syllable structure rules determine whether!"! is [+sonorant] or [$sonorant], and

this in turn determines how !"! assimilates in voice to adjacent segments.

Second, I suggest that the greater latitude allowed in word-marginal onsets and

codas, which is a feature of Russian and other languages (cf. Rubach and Booij 1990),

can be explained naturally by a probabilistic syllable structure grammar. This

approach allows features such as [!initial] and [!final] (cf. Dirksen 1993) to be

dispensed with.

Third, I show that vowel-zero alternations in Russian cannot fully be

explained by a Lexical-Phonology-style account (such as that proposed by Pesetsky

ms 1979) alone, nor can they be the result of epenthesis alone. I show empirically that

a combination of factors, including (1) the morphophonological principles discovered

by Pesetsky, (2) epenthesis, and (3) etymology, governs vowel-zero alternations.

Fourth, I show that Russian stress can be accounted for with a high rate of

accuracy by existing generative theories such as that of Melvold (1989), but I suggest

a simpler theory which accounts for the same data with as good a rate of accuracy.

The theory which I propose regards stress assignment as resulting from the interaction

of morphological and syllable structure: existing generative theories do not

acknowledge syllable structure as playing any role in Russian stress assignment. An

integral part of my theory is a comprehensive inventory of morphemes together with


15/425

13

the accentual information which is lexically specified for each morpheme. The

inventory which I propose, which is arrived at by computational inference, includes

1,094 morphemes and 1,509 allomorphs, while the longest existing list of this type, as

far as I am aware, is the index of approximately 250 suffixes in Redkin (1971).

The structure of this dissertation is as follows. In this chapter, I set out in

detail the concepts which are foundational to the whole work: the role which

computation plays in my work (1.2), the framework which I use (1.3), and the

methodology which underlies my work (1.4). Then, I discuss in detail aspects of the

syllable structure and morphological structure of Russian in Chapters 2 and 3

respectively, in each case developing a formally explicit grammar module which can

be shown to be equivalent to a finite state grammar. Chapter 4 describes in detail three

theories of stress assignment in Russian. These are tested computationally to ascertain

which is the most promising. Each of Chapters 2-4 begins with a section reviewing

the relevant literature. Finally, in Chapter 5, I describe how the principal features of

the preferred theory from Chapter 4 can be incorporated into a synthesis of the

grammars developed in Chapters 2 and 3. The result is an integrated, internally

consistent, empirically well-grounded grammar, which accounts for a variety of

different aspects of Russian phonology.

1.2 Why computational linguistics?

In this dissertation, computation is used as a tool. Any tool has limitations, of

course: a large building cannot be built with a power drill alone, and, to be sure, there

are problems in linguistics which computation is ill-suited to solve. On the other hand,

anyone who has a power drill will try to find appropriate uses for it. Likewise, I aim

to use computation for the purposes for which it is best suited. This, then, is not a


16/425

14

dissertation about computational linguistics; it is a dissertation that uses computation

as a tool in linguistics.

What, then, are the strengths of computational tools in linguistics? Shieber

(1985: 190-193), noting that the usefulness of computers is often taken for granted by

computational linguists, lists three roles that the computer can play in the evaluation

of linguistic analyses: the roles of straitjacket (forcing rigorous consistency and

explicitness, and clearly delineating the envelope of a theory), touchstone (indicating

the correctness and completeness of an analysis), and mirror (objectively reflecting

everything in its purview). In short, the process of implementing a grammar

computationally forces one to understand in detail the mechanisms by which a

grammar assigns structure. Shieber states, for example, that

we have found that among those who have actually attempted to write a computer-interpretable grammar, the experience has been invaluable in revealing real errorsthat had not been anticipated by the Gedanken-processing typically used by linguists

to evaluate their grammars&errors usually due to unforeseen interactions of variousrules or principles. (p. 192)

This has also been my experience in developing the current phonology of

Russian. In particular, areas such as stress assignment involve the interaction of a

number of different grammar modules, and, as Shieber states, decisions in one part of

the grammar, while internally consistent, may not cohere with interacting decisions in

another part (Shieber 1985: 190). Problems of this kind cannot always feasibly be

foreseen without actually implementing and testing a theory on a corpus of data.

Another perhaps self-evident strength of computers is their ability to process

large volumes of data quickly: once a grammar has been implemented, the processing

can take place without intensive effort on the part of the researcher. While in principle


17/425

15

generative theories can be implemented and tested by hand, the volume of data that

typically has to be processed to achieve significant results means that this is an

extremely tedious and time-consuming, if not impracticable, task. Clearly,

computational techniques shift the burden for the researcher from data processing to

the more interesting task of developing theories, identifying exceptions quickly, and

debugging the theory as appropriate.

Because the discipline of computational linguistics is still relatively young, it

is perhaps understandable that many existing theories have neither been implemented

nor tested computationally, but now that the means to validate theories are widely

available, it is less justifiable for new theories still to be proposed in linguistics

withoutbeing empirically tested: the widespread practice of testing a few interesting

cases is unreliable and is no substitute for an exhaustive check (Bird 1995: 14). It

seems that at this stage in linguistic research, the efforts of linguists would be better

directed towards implementing and testing existing theories rather than proposing new

alternatives, since otherwise it cannot be demonstrated that the new alternatives

measure up any better to the criteria of coverage, constrainedness and ability to

integrate than the theories which they replace.

It is also worth noting the limitations of computational analysis (which I set as

the limits for this dissertation). Ultimately, computers follow instructions rather than

making judgements, and while they are very good at evaluating grammars for

consistency and descriptive adequacy, they cannot test for explanatory adequacy

unless the programmer supplies the necessary information (that is, a standard against

which to measure the accuracy of structures assigned by a grammar to strings). The

judgement about the nature of the correct structures is a question of psychology, and


18/425

16

therefore I do not claim that the current phrase-structure context-free phonology of

Russian is a psychological model. In this, my approach is exactly the same as that of

Gazdar, Klein, Pullum and Sag (1985):

We make no claims, naturally enough, that our grammatical theory is eo ipso apsychological theory. Our grammar of English is not a theory of how speakers thinkup things to say and put them into words. Our general linguistic theory is not a theoryof how a child abstracts from the surrounding hubbub of linguistic and nonlinguisticnoises enough evidence to gain a mental grasp of the structure of a natural language.Nor is it a biological theory of the structure of an as-yet-unidentified mental organ. Itis irresponsible to claim otherwise for theories of this general sort

Thus we feel it is possible, and arguably proper, for a linguist (qua linguist) to ignorematters of psychology. But it is hardly possible for a psycholinguist to ignorelanguage If linguistics is truly a branch of psychology (or even biology), as is oftenunilaterally asserted by linguists, it is so far the branch with the greatest pretensionsand the fewest reliable results So far, linguistics has not fulfilled its own side of theinterdisciplinary bargain. (p. 5)

1.3 The framework

1.3.1 Phrase-structure grammar

In this dissertation, phonology and morphology, as modules of grammar, have

the function of enumerating or generating (the words of a) language. This view of

grammatical modules is entirely in accordance with traditional generative linguistics

(e.g. Chomsky and Miller 1963: 283-285). More precisely, a phonological grammar

should be able to generate all and only the phonological words of a natural language;

similarly, a word-formation grammar should enumerate all the morphological words

(p-forms, in the terminology of Zwicky 1992: 334) of a natural language.1 The same

1 As noted by Booij and Rubach (1984), there may well not be a one-to-one mapping betweenmorphological words and phonological words&well-known examples from Russian arepreposition-noun phrases, all of which have a single stress (e.g. 9&:!8%8!*/%


19/425

17

grammar that enumerates the forms of a language should also be able to assign them a

structural description (that is, parse them). These functions are clearly fulfilled by

phrase-structure grammars (PSGs), since in a PSG each rule can equivalently be

thought of as a partial structure, and each derivation can be represented as a directed

graph.

The ability of a grammar to parse (that is, providesome structural description

for the word) does not necessarily imply its ability to parse correctly. As Chomsky

and Miller (1963: 297) state, we have no interest, ultimately, in grammars that

generate a natural language correctly but fail to generate the correct set of structural

descriptions. A grammar which is able to assign structural descriptions to all relevant

well-formed utterances in a language is said to meet the condition ofdescriptive

adequacy, while a grammar which meets the more stringent requirement of assigning

correct structural descriptions to all well-formed utterances is said to meet the

condition ofexplanatory adequacy. In general, it is considerably harder to prove or

disprove a grammars explanatory adequacy than its descriptive adequacy, since the

former is a matter not just of linguistic data but of psychology as well (Chomsky

1965: 18-27). Moreover, it is important to realize that a parse should not necessarily

be considered incorrect just because it was unanticipated: such a parse may in fact be

a possible but unlikely parse. These factors all mean that establishing whether a given

grammar assigns correct structural descriptions is not always straightforward, and is

often a matter of judgement.

Conversely, English words of the form non-X (whereXstands for an adjective) are a singlemorphological word, but two phonological words.


20/425

18

Essentially, there are three good reasons for formulating a theory within the

framework of PSG. First, PSGs are the standard means of assigning hierarchical

constituent structure to strings, which is widely and uncontroversially regarded as an

important function of linguistics. The literature on phrase-structure grammar has been

developed over approximately 40 years, and owes much to Chomskys interest in

establishing a formal foundation for linguistics (e.g. Chomsky 1959, Chomsky and

Miller 1963, Chomsky 1963).

A second strength of the PSG formalism is that it has a straightforward

declarative interpretation. Phrase-structure grammar rules can equally validly be

seen as partial descriptions of surface representations or descriptions of information

structures, in Brown, Corbett, Fraser, Hippisley and Timberlakes (1996)

terminology. Specifically, context-free phrase-structure rules can be represented

graphically as tree structures (Coleman 1998: 99).

Third, there is a transparent relationship between PSGs and Definite Clause

Grammars (DCGs)2. This is perhaps the greatest advantage of using the PSG

formalism, because it means that a PSG can easily be implemented and tested

computationally. DCGs are a particular type of formalism available as part of the

programming language Prolog. For details of the workings of Prolog and DCGs, the

reader is invited to refer to a textbook on Prolog, such as Clocksin and Mellish

(1981). Here, it is sufficient to appreciate that DCGs can fulfil the functions of parsing

and generation, because Prolog is a declarative programming language. Thus, if a

2 DCGs are capable of defining recursively enumerable languages and CFGs are capable of definingonly context-free languages (which are a subset of the set of recursively enumerable languages). Thus,to be more precise, the type of DCG used to implement the theory proposed in this dissertation is arestricted type of DCG.


21/425

19

particular grammar is implemented as a DCG, it is possible to test the grammar

computationally to determine whether it describe[s] all and only the possible forms

of a language (Bird, Coleman, Pierrehumbert and Scobbie 1992). Throughout this

dissertation, I describe computational tests of this kind to determine whether the

different aspects of the grammar are accurate representations of the facts of the

language.

1.3.2 Context-free grammar

Having established in section 1.3.1 why I use the framework of PSG, I now

move on to explain the significance of my claim that nothing more powerful than a

context-free grammar (CFG) is necessary to describe the facts of Russian phonology.

The claim that CFG is sufficient is in contrast to McCarthy (1982), for example, who

claims that phonology is context-sensitive (p. 201). (Coleman 1998: 81 observes that

his phonology is an unrestricted rewriting system, since it is a context-sensitive

grammar with deletion: see McCarthys (1) on p. 201.) In other respects, however,

McCarthys aim is comparable to mine: McCarthy aims to provide a fair degree of

coverage, particularly in Hebrew phonology and Arabic morphology (p. 2), including

stress assignment.

CFGs are one of a number of types of grammar formalism in the Chomsky

Hierarchy (Chomsky 1959), represented in Figure 1. All of these grammar formalisms

are members of the family of PSGs. The place of a particular grammar within the

hierarchy is determined by the type of rules included in the grammar, as shown in

Table 1 (adapted from Coleman 1998: 79).


22/425

20

Figure 1. The Chomsky Hierarchy

Table 1. Types of rules permitted by grammars in the Chomsky Hierarchy3

Type Grammar Rule typesallowed

Conditions on symbols

0 Unrestricted A ' B A ( (VT) VN)*B ( (VT) VN)*

1 Context-sensitive A ' B/ C _ DandA '

A ( VN,B ( (VT) VN)

+,C,D ( (VT) VN)*

2 Context-free A ' B A ( VN,B ( (VT) VN)*

3 Right linear A ' aB A ( VN,B ( (VN) {}),a ( VT

3 Left linear A ' Ba A ( VN,B ( (VN) {}),a ( VT

3 Note to Table 1: Following Chomsky (1959) and Coleman (1998), VT represents the set of terminalsymbols, VN the set of non-terminal symbols, X* a sequence of zero or more Xs, X

+ a sequence of oneor more Xs, and the empty string.

Linear grammar (type 3)

Context-free grammar (type 2)

Context-sensitive grammar (type 1)

Unrestricted grammar (type 0)


23/425

21

Because there has been a considerable amount of work carried out in phrase-structure

grammar, the properties of different types of PSG in the Chomsky Hierarchy are by

now well understood. These properties are important to consider when formulating a

theory, for reasons which will now be made clear.

On a very general level, the more restricted the grammar formalism, the better.

This follows, essentially, from the principle of Occams razor: as Coleman (1998: 80)

points out, the goal in developing a formal theory of natural-language syntax or

phonology, is to use a type of grammar which is as powerful as necessary, but as

restrictive as possible. It should be acknowledged, however, that context-free

grammars can in practice have a cost compared to more powerful types of grammars,

in that more powerful grammars may describe the same phenomena more simply

(with fewer features or more general rules, for example), and may even be able to

parse and generate more efficiently in some cases (Weinberg 1988).

However, there are other, perhaps more psychological, arguments in support

of choosing a grammar formalism no more powerful than context-free. Bresnan and

Kaplan (1982) set out a number of constraints that they suggest linguistic theory

should impose on the class of possible grammars, and CFGs adhere to all but one of

these constraints. The one constraint which CFG does not adhere to is the

universality constraint, which assumes that the procedure for grammatical

interpretation, mG, is the same for all natural language grammars G (Bresnan and

Kaplan 1982: xlvii). It is significant that Bresnan and Kaplans grounds for stating

that CFG does not adhere to this constraint come from syntax, not phonology:

Bresnan, Kaplan, Peters, and Zaenen 1982 have shown that there is no context-freephrase-structure grammar that can correctly characterize the parse trees of Dutch.The problem lies in Dutch cross-serial constructions, in which the verbs arediscontinuous from the verb phrases that contain their arguments The results of


24/425

22

Bresnan, Kaplan, Peters, and Zaenen 1982 show that context-free grammars cannotprovide a universalmeans of representing these phenomena. (p. xlix)

Of the other constraints, one is the creativity constraint. One of the claimed

contributions of generative grammar to linguistics was the observation that if a

grammar is to be an equally valid model both of linguistic perception and production,

it should be able not only to assign structure to strings, but also to generate strings

(hence the term generative grammar). This observation is, for example, one of the

foundational tenets of Chomsky (1957). As noted by Matthews (1974: 219),

generative linguistics was partly a reaction to structuralist linguistics, which (it was

claimed) emphasized assignment of structure at the expense of generation. Despite the

emphasis of generative linguists upon the generative, it is notable that context-

sensitive grammars and those more powerful are notnecessarily reversible (Bear

1990). However, CFGs do always have the property of reversibility: that is, they can

be used either for generation or recognition.

Another constraint which CFGs satisfy is Bresnan and Kaplans reliability

constraint: that is, they can always accept or reject strings in a finite amount of time.

One of the properties of context-free (and more restricted) languages is that of

decidability (alternatively known as computability, Turing-decidability or

recursiveness). A language L*is decidable if there is an algorithm for determining

membership in L; in other words, L is decidable if there is a grammar which can

decide whether a string is well- or ill-formed (a member of L or not) in a finite

amount of time. Languages of type m, where m+ 1, are not necessarily decidable, but

those of type n, where n > 1, are always decidable. Bresnan and Kaplan argue that

natural languages must be decidable, since:


25/425

23

It is plausible to suppose that the ideal speaker can decide grammaticality byevaluating whether a candidate string is assigned (well-formed) grammaticalrelations or not. The syntactic mapping can thus be thought of as reliably computingwhether or not any string is a well-formed sentence of a natural language. This

motivates the reliability constraintthat the syntactic mapping must provide aneffectively computable characteristic function for each natural language. (p. xl)

The principal objection which has been raised to this assumption, and one

which is noted by Bresnan and Kaplan, is that native speakers often do not do well at

parsing garden path constructions such as The canoe floated down the river sankand

The editor the authors the newspaper hired liked laughed. However, they suggest,

plausibly, that these constructions do not disprove their hypothesis. After all, they

argue, speaker-hearers can disambiguate these sentences and recover from the garden

paths, given more (but not infinite) time, and possibly a pencil and paper.

A third reason for choosing the formalism of CFG is that the ordering of the

rules of CFGs will not affect the way in which they function or their end result

(although the ordering ofapplication of rules may have an effect on the outcome). All

forms and constraints in CFGs are partial descriptions of surface representations, no

rules do not ultimately constrain surface forms, all constraints must be compatible and

apply equally, and any ordering of constraints will describe the same surface form

(Scobbie, Coleman and Bird 1996). The motivation for this Order-free Composition

Constraint, as Bresnan and Kaplan (1982: xlv) call it, is the fact that complete

representations of local grammatical relations are effortlessly, fluently, and reliably

constructed for arbitrary segments of sentences (Bresnan and Kaplan 1982: xlv).

Again, this does not hold for all types of grammar.

There are thus a number of reasons why it is desirable to restrict a grammar so

that it is no more powerful than context-free. To summarize, these are as follows:


26/425

24

, CFGs are a relatively restricted class of grammar, and we would like to choose themost restricted theory which will account for the facts;

, CFGs have a generative as well as a recognitive interpretation;, CFGs are Turing-decidable;, the rules of CFGs need not be ordered in any particular way;, although CFGs have been shown to be unable to cope with all aspects of syntax,

there is no evidence to suggest that they are insufficient as far as phonology is

concerned.

1.4 The methodology

Generative linguists often claim that linguistics is a science. This claim is

made for phonology, for example, in Halle (1959: 24). What is meant by this?

Sommerstein (1977: 9) answers this question with respect to phonology as follows:

In science we frame and test hypotheses. It does not matter in the least how thesehypotheses are arrived at in the first place; it is the exception rather than the rule foran interesting hypothesis to be reached by a mechanical procedure, such as phonemicanalysis essentially is. Rather, what makes a hypothesis scientific or unscientific iswhether it can be stated what kind of empirical evidence will tend to disconfirm it,and what kind will definitely refute it. And there is no reason why this generalscientific principle should not be valid for phonological analysis.

Thus any grammar we propose has the status of a scientific theory that

attempts to account for observed linguistic data. On a philosophical level, the dataexist independent of any grammar; in other words, the existence of sentences, words,

etc., in a language does not depend on our ability to formulate grammar rules to

account for them. The only way of determining how well a grammar really does fit

the data is to test it empirically. One way in which scientific methodology can work is

incrementally: we look at the cases where a theory does not fit the data and modify


27/425

25

the theory accordingly. One would hope that the coverage of each successive theory

advanced using this kind of methodology would eventually approach 100%.

I shall now elucidate what is meant here by the coverage of a linguistic

theory. As we saw in 1.3.1, a given grammar may be descriptively but not

explanatorily adequate, but the converse is not possible. It may also be neither

descriptively nor explanatorily adequate, which means that it fails altogether to assign

a structural description to some utterances. For an imperfect grammar of this type, the

set of correctly parsed utterances will be a subset of the set of parsed utterances,

which in turn will be a subset of the set of all utterances, as Figure 2 illustrates.

Figure 2. Classification of analyses of an imperfect grammar

There are three measures that we shall be interested in. The first of these is coverage,

the number of utterances in Q as a percentage of the number of words in P. The

second is correctness orstructural coherence, the number of utterances in R as a

percentage of the number in P. The third is the number of utterances in R as a

percentage of the number in Q. Arguably, the second of these is the best overall

P: All words

Q: Words assigned somestructural description

R: Words assigned the correctstructural description


28/425

26

measure, but as we do not always have access to data which tell us what the correct

structures are, in some cases we have to use the first instead. The third measure will

be most relevant in Chapter 5, where we need to separate the issues of morphological

structure and stress assignment in order to be able to do a like-for-like comparison

between the phrase-structure phonology proposed in this dissertation and Melvolds

theory.

The methodology that underlies the current work is also incremental. In

subsequent chapters I advance theories about the syllable structure and morphological

structure of Russian words which are arrived at by trial and error: see, for example,

(91) in 3.2. The process of actually checking the descriptive adequacy of a grammar is

straightforward and well-suited to computational processing, since the latter is fast

and reliable.

1.5 The dataset used for the tests

In order to test a grammar computationally, it is necessary to have some kind

of lexical database which one can use as the dataset for the tests. As a minimum, the

database used in the research described here has to give the following information for

every word therein:

, A phonological transcription, The position of the word-stress, The position of all morpheme boundaries within the word, The part of speech

Additional information which would have been desirable for each word in the

corpus, but was unobtainable on a systematic basis, was as follows:


29/425

27

, A phonetic transcription,

The position of all syllable boundaries within the word

Although there are many existing electronic corpora for different languages

(including Russian), the requirements of the research described in this dissertation

were such that no existing electronic corpus was adequate for the purpose. Thus part

of the preliminary work necessary was to compile a purpose-made lexical database. In

this section, I discuss how I did this.

Oliverius (1976) contains a list of 2,493 morphologically tokenized words.

However, these words are all headwords. There are two major reasons why it is

desirable to extend Oliveriuss list to include inflected forms. First, if the dataset is

restricted to the words in Oliverius (1976), a large portion of the vocabulary of

Russian (all the inflected forms) is missed. This is unacceptable because the current

dissertation explicitly deals with the application of phonological theories of Russian

to the output of both derivation and inflection. Secondly, the larger the dataset used as

the basis for testing theories, the greater the level of significance the results will have.

One way of computationally extending the list to include inflected forms

would be to compute the inflected forms (with stress) from the head-words and

information about their stress patterns. This information can all be found in Zaliznjak

(1977), which is available in an electronic version. A program could be written to go

through the list of words in Oliverius (1976), matching each to the relevant entry in

Zaliznjak (1977), and generating the appropriate inflected forms. Although it could be

automated, even this approach would be a large undertaking, primarily because of the

thoroughness of Zaliznjaks description: the key in Zaliznjak which explains the

meanings of the tags to each entry takes up a significant amount of space in the


30/425

28

dictionary (132 pages). This information is not included in the electronic version, and

it would all have somehow to be input manually if the inflected forms of all entries in

the dictionary were to be generated computationally.

Fortunately, however, this was unnecessary. One of the products of the

research carried out by Brown and his colleagues at the University of Surrey (Brown,

Corbett, Fraser, Hippisley and Timberlake 1996) is a theorem dump listing the

inflected forms of 1,536 nouns. This file includes comprehensive information about

word-stress, but the words are only partly morphologically tokenized (since stem-

inflection but not stem-internal morpheme junctions are given).

In order to ensure that all possible forms from the University of Surrey

theorem dump were fully morphologically tokenized, each headword from the

theorem dump was matched to headwords from Oliverius (1976) and the

morphological tokenization of inflected forms was extrapolated from the

morphological tokenization of the headword, by the procedure outlined in (1):

(1) (a) For each headword (e.g. #8!&%!$;0%2! fool) in Oliverius (1976), find

whether it is a noun by searching through the on-line version of

Zaliznjak (1977), which provides part-of-speech information.

(b) For each (headword) noun identified by (a), search for all relatedinflected forms in the theorem dump. For#8!&%!$;0%2! fool, these

would be as follows:

#8!&%) (nom./acc. pl.)#8!&%& (gen. sg.) #8!&%,( (gen. pl.)#8!&%8 (dat. sg.) #8!&%&+ (dat. pl.)#8!&%,+ (instr. sg.) #8!&%&+) (instr. pl.)#8!&%" (loc. sg.) #8!&%&: (loc. pl.)


31/425

29

(c) Pair the headword with its morphological tokenization, which is known

from the information in Oliverius (1976) (for example, !$;0%2! would

be paired with the tokenization !$;0ra+%2sn+in1!4), and deduce the

noun-stem by removing the inflectional ending (in this case, zero). The

noun-stem of!$;0%2! would thus be !$;0ra+%2sn!.

(d) Morphologically parse the inflected forms using the parsing

information about the stem from (c), and parsing whatever is to the

right of the stem as the inflectional ending. In this example, the

inflected forms would be parsed !$;0ra+%2sn+%in!, !$;0ra+%2sn+;in!,

!$;0ra+%2sn+6+in!, etc. More detailed information on how inflectional

endings are categorized and distinguished is given in section 3.2.1.2.

The procedure in (1) was automated, except in the case of nouns which exhibit

a vowel-zero alternation within the stem (such as ,%',!62/46! window [nom. sg.],

,%,'!/6264! windows [gen. pl.]). The morphological tokenization for these forms

was input manually.

As it turned out, 967 of the 2,493 words in Oliverius (1976) were nouns; 835

of these were included in the theorem dump. Some of these nouns are identified by

the theorem dump as having incomplete paradigms, so the number of inflected forms

including head-words identified by step (b) of (1) was 9,633 (slightly less than 12 -

835 = 10,020).

4 The notation is explained fully in section 3.2.1.2.


32/425

30

The morphologically parsed inflected forms were combined with the rest of

the morphologically parsed head-words in Oliverius (1976), giving a sample of fully

morphologically parsed words as in Table 2.

Table 2. Analysis of words in on-line corpus

Category Head-words or inflected forms

In Oliverius(1976)

In theoremdump

Number ofwords

Non-nouns Head-words - 1,525Nouns Head-words - 132

Nouns Head-words 835Nouns Inflected forms - 8,798

Total 11,290

Regrettably, the on-line corpus of 11,290 word-forms does not include any

inflected forms for non-nouns, which means that the results presented in this

dissertation will have greatest weight in their applicability to nouns. But it would not

be fair to say that this dissertation is limited in its scope to nouns, because, as can be

seen from Table 2, the number of non-nouns is great enough that statistically

significant results can still be achieved. When more comprehensive electronic corpora

of Russian become available, it will certainly be interesting to see whether re-running

some of my tests on these corpora gives results in line with those I report here;

presumably, the null hypothesis would be that this will be the case.

1.6 Summary

In this chapter, I have established the approach which I employ in developing

a computational phonology of Russian, and dealt with various issues relating to my

perspective. To summarize, the aim in subsequent chapters is to formulate a broad-


33/425

31

coverage phonology, which is generative, context-free, coherent, and makes

predictions that can be shown empirically to be correct, or at least a good first

approximation at correctness. To the extent that this aim succeeds, this work will fill

an important gap in the literature to date, as no other work of which I am aware meets

all these criteria simultaneously.


34/425

32

Chapter 2: Syllable structure

2.1 Overview and aims

This chapter presents a sonority-based syllable structure grammar of Russian.

As well as aiming to advance a specific proposal about Russian, I also aim in this

chapter to contribute to the general debate on syllabification in two ways.

First, because the grammar is implemented as a Prolog DCG and tested for its

coverage of a corpus of Russian words, I am able to identify a list of exceptions to the

Sonority Sequencing Generalization (SSG), which is widely accepted in one form or

another as the standard means of accounting for phonotactic constraints. The list of

exceptions is comprehensive with respect to the dataset tested, so the test allows us to

quantify precisely how problematic Russian is for the SSG.

Secondly, we shall see further evidence that it is worthwhile to include a

formal definition of the term syllable in a phonology, as Fudge (1969) suggests: it is

not enough to refer to the syllable without explicitly defining it, as in Chomsky and

Halle (1968). The syllabification grammar outlined here is put to work in a variety of

areas of Russian phonology: it takes on a role as a structure in which to apply

phonotactic constraints, a role familiar from Kahn (1976); it is also the structure for

the implementation of other phonological constraints, such as assimilation, word-final

devoicing, consonant-vowel interdependencies and vowel reduction; and, as will

become apparent in Chapter 5, it takes on a novel role in stress assignment (novel,

because no other treatment of Russian stress hinges on syllable structure in the way

which I suggest).


35/425

33

To my knowledge, there are no comprehensive treatments of Russian syllable

structure comparable to the one proposed in this chapter. Bondarko (1969) is a

proposal, based on experimental measurements of relative durations of consonants

and vowels in the speech chain, that all consonants (and consonant clusters) in

Russian (except for cluster-initial !7!) syllabify together with the following vowel,

meaning that almost all Russian syllables are open. If this is true, this would amount

to a comprehensive proposal on Russian syllable structure, but the problem with

Bondarkos proposal is that it says nothing about the kinds of clusters that cannot

occur syllable-initially. In other words, the evidence that Bondarko examines excludes

evidence about the phonotactic constraints of Russian: for example, Bondarkos

theory does not explain why no Russian words begin with !42!. This kind of

consideration is the starting-point of this dissertation; after all, a generative grammar

must be able not only to assign syllable structure, but also to generate legal structures

and rule out illegal ones. Thus the grammar I propose, contrary to Bondarko (1969),

suggests that a number of different types of closed syllable can occur in Russian.

The remainder of this chapter is organized as follows. Section 2.2 reviews the

literature on syllable theory. Sections 2.3-2.4 describe a phrase-structure sonority-

based theory about Russian syllable structure. This theory is a linear (i.e. Type 3)

grammar, with all the advantages this brings (see section 1.3.2). However, the nature

of the constraints needed to account for Russian syllable structure is far from obvious.

The primary aim of the discussion in sections 2.3-2.4 is to establish what these

constraints are, rather than debating the issue of how syllable structure is assigned. I

then move on in section 2.5 to select four key aspects of Russian phonology which

have attracted attention in the literature: the constraints on the consonant clusters


36/425

34

permissible in Russian, assimilation in voicing and palatalization, word-final

devoicing and reduction of unstressed vowels. For each of these problem areas, I set

out what appear to be the facts as generally accepted: the aim of this section is to

show that these facts need not be seen as divorced from syllabification, but an account

of them can be integrated into the existing PSG of Russian syllable structure. Indeed,

in some cases, there is a clear advantage in this kind of integration. For example, the

properties of!"! with respect to voicing assimilation are most simply explained by

taking into account the features which the syllable structure grammar assigns to !"!.

The result is that a single grammar fulfils a variety of functions, assigning syllable

structure, mapping phonemic representations to phonetic representations, and, as we

shall see in Chapter 5, acting as an indispensable component in a theory about stress

assignment in Russian.

2.2 The syllable in phonological theory

The syllable is by no means a recent construct. It was discussed as a unit of

linguistic organization in, for example, Whitney (1865), Sievers (1881), Jespersen

(1904), de Saussure (1916), Grammont (1933), Bloomfield (1933) and Hockett

(1955). Bloomfield, for example, states that the ups and downs ofsyllabication play

an important part in the phonetic structure of all languages (p. 121; Bloomfields

emphasis). It was in the 1950s and 1960s that the status of the syllable was both

implicitly and explicitly questioned in generative phonology: implicitly, by its notable


37/425

35

absence in Halle (1959)5 and Chomsky and Halle (1968), and explicitly, in Kohler

(1966: 346-348). As Fudge (1969: 261-262) points out:

Chomsky and Halle (1968) continually invoke syllables, monosyllables, disyllables,etc. in their less formal discussions (in the text frequently, but sometimes also withinthe systems of rules proposed), and even postulate a feature Syllabic which wouldcharacterize all segments constituting a syllable peak (354). Unfortunately, none ofthese terms are made explicit in the text or in the rules The term syllable doesnot even figure in the index of Chomsky and Halle (1968).

In fact, we may state that it is not satisfactory to deal with the structure of oneelement in terms of statements designed to deal with the structure of an essentiallydifferent and only indirectly related element. If we want to state syllable-structure,we must explicitly introduce the element syllable into our linguistic description, and

state its relations to the other elements of the phonological hierarchy; it is preciselythis which Chomsky and Halle (1968) fail to do.

From that time, partly as a reaction to Chomsky and Halles work,

phonological theory has swung back towards endorsing the syllable. Indeed, even

before Halle (1959), Haugen (1956: 215-216) writes of the syllable that one would be

tempted to deny its existence, or at least its linguistic status, as some have done, were

it not for its wide persistence as a feature of most linguistic descriptions those who

attempt to avoid the syllable in their distributional statements are generally left with

unmanageable or awkward masses of material. This shortcoming of Chomsky and

Halles theory is pointed out not only by Fudge (1969), who argues that the element

syllable should be made explicit, but also by Hooper (1972) and Vennemann (1972);

the latter uses evidence from languages other than English to advocate the

incorporation of syllable boundaries and syllables in phonological descriptions (p. 2).

Perhaps the best-known work pointing out the inadequacies of Chomsky and Halle

(1968), though, is Kahn (1976): Kahn states that in describing productive

5 For further discussion of the absence of the syllable in Halle (1959), see section 2.2.2.


38/425

36

phonological processes he was hampered by the absence of a generative theory of

syllabification (p. 17). Kahn observed, in particular, that the phonotactic constraints

of English could be accounted for indirectly but simply by considering syllable

structure (pp. 40-41, 57-58). Clements and Keyser (1983), endorsing Kahns

hierarchical analysis of the syllable, argued however that syllabicity was not a

property of segments per se as Kahn suggested (Kahn 1976: 39), but rather involves

the relationship between a segment and its neighbors on either side (Clements and

Keyser 1983: 5): to account for this, they proposed analyzing syllables in terms of

three tiers, the syllable tier and segmental tier (as in Kahn 1976) and an additional CV

tier. Selkirk (1984) follows Clements and Keyser in rejecting [!syllabic] as a feature

of segments.

Despite the criticisms of certain aspects of Kahns approach, it has generally

been acknowledged since Kahn (1976) that the syllable is an indispensable unit of

linguistic organization. For example, a contemporary description of Slavic prosody,

Bethin (1998), makes the following statement:

We find that many prosodic features are restricted to or expressed on syllables, thatcertain restrictions on permissible consonant and vowel sequences are best describedas holding within a syllable, that there are phonological and morphological processeswhich seem to be conditioned by the syllable, and that many of these processes countsyllables but do not, as a rule, count phonemes or segments. (p. 192)

It seems, therefore, that the syllable is here to stay in linguistic theory, and in

particular that an account of syllable structure is an essential part of a generative

phonology of Russian. One aim of this chapter, therefore, is to put forward a specific

grammar of Russian syllable structure as part of the overall phonology proposed in

this dissertation. This grammar is explicit about what Russian syllables are; it does


39/425

37

state its relations to the other elements of the phonological grammar, as Fudge puts

it, and because the theory is implemented computationally and tested for its coverage,

a standard is set against which future proposals can be measured.

2.2.1 Sonority and syllable structure

The notion of the syllable is inextricably linked to that of sonority, which has

for more than a century been believed by phonologists to be an important factor in the

structure of syllables (Whitney 1865, Sievers 1881: 159-160, Jespersen 1904: 186-

187, de Saussure 1916: 71ff. and Grammont 1933: 98-104). Essentially, the idea is

that segments can be categorized with respect to sonority: those that are more

sonorous tend to stand closer to the centre of the syllable, and those that are less

sonorous closer to the margin. Clements (1990: 284) notes that this principle

expresses a strong cross-linguistic tendency, and represents one of the highest-order

explanatory principles of modern phonological theory. However, there are a number

of questions about sonority which to date have not been answered. Essentially, these

have to do with (a) how sonority is defined, and (b) at what linguistic level sonority

holds (Clements 1990: 287, Bethin 1998: 19-21).

As far as the first of these is concerned, there have been various attempts at

defining sonority. Bloomfield (1933: 120-121) equated sonority with the loudness of

segments (the extent to which some sounds strike the ear more forcibly than others);

another proposal is that sonority can be derived from basic binary categories, identical

to the major class features of standard phonological theory (Selkirk 1984, Clements

1990); and some have suggested that sonority does not have any absolute or

consistent phonetic properties (e.g. Hooper 1976: 198, 205-206). Even ignoring the

question of how sonority is defined phonetically, there is disagreement on what the


40/425

38

sonority hierarchy is; and until this issue is resolved, as Selkirk points out,

discovering the phonetic correlates of sonority will be difficult. For example,

Clements (1990: 292-296) proposes a hierarchy where obstruents are less sonorous

than nasals, nasals less sonorous than liquids, and liquids less sonorous than glides.

Glides, in turn, are seen as non-syllabic vowels. On the other hand, Selkirk (1984:

112) sets out a more detailed hierarchy, as follows:

(2) Sounds (in order of decreasing sonority)

%1=6

,=;

0

-

+=4

#

"=*=>

?=@

A=$=B9=&=2

Whatever the exact classification of sounds by sonority, it seems to be a

general rule that for each peak in sonority in a string of phonemes, there will be a

syllable (Bloomfield 1933, Selkirk 1984, Clements 1990). Perhaps the best-known

formulation of the sonority principle is Selkirks (1984: 116) Sonority Sequencing

Generalization (SSG):

In any syllable, there is a segment constituting a sonority peak that is preceded and/orfollowed by a sequence of segments with progressively decreasing sonority values.


41/425

39

In this formulation, the syllabicity of segments depends on their position,

rather than on any inherent phonological property of their own (Clements and Keyser

1983: 4-5, Selkirk 1984: 108, Blevins 1995, Bethin 1998): sonority peaks simply

align with syllable peaks. This offers an explanation in terms of syllable structure for

the fact that glides and approximants can function as either consonants or vowels, a

fact that was noted as early as Siev

Documents

A computational phonology of Russian - Peter A. Chew.pdf