28
Dynamic Typology Michael Cysouw LMU Munich

Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Dynamic TypologyMichael Cysouw

LMU Munich

Page 2: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Reasons for Typological Similarity

• Non-historical reasons‣ Universals‣ Chance

• Historical reasons‣ Common descent‣ Borrowing, substrate, etc.

Page 3: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

The Importance of History

• The actual languages as attested in this world are not necessarily reflecting the possible human languages

• History (very probably) is still a significant factor for the explanation of current typological distributions

Page 4: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Order of Object and Verb(Mathew Dryer 2005 from WALS)

Page 5: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

LETTERdoi:10.1038/nature09923

Evolved structure of language shows lineage-specifictrends in word-order universalsMichael Dunn1,2, Simon J. Greenhill3,4, Stephen C. Levinson1,2 & Russell D. Gray3

Languages vary widely but not without limit. The central goal oflinguistics is to describe the diversity of human languages andexplain the constraints on that diversity. Generative linguists fol-lowing Chomsky have claimed that linguistic diversity must beconstrained by innate parameters that are set as a child learns alanguage1,2. In contrast, other linguists following Greenberg haveclaimed that there are statistical tendencies for co-occurrence oftraits reflecting universal systems biases3–5, rather than absoluteconstraints or parametric variation. Here we use computationalphylogenetic methods to address the nature of constraints onlinguistic diversity in an evolutionary framework6. First, contraryto the generative account of parameter setting, we show that theevolution of only a few word-order features of languages arestrongly correlated. Second, contrary to the Greenbergian general-izations, we show that most observed functional dependenciesbetween traits are lineage-specific rather than universal tendencies.These findings support the view that—at least with respect to wordorder—cultural evolution is the primary factor that determineslinguistic structure, with the current state of a linguistic systemshaping and constraining future states.

Human language is unique amongst animal communication sys-tems not only for its structural complexity but also for its diversity atevery level of structure and meaning. There are about 7,000 extantlanguages, some with just a dozen contrastive sounds, others with morethan 100, some with complex patterns of word formation, others withsimple words only, some with the verb at the beginning of the sentence,some in the middle, and some at the end. Understanding this diversityand the systematic constraints on it is the central goal of linguistics. Thegenerative approach to linguistic variation has held that linguisticdiversity can be explained by changes in parameter settings. Each ofthese parameters controls a number of specific linguistic traits. Forexample, the setting ‘heads first’ will cause a language both to placeverbs before objects (‘kick the ball’), and prepositions before nouns(‘into the goal’)1,7. According to this account, language change occurswhen child learners simplify or regularize by choosing parameter set-tings other than those of the parental generation. Across a few genera-tions such changes might work through a population, effectinglanguage change across all the associated traits. Language changeshould therefore be relatively fast, and the traits set by one parametermust co-vary8.

In contrast, the statistical approach adopted by Greenbergian linguistssamples languages to find empirically co-occurring traits. These co-occurring traits are expected to be statistical tendencies attributable touniversal cognitive or systems biases. Among the most robust of thesetendencies are the so-called ‘‘word-order universals’’3 linking the orderof elements in a clause. Dryer has tested these generalizations on aworldwide sample of 625 languages and finds evidence for some of theseexpected linkages between word orders9. According to Dryer’s reformu-lation of the word-order universals, dominant verb–object orderingcorrelates with prepositions, as well as relative clauses and genitives

after the noun, whereas dominant object–verb ordering predicts post-positions, relative clauses and genitives before the noun4. One generalexplanation for these observations is that languages tend to be consist-ent (‘harmonic’) in their order of the most important element or ‘head’of a phrase relative to its ‘complement’ or ‘modifier’3, and so if the verbis first before its object, the adposition (here preposition) precedes thenoun, while if the verb is last after its object, the adposition follows thenoun (a ‘postposition’). Other functionally motivated explanationsemphasize consistent direction of branching within the syntactic struc-ture of a sentence9 or information structure and processing efficiency5.

To demonstrate that these correlations reflect underlying cognitiveor systems biases, the languages must be sampled in a way that controlsfor features linked only by direct inheritance from a commonancestor10. However, efforts to obtain a statistically independent sampleof languages confront several practical problems. First, our knowledgeof language relationships is incomplete: specialists disagree about high-level groupings of languages and many languages are only tentativelyassigned to language families. Second, a few large language familiescontain the bulk of global linguistic variation, making sampling purelyfrom unrelated languages impractical. Some balance of related, unre-lated and areally distributed languages has usually been aimed for inpractice11,12.

The approach we adopt here controls for shared inheritance byexamining correlation in the evolution of traits within well-establishedfamily trees13. Drawing on the powerful methods developed in evolu-tionary biology, we can then track correlated changes during the his-torical processes of language evolution as languages split and diversify.Large language families, a problem for the sampling method describedabove, now become an essential resource, because they permit theidentification of coupling between character state changes over long timeperiods. We selected four large language families for which quantitativephylogenies are available: Austronesian (with about 1,268 languages14

and a time depth of about 5,200 years15), Indo-European (about 449languages14, time depth of about 8,700 years16), Bantu (about 668 or522 for Narrow Bantu17, time depth about 4,000 years18) and Uto-Aztecan (about 61 languages19, time-depth about 5,000 years20).Between them these language families encompass well over a third ofthe world’s approximately 7,000 languages. We focused our analyses onthe ‘word-order universals’ because these are the most frequently citedexemplary candidates for strongly correlated linguistic features, withplausible motivations for interdependencies rooted in prominent formaland functional theories of grammar.

To test the extent of functional dependencies between word-ordervariables, we used a Bayesian phylogenetic method implemented in thesoftware BayesTraits21. For eight word-order features we comparedcorrelated and uncorrelated evolutionary models. Thus, for each pairof features, we calculated the likelihood that the observed states of thecharacters were the result of the two features evolving independently,and compared this to the likelihood that the observed states were theresult of coupled evolutionary change. This likelihood calculation was

1Max Planck Institute for Psycholinguistics, Post Office Box 310, 6500 AH Nijmegen, The Netherlands. 2Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Kapittelweg 29,6525 EN Nijmegen, The Netherlands. 3Department of Psychology, University of Auckland, Auckland 1142, New Zealand. 4Computational Evolution Group, University of Auckland, Auckland 1142, NewZealand.

5 M A Y 2 0 1 1 | V O L 4 7 3 | N A T U R E | 7 9

Macmillan Publishers Limited. All rights reserved©2011

Page 6: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Dynamic Typology

• It is not the actual frequencies that matter

• It is the stable distribution that matters

• A stable distribution is a situation in which just as many languages change from A to B as change from B to A.

• The extent to which the actual is different from the stable situation signals an effect of history

Page 7: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Type A Type B

50 200

probability of change: 20%

probability of change: 5%

1010

Stable distribution

Page 8: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Type A Type B

50 200

probability of change: 20%

probability of change: 5%

1010

Instable distribution

probability of change: 10%5

Page 9: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Type A Type B

83 167

probability of change: 10%

probability of change: 5%

8.38.3

Expected stable distribution

5/10+5×250= =10/10+5×250

Page 10: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Estimating Transition Probabilities

• Are transitions probabilities measurable ?‣ there is no assumption here why a change happens‣ might be internal or external (borrowing, substrate)‣ the question is whether specific features of language

are more amenable to change than others

• If yes: ‣ use group internal variation to estimate

transition probabilities

Page 11: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

The world’s genera

Page 12: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

A “sample” of the world’s languages

40% blue60% red

Page 13: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

A real sample stratified for genealogical

grouping

How to get probabilities of change ...

Page 14: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Change from blue to red:

p = 2/12 = 17%

Page 15: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Change from red to blue:

p = 2/18 = 11%

Page 16: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

blue: 17/11+17 = 61% (actual 40 %)

red: 11/11+17 = 39 % (actual 60 %)

Expected Stable Distribution:

Page 17: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Problem:

Page 18: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Problem:

‣ Originally red with one change to blue ?

‣ Originally blue with two changes to red ?

Page 19: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

probability of any change happening

= α· frequency (blue) + β

For groups of three languages:

α = 3· (pblue→red − pred→blue)

β = 3· pred→blue · (1 − pblue→red)

Elena Maslova’s proposal

Page 20: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

probability of contact triples = α· frequency (blue) + β

For groups of three languages A, B, C:– A+B related– B+C geographically close, not related– A ‘contact triple’ is a set in which A≠B=C

α = (pblue→red − pred→blue)

β = pred→blue · (1 − pred→blue)

Elena Maslova’s proposal for contact

Page 21: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

WALS frequencies

Stab

le e

stim

ates

Page 22: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

LETTERdoi:10.1038/nature09923

Evolved structure of language shows lineage-specifictrends in word-order universalsMichael Dunn1,2, Simon J. Greenhill3,4, Stephen C. Levinson1,2 & Russell D. Gray3

Languages vary widely but not without limit. The central goal oflinguistics is to describe the diversity of human languages andexplain the constraints on that diversity. Generative linguists fol-lowing Chomsky have claimed that linguistic diversity must beconstrained by innate parameters that are set as a child learns alanguage1,2. In contrast, other linguists following Greenberg haveclaimed that there are statistical tendencies for co-occurrence oftraits reflecting universal systems biases3–5, rather than absoluteconstraints or parametric variation. Here we use computationalphylogenetic methods to address the nature of constraints onlinguistic diversity in an evolutionary framework6. First, contraryto the generative account of parameter setting, we show that theevolution of only a few word-order features of languages arestrongly correlated. Second, contrary to the Greenbergian general-izations, we show that most observed functional dependenciesbetween traits are lineage-specific rather than universal tendencies.These findings support the view that—at least with respect to wordorder—cultural evolution is the primary factor that determineslinguistic structure, with the current state of a linguistic systemshaping and constraining future states.

Human language is unique amongst animal communication sys-tems not only for its structural complexity but also for its diversity atevery level of structure and meaning. There are about 7,000 extantlanguages, some with just a dozen contrastive sounds, others with morethan 100, some with complex patterns of word formation, others withsimple words only, some with the verb at the beginning of the sentence,some in the middle, and some at the end. Understanding this diversityand the systematic constraints on it is the central goal of linguistics. Thegenerative approach to linguistic variation has held that linguisticdiversity can be explained by changes in parameter settings. Each ofthese parameters controls a number of specific linguistic traits. Forexample, the setting ‘heads first’ will cause a language both to placeverbs before objects (‘kick the ball’), and prepositions before nouns(‘into the goal’)1,7. According to this account, language change occurswhen child learners simplify or regularize by choosing parameter set-tings other than those of the parental generation. Across a few genera-tions such changes might work through a population, effectinglanguage change across all the associated traits. Language changeshould therefore be relatively fast, and the traits set by one parametermust co-vary8.

In contrast, the statistical approach adopted by Greenbergian linguistssamples languages to find empirically co-occurring traits. These co-occurring traits are expected to be statistical tendencies attributable touniversal cognitive or systems biases. Among the most robust of thesetendencies are the so-called ‘‘word-order universals’’3 linking the orderof elements in a clause. Dryer has tested these generalizations on aworldwide sample of 625 languages and finds evidence for some of theseexpected linkages between word orders9. According to Dryer’s reformu-lation of the word-order universals, dominant verb–object orderingcorrelates with prepositions, as well as relative clauses and genitives

after the noun, whereas dominant object–verb ordering predicts post-positions, relative clauses and genitives before the noun4. One generalexplanation for these observations is that languages tend to be consist-ent (‘harmonic’) in their order of the most important element or ‘head’of a phrase relative to its ‘complement’ or ‘modifier’3, and so if the verbis first before its object, the adposition (here preposition) precedes thenoun, while if the verb is last after its object, the adposition follows thenoun (a ‘postposition’). Other functionally motivated explanationsemphasize consistent direction of branching within the syntactic struc-ture of a sentence9 or information structure and processing efficiency5.

To demonstrate that these correlations reflect underlying cognitiveor systems biases, the languages must be sampled in a way that controlsfor features linked only by direct inheritance from a commonancestor10. However, efforts to obtain a statistically independent sampleof languages confront several practical problems. First, our knowledgeof language relationships is incomplete: specialists disagree about high-level groupings of languages and many languages are only tentativelyassigned to language families. Second, a few large language familiescontain the bulk of global linguistic variation, making sampling purelyfrom unrelated languages impractical. Some balance of related, unre-lated and areally distributed languages has usually been aimed for inpractice11,12.

The approach we adopt here controls for shared inheritance byexamining correlation in the evolution of traits within well-establishedfamily trees13. Drawing on the powerful methods developed in evolu-tionary biology, we can then track correlated changes during the his-torical processes of language evolution as languages split and diversify.Large language families, a problem for the sampling method describedabove, now become an essential resource, because they permit theidentification of coupling between character state changes over long timeperiods. We selected four large language families for which quantitativephylogenies are available: Austronesian (with about 1,268 languages14

and a time depth of about 5,200 years15), Indo-European (about 449languages14, time depth of about 8,700 years16), Bantu (about 668 or522 for Narrow Bantu17, time depth about 4,000 years18) and Uto-Aztecan (about 61 languages19, time-depth about 5,000 years20).Between them these language families encompass well over a third ofthe world’s approximately 7,000 languages. We focused our analyses onthe ‘word-order universals’ because these are the most frequently citedexemplary candidates for strongly correlated linguistic features, withplausible motivations for interdependencies rooted in prominent formaland functional theories of grammar.

To test the extent of functional dependencies between word-ordervariables, we used a Bayesian phylogenetic method implemented in thesoftware BayesTraits21. For eight word-order features we comparedcorrelated and uncorrelated evolutionary models. Thus, for each pairof features, we calculated the likelihood that the observed states of thecharacters were the result of the two features evolving independently,and compared this to the likelihood that the observed states were theresult of coupled evolutionary change. This likelihood calculation was

1Max Planck Institute for Psycholinguistics, Post Office Box 310, 6500 AH Nijmegen, The Netherlands. 2Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Kapittelweg 29,6525 EN Nijmegen, The Netherlands. 3Department of Psychology, University of Auckland, Auckland 1142, New Zealand. 4Computational Evolution Group, University of Auckland, Auckland 1142, NewZealand.

5 M A Y 2 0 1 1 | V O L 4 7 3 | N A T U R E | 7 9

Macmillan Publishers Limited. All rights reserved©2011

Page 23: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

TsouRukai

Siraya

Sediq

CiuliAtayalSquliqAtayal

Sunda

Minangkabau

IndonesianMalayBahasaIndonesia

Iban

TobaBatakLampung

TimugonMurutMerinaMalagasy

Palauan

MunaKatobuTongkuno DWunaPopalia

Wolio

Paulohi

MurnatenAluneAlune

KasiraIrahutuLetineseBuruNamroleBay

LoniuLou

ManamKairiru

DobuanBwaidogaSaliba

KilivilaGapapaiwa

MekeoMotu

AriaMoukSengseng

KoveMaleu

Lakalai

NakanaiBileki DMaututu

NggelaKwaio

Vitu

TigakTungagTungakLavongaiNalik

KuanuaSiar

Kokota

BabatanaSisingga

BanoniTeop

NehanHapeNissan

KiribatiKusaie

Woleai

WoleaianPuluwatesePuloAnnan

PonapeanMokilese

Rotuman

TokelauRapanuiEasterIsland

TahitianModern

HawaiianMaori

FutunaFutunaWest

FutunaEastNiueSamoanTongan

FijianBau

DehuIaai

SyeErromangan

LenakelAnejomAneityum

NgunaPaameseSouth

SakaoPortOlry

BumaTeanu

Giman

AmbaiYapenMor

NgadhaManggarai

Lamaholot

ESLewa D

ESKamberaSouthern DKambera

ESUmbuRatuNggai D

NiasChamorro

Bajo

BikolNagaCity

CebuanoMamanwa

TboliTagabiliWBukidnonManobo

AgtaIlokano

IfugaoBatadBontokGuinaang

Pangasinan

Kapampangan

PaiwanO

OO

O

OO

O

O

OO

O

OO

OO

O

OOO

O

O

OO

OOO

OO

OO

OOO

OO

OO

OOO

OO

O

OO

OO

O

OOO

OO

O

OO

OO

OO

OO

O

OOO

OO

O

OO

O

OO

OO

OOOO

O

OO

O

OO

OO

O

OO

O

OO

OO

O

O

OO

O

OO

O

O

OO

OO

OO

OO

O

O

O

Austronesian

Tocharian BTocharian A

Albanian G

Greek ModAncientGreek

Armenian Mod

FrenchProvencalWalloonLadin

Italian

Catalan

SpanishPortuguese ST

Sardinian CRumanian List

Latin

Old Norse

DanishSwedish List

Riksmal

FaroeseIcelandic ST

German STDutch List

FrisianFlemishAfrikaans

Pennsylvania DutchLuxembourgish

English STOld English

Gothic

Old Church Slavonic

SlovenianSerbocroatian

MacedonianBulgarian

Lusatian ULusatian LSlovak

CzechUkrainianByelorussian

RussianPolish

LatvianLithuanian ST

Nepali ListBihariBengali

AssameseOriya

GujaratiMarathi

MarwariSindhi

LahdnaPanjabi ST

UrduHindi

SinghaleseKashmiri

Kurdish

WaziriAfghan

TadzikPersian List

WakhiOssetic

Baluchi

Scots GaelicIrish B

Welsh N

Breton STCornish

Hittite

OO

O

OO

O

OOOO

O

O

OO

OO

O

O

OO

O

OO

OO

OOO

OOO

O

O

O

OO

OO

OOOO

OO

OO

OO

OOO

OO

OO

OO

OO

OO

OO

O

OO

OO

OO

O

OO

O

OO

O

Indo-European

PipilAztecTetelcingo

Cora

OpataEudeve

GuarijioTarahumara

Yaqui

NorthernTepehuan

PapagoPimaSoutheasternTepehuan

CahuillaLuiseno

MonoNorthernPaiute

Comanche

SouthernPaiuteSouthernUte

Chemehuevi

Hopi

OO

O

OOO

OO

O

OO

OO

OO

O

OO

O

OUto-Aztecan

Makwa P311

Cewa N31bNyanja N31a

Shona S10Ngoni S45

Xhosa S41Zulu S42

Hadimu G43cKaguru G12

Gikuyu E51Haya J22H

Hima J13Rwanda J611

Luba L33Ndembu K221

Lwena K141Ndonga R22

Bira D322Doko C40D

Mongo C613MongoNkundo C611

Tiv 802

O

OO

OO

OO

OO

OOO

OO

OO

OO

OOO

OBantu

Figure 1 | Two word-order features plotted onto maximum clade credibilitytrees of the four language families. Squares represent order of adposition andnoun; circles represent order of verb and object. The tree sample underlyingthis tree is generated from lexical data16,22. Blue-blue indicates postposition,

object–verb. Red-red indicates preposition, verb–object. Red-blue indicatespreposition, object–verb. Blue-red indicates postposition, verb–object. Blackindicates polymorphic states.

RESEARCH LETTER

8 0 | N A T U R E | V O L 4 7 3 | 5 M A Y 2 0 1 1

Macmillan Publishers Limited. All rights reserved©2011

LETTERdoi:10.1038/nature09923

Evolved structure of language shows lineage-specifictrends in word-order universalsMichael Dunn1,2, Simon J. Greenhill3,4, Stephen C. Levinson1,2 & Russell D. Gray3

Languages vary widely but not without limit. The central goal oflinguistics is to describe the diversity of human languages andexplain the constraints on that diversity. Generative linguists fol-lowing Chomsky have claimed that linguistic diversity must beconstrained by innate parameters that are set as a child learns alanguage1,2. In contrast, other linguists following Greenberg haveclaimed that there are statistical tendencies for co-occurrence oftraits reflecting universal systems biases3–5, rather than absoluteconstraints or parametric variation. Here we use computationalphylogenetic methods to address the nature of constraints onlinguistic diversity in an evolutionary framework6. First, contraryto the generative account of parameter setting, we show that theevolution of only a few word-order features of languages arestrongly correlated. Second, contrary to the Greenbergian general-izations, we show that most observed functional dependenciesbetween traits are lineage-specific rather than universal tendencies.These findings support the view that—at least with respect to wordorder—cultural evolution is the primary factor that determineslinguistic structure, with the current state of a linguistic systemshaping and constraining future states.

Human language is unique amongst animal communication sys-tems not only for its structural complexity but also for its diversity atevery level of structure and meaning. There are about 7,000 extantlanguages, some with just a dozen contrastive sounds, others with morethan 100, some with complex patterns of word formation, others withsimple words only, some with the verb at the beginning of the sentence,some in the middle, and some at the end. Understanding this diversityand the systematic constraints on it is the central goal of linguistics. Thegenerative approach to linguistic variation has held that linguisticdiversity can be explained by changes in parameter settings. Each ofthese parameters controls a number of specific linguistic traits. Forexample, the setting ‘heads first’ will cause a language both to placeverbs before objects (‘kick the ball’), and prepositions before nouns(‘into the goal’)1,7. According to this account, language change occurswhen child learners simplify or regularize by choosing parameter set-tings other than those of the parental generation. Across a few genera-tions such changes might work through a population, effectinglanguage change across all the associated traits. Language changeshould therefore be relatively fast, and the traits set by one parametermust co-vary8.

In contrast, the statistical approach adopted by Greenbergian linguistssamples languages to find empirically co-occurring traits. These co-occurring traits are expected to be statistical tendencies attributable touniversal cognitive or systems biases. Among the most robust of thesetendencies are the so-called ‘‘word-order universals’’3 linking the orderof elements in a clause. Dryer has tested these generalizations on aworldwide sample of 625 languages and finds evidence for some of theseexpected linkages between word orders9. According to Dryer’s reformu-lation of the word-order universals, dominant verb–object orderingcorrelates with prepositions, as well as relative clauses and genitives

after the noun, whereas dominant object–verb ordering predicts post-positions, relative clauses and genitives before the noun4. One generalexplanation for these observations is that languages tend to be consist-ent (‘harmonic’) in their order of the most important element or ‘head’of a phrase relative to its ‘complement’ or ‘modifier’3, and so if the verbis first before its object, the adposition (here preposition) precedes thenoun, while if the verb is last after its object, the adposition follows thenoun (a ‘postposition’). Other functionally motivated explanationsemphasize consistent direction of branching within the syntactic struc-ture of a sentence9 or information structure and processing efficiency5.

To demonstrate that these correlations reflect underlying cognitiveor systems biases, the languages must be sampled in a way that controlsfor features linked only by direct inheritance from a commonancestor10. However, efforts to obtain a statistically independent sampleof languages confront several practical problems. First, our knowledgeof language relationships is incomplete: specialists disagree about high-level groupings of languages and many languages are only tentativelyassigned to language families. Second, a few large language familiescontain the bulk of global linguistic variation, making sampling purelyfrom unrelated languages impractical. Some balance of related, unre-lated and areally distributed languages has usually been aimed for inpractice11,12.

The approach we adopt here controls for shared inheritance byexamining correlation in the evolution of traits within well-establishedfamily trees13. Drawing on the powerful methods developed in evolu-tionary biology, we can then track correlated changes during the his-torical processes of language evolution as languages split and diversify.Large language families, a problem for the sampling method describedabove, now become an essential resource, because they permit theidentification of coupling between character state changes over long timeperiods. We selected four large language families for which quantitativephylogenies are available: Austronesian (with about 1,268 languages14

and a time depth of about 5,200 years15), Indo-European (about 449languages14, time depth of about 8,700 years16), Bantu (about 668 or522 for Narrow Bantu17, time depth about 4,000 years18) and Uto-Aztecan (about 61 languages19, time-depth about 5,000 years20).Between them these language families encompass well over a third ofthe world’s approximately 7,000 languages. We focused our analyses onthe ‘word-order universals’ because these are the most frequently citedexemplary candidates for strongly correlated linguistic features, withplausible motivations for interdependencies rooted in prominent formaland functional theories of grammar.

To test the extent of functional dependencies between word-ordervariables, we used a Bayesian phylogenetic method implemented in thesoftware BayesTraits21. For eight word-order features we comparedcorrelated and uncorrelated evolutionary models. Thus, for each pairof features, we calculated the likelihood that the observed states of thecharacters were the result of the two features evolving independently,and compared this to the likelihood that the observed states were theresult of coupled evolutionary change. This likelihood calculation was

1Max Planck Institute for Psycholinguistics, Post Office Box 310, 6500 AH Nijmegen, The Netherlands. 2Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Kapittelweg 29,6525 EN Nijmegen, The Netherlands. 3Department of Psychology, University of Auckland, Auckland 1142, New Zealand. 4Computational Evolution Group, University of Auckland, Auckland 1142, NewZealand.

5 M A Y 2 0 1 1 | V O L 4 7 3 | N A T U R E | 7 9

Macmillan Publishers Limited. All rights reserved©2011

Page 24: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

S1.4. Correlation Analysis To identify the relationships between the word order characters we mapped the WALS typological data onto these phylogenies using a continuous-time Markov model of trait evolution under a dependent and independent model of trait evolution. The linguistic characters were adapted from Dryerʼs word order features, published in the WALS database (Dryer 2005a-h; Section S2). The phylogenetic correlation analysis was carried out using DISCRETE, a Bayesian method of model comparison for discrete parameters implemented in BayesTraits (Pagel and Meade 2006). DISCRETE/BayesTraits estimates the posterior probability distribution of transition rates between the character states across the set of phylogenetic trees (Section S1.1) under a continuous-time Markov model of trait evolution (Pagel 1994, Pagel and Meade 2006). We calculated the fit of an independent and dependent model of trait evolution on each pair of characters. In the independent model, the linguistic features evolve separately (Figure S1). For example, a transition from having postpositions to having prepositions is not contingent on the background state of the second character (word order) and vice versa. Under this model there are four rates of change between states: Postpositions ⇒ Prepositions, Prepositions ⇒ Postpositions, Object-Verb ⇒ Verb-Object and Verb-Object ⇒ Object-Verb.

Figure S1: An independent model of linguistic trait evolution. Changes in character A (Adposition type, top) are not linked to changes in character B (word order, bottom). The magnitude of the estimated transition parameters is indicated here by arrow weight.

In contrast, the dependent model treats each pair of linguistic features as a single, four-state character, which can be represented by model with eight rates of change (Figure S2).

Figure S2: A dependent model of linguistic trait evolution. Changes in character A (Adposition type) are linked to changes in character B (word order). The magnitude of the estimated transition parameters is indicated here by arrow weight.

WWW.NATURE.COM/NATURE | 5

S1.4. Correlation Analysis To identify the relationships between the word order characters we mapped the WALS typological data onto these phylogenies using a continuous-time Markov model of trait evolution under a dependent and independent model of trait evolution. The linguistic characters were adapted from Dryerʼs word order features, published in the WALS database (Dryer 2005a-h; Section S2). The phylogenetic correlation analysis was carried out using DISCRETE, a Bayesian method of model comparison for discrete parameters implemented in BayesTraits (Pagel and Meade 2006). DISCRETE/BayesTraits estimates the posterior probability distribution of transition rates between the character states across the set of phylogenetic trees (Section S1.1) under a continuous-time Markov model of trait evolution (Pagel 1994, Pagel and Meade 2006). We calculated the fit of an independent and dependent model of trait evolution on each pair of characters. In the independent model, the linguistic features evolve separately (Figure S1). For example, a transition from having postpositions to having prepositions is not contingent on the background state of the second character (word order) and vice versa. Under this model there are four rates of change between states: Postpositions ⇒ Prepositions, Prepositions ⇒ Postpositions, Object-Verb ⇒ Verb-Object and Verb-Object ⇒ Object-Verb.

Figure S1: An independent model of linguistic trait evolution. Changes in character A (Adposition type, top) are not linked to changes in character B (word order, bottom). The magnitude of the estimated transition parameters is indicated here by arrow weight.

In contrast, the dependent model treats each pair of linguistic features as a single, four-state character, which can be represented by model with eight rates of change (Figure S2).

Figure S2: A dependent model of linguistic trait evolution. Changes in character A (Adposition type) are linked to changes in character B (word order). The magnitude of the estimated transition parameters is indicated here by arrow weight.

WWW.NATURE.COM/NATURE | 5

Independent Model

Dependent Model

Page 25: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

-50

510

1520

Bay

es F

acto

r

NUM-SBV

NUM-REL

OBV-REL

NUM-OBV

ADP-REL

REL-SBV

DEM-SBV

DEM-NUM

GEN-REL

ADJ-OBV

DEM-REL

ADP-DEM

ADJ-SBV

DEM-GEN

ADP-NUM

DEM-OBV

ADJ-ADP

ADJ-REL

ADP-SBV

GEN-SBV

GEN-OBV

ADJ-DEM

GEN-NUM

ADJ-NUM

ADP-GEN

OBV-SBV

ADJ-GEN

ADP-OBV

Page 26: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Postpostion,object-verb

Postpostion,verb-object

Preposition,object-verb

Preposition,verb-object

Postpostion,object-verb

Postpostion,verb-object

Preposition,object-verb

Preposition,verb-object

Austronesian Indo-European

developed here could be used to explore the dependency relationshipsbetween a wide range of linguistic features. Nearly all branches oflinguistic theory have predicted such dependencies. Here we haveexamined the paradigm example (word-order universals) of theGreenbergian approach, taken also by the Chomskyan approach as‘‘descriptive generalizations that should be derived from principles ofUG [Universal Grammar]’’.1,27 What the current analyses unexpectedlyreveal is that systematic linkages of traits are likely to be the rare excep-tion rather than the rule. Linguistic diversity does not seem to be tightlyconstrained by universal cognitive factors specialized for language29.Instead, it is the product of cultural evolution, canalized by the systemsthat have evolved during diversification, so that future states lie in anevolutionary landscape with channels and basins of attraction that arespecific to linguistic lineages.

Received 10 June 2009; accepted 8 February 2011.Published online 13 April 2011.

1. Baker, M. The Atoms of Language (Basic Books, 2001).2. Chomsky, N. Lectures on Government and Binding (Foris, 1981).3. Greenberg, J. H. in Universals of Grammar (ed. Joseph H. Greenberg) 73–113 (MIT

Press, 1963).4. Dryer, M. in Language Typology and Syntactic Description Vol. I Clause Structure 2nd

edn (ed. Shopen, T.) 61–131 (Cambridge University Press, 2007).5. Hawkins, J. A. inLanguageUniversals (edsChristiansen,M.H.,Collins, C.&Edelman,

S.) 54–78 (Oxford University Press, 2009).6. Croft, W. Evolutionary linguistics. Annu. Rev. Anthropol. 37, 219–234 (2008).7. Chomsky, N. Knowledge of Language: its Nature, Origin and Use (Praeger, 1986).8. Lightfoot, D. W. The child’s trigger experience—degree-0 learnability. Behav. Brain

Sci. 12, 321–334 (1989).9. Dryer, M. S. The Greenbergian word order correlations. Language 68, 81–138

(1992).10. Mace, R. & Pagel, M. The comparative method in anthropology. Curr. Anthropol. 35,

549–564 (1994).11. Bakker, P. in Oxford Handbook of Linguistic Typology (ed. Song, J. J.) (Oxford

University Press, 2010).12. Cysouw, M. in Quantative Linguistics: An International Handbook (eds Altmann, G.,

Kohler, R. & Piotrowski, R.) 554–578 (Mouton de Gruyter, 2005).13. Pagel, M., Meade, A. & Barker, D. Bayesian estimation of ancestral character states

on phylogenies. Syst. Biol. 53, 673–684 (2004).14. Gordon, R. G. J. Ethnologue: Languages of the World 15th edn (SIL International,

2005).15. Gray, R. D., Drummond, A. J. & Greenhill, S. J. Language phylogenies reveal

expansion pulses andpauses inPacific settlement. Science 323, 479–483 (2009).

16. Gray,R. D. & Atkinson, Q. D. Language-tree divergence times support the Anatoliantheory of Indo-European origin. Nature 426, 435–439 (2003).

17. Guthrie, M. Comparative Bantu Vol. 2 (Gregg International, 1971).18. Diamond, J. & Bellwood, P. Farmers and their languages: the first expansions.

Science 300, 597–603 (2003).19. Campbell, L.American IndianLanguages:TheHistorical LinguisticsofNativeAmerica

133–138 (Oxford University Press, 1997).20. Kemp, B. M. et al. Evaluating the farming/language dispersal hypothesis with

genetic variation exhibited by populations in the Southwest and Mesoamerica.Proc. Natl Acad. Sci. USA 107, 6759–6764 (2010).

21. Pagel, M. & Meade, A. Bayesian analysis of correlated evolution of discretecharacters by reversible-jump Markov chain Monte Carlo. Am. Nat. 167, 808–825(2006).

22. Dyen, I., Kruskal, J. B. & Black, P. An Indo-European classification, a lexicostatisticalexperiment. Trans. Am. Phil. Soc. 82, 1–132 (1992).

23. Greenhill, S. J., Blust, R. & Gray, R. D. The Austronesian basic vocabulary database:from bioinformatics to lexomics. Evol. Bioinform. 4, 271–283 (2008).

24. Holden, C. J. Bantu language trees reflect the spread of farming across sub-Saharan Africa: a maximum-parsimony analysis. Proc. R. Soc. Lond. B 269,793–799 (2002).

25. Haspelmath, M., Dryer, M. S., Gil, D. & Comrie, B. The World Atlas of LanguageStructures Online (Max Planck Digital Library, 2008).

26. Raftery, A. Approximate Bayes factors and accounting for model uncertainty ingeneralised linear models. Biometrika 83, 251–266 (1996).

27. Cela-Conde, C. & Marty, G. Noam Chomsky’s minimalist program and thephilosophy of mind. An interview. Syntax 1, 19–36 (1998).

28. Reesink, G., Singer, R. & Dunn, M. Explaining the linguistic diversity of Sahul usingpopulation models. PLoS Biol. 7, e1000241 (2009).

29. Evans,N.&Levinson,S.C. Themythof language universals: languagediversity andits importance for cognitive science. Behav. Brain Sci. 32, 429–492 (2009).

Supplementary Information is linked to the online version of the paper atwww.nature.com/nature.

Acknowledgements We thank M. Liberman for comments on our initial results andF. Jordan and G. Reesink for comments on drafts of this paper. L. Campbell, J. Hill,W. Miller and R. Ross provided and coded the Uto-Aztecan lexical data.

Author Contributions R.D.G. and M.D. conceived and designed the study. S.J.G., R.D.G.and M.D.provided lexicaldataandphylogenetic trees.M.D. codedword-order data, andconducted the phylogenetic comparative analyses with S.J.G. All authors were involvedin discussion and interpretation of the results. All authors contributed to the writingwith S.C.L. and M.D. having leading roles; M.D., R.D.G. and S.J.G. produced theSupplementary Information.

Author Information Reprints and permissions information is available atwww.nature.com/reprints. The authors declare no competing financial interests.Readers are welcome to comment on the online version of this article atwww.nature.com/nature. Correspondence and requests for materials should beaddressed to M.D. ([email protected]).

Postposition,verb–object

Preposition,object–verb

Preposition,verb–object

Postposition,object–verb

Postposition,verb–object

Preposition,object–verb

Preposition,verb–object

Postposition,object–verb

Austronesian Indo-European

Figure 3 | The transition probabilities between states leading to object–verband adposition–noun alignments in Austronesian and Indo-European.Data were taken from the model most frequently selected in the analyses;probability is indicated by line weight. The state pairs across the midline of each

figure (postposition, object–verb; and preposition, verb–object) areGreenberg’s ‘harmonic’ or stable word orders. Nevertheless, each languagefamily shows tendencies for specific directions and probabilities of statetransitions.

RESEARCH LETTER

8 2 | N A T U R E | V O L 4 7 3 | 5 M A Y 2 0 1 1

Macmillan Publishers Limited. All rights reserved©2011

Page 27: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Austronesian Indo-European

0.0

0.2

0.4

0.6

0.8

1.0

Postpositions, Object-Verb

Pro

porti

on in

sta

ble

stat

e

Austronesian Indo-European

0.0

0.2

0.4

0.6

0.8

1.0

Prepositions, Object-Verb

Pro

porti

on in

sta

ble

stat

e

Austronesian Indo-European

0.0

0.2

0.4

0.6

0.8

1.0

Prepostions, Verb-Object

Pro

porti

on in

sta

ble

stat

e

Austronesian Indo-European

0.0

0.2

0.4

0.6

0.8

1.0

Postpositions, Verb-Object

Pro

porti

on in

sta

ble

stat

e

Page 28: Dynamic Typology - cysouw.decysouw.de/home/presentations_files/cysouwDYNAMICTYPOLOGY.pdf · occurring traits are expected to be statistical tendencies attributable to universal cognitive

Looking forward

• Much more testing necessary‣ do different approaches converge on the same

estimates for transition probabilities ?

• Only very few types allowed‣ computations quickly get too large for the

amount of available data on the world’s languages