Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Language change in individuals and
populationsRichard A Blythe
with Gareth Baxter, Bill Croft, Alistair Jones, Simon Kirby, Alan McKane, Kenny Smith and Kevin Stadler
CANES Seminar – Nov 25th 2015
Large populations, long timescales
Dunn et al 2011 Maurits & Griffiths 2014
…
State = featuree.g. SVO vs OVS
see also Reali & Griffiths 2009, Smith & Wonnacott 2010, …
Small populations, short timescales
Input Output
Noun-Adj Adj-Noun
Noun-Num 443 (52%) 32 (4%)
Num-Noun 149 (17%) 227 (27%)
Culbertson et al 2012 reporting data from WALS 2008
What happens in between?572 LANGUAGE, VOLUME 83, NUMBER 3 (2007)
matic shift between 1971 and 1984. The trajectories of these nine individuals are desig- nated by arrows connecting their 1971 with their 1984 data points.
Figure 3. Individual percentages of [R]/([R] + [r]) for the 32 panel speakers for 1971 and 1984. Trajectories plotted for all speakers who showed a significant difference between the two years.
At the bottom of the graph, we see two groups of the 1971 speakers in the ellipses: a tight cluster of seven individuals under age thirty, none of whom uses more than 10% [R], and another group of five speakers between the ages of thirty-five and fifty, and who range between 0% and 17%. Ten of these twelve people form the stable group of Table 11 in their use of the conservative form. Their 1984 values are displayed in the two dotted ellipses to the right of each of the 1971 groups. Two of them, however, Lysiane B. (007) and Alain L. (104), make substantial changes, abandoning the virtually categorical conservative pattern they displayed in 1971. Along with the other mid- range speakers, their data appears individually in Table 12. Lysiane and Alain are among the nine speakers whose trajectories appear in Fig. 3.
A majority of mid-range speakers in 1971 (seven out of ten) had moved to the categorical or near-categorical use of innovative [R] by 1984. They are people we call 'later adopters' of the innovative variant. Five of the seven are young; five of the seven are male. As a group, we would characterize their behavior as catching up with their peers, the early adopters. The next three people listed in Table 12 are those mid-range speakers who were stable, somewhat older, behaving more as we had expected older individuals to behave. Of note is that one of them, Andre L. (065), a professional actor age twenty-seven when we met him in 1971, was the only speaker who exhibited stylistic variation (Sankoff & Blondeau 2008). In 1995 he maintained virtually the same overall level of [R] as he had in 1984: 69%. At the bottom of the table are the two individuals who moved from virtually categorical use of [r] to dominant use of [R], occupying the upper part of the variable range. Lysiane B. (007) is a case of exceptional upward social mobility, a twenty-four-year-old factory worker when we met her in 1 97 1 , a businesswoman in 1 984, and when last interviewed in 1 995, a successful realtor.
This content downloaded from 192.41.131.252 on Tue, 23 Jun 2015 10:59:50 UTCAll use subject to JSTOR Terms and Conditions
Montreal French /r/Sankoff & Blondeau 2007
[R] v
s [r
]
0 0 0 0 0 0 0 1 1 1 3 3 0
Fisher 1930, Wright 1931, Moran 1958, Boyd & Richerson 1985, Smith 2009, …
0
p00
1
p01
3
q3
t
t+1
Language learning and use in populationsThe Wright-Fisher-Moran-Iterated-Learning paradigm
Let λi be the probability that each offspring acquires state i, given the state of the parent population
P (n0, n1, . . . , nk; t+ 1) =
✓N !
n0!n1! . . . nk!
◆�n00 �n1
1 · · ·�nkk
If acquisition events are independent
xi =ni
N
h�xii = �i � xi h�xi�xji =�i(�i,j � �j)
N
+ (�i � xi)(�j � xj)
Only consistent continuous-time limit is if �i � xi ⇠
1
N
0 0 0 0 0 0 0 1 1 1 3 3 0
0
p00
1
p01
3
q3
t
t+1
λi is the probability that each offspring acquires state i
@
@t
P (~x, t) =X
n
anD̂nP (~x, t) +1
2N
X
i,j
@
2
@xi@xjxi(�i,j � xj)P (~x, t)
N is here
time, t
frequ
ency
, x
�i = xi
@
@t
P (x, t) =1
N
@
2
@x
2x(1� x)P (x, t)
The time until loss (fixation) increases with N
Agents learn from a randomly-chosen parent
The time until loss (fixation) increases with N
Two key results from neutral theory of evolution
Kimura 1984
If the probability of innovation (mutation) is constant for each individual (universal), the overall rate of innovation increases with N
Does the rate of
historical language change depend on the number of speakers?
Lexical gain and lossBromham et al 2015
Count gain and loss of cognatesWords with the same form and function
γ Lower 95% CL Max Likelihood Upper 95% CL
Gain 0.145 0.29 0.435
Loss -0.194 -0.12 -0.048
Gain & Loss -0.092 -0.03 0.033
Macroscopic Poisson process
A varies according to language pairN is current language size
!(k ! k ± 1) = AN� is the best fit
None 243 (39%)
None 296 (55%)
Demonstrative 69 (11%)
“One” 112 (21%)
Distinct word 216 (35%)
Distinct word 102 (19%)
Affix 92 (15%)
Affix 24 (4%)
0
1
3
Definite Indefinite
2
WALS 2014
Article grammaticalisation cycles
Historical data51 languages6 areal groups
(Europe, Mideast, S&SW Asia, E Asia, Mesoamerica, S America)
Changes recorded over periods lasting 500 to 5000 years
Also made estimates of historical population sizes
Blythe & Croft, in prep.
Changes in state modelled as a Poisson process
γ Lower 95% CL Max Likelihood Upper 95% CL
Definite -0.316 -0.213 -0.101
Definiteexcluding Mesoamerica
-0.266 -0.0706 0.136
Indefinite -0.13 0.0389 0.211
In these cases, a constant rate model is a marginally better fit
Current distribution over the world’s languages cf Maurits & Griffiths 2014
Blythe & Croft, in prep.
�(i � i + 1) =AN�
fi
Macroscopic changes in state are at most weakly affected by language size
Thought experiment1. Take the set of language changes in the sample
2. Assign a common population size N to each
3. Determine the likelihood of the set of changes within a microscopic model, P(N)
4. It shouldn’t matter (too much) what value of N is chosen in step 2
Blythe & Croft, in prep.
Looking for less than one order of magnitude variation in P(N) across four orders of magnitude in N
Model B Model C
Parents 1 2 1Cycle
mechansismBiased mutation +
noise Novelty bias Interaction between features
Free params 2 2 2Fixed params 0 1 1
Tail Geometric Power (-2) Power (-2)
Model A
P(N)
Effect size 10x larger than observed
@
@t
P (~x, t) =X
n
anD̂nP (~x, t) +1
2N
X
i,j
@
2
@xi@xjxi(�i,j � xj)P (~x, t)
N is here
an specify strengths of cognitive biases, assumed universalFor any statistic X(~a, ⌧, N) = F(N~a, ⌧
N ) ⇠ F(N~a, 0)
Asymptotically-flat P(N)
What if biases aren’t universal?Consequences of scaling law
Each a priori unknown parameter contributes
1/N to tail of P(N)Parameter needs to be
known to within 0.1% to count as “known”
Blythe & Croft, in prep.
0 0 0 0 0 0 1 1 1
0
p00(σ)1
p01(σ)3
t
t+1
σ = 0
Cognitive universals generate variation in linguistic behaviourPropagated by shared values associated with specific behaviour
Croft 2000
Other evidence
Rapid dialect formationBaxter et al 2009
S-curve pattern of changeBlythe & Croft 2012
Origin of shared biases?
Local majority rule?Memory?
Homophily?
Population-level change does not follow straightforwardly from individual-level change
Better understanding of the role of population size is essential
0
10