51
Meaning as a Selective Pressure in the Evolution of Linguistic Structure Mónica Tamariz [email protected] Language as an Evolutionary System: A Multidisciplinary Approach. Edinburgh, 13 July 2010

Meaning as Selective Pressure in the Evolution of ... · Meaning as a Selective Pressure in the Evolution of Linguistic Structure Mónica Tamariz [email protected] Language as

  • Upload
    others

  • View
    23

  • Download
    0

Embed Size (px)

Citation preview

Meaning as a Selective Pressure in the Evolution of Linguistic Structure

Mónica Tamariz [email protected]

Language as an Evolutionary System: A Multidisciplinary Approach. Edinburgh, 13 July 2010

LANGUAGESTRUCTURE

Socialnetworkstructureanddynamics

TheoryofmindConformitybias

MemoryProcessingPercepConProducCon

ParsimonyMinimumeffort

InformaCon/noise

Structureofmeanings

Communication & thought

Human fitness

Natural selection

Overview •  CorrelaCon:FormstructuresystemaCcallyreflectsmeaningstructure(Synchroniccorpusstudy)

•  Causality:Formstructureofformsgraduallycomestoreflectmeaningstructure(Diachroniccorpusofar/ficialminiature

languages)

1. Form-meaning systematicity in the lexicon

2. The evolution of frequency distributions in language FORMS�

3. FORMS and MEANINGS: The evolution of

compositionality�

4. Evolutionary dynamics �

1. Correlation between FORM and MEANING structure

Systematicity

•  TherelaConshipbetweentwospacesissystemaCcifthestructureofonespacereflectsthestructureoftheother

•  Thereforeknowingaboutonestructureprovidesuswithsomeknowledgeabouttheother

Systematicity in language •  LexicalitemssystemaCcallyreflectsemanCcmeaning

•  Wordorder,inflexion,derivaConetcreflectgrammaCcalmeaning

present past 3rd person talk talked talks play played plays study studied studies

Systematicity in the lexicon

Onomatopoeia&soundsymbolism

Glisten Glitter Glow Glimmer

Snout Sniff Snore Snout Sniffle Snarl

A systematic lexicon

Wordsthatsoundsimilartendtohavesimilarmeanings

Meaningspace

FuncConsofsystemaCcity:

• Helplearn/understandnewitems

• AllowcreaCvityandgeneralisaCon

DistribuConalsimilarity

Capturessyntax&seman/cs(Landauer&Dumais1997,MacDonald2000)

Phonologicalsimilarity

kasa

lima

nene

kita

kasa

kita lima

nene

Measuring systematicity

DATA

•  A subset of the spoken part of the BNC (Shillcock et al. 2001)

1733 most frequent monosyllabic monomorphemic words

•  Three subsets of a Spanish speech corpus (Tamariz 2008)

Measuring systematicity

252 CVCV words 146 CVCCV words of freq >= 20 148 CVCVCV words

Results

(Shillcock et al. 2001, Tamariz 2008)

Wordsthatsoundsimilardotendtohavesimilarmeanings

Results

(Shillcock et al. 2001, Tamariz 2008)

Wordsthatsoundsimilardotendtohavesimilarmeanings

However,toomuchsystemaCcitymayposeaproblemforcomprehension!

Systematicity vs. disambiguation

kiri

kili kini

kisi

The phonological correlates of systematicity

Which elements of word form systematically relate to word meaning?

In CVCV, CVCCV, CVCVCV words , measure correlation between meaning and form similarity terms of:

-  Consonants, vowels, stress..

E.g. Do words that share the first consonant tend to have similar meanings?

Results TheimpactofphonologicalparametersonsystemaCcity

CVCV

Allvaluesp<0.01exceptwherestated

consonantsvowelsstress

Impact > 0 Words sharing “tc” tend to have similar meanings

(Tamariz 2008)

Impact < 0 Words sharing “v1” tend to have DISSIMILAR meanings

TheimpactofphonologicalparametersonsystemaCcity

Results

(Tamariz 2008)

consonantsvowelsstress• Consonants,mostlyposiCveimpact

• Vowels,mostlynegaCveimpact

• StressedvowelinthepenulCmatesyllable,negaCveimpact

• Otherstress,posiCveimpact

Allvaluesp<0.01exceptwherestated

CVCVCV CVCCV CVCV

The structure of the lexicon is an adaptation to two opposed pressures

Aspects of form selectively respond to pressures

One interpretation of results

To be systematic (for processing and learning)

To avoid ambiguities derived from systematicity

Consonant structure, stress pattern

Vowel structure, esp. stressed vowel in penultimate syllable

(ForSpanish;otherlanguagesmayhavefounddifferentsolu/onstothisconflict)

The structure of the lexicon is an adaptation to two opposed pressures

… and those pressures originate in the structure of meanings

-- Coarse-grained categories (consonants, stress) -- Fine-grained distinctions (stressed penultimate vowel)

One interpretation of results

2. Frequency distributions in language FORMS�

FrequencydistribuConofall1,2and3‐gramsinasetofwords

risa tio suerte trabajo hotel enchufe caballo estudia joven dia fecha espana manos libro leo cuidado encanta azul autobus folleto imprimir toma

RANDOMWORDS

N-gram frequency

(Tamariz, in prep)

FrequencydistribuConofall1,2and3‐gramsinasetofwords

risa tio suerte trabajo hotel enchufe caballo estudia joven dia fecha espana manos libro leo cuidado encanta azul autobus folleto imprimir toma

RANDOMWORDS

Expect a power law distribution: Signature of natural languages, e.g. Zipf’s law, etc

N-gram frequency

(Tamariz, in prep)

FrequencydistribuConofall1,2and3‐gramsinasetofwords

risa tio suerte trabajo hotel enchufe caballo estudia joven dia fecha espana manos libro leo cuidado encanta azul autobus folleto imprimir toma

RANDOMWORDS

Expect a power law distribution: Signature of natural languages, e.g. Zipf’s law, etc

N-gram frequency

(Tamariz, in prep)

FrequencydistribuConofall1,2and3‐gramsinasetofwords

risa tio suerte trabajo hotel enchufe caballo estudia joven dia fecha espana manos libro leo cuidado encanta azul autobus folleto imprimir toma

RANDOMWORDS

amo amas ama ame amaste amo amaba amabas amaba amare amaras amara amaria amarias amaria amase amases amase amara amaras amara ame

VERBPARADIGM

2 4 6 18 24

6 18

24

N-gram frequency

(Tamariz, in prep)

FrequencydistribuConofall1,2and3‐gramsinasetofwords

risa tio suerte trabajo hotel enchufe caballo estudia joven dia fecha espana manos libro leo cuidado encanta azul autobus folleto imprimir toma

RANDOMWORDS

amo amas ama ame amaste amo amaba amabas amaba amare amaras amara amaria amarias amaria amase amases amase amara amaras amara ame

VERBPARADIGM

2 4 6 18 24

6 18

24

Frequency signature of structure

N-gram frequency

(Tamariz, in prep)

game order would that doing shit were of topping than don but and of don the it the is it eats for be

twentytwo twentythree twentyfour twentyfive twentysix twentyseven twentyeight twentynine thirty thirtyone thirtytwo thirtythree thirtyfour thirtyfive thirtysix thirtyseven thirtyeight thirtynine forty fortyone fortytwo fortythree fortyfour

FrequencydistribuConofall1,2and3‐gramsinasetofwords

RANDOMWORDS 1000NUMBERWORDS

100 300 80 10

900

1500

N-gram frequency

(Tamariz, in prep)

(aside)

6 18

24

•  Each chemical element emits in specific frequencies

•  Spectra used to identify elements

•  Special subsets of a language show specific frequencies •  Spectra used to identify the quantitative structure of a sample?

-  Tell decimal from other numeral systems? -  Classify morphological paradigms?

Sun

H

He

Hg

U

•  Maybe the result of adaptation… maybe not

•  Look at the process of adaptation

Adaptation of linguistic form to meaning?

Evolution of the n-gram freq distr

•  Data from Kirby, Cornish & Smith (2008) •  8 diffusion chains of miniature artificial languages •  Distinct, structured meaning space:

27COMPLEXMEANINGS:AllthepossiblecombinaConsoftheabove,e.g.

9SIMPLEXMEANINGS:3shapes3colours3moCons

Kirby, Cornish & Smith (2008)

kimako

koni

kanige

kuni

winige

komako

Genera4on0:randomsignals

Total10genera4ons

Evolution of the n-gram freq distr

Kirby, Cornish & Smith (2008)

•  We know the meaning space •  We know the whole history of the language

•  OneoftheiniCalrandomlanguages

Evolution of the n-gram freq distr

Chain 7 Gen 0

Kirby, Cornish & Smith (2008)

kinimapi miwimi miwiniku pikuhemi mihe gepihemi nihepi wikima wimaku wikuki nipi pinipi kimaki winige

kunige wigemi nipikuge miniki kikumi wige kihemiwi pimikihe kinimage miki mahekuki hema gepinini

•  AfewgeneraConslater

Evolution of the n-gram freq distr

Chain 7 Gen 10

Kirby, Cornish & Smith (2008)

miniku tupin tupim miniku tupin tupim miniku tupin tupim poi poi poi poi poi poi poi poi poi

tuge tuge tuge tuge tuge tuge tuge tuge tuge

1 3 5 7 9 12 15 18 21 24 27 30

-15

-10

-50

510

15

Gen 0

N-gram freq

De

via

tio

n f

rom

exp

ecte

d f

req

Evolution of the n-gram freq distr

(Tamariz, in prep)

Data from 4 different chains of languages

Evolution of the n-gram freq distr

1 3 5 7 9 12 15 18 21 24 27 30

-15

-10

-50

510

15

Gen 3

N-gram freq

Freq

(Tamariz, in prep)

Data from 4 different chains of languages

Evolution of the n-gram freq distr

1 3 5 7 9 12 15 18 21 24 27 30

-15

-10

-50

510

15

Gen 10

N-gram freq

De

via

tio

n f

rom

exp

ecte

d f

req

9 12

15 18 26

28 27

Signature of “x3” or “x9” structure

(Tamariz, in prep)

29

Data from 4 different chains of languages

•  OneoftheiniCalrandomlanguages

Evolution of the n-gram freq distr

Chain 1 Gen 0

Chain 1 Gen 0 FILTEREDCONDITION(preventshomonymy)

Kirby, Cornish & Smith (2008)

huhunigu wakiki nihu kekewa huwa kowagu muwapo wako muko kemuniwa pokikehu niguki komuhuke hukike

kokihuko powa hukeko kokeguke kihupo waguhuki koni kopo ponikiko kiwanike hukinimu pohumu kimu

•  AfewgeneraConslater

Evolution of the n-gram freq distr

Chain 1 Gen 4

Kirby, Cornish & Smith (2008)

winekuki winukuki wikekuki winekiko winekiko wikiko wineko wuneko wikeko kunkuki hunekuki kunekuki kunkiko hunekiko

kunekiko kuneko huneko kuneko ponekuki punekuki ponekuki pokiko puniko pokiko poneko puneko poneko

FORMS

FILTEREDCONDITION(preventshomonymy)

Evolution of the n-gram freq distr

1 3 5 7 9 11 14 17 20 25

-15

-10

-50

510

15

Gen 0

N-gram freq

De

via

tio

n f

rom

exp

ecte

d f

req

(Tamariz, in prep)

Data from 4 different chains of languages

Evolution of the n-gram freq distr

1 3 5 7 9 11 15 18 21 24 27 30 40

-15

-10

-50

510

15

Gen 5

N-gram freq

De

via

tio

n f

rom

exp

ecte

d f

req

(Tamariz, in prep)

Data from 4 different chains of languages

Evolution of the n-gram freq distr

1 3 5 7 9 11 14 17 20 24 27 32 46

-15

-10

-50

510

15

Gen 10

N-gram freq

De

via

tio

n f

rom

exp

ecte

d f

req

9

13 18 27 33 21 22

(Tamariz, in prep)

Data from 4 different chains of languages

Signature of “x9” structure

1 3 5 7 9 12 15 18 21 24 27 30

-15

-10

-50

510

15

Gen 10

N-gram freq

Dev

iatio

n fr

om e

xpec

ted

freq

9 12

15 18 26

28 27

29

1 3 5 7 9 11 14 17 20 24 27 32 46

-15

-10

-50

510

15

Gen 10

N-gram freq

Devia

tion fro

m e

xpecte

d fre

q

9

13 18 27 33 21 22

FILTEREDCONDITION

UNFILTEREDCONDITION

Provide convergent evidence of adaptation of forms to the quantitative structure of the meaning space

Evolution of the n-gram frequency distribution

(Tamariz, in prep)

3. Adding MEANING to the picture�

Compositionality

Chain 1 Gen 0

•  Thenweobtaintheregularityofthemappingsbetweeneachsegment(wordbeginning/middle/end)andeachmeaning(shape/moCon/colour)

wi ne kuki wi nu kuki wi ke kuki

wi ne kiko wi ne kiko wi kiko

wi ne ko wu ne ko wi ke ko

ku n kuki hu ne kuki ku ne kuki

ku n kiko hu ne kiko ku ne kiko

ku ne ko hu ne ko ku ne ko

po ne kuki pu ne kuki po ne kuki

po kiko pu niko po kiko

po ne ko pu ne ko po ne ko

Chain 1 Gen 0 Chain 1 Gen 4

“In a compositional system, the meaning of a complex form is a function of the meanings of the components of the form plus the rules used to combine them”

•  Kirby, Cornish & Smith quantified increase in structure

•  Intuitively compositional

•  This knowledge helps segment linguistic forms into meaningful units

(Cornish, Tamariz & Kirby, 2010)

Compositionality

0 1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

1.0

WORD BEGINNING

Generation

RegMap

SHAPE

MOTION

COLOUR

0 1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

1.0

WORD MIDDLE

Generation

RegMap

SHAPE

MOTION

COLOUR

0 1 2 3 4 5 6 7 8 9 10

0.0

0.2

0.4

0.6

0.8

1.0

WORD END

Generation

RegMap

SHAPE

MOTION

COLOUR

Integrate 9 graphs into a measure of compositionality

(Tamariz, in prep)

Compositionality

Division of labour:

Adaptation of word segments to capture different meanings

0 1 2 3 4 5 6 7 8 9 10

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Generation

Compositionality

WHOLE LANGUAGE (3 SEGMENTS x 3 MEANINGS)

4. Evolutionary dynamics �

N-gram frequencies

!"

#"

$!"

$#"

%!"

%#"

&!"

!" $" %" &" '" #" (" )" *" +" $!"

!"#$%#&'()'*%&+)

,#&#"-.*&)

/#'*&0)'*&1*&-&+)!

,"

-"

."

/"

0"

1"

2"

Neutral evolution or selection?

!"

#"

$!"

$#"

%!"

%#"

&!"

!" $" %" &" '" #" (" )" *" +" $!"

!"#$%#&'()'*%&+)

,#&#"-.*&)

/+0)%&12"-3!

,"

-"

."

/"

0"

1"

2"

3"

Second consonant Third consonant

3 adaptive niches? !"

#"

$"

%"

&"

'!"

'#"

'$"

'%"

'&"

!" '" #" (" $" )" %" *" &" +" '!"

!"#$%#&'()'*%&+)

,#&#"-.*&)

!/&-0)+"/1"-2! ,-." /-."

01," 2-."

23/" /4."

/5," 06,"

03/" .4."

.3/" 21,"

,3/" 0-2"

0-." ,4."

,-0" 2-0"

2-2" 05."

,6," 07."

.80" 0-0"

,-2" .-,"

!"

#"

$"

%"

&"

'!"

'#"

'$"

!" '" #" (" $" )" %" *" &" +" '!"

!"#$%#&'()'*%&+)

,#&#"-.*&)

/0+)+1"##)'*&0*&-&+0! ,-." /-"01" 0/1"1," -/1"1-," -1"1/1" 11-".1" 111",/2" 03-"1." -0"1.-" 10/"113" -11"/31" -1/"110" .0,"1./" ,0",1" 1/"011" ,11"--/" ,/1"103"

!"

#"

$!"

$#"

%!"

%#"

!" $" %" &" '" #" (" )" *" +" $!"!"#$%#&'()'*%&+)

,#&#"-.*&)

/&0)1(22-32#!,-"

./"

,0"

,1"

.1"

20"

.0"

.-"

31"

45"

N-gram frequencies

The evolution of meaningful segments Word-initial bigrams, KCS’08 chain 1

po

ni

hu

ko

ki

wa ke

mu

pi

po

hu

ku ko

ki

wi

wa

pi

po

hu

ku

ki

wi

po

hu

ku

wi

po pu

hu

ku

wu

wi

po pu

hu

ku

wi

wa

po pu

ho

ku

wi

po pu

ho

ku

wi

pi

po pu

ho

hu

ku

wi

po pu

hu

wi

pu

hu

wi

4

2

5

6

3

3

2 2

4

7

5

2 2

1

2

2

9

7

2

2 7

9

4

5

9

3

2 3

2

1

1

6

0 1 2 3 4 5 6 7 8 9 10

6

3

3

6

1

8

2

7

4

5

6

3

2

7

3

5

9

4

6

3

3

9

7

5

6

9

9

5

13

Generations

(Cornish, Tamariz & Kirby, 2010)

•  Replication - memorability

•  Variation - recombination

•  Selection - 3 meaning “niches” - vs. directed mutation

Units of evolution

•  Form frequency structure and form-meaning regularity reveal high-fitness (memorable, replicable) units

-  Perceptually salient

-  Distinct from each other

-  Meaningful

wi ne kuki wi nu kuki wi ke kuki

wi ne kiko wi ne kiko wi kiko

wi ne ko wu ne ko wi ke ko

ku n kuki hu ne kuki ku ne kuki

ku n kiko hu ne kiko ku ne kiko

ku ne ko hu ne ko ku ne ko

po ne kuki pu ne kuki po ne kuki

po kiko pu niko po kiko

po ne ko pu ne ko po ne ko

Chain 1 Gen 4

wi = black hu / ku = blue po / pu = red

kuki = kiko = ko =

Units of evolution

•  Form frequency structure and form-meaning regularity reveal high-fitness (memorable, replicable) units

(units of evolution with respect to meaning)

In natural language:

Meanings Forms

Phonemes Semantic Words & Grammatical Constructions Social prestige Intonation Group identity Accent Attitude Volume … …

-  Perceptually salient

-  Distinct from each other

-  Meaningful

•  Evidencefor–  FORMSTRUCTUREREFLECTINGMEANINGSTRUCTUREinnaturallanguage

–  RESULTSFROMADAPTATIONfromrandomtosystemaCcform‐meaningmappings

–  EVOLUTIONARYdynamics

Summing up:

  Replication

  Variation

  Drift

  Selection

  Units of evolution