57
The Big Questions Need Multipurpose Portable Solutions Laura A. Janda CLEAR (Cognitive Linguistics: Empirical Approaches to Russian) UiT The Arctic University of Norway

The Big Questions Need Multipurpose Portable Solutions

  • Upload
    gaurav

  • View
    37

  • Download
    0

Embed Size (px)

DESCRIPTION

The Big Questions Need Multipurpose Portable Solutions. Laura A. Janda CLEAR (Cognitive Linguistics: Empirical Approaches to Russian) UiT The Arctic University of Norway. An inherent challenge. Quantitative Investigations in Theoretical Linguistics. BUT: - PowerPoint PPT Presentation

Citation preview

Page 1: The Big Questions Need  Multipurpose Portable Solutions

The Big Questions Need Multipurpose Portable Solutions

Laura A. JandaCLEAR (Cognitive Linguistics: Empirical Approaches to Russian)

UiT The Arctic University of Norway

Page 2: The Big Questions Need  Multipurpose Portable Solutions

An inherent challenge

BUT:Theories don’t come equipped with quantitative methods.

Quantitative analysis doesn’t guarantee theoretical relevance.

BUT:Theories don’t come equipped with quantitative methods.

Quantitative analysis doesn’t guarantee theoretical relevance.

Page 3: The Big Questions Need  Multipurpose Portable Solutions

A Big Question perspective

Big QuestionsTranscend theory Interesting for all linguists

TheoryHelps to focus Big Questions

OperationalizationFacilitates quantitative methods

Page 4: The Big Questions Need  Multipurpose Portable Solutions

Overview1. Big Questions2. Theoretical perspective3. Operationalization4. Portable5. Multipurpose 6. Examples7. Infrastructure8. Applications

Page 5: The Big Questions Need  Multipurpose Portable Solutions

1. Some Big QuestionsWhat is the relationship between form and meaning?

What is the relationship between lexicon and grammar?

What is the structure of linguistic categories?

What is the structure of linguistic constructions?

Page 6: The Big Questions Need  Multipurpose Portable Solutions

2. Theoretical perspective:Cognitive linguistics

Minimal Assumption: language can be accounted for in terms of general cognitive strategies

• no autonomous language faculty• no strict division between grammar and lexicon• no a priori universals

Usage-Based: generalizations emerge from language data• no strict division between langue and parole• no underlying forms

Meaning is Central: holds for all language phenomena• no semanticallyempty forms• differences in behavior are motivated (but not

specifically predicted) by differences in meaning

Page 7: The Big Questions Need  Multipurpose Portable Solutions

Big Questions focused byCognitive Linguistics

What is the relationship between form and meaning?How does form reflect meaning and vice versa?Can we use difference in form as a measure of meaning?

What is the relationship between lexicon and grammar?How do we account for meaning in grammar?Can we use similar models for grammatical meanings?

What is the structure of linguistic categories?What is relationship between prototype and periphery?Can we compare category structure across near

synonyms?

What is the structure of linguistic constructions? Are constructions hierarchical or flat?What is the relationship between constructions and

fillers?

Page 8: The Big Questions Need  Multipurpose Portable Solutions

3. Operationalization:Linguistic profiles

Focused subsets of behavioral profiles (Firth 1957, Harris 1970, Hanks 1996, Geeraerts et al. 1999, Speelman et al. 2003, Divjak & Gries 2006, Gries & Divjak 2009)

Grammatical profiling: relationship between frequency distribution of forms and linguistic categories

Semantic profiling: relationship between meanings (semantic tags) and forms

Constructional profiling: relationship between frequency distribution of grammatical constructions and meaning

Radial category profiling: differences in the frequency distribution of uses across two or more near-synonyms Collostructional profiling: relationship between a construction and the words that fill its slots

Page 9: The Big Questions Need  Multipurpose Portable Solutions

4. Portable

Linguistic profiles are portable– across questions– across theories– across statistical models– across languages

Linguistic profiles are a suite of methodological ideas that make it possible to approach Big Questions empirically from a variety of angles

Ideally results are also portable across platforms – open source, open access, available to all

researchers

We will return to the issue of portability in 7. Infrastructure

We will return to the issue of portability in 7. Infrastructure

Page 10: The Big Questions Need  Multipurpose Portable Solutions

5. Multipurpose

Quantitatively measured results yield real gains in our understanding of languages

These results can serve multiple purposes:– resources for language learners and users– (real, not statistical) machine translation – documentation and revitalization for

minority indigenous languages – language policy

We will return to the multipurpose issue in 8. Applications

We will return to the multipurpose issue in 8. Applications

Page 11: The Big Questions Need  Multipurpose Portable Solutions

6. Examples

• Grammatical Profiles: TAM in Russian• Semantic Profiles: “Empty” prefixes in Russian• Constructional Profiles: SADNESS in Russian• Radial Category Profiles: Ambipositions in North

SaamiFor each example we will identify:

• Big Questions• Theoretical perspective• Operationalization (Profiling) & statistical methods• Portability• Multipurpose applications

For each example we will identify:• Big Questions• Theoretical perspective• Operationalization (Profiling) & statistical methods• Portability• Multipurpose applications

Page 12: The Big Questions Need  Multipurpose Portable Solutions

Grammatical Profiles: TAM in Russian

Janda, L. A. & Lyashevskaya, O. 2011. “Grammatical profiles and the interaction of the lexicon with aspect, tense and mood in Russian”. Cognitive Linguistics 22:4 (2011), 719-763.

Page 13: The Big Questions Need  Multipurpose Portable Solutions

Crash course in Russian TAM

Tense: Past vs. Non-Past– Non-Past: Imperfective = Present tense vs. Perfective = Future

tense

Aspect: Perfective (marked) vs. Imperfective (unmarked)– All forms of all verbs express aspect – “Aspectual pairs” = same lexical meaning, different aspect,

e.g., pisat’ ‘write[imperfective]’ vs. napisat’ ‘write[perfective]’– Aspectual pairs can be formed via both prefixation and

suffixation (perepisat’ ‘rewrite[perfective]’ vs. perepisyvat’ ‘rewrite[imperfective]’)

– ≈1400 imperfective base stems form ≈2000 perfective aspectual partners using 16 prefixes

– These prefixes are traditionally assumed to be “empty”

Mood: imperative, infinitives in modal constructions

Page 14: The Big Questions Need  Multipurpose Portable Solutions

Grammatical Profiles: TAM in Russian

Big Questions:

What is the relationship between form and meaning?

➜ between verb inflection and grammatical meaning of aspect?

What is the relationship between lexicon and grammar?

➜ between lexical meaning of verbs and TAM?

Page 15: The Big Questions Need  Multipurpose Portable Solutions

Grammatical Profiles: TAM in Russian

Theoretical focus:

Can we measure the expression of aspect according to distribution of inflected forms?

Can we distinguish between prefixation vs. suffixation in formation of aspectual pairs?

Can we measure the attraction of lexical classes to grammatical categories?

Page 16: The Big Questions Need  Multipurpose Portable Solutions

Grammatical Profiles: TAM in Russian

Operationalization:

Grammatical profiles: frequency distribution of inflected forms➜Distribution of Russian verb forms according to

subparadigm➜Distribution of Russian verbs according to subparadigm

Data:Approx. 6M verb forms from the Russian National Corpus (

http://ruscorpora.ru/ )

Statistics:Chi-square, Cramer’s V effect size, distribution plots

Page 17: The Big Questions Need  Multipurpose Portable Solutions

What is a grammatical profile?Verbs have different forms:

eat 749 Meats 121 Meating 514 Meaten 88.8 Mate 258 M

The grammaticalprofile of eat

Page 18: The Big Questions Need  Multipurpose Portable Solutions
Page 19: The Big Questions Need  Multipurpose Portable Solutions

Grammatical Profiles of Russian Verbs

20.04.23

19

Nonpast Past Infinitive Imperative

Imperfective 1,330,016 915,374 482,860 75,717

Perfective 375,170 1,972,287 688,317 111,509

chi-squared = 947756df = 3p-value < 2.2e-16effect size (Cramer’s V)= 0.399 (medium-large)

Page 20: The Big Questions Need  Multipurpose Portable Solutions

Distribution of Russian verb forms according to subparadigmPrefixation (dark) vs. suffixation (light):Statistically significant, BUT effect sizes too small (0.076 & 0.037)

Prefixation (dark) vs. suffixation (light):Statistically significant, BUT effect sizes too small (0.076 & 0.037)

Page 21: The Big Questions Need  Multipurpose Portable Solutions

04/20/23

21

Distribution of Russian verbs according to subparadigm:Imperfective verbs and their attraction to imperative

Over 200 outliersOver 200 outliers

Page 22: The Big Questions Need  Multipurpose Portable Solutions

Imperfective imperative “be doing X!”• Polite: guest knows what to expect: razdevajtes’ ‘take off your coat’, sadites’ ‘sit down’

• Insistence: hearer is hesitant: stupajte ‘get going’, gljadite ‘look’, zabirajte ‘take’

• Insistence: hearer has not behaved properly (connection with negation): provalivaj ‘get out of here’, končaj ‘stop’, ne perebivaj ‘don’t interrupt’

• Polite requests: vyručajte ‘help’• Kind wishes: vyzdoravlivajte ‘get well’• Idiomatic: davajte posmotrim ‘let’s take a look’

• Idiomatic/culturally anchored: proščaj(te) ‘farewell’, soedinjajtes’ ‘unite’ (slogan), zapevaj ‘sing’ (army)

Page 23: The Big Questions Need  Multipurpose Portable Solutions

Grammatical Profiles: Findings

• Perfective verbs behave differently than imperfective verbs

• “Verb pairs” behave the same regardless of which type of morphology (prefixation vs. suffixation) is used to mark aspect

• We can identify exactly the verbs that are most attracted to various TAM combinations.

Page 24: The Big Questions Need  Multipurpose Portable Solutions

Grammatical Profiles: Portability

• Across issues:– Grammatical profiling and gender stereotypes (Kuznetsova 2012)

• Across languages:– Gives 96% resolution of perfective vs. imperfective for Old Church

Slavonic verbs, as compared with Dostál 1954 (Eckhoff & Janda 2013)

– Planned study of grammatical profiles across 4 languages:

• Across researchers:– All outlier verbs listed in Janda & Lyashevksaya 2011, data and

code for Eckhoff & Janda 2013 on website

Problem: Janda & Lyashevksaya 2011 data not publicly archived yet

Problem: Janda & Lyashevksaya 2011 data not publicly archived yet

Page 25: The Big Questions Need  Multipurpose Portable Solutions

Grammatical Profiles: Multipurpose Applications

Pedagogical implications:• Strategic combinations of verbs and subparadigms

Page 26: The Big Questions Need  Multipurpose Portable Solutions

Semantic Profiles: “Empty” prefixes in Russian

Janda, L. A. & Lyashevskaya, O. 2013. “Semantic Profiles of Five Russian Prefixes: po-, s-, za-, na-, pro-”. Journal of Slavic Linguistics 21:2, 211-258.

Page 27: The Big Questions Need  Multipurpose Portable Solutions

Semantic Profiles: “Empty” prefixes in Russian

Big Questions:

What is the relationship between form and meaning?

➜ ...between prefixes and meanings of verbs?

Are there any “empty” forms?➜ Are prefixes empty as claimed?

Imperfective base Prefixed perfective

sovetovat’ ‘advise’ posovetovat’ ‘advise’

varit’ ‘cook’ svarit’ ‘cook’

pisat’ ‘write’ napisat’ ‘write’

tverdet’ ‘harden’ zatverdet’ ‘harden’

gremet’ ‘thunder’ progremet’ ‘thunder’

Page 28: The Big Questions Need  Multipurpose Portable Solutions

Semantic Profiles: “Empty” prefixes in Russian

Theoretical focus:

Can we measure the relationship between prefixes and meanings of verbs?

➜ Distribution of prefixes vs. semantic groups of verbs

How do we show that “empty” forms aren’t really empty?

➜ Show that prefixes have different semantic behaviors

Page 29: The Big Questions Need  Multipurpose Portable Solutions

Semantic Profiles: “Empty” prefixes in Russian

Operationalization:

Semantic profiling: relationship between meanings (semantic tags) and forms

➜Distribution of Russian verb prefixes vs. semantic tags

Data:382 verbs with “empty” prefixes from the Exploring

Emptiness database (http://emptyprefixes.uit.no/index.php ), semantic tags independently assigned in the Russian National Corpus (http://ruscorpora.ru/ )

Statistics:Chi-square, Cramer’s V effect size, Fisher Test

Page 30: The Big Questions Need  Multipurpose Portable Solutions

20.04.23

30

chi-square = 248, df = 12, p = 2.2e-16; Cramer’s V effect-size = 0.8

Page 31: The Big Questions Need  Multipurpose Portable Solutions

Attractions and repulsions measured by Fisher Test

Page 32: The Big Questions Need  Multipurpose Portable Solutions

Semantic Profiles: Findings

• Each prefix has a unique semantic profile• Each prefix is attracted to and repulsed by a

different set of semantic classes of verbs• It is possible to establish meanings of prefixes

and expectations for how prefixes combine with verbs

Page 33: The Big Questions Need  Multipurpose Portable Solutions

Semantic Profiles: Portability

All data, statistical code, lists of verbs available at:http://emptyprefixes.uit.no/semantic_eng.htm

Page 34: The Big Questions Need  Multipurpose Portable Solutions

Semantic Profiles: Multipurpose Applications

Pedagogical implications:We can design materials

that reduce the burden of memorizing ≈2000 correct prefix-verb combinations

Page 35: The Big Questions Need  Multipurpose Portable Solutions

Constructional Profiles: SADNESS in Russian

Janda, L. A. & Solovyev, V. 2009. “What Constructional Profiles Reveal About Synonymy: A Case Study of Russian Words for sadness and happiness”. Cognitive Linguistics 20:2, 367-393.

Page 36: The Big Questions Need  Multipurpose Portable Solutions

Crash course in Russian case & SADNESS

Nouns are obligatorily case-marked

6 cases: Nominative, Accusative, Dative, Instrumental, Genitive, Locative

– All cases can appear with a preposition– All cases except Locative can also appear without a

preposition

– 70 constructions [(preposition) [NOUN]case]

SADNESS: 6 near-synonyms, no “umbrella term”– grust’, melanxolija, pečal’, toska, unynie, xandra

Page 37: The Big Questions Need  Multipurpose Portable Solutions

Constructional Profiles: SADNESS in Russian

Big Questions:

What is the relationship between form and meaning?

➜What is the relationship between words and grammatical constructions?

➜What is the relationship between synonyms?

Page 38: The Big Questions Need  Multipurpose Portable Solutions

Constructional Profiles: SADNESS in Russian

Theoretical focus:

Can we measure the difference between synonyms in terms of distribution in grammatical constructions?

Page 39: The Big Questions Need  Multipurpose Portable Solutions

Constructional Profiles: SADNESS in Russian

Operationalization:

Constructional profiling: relationship between frequency distribution of grammatical constructions and meaning

➜SADNESS words vs. distribution in [(preposition) [NOUN]case] constructions

Data: 500 sentences for each word from Russian National Corpus, Biblioteka Maksima Moškova

Statistics:Chi-square, Cramer’s V effect size, Hierarchical Clustering

(squared Euclidean distance)

Page 40: The Big Questions Need  Multipurpose Portable Solutions

Chi-square = 730.35, df = 30, p < 0.0001, Cramer’s V = 0.305

Page 41: The Big Questions Need  Multipurpose Portable Solutions

pečal’ toska xandra melanxolija grust’ unynie

‘Sadness’ Hierarchical Cluster

Page 42: The Big Questions Need  Multipurpose Portable Solutions

Constructional Profiles: Findings

Each synonym has a unique constructional profile

Some synonyms are closer together, others are farther apart

Page 43: The Big Questions Need  Multipurpose Portable Solutions

Constructional Profiles: Portability

• Across issues:– Logistic regression analysis of Russian gruzit’ ‘load’

with 3 “empty” prefixes across Locative Alternation constructions (Sokolova 2012, Sokolova, Janda and Lyashevskaya 2012)

– Analysis of aspectual pairs formed by prefix pro- (Kuznetsova 2012)

• Across languages:– North Saami anaphoric possessive constructions:

reflexive pronoun vs. possessive suffix (forthcoming)• Data published in Janda & Solovyev article; data and

code for gruzit’ on website.Problem: Janda & Solovyev 2009 – I was using SPSS and don’t have access anymoreProblem: Janda & Solovyev 2009 – I was using SPSS and don’t have access anymore

Page 44: The Big Questions Need  Multipurpose Portable Solutions

Constructional Profiles: Multipurpose Applications

Pedagogical implications:Teach relevant constructions with near-synonyms

Possible implication for machine translation:Lexical selection informed by constructional profiles

Page 45: The Big Questions Need  Multipurpose Portable Solutions

Radial Category Profiles: Ambipositions in North Saami

Antonsen, L., Janda, L. A., & Baal, B. A. B. “Njealji davvisámi adposišuvnna geavahus” [“The Use of Four North Saami Adpositions”], co-authored with Lene Antonsen[1] and Berit Anne Bals Baal[3], Sámi dieđalaš áigečála 2012, v. 2. 32pp.

Janda, L. A., Antonsen, L. & Baal, B. A. B. Forthcoming. “A Radial Category Profiling Analysis of North Sámi Ambipositions”. High Desert Linguistics Society Proceedings, Volume 1. 11 pp.

Page 46: The Big Questions Need  Multipurpose Portable Solutions

Crash course in North Saami ambipositions

Unusually large number of adpositions that can appear as both prepositions and postpositions, always use Genitive case1. a. miehtá dálvvi b. dálvvi miehtá

[over winter-G] [winter-G over]‘during the winter’

2. a. čađa áiggi b. áiggi čađa[through time-G] [time-G through]

‘through time’3. a. rastá joga b. joga rastá   [across river-G] [river-G across]

‘across the river’4. a. maŋŋel soađi b. soađi maŋŋel

[after war-G] [war-G after]‘after the war’

5 = North Saami

Page 47: The Big Questions Need  Multipurpose Portable Solutions

Radial Category Profiles: North Saami ambipositions

Big Questions:

What is the relationship between form and meaning?➜What is the relationship between position

(preposition vs. postposition) and meaning?

What is the influence of majority languages (prepositional languages in West vs. postpositional languages in East)?

Is there a relationship between frequency of ambipositions and their use to distinguish meaning?

Page 48: The Big Questions Need  Multipurpose Portable Solutions

Radial Category Profiles: North Saami ambipositions

Theoretical focus:

Can we measure the difference between uses in preposition vs. postposition?

Can we model the meanings in terms of a radial category?

Can we measure dialectal differences?

Page 49: The Big Questions Need  Multipurpose Portable Solutions

Radial Category Profiles: North Saami ambipositions

Operationalization:

Radial category profiling: differences in the frequency distribution of uses across two or more near-synonyms

➜Distribution across uses in radial category for preposition vs. postposition

Data: 100+ sentences for each position from 10M word newspaper corpus, plus exx. from literature, Bible translation

Statistics:Chi-square, Cramer’s V effect size

Page 50: The Big Questions Need  Multipurpose Portable Solutions

Radial categories:miehtá ‘over’ in newspapers

20.04.23

50

time9%

extent79%

motion12%

preposition

time95%

extent5%

postpositionchi-squ = 170, df = 2, p < 2.2e-16; Cramer’s V = 0.85

Page 51: The Big Questions Need  Multipurpose Portable Solutions

Distribution of adpostitions

20.04.23

51

Х2=129.7, df=2, p<2.2e-16Cramer’s V=0.48

Х2=129.7, df=2, p<2.2e-16Cramer’s V=0.48

Page 52: The Big Questions Need  Multipurpose Portable Solutions

Radial Category Profiles: Findings

There is a relationship between meaning and position

Prevailing trends in majority languages do influence use of position

There seems to be a typological relationship between frequency of ambipositions and their use to distinguish meaning

Languages with few ambipositions (Germanic, Russian) do not use position distinctively

Language with more ambipositions use them in more complex ways (North Saami > Finnish, Estonian)

Page 53: The Big Questions Need  Multipurpose Portable Solutions

Radial Category Profiles: Portability

• Across issues and languages:– Russian prefixes vy- vs. iz- (Nesset, Endresen, Janda

2011) – Russian prefixes o-/ob-/obo- (Baydimirova

[Endresen] 2010)• Data and code published on website.

Problem: The North Saami data and code is on a website that requires users to navigate in North Saami and Norwegian and is probably hard to find.

Problem: The North Saami data and code is on a website that requires users to navigate in North Saami and Norwegian and is probably hard to find.

Page 54: The Big Questions Need  Multipurpose Portable Solutions

Radial Category Profiles: Multipurpose Applications

Pedagogical implications:Teach ambipositions with relevant meanings and nouns

Improvements to constraint grammar analyzer:Improves linguistic analysis and language technology tools, these are crucial to preserving and revitalizing the language

Page 55: The Big Questions Need  Multipurpose Portable Solutions

7. Infrastructure

Data management issues: Remember those problems with portability?

--Data analyzed in proprietary programs--Data not publicly available or hard to navigate

http://www.youtube.com/watch?v=N2zK3sAtr-4

Page 56: The Big Questions Need  Multipurpose Portable Solutions

TROLLing

Tromsø Repository of Language and Linguistics

•International archive of data and code•All items open-source, open access•Searchable metadata•Verify results, see how to implement various statistical models•Housed at UiT library•Connected to CLARIN (Common Language Resources and Technology Infrastructure, a networked federation of European data repositories)

Page 57: The Big Questions Need  Multipurpose Portable Solutions

8. Applications

A model for applications:http://giellatekno.uit.no/english.html

A model for applications:http://giellatekno.uit.no/english.html