37
Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language and Identity Kim Ebensgaard Jensen CGS, Aalborg University

Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Embed Size (px)

Citation preview

Page 1: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Identity's identities:An empirical study of the distributional effects of polysemy

Aalborg Languages and Linguistics Research Group Seminar

on

Language and Identity

Kim Ebensgaard JensenCGS, Aalborg University

Page 2: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

IntroductionIntroduction'Identity' is polysemous and has a number of functions:

'The informant, whose identity was protected, said that he or she was involved with Lowery and another man …' (COCA 2011 NEWS AssocPress) - [NAME, INFORMATION, BACKGROUND]

'Egan has been highly praised for her searching and unconventional narratives about modern angst and identity.' (COCA 2001 NEWS AssocPress) - [PERSONALITY]

'...it is not always biodiversity per se that explains these relationships, because the identity and biology of the species involved can influence the outcome of the interactions' (COCA 2011 ACAD Bioscience) - [CHARACTERISTICS]

'...but that she – felt so invested in her identity as a mother' (COCA 2011 SPOK CNN_Behar) - [SOCIAL ROLE]

Page 3: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

IntroductionIntroductionThis study provides an analysis of the distributional behavior of each sense of 'identity' (i.e. a behavioral profile [Gries 2010, fc: §2]).

Towards an understanding of the lexical concept(s) of IDENTITY in (American) English.

Towards an understanding of the concept of IDENTITY as such in (American) Anglophone culture(s).

Page 4: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

IntroductionIntroduction

DISCLAIMER!

This work is far from complete and has not yet been fully error checked – the results should be taken with a grain of salt and, if anything, merely as documentation of the method and research work in progress!

Page 5: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

OutlineOutline

The complexity of identityPolysemy and distributionData and method

DataBehavioral profiling

Findings

Page 6: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

The complexity of identityThe complexity of identity

The lexeme 'identity' and the concept it covers are tricky. As Fearon (1999) points out, 'identity' has a number of specialized uses in the humanistic and social sciences. On top of that, the lexeme figures in everyday discourse and, of course, other more specialized discourses.

Page 7: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

The complexity of identityThe complexity of identity

The problem is that

Our present idea of "identity" is a fairly recent social construct, and a rather complicated one at that. Even though everyone knows how to use the word properly in everyday discourse, it proves quite difficult to give a short and ade- quate summary statement that captures the range of its present meanings. (Fearon 1999: 4)

Page 8: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

The complexity of identityThe complexity of identityAnd..

Given the centrality of the concept to so much recent research – and especially in social science where scholars take identities both as things to be explained and things that have explanatory force – this amounts almost to a scandal. At a minimum, it would be useful to have a concise statement of the meaning of the word in simple language that does justice to its present intension. (Fearon 1999: 4)

Page 9: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

The complexity of identiyThe complexity of identiy'Identity' is simply caught in the reality of language. The lexical concept IDENTITY is, in reality, a set of lexical concepts which are associated with a number of different functions and different contexts. Consequently, the 'short and adequate' statement that Fearon (1999: 4) calls for is ultimately impossible – if it is to do any justice to 'identity' in all its aspects of use.

However, it is possible to analyze the use of 'identity' and to get an overview of the lexical concepts it covers and how they behave in actual language use, deploying techniques of analysis and description from corpus linguistics.

Page 10: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Polysemy and distributionPolysemy and distribution

Polysemy: when a lexical or constructional item is associated with two or more interrelated senses.

The senses are, in the perspective of cognitive linguistics, organized in prototype categorial networks (Geeraerts 1997)

Page 11: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Polysemy and distributionPolysemy and distribution

Meaning cannot be directly observed, but “the distributional characteristics of a linguistic expression reveal many if not most of its semantic and functional properties” (Gries 2012: 57)

Page 12: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Polysemy and distributionPolysemy and distribution

The senses of a polysemic item may be reflected in different distributional patterns.

For instance the INGESTION sense of 'feed' is reflected in the preference for intransitive contexts and the PROVIDE WITH FOOD / MAKE EAT sense is reflected in the preference for transitive contexts.

Page 13: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Data and methodData and method

Data:Source: COCA (2013)

2011-sectionAcademic texts, newspapers, magazines, fiction, speech

Query: identity, identities, identity's, identities'

Hits: 1330 (no genitives found)

Page 14: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Data and methodData and method

A behavioral profile is a fine-grained analysis of the distributional patterns (aka. behavior) of a lexical item based on the assumption

that distributional similarity reflects, or is indicative of, functional similarity; our understanding of functional similarity is rather broad, i.e., encompassing any function of a particular expression, ranging from syntactic over semantic to discourse-pragmatic. (Gries & Berez 2009: 159)

Page 15: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Data and methodData and method

Three steps of BP

1)Qualitative analysisIdentification of sensesID tagging

2)Quantification of ID tags – using Gries (2010)

3)Evaluation of quantitative data – using Gries (2010)

Page 16: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Towards a behavioral profile: senses and ID tagsTowards a behavioral profile: senses and ID tagsSenses (will definitely have to be reworked!)

•DISTINCT PERSONALITY (DistPersonality)

•INDIVIDUAL CHARACTERISTICS (IndCharacteristics)

•IDENTITY OPERATOR (IdenOperator)

•GROUP MEMBERSHIP/ALIGNMENT (GroupMemb)

•HANDLE OR USERNAME (Handle)

•NAME AND BACKGROUND INFORMATION (NameBack)

•SELF-PERCEPTION (Self)

•GEOGRAPHICAL BELONGING (GeoBel)

•UNIQUENESS (Unique)

•STATE OF EXISTENCE (SoExist)

•PERCEIVED/ASSIGNED IDENTITY (Perce)

•DEFINING FEATURE (DefFeat)

•SOCIAL ROLE (SocRole)

•GENDER AND SEXUAL ORIENTATION (GenderSex)

•CULTURAL DEFINITION OF SOMEONE/SOMETHING (Culture)

Interestingly no instances of the SAMENESS sense were found.

Page 17: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Towards a behavioral profile: senses and ID tagsTowards a behavioral profile: senses and ID tags

ID tagging: Each instance of 'identity' was assigned ID tags after

identification of its sense. Each ID tag is divided into what is called levels (which is a

specification of a category within the ID tag). Some examples of ID tags and levels:

Number (plural, singular) Determiner (definite article, indefinite article, indefinite

pronoun, possessive pronoun, zero etc.) Semantics of premodifier (ethnicity, ideology, gender,

temporality etc.)

Page 18: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Towards a behavioral profile: senses and ID tagsTowards a behavioral profile: senses and ID tagsID tagging: Syntax:

Syntactic function

Determiner

Word class of premodifier

Postmodifier

Diathesis

Morphology:

Number

Function in nominal compound

Lexeme in modifier in nominal compound

Semantics:

Semantics of prepositional complement in postmodifying prepositional phrase

Semantics of premodifier

Collocation:

Lexeme in head postmodified by 'identity'

Lexeme in main verb when 'identity' is subject

Lexeme in main verb when 'identity, is passive subject, object or similar syntactic function

Discourse-pragmatics:

Textual function

Speech act

Domain

Page 19: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Towards a behavioral profile: senses and ID tagsTowards a behavioral profile: senses and ID tags

ID tagging:

Page 20: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Quantification of ID tagsQuantification of ID tags

After being assigned to identified senses, the ID tags are quantified.

Quantification of ID tags is essentially the identification of association patterns - “the systematic ways in which linguistic features are used in association with other linguistic and non-linguistic features” (Biber et al. 1998: 5).

The result is the behavioral profile of. I used Gries (2010).

Page 21: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Quantification of ID tagsQuantification of ID tags

Page 22: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Quantification of ID tagsQuantification of ID tags

Page 23: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Quantification of ID tagsQuantification of ID tags

Page 24: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Quantification of ID tagsQuantification of ID tagsOutput:

Page 25: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Quantification of ID tagsQuantification of ID tagsOutput:

Work in progress – not fully error-checked as you can see :-S

Page 26: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Behavioral profileBehavioral profile

ID tag: discourse-pragmatics – domain

Culture DistPersonality GenderSex GeoBel GroupMemb NameBack

academic 0,8983050847 0,6373626374 0,7272727273 0,6517857143 0,8246268657 0,1639344262

fiction 0,0338983051 0,0659340659 0,0519480519 0,0446428571 0,0111940299 0,2090163934

magazine 0,0169491525 0,1318681319 0,0649350649 0,1607142857 0,0858208955 0,1557377049

news 0,0169491525 0,0695970696 0,025974026 0,1071428571 0,0634328358 0,2336065574

spoken 0,0338983051 0,0952380952 0,1298701299 0,0357142857 0,0149253731 0,237704918

Page 27: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Behavioral profileBehavioral profile

Page 28: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Behavioral profileBehavioral profileID tag: syntax – function

Culture DistPersonality GenderSex GeoBel GroupMemb Handle NameBack

A-PC 0,0677966102 0,1135531136 0,1168831169 0,0714285714 0,1231343284 0 0,0983606557

APP 0 0,0036630037 0 0,0267857143 0 0 0,0040983607

Co 0 0,0073260073 0 0 0 0 0

Cs 0,0338983051 0,021978022 0,025974026 0,0267857143 0,0037313433 0 0,0081967213

INP 0,1186440678 0,043956044 0,1168831169 0,1964285714 0,0820895522 0,0833333333 0,0327868852

Od 0,3050847458 0,2783882784 0,2597402597 0,25 0,2276119403 0,8333333333 0,5204918033

Oi 0 0 0 0 0 0 0

PoM-PC 0,3728813559 0,3626373626 0,2857142857 0,3571428571 0,4141791045 0 0,1352459016

PreM 0,0508474576 0,0476190476 0,012987013 0,0089285714 0,0559701493 0 0,0327868852

S 0,0508474576 0,1208791209 0,1818181818 0,0625 0,0932835821 0,0833333333 0,1680327869

ID tag: syntax – diathesis

Culture DistPersonality GenderSex GeoBel GroupMemb Handle NameBack

active 0,7796610169 0,8498168498 0,7402597403 0,7410714286 0,7649253731 0,9166666667 0,8442622951

passive 0,0677966102 0,0805860806 0,0909090909 0,0357142857 0,1044776119 0 0,1229508197

NA 0,1525423729 0,0695970696 0,1688311688 0,2232142857 0,1305970149 0,0833333333 0,0327868852

Page 29: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Behavioral profileBehavioral profile

Page 30: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Behavioral profileBehavioral profile

Page 31: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Behavioral profileBehavioral profileID tag: collocate in head – 'sense'

Culture 0

DefFeat 0

DistPersonality 0,0036630037

GenderSex 0

GeoBel 0,0535714286

GroupMemb 0,026119403

Handle 0

IdenOperator 0

IndChar 0

NameBack 0

Perce 0

Self 0,2222222222

SocRole 0

SoExist 0

Unique 0

Page 32: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Clustering the senses of 'identity'Clustering the senses of 'identity'

Behavioral profiles offer a fine-grained view of the distributional behavior of the senses of polysemic lexemes.

This offers several possibilities of gaining insights into the actual use of the lexeme in question and, of course, its senses.

Behavioral profiling may also give us an idea of the structure of the category network that relates the senses of the lexeme.

Page 33: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Clustering the senses of 'identity'Clustering the senses of 'identity'

The behavioral profile itself technically also constitutes a usage-based network of senses, as it associates senses with distributional aspects (represented by ID tag levels) – thus offering a complex overview of entrenchment patterns.

However, because of the complexity and detail, it is difficult to get an overview of how the senses relate to each other in such a network on the basis of a fully fledged behavioral profile.

Fortunately, there are statistical methods of identifying category structures on the basis of calculations of similarity – such as cluster analysis.

Page 34: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Clustering the senses of 'identity'Clustering the senses of 'identity'

Page 35: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Clustering the senses of 'identity'Clustering the senses of 'identity'

Page 36: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

Concluding remarksConcluding remarks 'Identity' is polysemic Our (provisional) behavioral profile shows that...:

IDENTITY is a very complex lexical concept – or set of lexical concepts strongly associated with factors of distribution.

There is a sharp distinction between the NameBack sense and the other senses, with the latter being associated in particular with academic discourse.

The NameBack and Handle senses are more frequently direct objects than the other senses, suggesting perhaps a more concrete or entity-like conceptualization.

The Self sense collocates with 'sense' (in 'sense of') much more strongly than any other sense.

Although far from complete and still not fully error checked, this analysis also shows that behavioral profiling is an empirically powerful way to do lexicological analysis due to its multifactorial nature.

Page 37: Identity's identities: An empirical study of the distributional effects of polysemy Aalborg Languages and Linguistics Research Group Seminar on Language

Kim Ebensgaard JensenCGS, Aalborg University

BibliographyBibliography Berez, Andrea L. & Stefan Th. Gries (2009). In defense of corpus-based methods: a behavioral

profile analysis of polysemous get in English. In Steven Moran, Darren S. Tanner, & Michael Scanlon (eds.), Proceedings of the 24th Northwest Linguistics Conference. University of Washington Working Papers in Linguistics Vol. 27. Seattle, WA: Department of Linguistics. 157-166.

Biber Douglas, Susan Conrad, Randi Reppen (1998). Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press.

Davies, Mark (2013). Corpus of Contemporary American English (COCA). corpus.byu.edu/coca/

Fearon, James D. (1999). What is identity (as we now use the word)? Unpublished manuscript, Department of Political Studies, Stanford University.

Geeraerts, Dirk (1997). Diachronic Prototype Semantics: A Constribution to Historical Lexicology. Oxford: Oxford University Press.

Gries, Stefan Th. (2010). BehavioralProfiles 1.01: A program for R 2.7.1 and higher.

Gries, Stefan Th. (2012). Behavioral profiles: A fine-grained and quantitative approach in corpus-based lexical semantics. In Gonia Jarema, Gary Libben, & Chris Westbury (eds.), Methodological and Analytic Frontiers in Lexical Research. Amsterdam & Philadelphia: John Benjamins. 57-80.

Gries, Stefan Th. (fc). Corpus and quantitative linguistics. In John Taylor & Jeanette Littlemore (eds.), Companion to Cognitive Linguistics. London & New York: Bloomsbury.