13
On lexical ambiguity On lexical ambiguity Ágoston Tóth, PhD Ágoston Tóth, PhD University of Debrecen University of Debrecen tagoston tagoston @delfin.unideb.hu @delfin.unideb.hu Ruzomberok Ruzomberok 24 June, 2009 24 June, 2009 Sponsored by Sponsored by OTKA research grant K OTKA research grant K 72983 72983

On lexical ambiguity Ágoston Tóth, PhD University of Debrecen [email protected] Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Embed Size (px)

Citation preview

Page 1: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

On lexical ambiguityOn lexical ambiguity

Ágoston Tóth, PhDÁgoston Tóth, PhDUniversity of DebrecenUniversity of Debrecen

[email protected]@delfin.unideb.hu

RuzomberokRuzomberok24 June, 200924 June, 2009

Sponsored by Sponsored by OTKA research grant K 72983OTKA research grant K 72983

Page 2: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Homonymy, Homonymy, ppolysemyolysemy

HomonymyHomonymyBring money from the Bring money from the bankbank..bankbank11: : [[financial institutionfinancial institution]; ]; bankbank22: : [[riverbankriverbank]]

PolysemyPolysemybulbbulb: : [[the root of a plantthe root of a plant] ~ [an electric lamp]] ~ [an electric lamp]

Fuzzy boundary!Fuzzy boundary! Workaround: maximize homonymy or maximize Workaround: maximize homonymy or maximize polysemy (Lyons 1977). polysemy (Lyons 1977).

NLP lexicons (incl. WordNet): maximize homonymyNLP lexicons (incl. WordNet): maximize homonymySemcor corpus “polysemy” factor: 6.6 senses/word on avg. (Mihalcea Semcor corpus “polysemy” factor: 6.6 senses/word on avg. (Mihalcea

and Moldovan 2001)and Moldovan 2001)

Page 3: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Cruse’s lexical semanticsCruse’s lexical semantics

Test for Test for antagonismantagonism: : we can focus on one we can focus on one reading reading at a timeat a time

Bring money from the Bring money from the bankbank..She was wearing a She was wearing a lightlight coat. (Cruse 2000) coat. (Cruse 2000)

Other tOther tests for the presence of ests for the presence of discrete discrete readingsreadings..

Relatedness of senses: continuous Relatedness of senses: continuous phenomenonphenomenon

Page 4: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

AA tacit tacit premisepremise

If a lexical item causes If a lexical item causes ambiguity, it can be ambiguity, it can be

disambiguated, i.e. we can pick disambiguated, i.e. we can pick out a “right meaning” for each out a “right meaning” for each

lexical item in a sentence.lexical item in a sentence.

Page 5: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Plausibility of word sense Plausibility of word sense disambiguation (WSD)disambiguation (WSD)

Word Sense DisambiguationWord Sense DisambiguationTThe selection of the “right” meaning for each he selection of the “right” meaning for each lexical item in a sentence.lexical item in a sentence.

Senseval-3Senseval-3 (cf. Snyder & Palmer 2004)(cf. Snyder & Palmer 2004)26 competing systems. 26 competing systems. Accuracy: up to 65% (best system, best Accuracy: up to 65% (best system, best

case)case)Always-select-the-most-frequent-sense Always-select-the-most-frequent-sense

(MFS) baseline: 61%(MFS) baseline: 61% Human inter-annotator agreement: 72%Human inter-annotator agreement: 72%

Page 6: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Plausibility of word sense Plausibility of word sense disambiguation (WSD)disambiguation (WSD)

Conclusion of Semeval-7 (all-words Conclusion of Semeval-7 (all-words disambiguation task):disambiguation task):

““after decades of research in the field it is after decades of research in the field it is still unclear whether WSD can provide still unclear whether WSD can provide relevant contribution to real-world relevant contribution to real-world applications” (Navigli, Litkowsky & applications” (Navigli, Litkowsky & Hargraves 2007:34)Hargraves 2007:34)

Page 7: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Other linguistic fields with Other linguistic fields with correlating findingscorrelating findings

Lexicographical practiceLexicographical practice““lumpinglumping is considering two slightly different is considering two slightly different

patterns of usage as a single meaning”, and patterns of usage as a single meaning”, and ““splittingsplitting is … dividing or separating them into is … dividing or separating them into different meanings” (Kilgarriff 1997:9)different meanings” (Kilgarriff 1997:9)

Whether lexicographers lump or split senses is a Whether lexicographers lump or split senses is a matter of tradition, editorial policy and subjective matter of tradition, editorial policy and subjective decisions.decisions.

E.g. E.g. mouthmouth: : [[body partbody part] /] / [mouth [mouth of a riverof a river] / [mouth ] / [mouth of a cave] / [mouth of a bottle]of a cave] / [mouth of a bottle]

Page 8: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Other linguistic fields with Other linguistic fields with correlating findingscorrelating findings

Theoretical linguistics: sense enumerationTheoretical linguistics: sense enumeration

Pustejovsky (1995): conventional lexicon design is Pustejovsky (1995): conventional lexicon design is based on sense-enumeration.based on sense-enumeration.

It cannot account for:It cannot account for:- the Creative Use of Words, the process of how the Creative Use of Words, the process of how

“words assume new senses in novel contexts” “words assume new senses in novel contexts” (Pustejovsky 1995:39)(Pustejovsky 1995:39)

- the Permeability of Word Senses: “Word senses the Permeability of Word Senses: “Word senses are not atomic definitions but overlap and make are not atomic definitions but overlap and make reference to other senses of the word” (p. 39)reference to other senses of the word” (p. 39)

- the Expression of Multiple Syntactic Forms the Expression of Multiple Syntactic Forms

Page 9: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Other linguistic fields with Other linguistic fields with correlating findingscorrelating findings

Theoretical linguistics: the role of contextTheoretical linguistics: the role of context

The context can influence the meaning (based on Cruse The context can influence the meaning (based on Cruse 2000:120-123):2000:120-123):

- selection selection process: existing readings or established process: existing readings or established senses are selectively activated and suppressedsenses are selectively activated and suppressed

- coercing coercing a meaning: when the established senses do a meaning: when the established senses do not fit into the context, the listener is supposed to look for not fit into the context, the listener is supposed to look for a matching meaning extension, possibly metaphorical or a matching meaning extension, possibly metaphorical or metonymical, “because of a tacit assumption that metonymical, “because of a tacit assumption that speakers are usually trying to convey an intelligible speakers are usually trying to convey an intelligible message” (p. 120).message” (p. 120).

- Meaning can be Meaning can be modulated modulated in other waysin other ways

Page 10: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Other linguistic fields with Other linguistic fields with correlating findingscorrelating findings

Theoretical linguistics: the role of contextTheoretical linguistics: the role of context

Extremist position (non-Crusian):Extremist position (non-Crusian):““The notion that words have a meaning – what Lakoff and The notion that words have a meaning – what Lakoff and

Johnson (1980) call the “container metaphor” – is now Johnson (1980) call the “container metaphor” – is now hard to maintain. It seems that “meaning” consists of the hard to maintain. It seems that “meaning” consists of the process of meaning (Clark 1992). Words should be seen process of meaning (Clark 1992). Words should be seen as information tokens that, among others, to some extent as information tokens that, among others, to some extent guide the meaning process” (Haase and Rothe-Neves guide the meaning process” (Haase and Rothe-Neves 1999:291).1999:291).

Cruse (2000): context-independent “pre-established Cruse (2000): context-independent “pre-established senses” (p. 68) and “default readings” (p. 116)senses” (p. 68) and “default readings” (p. 116) exist.exist.

Page 11: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Lexical meaning and neuroscienceLexical meaning and neuroscience

No generally accepted, tested and verified model of word No generally accepted, tested and verified model of word meaning in neuroscience.meaning in neuroscience.

Cell assemblies are Cell assemblies are created by correlative coactivation of created by correlative coactivation of neurons (Hebb 1949, Pulvermüller 1999, 2001).neurons (Hebb 1949, Pulvermüller 1999, 2001).

When a sufficient subset of the assembly is stimulated, the When a sufficient subset of the assembly is stimulated, the whole assembly ignites and then reverberates.whole assembly ignites and then reverberates.

My hypothetical suggestion for the role of a word: igniting, My hypothetical suggestion for the role of a word: igniting, maintaining and modifying spatiotemporal activation of maintaining and modifying spatiotemporal activation of assemblies also biasing further activation. assemblies also biasing further activation.

Meaning is selected, coerced and modulated & ambiguity Meaning is selected, coerced and modulated & ambiguity gets resolved gets resolved in the intricate in the intricate (but not necessarily (but not necessarily linguistically transparent)linguistically transparent) interplay of neural interplay of neural activations/reverberations.activations/reverberations.

Page 12: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

The HunGram projectThe HunGram project

OTKA OTKA (Hungarian Scientific Research Fund)(Hungarian Scientific Research Fund) research grant research grant for 2008—2012 (K 72983), Pfor 2008—2012 (K 72983), PI: dr. I: dr. Tibor LaczkTibor Laczkóó

ObjectivesObjectives1.1. developing a comprehensive LFG grammar of the Hungarian developing a comprehensive LFG grammar of the Hungarian

language (morphology, syntax, lexicon, semantic issues)language (morphology, syntax, lexicon, semantic issues)2.2. implementing it in XLE (LFG parser) implementing it in XLE (LFG parser)

Lexical background: non-toy lexicon Lexical background: non-toy lexicon Single entry for each word unless our grammar needs re-listing Single entry for each word unless our grammar needs re-listing

(argument structure)(argument structure)Also developing an Artificial Neural Network (ANN) tool that can be Also developing an Artificial Neural Network (ANN) tool that can be

trained to learn associational properties of words partly based on trained to learn associational properties of words partly based on information coming from the parser (morphological, syntactic) information coming from the parser (morphological, syntactic) when analyzing authentic text. Goal: to acquire selectional when analyzing authentic text. Goal: to acquire selectional attributes and important information about the argument structure attributes and important information about the argument structure not otherwise encoded in the lexicon/grammar.not otherwise encoded in the lexicon/grammar.

Page 13: On lexical ambiguity Ágoston Tóth, PhD University of Debrecen tagoston@delfin.unideb.hu Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K 72983

Thank you for your attention.Thank you for your attention.

Dr. Dr. Ágoston TóthÁgoston Tóth

[email protected]@delfin.unideb.hu