17
1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton ***

1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Embed Size (px)

Citation preview

Page 1: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

1

How to Compute the Meaning of Natural Language Utterances

Patrick Hanks,

Research Institute of Information and Language Processing,

University of Wolverhampton***

Page 2: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Goals of the tutorial

• To explore the relationship between meaning and phraseology.

• To explore the relationship between conventional uses of words and creative uses such as freshly coined metaphors.

• To discover factors that contribute to the dynamic power of natural language, including anomalous arguments, ellipsis, and other “explotations” of normal usage.

2

Page 3: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Procedure • We shall focus on verbs.

– We shall not assume that the analytic procedure developed for verbs is equally suitable for nouns

• We shall not invent examples. – Instead, we shall analyse data.

• Instead, we shall look at large numbers of actual uses of a verb, using concordances to a very large corpus.

• We shall ask questions such as:– What patterns of normal use of this verb can we detect?

– What is the nature of a “pattern”?

– Does each pattern have a different meaning?

– What is the nature of lexical ambiguity, and why has it been so troublesome for NLP?

3

Page 4: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Patterns in Corpora

• When you first open a concordance, very often some patterns of use leap out at you. – Collocations make patterns: one word goes with another

– To see how words make meanings, we need to analyse collocations

• The more you look, the more patterns you see.

• BUT

• When you try to formalize the patterns, you start to see more and more exceptions.

• The boundaries are fuzzy and there are many outlying cases.

4

Page 5: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Analysis of Meaning in Language

• Analysis based on predicate logic is doomed to failure:– Words are NOT building blocks in a ‘Lego set’

– A word does NOT denote ‘all and only’ members of a set

– Word meaning is NOT determined by necessary and sufficient conditions for set membership

• Instead, a prototype-based approach to the lexicon is necessary: – mapping prototypical interpretations onto prototypical phraseology

5

Page 6: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

The linguistic ‘double-helix’ hypothesis

• A language is a system of rule-governed behaviour.

• Not one, but TWO (interlinked) sets of rules:

1. Rules governing the normal uses of words to make meanings

2. Rules governing the exploitation of norms

6

Page 7: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Exploitations

• People exploit the rules of normal usage for various purposes:

• For economy and speed:– Conversation is quick

– Listeners (and readers) get bored easily

– Words that are ‘obvious’ can sometimes be omitted

• To say new things (reporting discoveries)

• To say old things in new ways

• For rhetoric, humour, poetry, politics …

7

Page 8: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Lexicon and prototypes

• Each word is typically used in one or more patterns of usage (valency + collocations)

• Each pattern is associated with a meaning: – a meaning is a set of prototypical beliefs

– In CPA, meanings are expressed as ‘anchored implicatures’.

– few patterns are associated with more than one meaning.

• Corpus data enables us to discover the patterns that are associated with each word.

8

Page 9: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

What is a pattern? (1)

• The verb is the pivot of the clause.• A pattern is a statement of the clause structure

(valency) associated with a meaning of a verb,– together with the typical semantic values of each

argument.– arguments of verbs are populated by lexical sets of

collocates

• Different semantic values of arguments activate different meanings of the verb.

9

Page 10: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

What is a pattern? (2)

• [[Human]] fire [[Firearm]] • [[Human]] fire [[Projectile]]• [[Human 1]] fire [[Human 2]]• [[Anything]] fire [[Human]] {with enthusiasm}• [[Human]] fire [NO OBJ]• Etc.

10

Page 11: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Semantic Types and Ontology

• Items in double square brackets are semantic types.

• Semantic types are being gathered together into a shallow ontology.– (This is work in progress in the currect CPA project)

• Each type in the ontology will (eventually) be populated with a set of lexical items on the basis of what’s in the corpus under each relevant pattern.

11

Page 12: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Shimmering lexical sets

• Lexical sets are not stable – not „all and only”.

• Example:– [[Human]] attend [[Event]]– [[Event]] = meeting, wedding, funeral, etc. – But not thunderstorm, suicide.

12

Page 13: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Meanings and boundaries

• Boundaries of all linguistic and lexical categories are fuzzy.– There are many borderline cases.

• Instead of fussing about boundaries, we should focus instead on identifying prototypes

• Then we can decide what goes with what– Many decision will be obvious.– Some decisions – especially about boundary cases –

will be arbitrary.

13

Page 14: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

The importance of phraseology

• “Many, if not most, meanings depend on the presence of more than one word for their realization.” – John Sinclair

14

Page 15: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

The Idiom Principle (Sinclair)

• In word use, there is tension between the „terminological tendency” and the „phraseological tendency”:– The terminological tendency: the tendency for words

to have meaning in isolation– The phraseological tendency: the tendency for the

meaning of a word to be activated by the context in which it is used.

15

Page 16: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Computing meaning (1)• Each user of a language has a “corpus” of uses stored

inside his or her head– These are traces of utterances that the person has seen, heard, or

uttered

• Each person’s mental corpus of English (etc .) is different

• What all these “mental corpora” have in common is patterns

• By analysing a huge corpus of texts computationally, we can create a pattern dictionary for use by computers as well as by people.

• In a pattern dictionary, each pattern is associated with a meaning (or a translation, or other implicature)

16

Page 17: 1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton

Computing meaning (2)

• When processing unseen text, the computer compares the actual use of each verb in the text with the inventory of patterns in the pattern dictionary, unsing information about a) valency, and b) semantic types of collocates.

• Exact matches are not to be expected.

• Best match wins: the pattern dictionary provides the most probable meaning (or trnaslation) of the word in context.

17