42
1 CS 385 Fall 2006 Chapter 14 Understanding Natural Language (omit 14.4)

1 CS 385 Fall 2006 Chapter 14 Understanding Natural Language (omit 14.4)

Embed Size (px)

Citation preview

1

CS 385 Fall 2006Chapter 14

Understanding Natural Language

(omit 14.4)

2

The Problem

Language is fuzzy– I feel funny

– Fruit flies like bananas

– Is there water in the fridge?

Early history: – dictionary translation, word by word

– out of sight, out of mind → the person is blind and insane

– did not address interrelation among words

– more to it: what you know beyond the simple meaning of a word

– Doug Lenat's CYC project (1984), now Cyc Corporationrepresent world knowledge via logic and frames

12 years, 35 million dollars

questionable resultshttp://www.cyc.com/cyc/cycrandd/areasofrandd_dir/nlu

3

Levels of Analysis (big picture)

Prosody – rhythm and intonation of language

Phonlogy– the sounds which comprise language (phonemes)

– speech analysis: identify phonemes and conglomerated into words.

Morphology– the components that make up words (ing, ed,...)

Syntax – rules for combining words into legal (syntactically correct) sentences

– used to parse a sentence

– the most successful level, because it is formalized

Semantics – attaching meaning to words, phrases, and sentences

Pragmatics – how is language usually used? “How are you" → "fine"

World knowledge– general background necessary to interpret text or conversation

– “My thesis draft is due tomorrow” makes you think of ...?..

4

Today

Acceptance that a general conversationalist is unlikely

Scale back to interpretation in restricted applications – MS word grammar and style checker

– others?

Audio to text, but little interpretation – cell phone speed dial

– United Airlines customer service

– medical transcription

– Acura TL recognizes 650 voice commands

Normal steps of linguistic analysis:– parsing

– semantic interpretation

– expanded representation

5

Specifying a Grammar

1. sentence np vp

2. np n

3. np art n rewrite rules

4. vp v

5. vp v np

6. art a

7. art the

8. n man terminals

9. n dog

10. v likes

11. v bites

Legal sentence: a string of terminals that can be derived from these rules.

6

Parse Tree for Tony growled at Bob

sentence

noun phrase verb phrase

noun verb prepositional phrase

preposition noun phrase

noun

Tony growled

at

Bob

7

Interpret it with a Semantic Net

Construct a semantic net describing mammals:– mammals are covered with hair

– tigers are a subclass with stripes that growls

– Tony is a tiger

– humans are a subclass of mammal that are frightened by tigers

– Bob is a human

mammal

humantiger frightens

subclasssubclass

Bob

instanceinstance

Tony

stripes

has prop

growls

has prop

has prop

hair

8

Semantic Interpretation (conceptual graph)

Growling has an agent and an object (from parse tree):

Expanded representation of the sentencemeaning (from sem. net):

we know Tony is a tiger and Bob should be frightened

do a join:

tiger:Tony growled at person: Bobagent object

tiger frightens personagent object

tiger:Tony growled at person: Bobagent object

frightensagent object

9

Fig 14.2 Stages in producing an internal representation of a sentence.

10

Parsing The man bites the dog

Top down: start at sentence symbol and work down to a string of terminals

1. sentence np vp

2. np n

3. np art n

4. vp v

5. vp v np

6. art a

7. art the

8. n man

9. n dog

10. v likes

11. v bites

sentence

→ np vp

→ art n vp

→ the n vp

→ the man vp

→ the man v np

→ the man bites np

→ the man bites art n

→ the man bites the n

→ the man bites the dog

11

Resulting parse tree for “The man bites the dog”

5

Problem: we needed to know where we are goingGoal driven: need to back up and retrace a lotNew approach: transition net parsers

12

Transition Net Parsers

Grammar: a set of finite state

machines or transition nets

One for each non terminal

Successful transition through

the network == replacing the

nonterminal by the rhs of a

grammar rule

E.g. first arc in sentence ATN

replaced by a path through the

np ATN

13

Sinitial Sfinalnoun phrase verb phrase

np

Sinitial art n Sfinal

n

Sinitial

art

Sfinal

a

the

the

Sinitial

n

man

dog

man

Sfinal

the man

The man bites the dog:

14

How would you augment the grammar to allow "bite the dog?

Sinitial Sfinalnoun phrase verb phrase

verb phrase

15

What paths would be examined to parse it?

Begin with sentence network and try to move along top arc

Go to np network

Try to move along bottom arc

Go to noun network

Try man. fail

Try dog. fail

Try to move along top arc

Go to article network

Try “a” fail

Try “the” fail

Fail article

Fail np network

Sinitial Sfinalnoun phrase verb phrase

verb phrase

Sinitial art n Sfinal

nsentencenonn phrase

16

Try to move along bottom arc

Go to vp network

Go to v network

Try likes. fail

Try bites. fail

Try bite. success

Go to np network

Go to art network

Try a. fail

Try the. succeed

Go to n network

Try man. fail

Try dog. succeed

Succeed (np)

Succeed (vp)

Sinitial Sfinalnoun phrase verb phrase

verb phrase

Sinitialv np

Sfinal

v

verb phrasesentence

17

What Next?

Note, this does not build the parse tree, it just identifies correct sentences

To build a tree:– Each terminal returns success and a tree with the terminal as a

single node

– Each non-terminal network returns a set of subtrees whose root is the nonterminal symbol and whose leaves are the trees for the branches taken

Add the tree to the steps for "bite the dog"

18

Go to vp network

Go to v network

Try likes. fail

Try bites. fail

Try bite. success {return verb

Go to np network bite}

Go to art network

Try a. fail

Try the. succeed {return article

Go to n network the}

Try man. fail

Try dog. succeed {return noun

Succeed (np) dog }

Succeed (vp) {return vp

v np

bite art n

the dog}

Sinitial Sfinalnoun phrase verb phrase

verb phrase

Sinitialv np

Sfinal

v

verb phrasesentence

19

Pseudo-code for a transition network parser

Defined using two mutually recursive functions, parse and transitionfunction parse(grammar_symbol)

continued…

20

21

Fig 14.5 Trace of a transition network parse of the sentence “Dog bites.”

9

transition(Noun_phrase)

parse(Article)

parse(sentence)

terminals don't match Dog parse(Noun) terminal

matches Dog

Red corresponds to function calls

22

14.2.3 The Chomsky Hierarchy and Context Sensitive Languages

Chomsky hierarchy: – of languages by increasing linguistic complexity

– we will be concerned with context-free

context sensitive

Context-free: – one non-terminal symbol on the lhs of a rewrite rule

– problem: no requirement that dog is followed by bites, not bite

– e.g. no relation between dog and its appropriate verb because the two can’t both be on lhs.

Is a programming language (C++) context-free?cast expressions

template syntax

23

Context-Sensitive Grammars

More than one symbol on lhs → a noun and verb can be related

singular and plural are part of the spec via "number"

Example:

sentence ↔ noun_phrase verb_phrase

noun_phrase ↔ article number noun

article singular ↔ a singular

article singular ↔ the singular

article plural ↔ the plural

singular noun ↔ man singular

singular verb phrase ↔ singular verb

singular verb ↔ bites

Parse: The man bites:sentence

noun_phrase verb_phrase

article singular noun verb phrase

The singular noun verb phrase

The dog singular verb phrase

The dog singular verb

The dog bites

24

Data-Driven Parse?

Example:

1. sentence ↔ noun_phrase verb_phrase

2. noun_phrase ↔ article number noun

3. article singular ↔ a singular

4. article singular ↔ the singular

5. article plural ↔ the plural

6. singular noun ↔ man singular

7. singular verb phrase ↔ singular verb

8. singular verb ↔ bites

The man bites:Rule 8 matches bites

The man singular verb

Rule 7

The man singular verb-phrase

Rule 6

The singular noun verb-phrase

Rule 4

article singular noun verb-phrase

Rule 6

noun_phrase verb-phrase

Rule 1

sentence

25

Problems with Context-Sensitive

More rules

Obscured phrase structure, semantics mush in with syntax

Still no semantic representation

Next step: ATN parsers

Terminals and non-terminals represented

as identifiers (frames) with

attached features (slots)

Procedures attached to arcs of the network– executed when ATN traverses an arc

– values assigned to grammatical features

– tests performed and transition can fail, e.g. if no number agreement

26

Fig 14.7 Dictionary entries for a simple ATN

27

Fig 14.8 An ATN grammar that checks number agreement and builds a parse tree.

.NOUN-PHRASE

checking for agreement →

←typo

28

Fig 14.8 continued from previous slide.

29

Fig 14.8 continued from previous slide.

30

Fig 14.9 Parse tree for “The dog likes a man”

31

Combining Syntax and Semantics

Build conceptual graph the parse treee.g. representation for sentence:

get representation for subject from the noun phrase

get representation for verb phrase

bind subject to agent of the graph for the verb phrase

When you reach a terminal, retrieve information from a knowledge baseconcepts. e.g. dog, man as in a type hierarchy (next slide)

conceptual relations as in next slide

32

Knowledge BaseType hierarchy: Frames for likes and bites

33

34

Parse tree → Semantic Representation

1. call sentence

2. sentence calls noun_phrase

3. noun_phrase calls noun

4. noun returns concept for dog (1)

5. article is definite →bind a marker to dog (2)

6. sentence calls verb_phrase

7. verb_phrase calls verb which retrieves frame

for like (3)

8. verb_phrase calls noun_phrase which calls

noun to retrieve man (4)

9 . article is definite → leave concept generic (7)

35

14.5 Natural Language Applications

Story understanding and question answering– goal: a program that can read a story and answer questions

– why useful?

What can we do so far?– parse and interpret a sentence

(perform network joins between semantic interpretation of the input and conceptual graphs in the knowledge base)

– can we expand this?

Yes– answer questions

– scripts

– join semantic representations for multiple sentences

36

Answer Questions

Answer questions:

fido bit tony

What did fido bite tony with?

Scripts:

fido bit tony

tony has blood on his coat

A script might infer that the blood came from the bite.

37

Join Semantic Representations Sentences

Given fido bit tony

fido has no teeth

What?

38

14.5.2 Database Front End

Information is structuredselect salary

from employee_salary

where employee ="John Smith"

select salary

from employee_salary, manager_of_hire

where manager ="Ed Angel" and

manager_of_hire.employee=employee_salary.employee

What is John Smith's salary?

List the salaries of employees

who work for Ed Angel

39

Entity-relationship diagrams Knowledge base entry

40

Database query from natural language input "Who hired john smith?"

41

14.5.3 Information extraction from the Web

42

Fig 14.20 An architecture for information extraction, from Cardie (1997).

As on preceding slide