24
1 Carnegie Mellon School of Computer Science LTI Grammars and Lexicons Copyright © 2007, Carnegie Mellon. All Rights Reserved. Grammar Writing Lecture 5 11-721 Grammars and Lexicons Teruko Mitamura [email protected] www.cs.cmu.edu/~teruko

Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

Embed Size (px)

DESCRIPTION

Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 3 LTI Grammars and Lexicons Bird.gra review General Problems Incomplete F-structure Incorrect F-structure Not enough constraints in the rule Unification problems

Citation preview

Page 1: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

1Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Grammar WritingLecture 5

11-721 Grammars and Lexicons

Teruko Mitamura

[email protected]/~teruko

Page 2: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

2Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Schedule: November 19, 2007• Review of “bird.gra” • Review of “bird2.gra”• Character-based Parsing vs. Word-based

Parsing • Morphology• Start a new grammar exercise (4)

Page 3: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

3Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Bird.gra review General Problems

• Incomplete F-structure • Incorrect F-structure• Not enough constraints in the rule • Unification problems

Page 4: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

4Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Incomplete F-structuresDeterminer information is missing from f-

structure“A bird flies” and “The bird flies” showed the

same F-structure((subj ((agreement 3sg) (number sg) (root bird))) (form present) (agreement 3sg) (root fly))

Page 5: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

5Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Complete F-structure• Contains all the necessary grammatical

information• Be able to reconstruct the original sentence “A bird flies”((SUBJ ( (NUMBER SG) (AGREEMENT 3SG) (ROOT BIRD) (DET ((NUMBER SG) (DEFINITENESS -) (ROOT A))) )) (FORM PRESENT) (AGREEMENT 3SG) (ROOT FLY))

• Some feature structures are redundant

Page 6: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

6Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Incomplete F-structures (2)

Grammar problem:(<NP> < == > (<DET> <N>) ( ((x1 number) = (x2 number)) (x0 = x2) ))

Page 7: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

7Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Not Enough ConstraintsThe singular noun without determiner can

become NP. “Bird flies” may parse. (<NP> < == > (<N>) ((x0 = x1)))

Problem: No constraint for number. ((x1 number) =c pl)

Page 8: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

8Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Be Aware of Unification

(<NP> < == > (<DET> <N>) ((x0 = x1) (x0 = x2)))(<DET> < -- > (t h e) (((x0 definiteness) = +)))(<N> < -- > (b i r d) (((x0 root) = bird) ((x0 number) = sg) ((x0 agreement) = 3sg)))

Page 9: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

9Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Be Aware of Unification (cont.)(<NP> < == > (<DET> <N>) ((x0 = x1) (x0 = x2)))(<DET> < -- > (t h e) (((x0 definiteness) = +) ((x0 root) = the)))(<N> < -- > (b i r d) (((x0 root) = bird) ((x0 number) = sg) ((x0 agreement) = 3sg)))

Page 10: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

10Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Frequently Seen Problems• Test equations come before Action (x0 = x2) ;action ((x1 agreement) = (x2 agreement)) ;test• No “root” info in f-structure• When submitted:

– Write your full name in the grammar– Write more comments in the grammar– Turn off (dmode 2) or trace

• Print out the grammar and results files. – lpr –P<printer name> <filename> e.g. lpr –Pshakthi bird.gra

Page 11: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

11Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Review: Bird 2 Grammar

• Goal: To learn more on unification• Some Problems:

– Not scalable semantic features ((x0 semclass) = Morris)– Incomplete f-structures– Incorrect f-structures

Page 12: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

12Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Grammar Exercise (3) Test Sentences

"A bird flies“"Birds fly“"The bird flies“"The birds fly“"The cat runs“"The cats run““Morris runs““Morris meows“"Cats meow“"A cat meows”"The cats meow“"The penguins run”"A penguin runs"

Page 13: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

13Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Grammar Exercise (3) Test Sentences (fail)

"A bird fly" "A birds flies" "Birds flies" "Bird flies" "The bird fly" "The birds flies"

Page 14: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

14Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Test Sentences (fail) "The cat flies" "The cats fly" "The cat run" "A cat meow" “Morris meow" “Morris flies" "The bird meows" "A penguin meows" “Penguins meow" "The penguin flies"

Page 15: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

15Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Semantic Category

Bird fly, run, *meowCat *fly, run, meowPenguin *fly, run, *meow

Page 16: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

16Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Semantic Features (noun)

Bird (sem-class bird)Cat (sem-class cat)Penguin (sem-class penguin)--------------------------------- (animate +)

Page 17: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

17Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Semantic Features (verb)Fly ((subj sem-class) = bird)Meow ((subj sem-class) = cat)Run ((subj animate) = +)

Page 18: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

18Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Unification(<N> <--> (c a t s)

(((x0 root) = cat) ((x0 number) = pl) ((x0 animate) = +) ((x0 sem-class) = cat) ((x0 agreement) = pl)))

(<V> <--> (m e o w) (((x0 root) = meow) ((x0 agreement) = pl) ((x0 subj animate) = +) ((x0 subj sem-class) = cat) ((x0 form) = present)))

Page 19: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

19Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Unification(<S> <==> (<NP> <VP>)(((x1 agreement) = (x2 agreement)) ((x0 subj) = x1) (x0 = x2)))

Page 20: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

20Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Character-based ParsingMorphological rules can be parsedInput string: tabeta eat-past

taberu eat-present

(<v-class-1> < -- > (<v-class-1> r u)((x0 = x1) ((x0 tense) = present)))

(<v-class-1> < -- > (<v-class-1> t a)((x0 = x1) ((x0 tense) = past)))

(<v-class-1> < -- > (t a b e)(((x0 root) = taberu))

Page 21: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

21Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Japanese morphologytabe-sase-rare-taeat-caus-pass-past(<v-class-1> < -- > (t a b e)

(((x0 root) = taberu))) (<v-class-1> < -- > (<v-class-1> s a s e)

(((x1 pass) = *undefined*) ((x1 tense) = *undefined*) (x0 = x1) ((x0 caus) = +)))

(<v-class-1> < -- > (<v-class-1> r a r e)(((x1 tense) = *undefined*) (x0 = x1) ((x0 pass) = +)))

(<v-class-1> < -- > (<v-class-1> t a)((x0 = x1) ((x0 tense) = past)))

Tabeta eat-pastTabe-sase-ta eat-caus-pastTabe-rare-ta eat-pass-pastTabe-sase-rare-ta eat-caus-pass-past

*tabe-rare-sase-ta eat-pass-caus-past*tabe-ta-sase-rare eat-past-caus-pass*tabe-ta-rare-sase eat-past-pass-caus*tabe-rare-ta-sase eat-pass-past-caus*tabe-sase-ta-rare eat-caus-past-pass*tabe-rare-sase eat-pass-caus*tabe-ta-sase eat-past-caus

Page 22: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

22Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Word-based Parsing(<N> < -- > (sushi) (((x0 = root) = sushi)))Instead of: (<N> < -- > (s u s h i) (((x0 = root) = sushi)))For parsing: (parse-list list of symbols $)e.g. (parse-list ‘(a bird flies $))

Page 23: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

23Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Grammar Exercise (4)

• Start grammar exercise (4): mlb.gra• Files are in /afs/cs/project/cmt-55/lti/Lab/Modules/ GNL-721/2007/• Test file: mlb-test.lisp

Page 24: Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 1 LTI Grammars and Lexicons Grammar Writing Lecture

24Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Next Class: Nov 26

• Return bird2.gra• Return Assignment #1• Grammar Writing Project Evaluation

Criteria• Finish mlb.gra• Start a new exercise