37
INSTITUTE OF COMPUTING TECHNOLOGY Forest-based Semantic Role Labeling Hao Xiong, Haitao Mi, Yang Liu and Qun Liu Institute of Computing Technology Academy of Chinese Sciences AAAI 2010, Atlanta 7/15/10 1

Forest-based Semantic Role Labeling

Embed Size (px)

DESCRIPTION

Forest-based Semantic Role Labeling. Hao Xiong, Haitao Mi , Yang Liu and Qun Liu Institute of Computing Technology Academy of Chinese Sciences. Semantic Role Labeling. Given a sentence and its verbs Identify the arguments of the verbs Assign semantic labels (the roles they play). - PowerPoint PPT Presentation

Citation preview

Page 1: Forest-based  Semantic Role Labeling

INS

TIT

UTE O

F C

OM

PU

TIN

G

TEC

HN

OLO

GY

Forest-based Semantic Role

Labeling

Hao Xiong, Haitao Mi, Yang Liu and Qun Liu

Institute of Computing TechnologyAcademy of Chinese SciencesAAAI 2010, Atlanta7/15/10 1

Page 2: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Semantic Role Labeling

Given a sentence and its verbs Identify the arguments of the verbs Assign semantic labels (the roles they play)

This company

last year

1000 cars

in the U.S.

sold

AgentAgent PatientPatientArgMod-

TeMPoral

ArgMod-

TeMPoral

ArgMod-LOCationArgMod

-LOCation

PropBank (Kingsbury and Palmer 2002)

7/15/10 2

Page 3: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

One Conventional Approach

the role of Celimene is played by Kim Cattrall

Patient Agent

AAAI 2010, Atlanta7/15/10 3

Page 4: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

One Conventional Approach

the role of Celimene is played by Kim Cattrall

Patient Agent

S

NP PP

VP

AUX VBN PP

VPNP

AAAI 2010, Atlanta7/15/10 4

Page 5: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

One Conventional Approach

the role of Celimene is played by Kim Cattrall

Patient Agent

S

NP PP

VP

AUX VBN PP

VP?

more than 15%

AAAI 2010, Atlanta7/15/10 5

Page 6: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

1 2 3 k…

Solution

k-best parses: limited scope: k too much redundancy

25<50<26

S

NP PP

VP

AUXVBN PP

VPNP

S

NP PP

VP

AUXVBN PP

VP …AAAI 2010, Atlanta7/15/10 6

Page 7: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Our Solution Forest

A compact representation of many parses By sharing common sub-derivations Polynomial-space encoding of exponentially large

set S

NP PP

VP

AUX VBN PP

NPVP

S

NP PP

VP

AUXVBN PP

VPNP

S

NP PP

VP

AUXVBN PP

VP …Unpack

AAAI 2010, Atlanta7/15/10 7

Page 8: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Our Solution Forest

A compact representation of many parses By sharing common sub-derivations Polynomial-space encoding of exponentially large

set S

NP PP

VP

AUX VBN PP

NPVP

AAAI 2010, Atlanta7/15/10 8

Page 9: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Outline

Tree-based Semantic Role Labeling Parsing Selecting candidates Extracting features Classifying

Forest-based Semantic Role Labeling Experiments Conclusion

AAAI 2010, Atlanta7/15/10 9

Page 10: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Parsing

S

NP NP VP

DT NN JJ NN VBD NP PP

CD NNS IN NP

NNPDT

This company last year sold

1000 cars in

the U.S.AAAI 2010, Atlanta7/15/10 10

Page 11: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Selecting Candidates

S

NP NP VP

DT NN JJ NN VBD NP PP

CD NNS IN NP

NNPDT

soldThis company last year

1000 cars in

the U.S.AAAI 2010, Atlanta7/15/10 11

Page 12: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Extracting Features

S

NP NP VP

DT NN JJ NN VBD NP PP

CD NNS IN NP

NNPDT

This company last year sold

1000 cars in

the U.S.

Path to the predicate

This company last year

1000 cars in

the U.S.

NNS

NPSVPVBN

AAAI 2010, Atlanta7/15/10 12

Page 13: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Extracting Features

S

NP NP VP

DT NN JJ NN VBD NP PP

CD IN NP

NNPDT

This company last year sold

1000 cars in

the U.S.

Position: left

This company last year

1000 cars in

the U.S.

NNS

NPSVPVBN

left

AAAI 2010, Atlanta7/15/10 13

Page 14: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Extracting Features

S

NP NP VP

DT NN JJ NN VBD NP PP

CD IN NP

NNPDT

This company last year sold

1000 cars in

the U.S.

Head word: company

This company last year

1000 cars in

the U.S.

NNS

NPSVPVBN

leftcompany

AAAI 2010, Atlanta7/15/10 14

Page 15: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Extracting Features

S

NP NP VP

DT NN JJ NN VBD NP PP

CD IN NP

NNPDT

This company last year sold

1000 cars in

the U.S.

Head POS tag: NN

This company last year

1000 cars in

the U.S.

NNS

NPSVPVBN

leftcompany

NN…

AAAI 2010, Atlanta7/15/10 15

Page 16: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Classifying

S

NP NP VP

DT NN JJ NN VBD NP PP

CD IN NP

NNPDT

This company last year sold

1000 cars in

the U.S.S(Agent)=0.1S(Patient)=0.1S(None)=0.5

S(AM-TMP)=0.9S(Patient)=0.1S(None)=0.1

S(Agent)=0.2S(Patient)=0.8S(None)=0.1

S(Agent)=0.8S(Patient)=0.1S(None)=0.1

S(AM-LOC)=0.9S(Agent)=0.1S(None)=0.1

Computing Score using a trained classifier

This company last year

1000 cars in

the U.S.

NNS

16

Page 17: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Classifying

S

NP NP VP

DT NN JJ NN VBD NP PP

CD IN NP

NNPDT

This company last year sold

1000 cars in

the U.S.

S(Agent)=0.8

S(AM-LOC)=0.9

This company last year

1000 cars in

the U.S.

NNS

S(None)=0.5

S(AM-TMP)=0.9

S(Patient)=0.8

Best score for each constituentSimply sort themChoose the best label sequence

NP

17

Page 18: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Classifying

S

NP NP VP

DT NN JJ NN VBD NP PP

CD IN NP

NNPDT

This company last year sold

1000 cars in

the U.S.

Agent AM-TMP V

Patient

AM-LOC

This company last year

1000 cars in

the U.S.

NNS

18

Page 19: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Outline

Tree-based Semantic Role Labeling Forest-based Semantic Role Labeling

Parsing into a forest Selecting candidates Extracting features on forest Classifying

Experiments Conclusion

AAAI 2010, Atlanta7/15/10 19

Page 20: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Forest

the role of Celimene is played by Kim Cattrall

S

NP PP

VP

AUX VBN PP

NP

VP

Hyper-graph Hyper-edge

Node

AAAI 2010, Atlanta7/15/10 20

Page 21: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Selecting Candidates

the role of Celimene is played by Kim Cattrall

S

NP PP

VP

AUX VBN PP

NP

VP

AAAI 2010, Atlanta7/15/10 21

Page 22: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Exacting features

Path to the predicate

the role of Celimene is played by Kim Cattrall

S

NP PP

VP

AUX VBN PP

NP

VP

NPNPSVPVPVBN

AAAI 2010, Atlanta7/15/10 22

Page 23: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Exacting features

Path to the predicate

the role of Celimene is played by Kim Cattrall

S

NP PP

VP

AUX VBN PP

NP

VP

NPSVPVPVBN

NPNPSVPVPVBNshortes

t

AAAI 2010, Atlanta7/15/10 23

Page 24: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Exacting features

Parent LabelNPSVPVPVBN

the role of Celimene is played by Kim Cattrall

S

NP PP

VP

AUX VBN PP

NP

VP

AAAI 2010, Atlanta7/15/10 24

Page 25: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Exacting features

Parent Label

the role of Celimene is played by Kim Cattrall

NP PP

VP

AUX VBN PP

VP

NPSVPVPVBN

in the shortest path

AAAI 2010, Atlanta7/15/10 25

Page 26: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

New Features Parsing score (Fractional value (Mi et al., 2008))

Inside-outside Marginal prob.

the role of Celimene is played by Kim Cattrall

S

NP PP

VP

AUX VBN PP

NP

VP

NPSVPVPVBN

f(NP3)

AAAI 2010, Atlanta7/15/10 26

Page 27: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Classifying

S(Patient)=0.8S(Agent)=0.1S(None)=0.2

S(Patient)=0.5S(Agent)=0.1S(None)=0.3

the role of Celimene is played by Kim Cattrall

S

NP PP

VP

AUX VBN PP

NP

VP

S(Agent)=0.8S(Patient)=0.1S(None)=0.2

AAAI 2010, Atlanta7/15/10 27

Page 28: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Classifying

S(Patient)=0.8

the role of Celimene is played by Kim Cattrall

S

NP PP

VP

AUX VBN PP

NP

VP

S(Agent)=0.8

Patient AgentAAAI 2010, Atlanta7/15/10 28

Page 29: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Outline

Tree-based Semantic Role Labeling Forest-based Semantic Role Labeling Experiments Conclusion

AAAI 2010, Atlanta7/15/10 29

Page 30: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Experiments

Corpus: CoNLL-2005 shared task Sections 02-21 of PropBank for

training Section 24 for development set Section 23 for test set Total

43,594 sentences 262,281 arguments

AAAI 2010, Atlanta7/15/10 30

Page 31: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Experiments

Training sentences Parse into 1-best and forest Prune forest using inside-outside algorithm Train classifiers

Decoding sentences Parse into 1-best and forest Prune forest using inside-outside algorithm Use classifiers

AAAI 2010, Atlanta7/15/10 31

Page 32: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Features

Predicate lemma Path to predicate Path length Partial path Position Voice Head word/POS tag …

AAAI 2010, Atlanta7/15/10 32

Page 33: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Results on Dev Set

precision

recall

F

1-best 50-best forest(p3)9.63×105

forest(p5)5.78×106

33

Page 34: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Results on Tst Set

AAAI 2010, Atlanta7/15/10 34

Page 35: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Outline

Tree-based Semantic Role Labeling Forest-based Semantic Role Labeling Experiments Conclusion

AAAI 2010, Atlanta7/15/10 35

Page 36: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Conclusion

Forest Exponentially encode many parses Enlarge the candidate space Explore more rich features

Improve the quality significantly Not necessary using very large forest Can NOT use k-best to simulate

Future works Features on forest

AAAI 2010, Atlanta7/15/10 36

Page 37: Forest-based  Semantic Role Labeling

INSTITUTE OF COMPUTING

TECHNOLOGY

Thank you!Patient

AAAI 2010, Atlanta7/15/10 37