33
Semantic role labeling L645 SRL Linguistics Semantic resources Propbank FrameNet Effect Approaches References Semantic role labeling L645 Dept. of Linguistics, Indiana University Fall 2009 1 / 31

Semantic role labeling - L645 - Indiana University …cl.indiana.edu/~md7/09/645/slides/14-srl/14-srl.pdfreflects an underlying semantic similarity among verbs I Lexical items are

Embed Size (px)

Citation preview

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Semantic role labelingL645

Dept. of Linguistics, Indiana UniversityFall 2009

1 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Semantic role labeling

Semantic role labeling (SRL):I Indicate the semantic relations among a predicate and

its participantsI Relations are typically drawn from a list of possible

semantic roles

(1) [The girl on the swing]Agent [whispered]Pred to [theboy beside her]Recipient

I Provides a first-level semantic representation of a text

SRL used for tasks such as:I information extraction (Surdeanu et al. 2003)I machine translation (Komachi et al. 2006)I question answering (Narayanan and Harabagiu 2004)

Slides based upon Marquez et al (2008), Semantic Role Labeling: An

Introduction to the Special Issue, Computational Linguistics

2 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Semantic resources

SRL requires corpora annotated with predicate-argumentstructure for training and testing dataI Gildea and Jurafsky (2002); Xue and Palmer (2004);

Toutanova et al. (2005); Pradhan et al. (2005), ...

This has led to the development of statistical approaches toSRL

3 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

What are semantic roles?

Differing viewpoints, in terms of how specific they are:I Situation-specific roles, e.g., Suspect, Authorities,

OffenseI General roles, e.g., Agent, Theme, Location, GoalI Core role: Proto-Agent & Proto-Theme

An important question for SRL:I What is the mapping between the predicate-argument

structure determining the roles & their syntacticrealization?

4 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Regularity in semantic roles

Verb classes (Levin):I Patterns of syntactic alternation exhibit regularity which

reflects an underlying semantic similarity among verbsI Lexical items are syntactically homogeneous & share

coarse semantic propertiesI No real notion of a semantic role

Frame semantics (Fillmore)I Relates linguistic semantics to encyclopedic knowledgeI Delineates very situation-specific frames and semantic

rolesI Not chosen from a pre-specified list

5 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Characterization of participants

Frame semantics:I Core frame elements: e.g., Suspect, Authorities,

OffenseI Peripheral, or extra-thematic, elements: e.g., Manner,

Time PlaceI SRL systems tend to do better on core elements

SRL tends to focus on verbs, but nouns, adjectives, &prepositions also can have framesI e.g., proud that we finished the paper: Theme

subordinate clause of proud

6 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Semantic annotation

Corpora with semantic annotation are increasingly relevantin natural language processingI See: Baker et al. (1998); Palmer et al. (2005); Burchardt

et al. (2006); Taule et al. (2005)

Need feedback on annotation schemes:I difficult to select an underlying theory (see, e.g.,

Burchardt et al. 2006)I difficult to determine certain relations, e.g., modifiers

(ArgM) in PropBank (Palmer et al. 2005)I Not a clear consensus on what elements to tag and

how to tag them (Palmer et al. 2000)

Semantically-annotated corpora also have potential assources of linguistic data for theoretical research

7 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Senses & Relations

Broadly speaking, there are 2 main ways to do semanticannotation:I Lexical semantics: word senses

I The major issue here is how to deal with polysemyI How many senses does each word have, and what are

they?I Compositional semantics: argument relations

I The connection to syntax is apparentI Requires an inventory of argument roles & relies on the

particular verb sense

These concepts are interrelated in some ways, but we aremore interested in the latter

8 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Sense Tagging the Penn TreebankPalmer et al. (2000)

Initially tagged a 5000-word corpus (later expanded forPropbank)I Selected WSJ articles which contained “interesting

verbs” & covered a range of topicsI Sense-tagged only the verbs & headwords of

arguments/adjunctsI Used WordNet senses

I Additionally tagged proper nouns as person, company,date, or name

9 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Predicate-argument structure

Building from the sense annotation, they also annotatedpredicate-argument structureI Added subscripts to PTB trees to indicate what

semantic role a constituent plays in a sentenceI e.g., SBJ on an NP indicates a subject roleI e.g., TMP on a PP indicates temporal information about

an event

To obtain predicate-argument annotation, verbs needed tobe linked to their argumentsI Required being able to automatically determine

semantic heads of phrasesI Morphological information & phrasal lexical entries were

also added

10 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Predicate-argument relations (formally)

Semantic annotation is non-uniform:

(2) [Arg1 lending practices] vary/vary.01 [Arg2−EXT widely][ArgM−MNR by location]

1. the verb sense

2. the span of each argument

3. argument label names

11 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Some consistency issuesDickinson and Lee (2008)

It is hard to maintain predicate-argument consistency,especially when built on top of other layers of annotation:

(3) a. coming/VBG [Arg1 months] ,

b. coming/JJ months ,

(4) a. [Arg1 net income in its first half] rose 59 %

b. [Arg1 net income] in its first half rose 8.9 %

(5) a. That could [Arg2−MNR substantially] reduce the valueof the television assets .

b. the proposed acquisition could [ArgM−MNR

substantially] reduce competition ...

12 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Insights

I Some verbs are ambiguous in whether they takearguments and what type of arguments they take

(6) a. [Arg1 Analysts] had mixed responses

b. [Arg1 Analysts] had expected Consolidated topost a slim profit ...

I Much argument identification ambiguity rooted indifficulties resolving syntactic ambiguity

(7) a. seeking [Arg1 a buyer] [PP for several months]

b. seeking [Arg1 a buyer for only its shares]

I Some argument relations depend upon the sense of theverb, which depends upon other arguments of verb

(8) a. [Arg0 he] will return Kidder to prominenceb. [Arg1 he] will return to his old bench

13 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

FrameNet

FrameNet is an online lexical resource for English (withFrameNets also in other languages)I http://framenet.icsi.berkeley.edu/

FrameNet featuresI 10,000 lexical units, with more than 825 semantic

framesI 135,000 annotated sentences

We’ll talk more about frame semantics momentarily ...

14 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Frame Semantics

Frame semantics describes meaning as:I characterized by the background knowledge necessary

to understand each expressionI A frame is evoked by a word or expression

I Coarse-grained frame descriptions generalize overdifferent lexical items (unlike Propbank)

I Each frame has its own set of semantic roles, calledframe elements

I Participants & propositions of an abstract situationI Frame elements are local to individual frames, instead

of using universal roles

15 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Frame example: StatementFrame description

This frame contains verbs and nouns thatcommunicate the act of a speaker to address amessage to some addressee using language. Anumber of the words can be used performatively,such as declare and insist.

16 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Frame example: StatementFrame elements

I speaker: Evelyn said she wanted to leave.I message: Evelyn announced that she wanted to leave.I addressee: Evelyn spoke to me about her past.I topic: Evelyn’s statement about her past.I medium: Evelyn preached to me over the phone.

17 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Frame example: StatementPredicates

I acknowledge.vI acknowledgement.nI add.vI address.vI admission.nI admit.vI affirm.vI affirmation.nI allegation.nI ...

18 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Syntax-semantics examples

Frame semantics is between syntax and “deep” semantics

e.g., generalizes over verbal alternations:

I [Peter]agent hitcauseimpact [the ball]impactee .I [The ball]impactee was hitcauseimpact .

and over nominalizations:

I [Evelyn]speaker spokestatement [about her past]topic .I [Evelyn’s]speaker statementstatement [about her past]topic

19 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

The SALSA projectBurchardt et al. (2006)

Large corpora and large domain-independent lexica canhelp the study of:

I lexical semanticsI syntax-semantics linking propertiesI noncompositional phenomena, e.g., idiomatic &

metaphoric expressionsI cross-lingual analysis & application of lexical semantic

informationI particularly apt for frame semantics, as it has a

common, largely language-independent word sense &role inventory

Project page: http://www.coli.uni-saarland.de/projects/salsa/

See http://www.coli.uni-saarland.de/projects/salsa/corpus/for the release

20 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Annotation for German

Build on top of the TIGER corpus of GermanI Single flat tree for each frameI Root node labeled with frame name; edges with frame

element namesI Frame elements refer to syntactic constitutents

Communication response:

(9) “ [S“

Schlecht]Message

Badly””

,,antworteranswers

[NP

thedie Branche]Speaker

industry section[PP

inimunison

Chor].

.

Annotation proceeds one predicate at a time & all instancesof a predicate are annotated

21 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Compositionality

I Support Verb ConstructionsI Idioms: annotate complete multiword unit as

frame-evoking elementI Metaphors, e.g., unter die Lupe nehmen: to put (lit.

take) under a magnifying glass:I Source frame models syntactic realization patterns

(e.g., Taking)I Target frame models the understood meaning (e.g.,

Scrutiny)

I Vagueness: annotators can assign more than 1 label(for frames or frame elements)

22 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Effect of role inventory on SRL

There is a bit of an open question as to what effect the roleinventory has on SRL.I Gildea and Jurafsky (2002) mapped FrameNet frames

into abstract thematic rolesI System used these roles without degradation

PropBank:I Is it too domain-specific?I Are the roles easier to label than FrameNet’s?

23 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Approaches to Automatic SRL

SRL has two tasks:I Identify the boundaries of the arguments of the verb

predicate (argument identification)I Label arguments with semantic roles (argument

classification)

Most common architecture:

1. Filtering/pruning the set of arguments

2. Local scoring of argument candidates

3. Global scoring of argument candidates

24 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Filtering

Filter/prune the set of argument candidatesI Can be continuous or discontinuous

I This means any subsequence of words can beconsidered a candidate

I Typically use simple heuristics to reduce the space ofcandidates:

I Xue and Palmer (2004): collect sister constituents of apredicate as possible arguments

I Move up the tree, collecting sisters, all the way to thetop

25 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Local scoring

Locally score argument candidates via a function thatoutputs probabilities/confidence scores for each possiblelabelI Also include “no-argument” (NONE) labelI Candidates are treated independently of each other

Notes:I Feature selection tends to be more crucial than choice

of classification algorithmI Argument identification & classification can be treated

jointly or separatelyI Separately = pipeline of argument/no-argument +

specific labelI Useful features may be different for the 2 tasks

26 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Global scoring

Joint (or global) scoring combines the predications of localscores to good a good structureI Dependencies among several arguments of the same

predicate can be exploitedI arguments do no overlapI core arguments do not repeatI etc.

I Could rerank or use probabilistic models to obtainstructured output

27 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Variations on common architecture

I Do only local scoringI Skip directly to joint scoringI Fourth step of fixing common errorsI Fourth step of enforcing coherence in solution

28 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

System combination

Combine:I the output of several independent SRL basic systemsI several outputs from the same SRL system

I Change input annotations or other internal parameters

Could combine the best among competing full solutions orcombine fragments of alternative solutions

29 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Feature engineering

Given a verb and a candidate argument, three types offeatures are used:

1. Features that characterize the candidate argument & itscontext

I e.g., phrase type, headword, governing category of theconstituent

2. Features that characterize the verb predicate & itscontext

I e.g., lemma, voice, & subcategorization pattern of theverb

3. Features that characterize the realtion between thecandidate & the predicate

I e.g., L/R position of the constituent w.r.t. the verb,category path between them

There are lots of extensions to this (e.g., syntactic frame,syntactic path variants, semantic relation/selectionalpreferences). . . We’ll talk more next time: read the Pradhan et al paper

30 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Evaluation

CoNLL-2005 shared task showed:I F1 scores around 80%I Argument identification accounts for most of the errors

I recall about 81% of correct unlabeled argumentsI about 95% of these are assigned the correct semantic

role

31 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

ReferencesBaker, Collin F., Charles J. Fillmore and John B. Lowe (1998). The Berkeley

FrameNet Project. In Proceedings of ACL-98. Montreal, pp. 86–90.Burchardt, Aljoscha, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado

and Manfred Pinkal (2006). The SALSA corpus: a German corpus resource forlexical semantics. In Proceedings of LREC-06. Genoa.

Dickinson, Markus and Chong Min Lee (2008). Detecting Errors in SemanticAnnotation. In Proceedings of the 6th Language Resources and EvaluationConference (LREC 2008). Marrakech, Morocco.

Gildea, Daniel and Daniel Jurafsky (2002). Automatic Labeling of Semantic Roles.Computational Linguistics 28(4), 245–288.

Komachi, Mamoru, Masaaki Nagata and Yuji Matsumoto (2006). PhraseReordering for Statisitcal Machine Translation Based on Predicate-ArgumentStructure. In Proceedings of the International Workshop on Spoken LanguageTranslation. Kyoto, Japan, pp. 77–82.

Morante, Roser and Antal van den Bosch (2007). Memory-Based Semantic RoleLabeling of Catalan and Spanish. In Proceedings of RANLP-07. pp. 388–394.

Narayanan, Srini and Sanda Harabagiu (2004). Question Answering based onSemantic Structures. In International Conference on Computational Linguistics(COLING 2004). Geneva, Switzerland.

Palmer, Martha, Hoa Trang Dang and Joseph Rosenzweig (2000). Sense Taggingthe Penn Treebank. In Proceedings of the Second Language Resources andEvaluation Conference, LREC-00. Athens.http://verbs.colorado.edu/∼mpalmer/papers/lrec00.ps.gz.

31 / 31

Semantic rolelabeling

L645

SRL

Linguistics

Semantic resourcesPropbank

FrameNet

Effect

Approaches

References

Palmer, Martha, Daniel Gildea and Paul Kingsbury (2005). The Proposition Bank:An Annotated Corpus of Semantic Roles. Computational Linguistics 31(1),71–105.

Pradhan, Sameer, Kadri Hacioglu, Valerie Krugler, Wayne Ward, James H. Martinand Daniel Jurafsky (2005). Support Vector Learning for Semantic ArgumentClassification. Machine Learning 60(1), 11–39.

Surdeanu, Mihai, Sanda Harabagiu, John Williams and Paul Aarseth (2003). UsingPredicate-Argument Structures for Information Extraction. In Proceedings ofACL-03.

Taule, M., J. Aparicio, J. Castellvı and M.A. Martı (2005). Mapping syntacticfunctions into semantic roles. In Proceedings of TLT-05. Barcelona.

Toutanova, Kristina, Aria Haghighi and Christopher Manning (2005). Joint LearningImproves Semantic Role Labeling. In Proceedings of ACL-05. Ann Arbor,Michigan, pp. 589–596.

Xue, Nianwen and Martha Palmer (2004). Calibrating Features for Semantic RoleLabeling. In Dekang Lin and Dekai Wu (eds.), Proceedings of EMNLP 2004 .Barcelona, pp. 88–94.

31 / 31