Upload
duongthuy
View
220
Download
0
Embed Size (px)
Citation preview
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Semantic role labelingL645
Dept. of Linguistics, Indiana UniversityFall 2009
1 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Semantic role labeling
Semantic role labeling (SRL):I Indicate the semantic relations among a predicate and
its participantsI Relations are typically drawn from a list of possible
semantic roles
(1) [The girl on the swing]Agent [whispered]Pred to [theboy beside her]Recipient
I Provides a first-level semantic representation of a text
SRL used for tasks such as:I information extraction (Surdeanu et al. 2003)I machine translation (Komachi et al. 2006)I question answering (Narayanan and Harabagiu 2004)
Slides based upon Marquez et al (2008), Semantic Role Labeling: An
Introduction to the Special Issue, Computational Linguistics
2 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Semantic resources
SRL requires corpora annotated with predicate-argumentstructure for training and testing dataI Gildea and Jurafsky (2002); Xue and Palmer (2004);
Toutanova et al. (2005); Pradhan et al. (2005), ...
This has led to the development of statistical approaches toSRL
3 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
What are semantic roles?
Differing viewpoints, in terms of how specific they are:I Situation-specific roles, e.g., Suspect, Authorities,
OffenseI General roles, e.g., Agent, Theme, Location, GoalI Core role: Proto-Agent & Proto-Theme
An important question for SRL:I What is the mapping between the predicate-argument
structure determining the roles & their syntacticrealization?
4 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Regularity in semantic roles
Verb classes (Levin):I Patterns of syntactic alternation exhibit regularity which
reflects an underlying semantic similarity among verbsI Lexical items are syntactically homogeneous & share
coarse semantic propertiesI No real notion of a semantic role
Frame semantics (Fillmore)I Relates linguistic semantics to encyclopedic knowledgeI Delineates very situation-specific frames and semantic
rolesI Not chosen from a pre-specified list
5 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Characterization of participants
Frame semantics:I Core frame elements: e.g., Suspect, Authorities,
OffenseI Peripheral, or extra-thematic, elements: e.g., Manner,
Time PlaceI SRL systems tend to do better on core elements
SRL tends to focus on verbs, but nouns, adjectives, &prepositions also can have framesI e.g., proud that we finished the paper: Theme
subordinate clause of proud
6 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Semantic annotation
Corpora with semantic annotation are increasingly relevantin natural language processingI See: Baker et al. (1998); Palmer et al. (2005); Burchardt
et al. (2006); Taule et al. (2005)
Need feedback on annotation schemes:I difficult to select an underlying theory (see, e.g.,
Burchardt et al. 2006)I difficult to determine certain relations, e.g., modifiers
(ArgM) in PropBank (Palmer et al. 2005)I Not a clear consensus on what elements to tag and
how to tag them (Palmer et al. 2000)
Semantically-annotated corpora also have potential assources of linguistic data for theoretical research
7 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Senses & Relations
Broadly speaking, there are 2 main ways to do semanticannotation:I Lexical semantics: word senses
I The major issue here is how to deal with polysemyI How many senses does each word have, and what are
they?I Compositional semantics: argument relations
I The connection to syntax is apparentI Requires an inventory of argument roles & relies on the
particular verb sense
These concepts are interrelated in some ways, but we aremore interested in the latter
8 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Sense Tagging the Penn TreebankPalmer et al. (2000)
Initially tagged a 5000-word corpus (later expanded forPropbank)I Selected WSJ articles which contained “interesting
verbs” & covered a range of topicsI Sense-tagged only the verbs & headwords of
arguments/adjunctsI Used WordNet senses
I Additionally tagged proper nouns as person, company,date, or name
9 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Predicate-argument structure
Building from the sense annotation, they also annotatedpredicate-argument structureI Added subscripts to PTB trees to indicate what
semantic role a constituent plays in a sentenceI e.g., SBJ on an NP indicates a subject roleI e.g., TMP on a PP indicates temporal information about
an event
To obtain predicate-argument annotation, verbs needed tobe linked to their argumentsI Required being able to automatically determine
semantic heads of phrasesI Morphological information & phrasal lexical entries were
also added
10 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Predicate-argument relations (formally)
Semantic annotation is non-uniform:
(2) [Arg1 lending practices] vary/vary.01 [Arg2−EXT widely][ArgM−MNR by location]
1. the verb sense
2. the span of each argument
3. argument label names
11 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Some consistency issuesDickinson and Lee (2008)
It is hard to maintain predicate-argument consistency,especially when built on top of other layers of annotation:
(3) a. coming/VBG [Arg1 months] ,
b. coming/JJ months ,
(4) a. [Arg1 net income in its first half] rose 59 %
b. [Arg1 net income] in its first half rose 8.9 %
(5) a. That could [Arg2−MNR substantially] reduce the valueof the television assets .
b. the proposed acquisition could [ArgM−MNR
substantially] reduce competition ...
12 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Insights
I Some verbs are ambiguous in whether they takearguments and what type of arguments they take
(6) a. [Arg1 Analysts] had mixed responses
b. [Arg1 Analysts] had expected Consolidated topost a slim profit ...
I Much argument identification ambiguity rooted indifficulties resolving syntactic ambiguity
(7) a. seeking [Arg1 a buyer] [PP for several months]
b. seeking [Arg1 a buyer for only its shares]
I Some argument relations depend upon the sense of theverb, which depends upon other arguments of verb
(8) a. [Arg0 he] will return Kidder to prominenceb. [Arg1 he] will return to his old bench
13 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
FrameNet
FrameNet is an online lexical resource for English (withFrameNets also in other languages)I http://framenet.icsi.berkeley.edu/
FrameNet featuresI 10,000 lexical units, with more than 825 semantic
framesI 135,000 annotated sentences
We’ll talk more about frame semantics momentarily ...
14 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Frame Semantics
Frame semantics describes meaning as:I characterized by the background knowledge necessary
to understand each expressionI A frame is evoked by a word or expression
I Coarse-grained frame descriptions generalize overdifferent lexical items (unlike Propbank)
I Each frame has its own set of semantic roles, calledframe elements
I Participants & propositions of an abstract situationI Frame elements are local to individual frames, instead
of using universal roles
15 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Frame example: StatementFrame description
This frame contains verbs and nouns thatcommunicate the act of a speaker to address amessage to some addressee using language. Anumber of the words can be used performatively,such as declare and insist.
16 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Frame example: StatementFrame elements
I speaker: Evelyn said she wanted to leave.I message: Evelyn announced that she wanted to leave.I addressee: Evelyn spoke to me about her past.I topic: Evelyn’s statement about her past.I medium: Evelyn preached to me over the phone.
17 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Frame example: StatementPredicates
I acknowledge.vI acknowledgement.nI add.vI address.vI admission.nI admit.vI affirm.vI affirmation.nI allegation.nI ...
18 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Syntax-semantics examples
Frame semantics is between syntax and “deep” semantics
e.g., generalizes over verbal alternations:
I [Peter]agent hitcauseimpact [the ball]impactee .I [The ball]impactee was hitcauseimpact .
and over nominalizations:
I [Evelyn]speaker spokestatement [about her past]topic .I [Evelyn’s]speaker statementstatement [about her past]topic
19 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
The SALSA projectBurchardt et al. (2006)
Large corpora and large domain-independent lexica canhelp the study of:
I lexical semanticsI syntax-semantics linking propertiesI noncompositional phenomena, e.g., idiomatic &
metaphoric expressionsI cross-lingual analysis & application of lexical semantic
informationI particularly apt for frame semantics, as it has a
common, largely language-independent word sense &role inventory
Project page: http://www.coli.uni-saarland.de/projects/salsa/
See http://www.coli.uni-saarland.de/projects/salsa/corpus/for the release
20 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Annotation for German
Build on top of the TIGER corpus of GermanI Single flat tree for each frameI Root node labeled with frame name; edges with frame
element namesI Frame elements refer to syntactic constitutents
Communication response:
(9) “ [S“
Schlecht]Message
Badly””
,,antworteranswers
[NP
thedie Branche]Speaker
industry section[PP
inimunison
Chor].
.
Annotation proceeds one predicate at a time & all instancesof a predicate are annotated
21 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Compositionality
I Support Verb ConstructionsI Idioms: annotate complete multiword unit as
frame-evoking elementI Metaphors, e.g., unter die Lupe nehmen: to put (lit.
take) under a magnifying glass:I Source frame models syntactic realization patterns
(e.g., Taking)I Target frame models the understood meaning (e.g.,
Scrutiny)
I Vagueness: annotators can assign more than 1 label(for frames or frame elements)
22 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Effect of role inventory on SRL
There is a bit of an open question as to what effect the roleinventory has on SRL.I Gildea and Jurafsky (2002) mapped FrameNet frames
into abstract thematic rolesI System used these roles without degradation
PropBank:I Is it too domain-specific?I Are the roles easier to label than FrameNet’s?
23 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Approaches to Automatic SRL
SRL has two tasks:I Identify the boundaries of the arguments of the verb
predicate (argument identification)I Label arguments with semantic roles (argument
classification)
Most common architecture:
1. Filtering/pruning the set of arguments
2. Local scoring of argument candidates
3. Global scoring of argument candidates
24 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Filtering
Filter/prune the set of argument candidatesI Can be continuous or discontinuous
I This means any subsequence of words can beconsidered a candidate
I Typically use simple heuristics to reduce the space ofcandidates:
I Xue and Palmer (2004): collect sister constituents of apredicate as possible arguments
I Move up the tree, collecting sisters, all the way to thetop
25 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Local scoring
Locally score argument candidates via a function thatoutputs probabilities/confidence scores for each possiblelabelI Also include “no-argument” (NONE) labelI Candidates are treated independently of each other
Notes:I Feature selection tends to be more crucial than choice
of classification algorithmI Argument identification & classification can be treated
jointly or separatelyI Separately = pipeline of argument/no-argument +
specific labelI Useful features may be different for the 2 tasks
26 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Global scoring
Joint (or global) scoring combines the predications of localscores to good a good structureI Dependencies among several arguments of the same
predicate can be exploitedI arguments do no overlapI core arguments do not repeatI etc.
I Could rerank or use probabilistic models to obtainstructured output
27 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Variations on common architecture
I Do only local scoringI Skip directly to joint scoringI Fourth step of fixing common errorsI Fourth step of enforcing coherence in solution
28 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
System combination
Combine:I the output of several independent SRL basic systemsI several outputs from the same SRL system
I Change input annotations or other internal parameters
Could combine the best among competing full solutions orcombine fragments of alternative solutions
29 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Feature engineering
Given a verb and a candidate argument, three types offeatures are used:
1. Features that characterize the candidate argument & itscontext
I e.g., phrase type, headword, governing category of theconstituent
2. Features that characterize the verb predicate & itscontext
I e.g., lemma, voice, & subcategorization pattern of theverb
3. Features that characterize the realtion between thecandidate & the predicate
I e.g., L/R position of the constituent w.r.t. the verb,category path between them
There are lots of extensions to this (e.g., syntactic frame,syntactic path variants, semantic relation/selectionalpreferences). . . We’ll talk more next time: read the Pradhan et al paper
30 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Evaluation
CoNLL-2005 shared task showed:I F1 scores around 80%I Argument identification accounts for most of the errors
I recall about 81% of correct unlabeled argumentsI about 95% of these are assigned the correct semantic
role
31 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
ReferencesBaker, Collin F., Charles J. Fillmore and John B. Lowe (1998). The Berkeley
FrameNet Project. In Proceedings of ACL-98. Montreal, pp. 86–90.Burchardt, Aljoscha, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado
and Manfred Pinkal (2006). The SALSA corpus: a German corpus resource forlexical semantics. In Proceedings of LREC-06. Genoa.
Dickinson, Markus and Chong Min Lee (2008). Detecting Errors in SemanticAnnotation. In Proceedings of the 6th Language Resources and EvaluationConference (LREC 2008). Marrakech, Morocco.
Gildea, Daniel and Daniel Jurafsky (2002). Automatic Labeling of Semantic Roles.Computational Linguistics 28(4), 245–288.
Komachi, Mamoru, Masaaki Nagata and Yuji Matsumoto (2006). PhraseReordering for Statisitcal Machine Translation Based on Predicate-ArgumentStructure. In Proceedings of the International Workshop on Spoken LanguageTranslation. Kyoto, Japan, pp. 77–82.
Morante, Roser and Antal van den Bosch (2007). Memory-Based Semantic RoleLabeling of Catalan and Spanish. In Proceedings of RANLP-07. pp. 388–394.
Narayanan, Srini and Sanda Harabagiu (2004). Question Answering based onSemantic Structures. In International Conference on Computational Linguistics(COLING 2004). Geneva, Switzerland.
Palmer, Martha, Hoa Trang Dang and Joseph Rosenzweig (2000). Sense Taggingthe Penn Treebank. In Proceedings of the Second Language Resources andEvaluation Conference, LREC-00. Athens.http://verbs.colorado.edu/∼mpalmer/papers/lrec00.ps.gz.
31 / 31
Semantic rolelabeling
L645
SRL
Linguistics
Semantic resourcesPropbank
FrameNet
Effect
Approaches
References
Palmer, Martha, Daniel Gildea and Paul Kingsbury (2005). The Proposition Bank:An Annotated Corpus of Semantic Roles. Computational Linguistics 31(1),71–105.
Pradhan, Sameer, Kadri Hacioglu, Valerie Krugler, Wayne Ward, James H. Martinand Daniel Jurafsky (2005). Support Vector Learning for Semantic ArgumentClassification. Machine Learning 60(1), 11–39.
Surdeanu, Mihai, Sanda Harabagiu, John Williams and Paul Aarseth (2003). UsingPredicate-Argument Structures for Information Extraction. In Proceedings ofACL-03.
Taule, M., J. Aparicio, J. Castellvı and M.A. Martı (2005). Mapping syntacticfunctions into semantic roles. In Proceedings of TLT-05. Barcelona.
Toutanova, Kristina, Aria Haghighi and Christopher Manning (2005). Joint LearningImproves Semantic Role Labeling. In Proceedings of ACL-05. Ann Arbor,Michigan, pp. 589–596.
Xue, Nianwen and Martha Palmer (2004). Calibrating Features for Semantic RoleLabeling. In Dekang Lin and Dekai Wu (eds.), Proceedings of EMNLP 2004 .Barcelona, pp. 88–94.
31 / 31