32
Introduction Our Approach Experiments Conclusion Memory-Based Acquisition of Argument Structures and its Application to Implicit Role Detection Christian Chiarcos & Niko Schenk {chiarcos,n.schenk}@em.uni-frankfurt.de Applied Computational Linguistics Lab (ACoLi) Institut f¨ ur Informatik und Mathematik Goethe-University Frankfurt am Main, Germany September 2, 2015 Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,[email protected] Applied Computational Linguistics

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Memory-Based Acquisition of Argument Structures

and its Application to Implicit Role Detection

Christian Chiarcos & Niko Schenk{chiarcos,n.schenk}@em.uni-frankfurt.de

Applied Computational Linguistics Lab (ACoLi)Institut fur Informatik und Mathematik

Goethe-University Frankfurt am Main, Germany

September 2, 2015

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 2: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Background

Traditional Semantic Role Labeling (SRL)

detects events (e.g., verbal or nominal predicates)

and associated semantic roles (agent, theme, recipient...)

as overtly (explicitly) realized in the current sentence.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 3: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

An Example from Roth & Frank (2013)

“El Salvador [...] still has troops in Iraq.

Nicaragua and Honduras have withdrawn their

troops.”

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 4: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Labeling Explicit Roles

PropBank-style (Palmer et al., 2005) SRL parse for the first sample sentence.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 5: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

The Phenomenon of Implicit Roles

Note that the filler for the implicit role A2-DIR is not overtly realized in thissentence (but can be resolved to Iraq in the first sentence).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 6: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Problem Statement

Many role realizations are suppressed on the surface level.Traditional SRL cannot detect them

Is restricted to isolated sentences + requires discourse analysis.Downstream NLU applications would greatly benefit from implicit roles as’supplementary’ information.

Recognizing textual entailment, summarization, question answering, ore.g., as input to Abstract Meaning Representation (Banarescu et al.2013).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 7: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Current Issues in Implicit Role Detection

Previous approaches combined costly, specialized lexical resources (e.g., VerbNet,FrameNet) with word-specific heuristics and/or static templates:

Example:Predicate withdraw :Lexical entry: remove-10.1, A0/Agent, A1/Theme, A2/Source.

→ Given the definition of the lexical entry, define all roles as implicit which arenot overtly realized in the sentence.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 8: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Current Issues in Implicit Role Detection

Previous approaches combined costly, specialized lexical resources (e.g., VerbNet,FrameNet) with word-specific heuristics and/or static templates:

Example:Predicate withdraw :Lexical entry: remove-10.1, A0/Agent, A1/Theme, A2/Source.

→ Given the definition of the lexical entry, define all roles as implicit which arenot overtly realized in the sentence.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 9: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Current Issues in Implicit Role Detection

Previous approaches combined costly, specialized lexical resources (e.g., VerbNet,FrameNet) with word-specific heuristics and/or static templates:

Example:Predicate withdraw :Lexical entry: remove-10.1, A0/Agent, A1/Theme, A2/Source.

→ Given the definition of the lexical entry, define all roles as implicit which arenot overtly realized in the sentence.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 10: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Current Issues in Implicit Role Detection

Pattern-based methods perform okay, but there are drawbacks:

1 They are inflexible and absolute. (Not all candidate implicit roles areequally likely to be missing).

2 They are expensive. (They require handcrafted, idiosyncratic rules andlanguage-specific lexical resources).

3 Earlier studies restricted implicit information to core roles. (Non-core rolesalso provide valid information).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 11: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

1 Introduction

2 Our Approach

3 ExperimentsExperiment 1Experiment 2

4 Conclusion

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 12: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

General Idea & Motivation

We argue that implicit role detection

1 should be probabilistic instead of rule-based (memory-based).2 should be domain-independent.3 should be based on a generic role set (PropBank, to obtain better

generalizations).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 13: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

General Idea & Motivation

We propose a novel strategy to detecting implicit roles (not their fillers):

Our models use large corpora with automated annotations for explicit roles,only to capture the distribution of predicates and their associated roles.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 14: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Memory-Based Learning of Argument Structures

Generate probabilistic models from the WaCkypedia EN corpus (Baroni et al.,2009) by annotating it with SRL information. For each predicate derive allprobabilities:

P(r |R ,predicate)

R is the role inventory of the parser. R ⊆ R a (sub)set of explicitly realizedsemantic roles, and r ∈ R \ R an arbitrary semantic role.

Example:P(A2|{A0,A1},withdraw) = 0.84

We train two types of models: sense-distinguished (SD) and sense-ignorant (SI )models (cf. play.01.n or play.n).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 15: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Memory-Based Learning of Argument Structures

Generate probabilistic models from the WaCkypedia EN corpus (Baroni et al.,2009) by annotating it with SRL information. For each predicate derive allprobabilities:

P(r |R ,predicate)

R is the role inventory of the parser. R ⊆ R a (sub)set of explicitly realizedsemantic roles, and r ∈ R \ R an arbitrary semantic role.

Example:P(A2|{A0,A1},withdraw) = 0.84

We train two types of models: sense-distinguished (SD) and sense-ignorant (SI )models (cf. play.01.n or play.n).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 16: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

1 Introduction

2 Our Approach

3 ExperimentsExperiment 1Experiment 2

4 Conclusion

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 17: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiments

Evaluation data:SemEval Shared Task 10 (Ruppenhofer et al., 2010), de facto standard withtraining/test split. Literature/novels with manual implicit role annotations.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 18: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiment 1

Simplified setting:Only predicates with one explicit role and one implicit role.

Task Description:Predict the single missing role ni for each predicate.

Approach:Train two gold models on complete PropBank and NomBank corpora, and differentmodels on varying sizes of the WaCkypedia dump.Return

ni = arg maxn∈R\R

P(n|oj ,predicate),

where R = {oj}, the predicate’s single explicit role and R = {A0..A4} ⊃ R, the roleinventory.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 19: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiment 1

Simplified setting:Only predicates with one explicit role and one implicit role.

Task Description:Predict the single missing role ni for each predicate.

Approach:Train two gold models on complete PropBank and NomBank corpora, and differentmodels on varying sizes of the WaCkypedia dump.Return

ni = arg maxn∈R\R

P(n|oj ,predicate),

where R = {oj}, the predicate’s single explicit role and R = {A0..A4} ⊃ R, the roleinventory.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 20: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiment 1 – Finding the Single Implicit Role

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1k 1k 1,000k10k 100k# training sentences

accu

racy

classifier

MC

PB

SD

SI

MC

PBSI

SD

NB

Prediction accuracies for ver-

bal predicates. Majority class

baseline (MC ) and Prop-

Bank (PB) gold model. The

log-scaled x-axis only refers

to the SD and SI models.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 21: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiment 1 – Results

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1k 1k 1,000k10k 100k# training sentences

accu

racy

classifier

MC

PB

SD

SI

MC

PBSI

SD

NB

The trend is the same forboth verbs and nouns:

1.) Performance of SD

and SI models increase withthe training size.

2.) SD models per-

form better than SI models.

3.) SD models outper-

form gold model for verbssignificantly. (p < .01).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 22: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiment 2 – A Realistic Setting

Task Description:For all predicate instances (≈1,000) and their explicit roles (any combination ofA0 to A4 or none), predict the correct set of implicit roles.

Evaluate against Ruppenhofer el al. (2010) SemEval data.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 23: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiment 2 – Our Classifiers

Baseline ( A )Predicts implicit roles randomly, emulating their frequency distribution in the test set.

Supervised Classifiers ( B )Predict most frequent implicit role pattern obtained from training section, either foreach predicate (B1) or for each predicate + its overt role(s) (B2).

Mildly Supervised Classifiers ( C )Use information only from automatically annotated explicit roles from WaCkypedia.Different thresholds are introduced for the number of explicit roles, which areestimated optimizing F1 score on training section of SemEval data.

Hybrid Classifiers ( CB )Combine supervised and mildly supervised classifiers.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 24: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiment 2 – Results

All classifiers demonstrate significant improvements over the baseline .

Supervised classifiers are very accurate (precision).

Mildly supervised classifiers outperform the supervised classifiers (F1, recall).Performance increases with increasing number of parameters, i.e. one shouldincorporate a maximum of available information on explicit roles.Encoding the distinction between verbal/nominal predicates increases F1 score.

Hybrid classifiers achieve best results.

Recognition rate of implicit roles (81%) outperforms the state of the art (66%,Laparra and Rigau, 2012).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 25: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Experiment 2 – Results

All classifiers demonstrate significant improvements over the baseline .

Supervised classifiers are very accurate (precision).

Mildly supervised classifiers outperform the supervised classifiers (F1, recall).Performance increases with increasing number of parameters, i.e. one shouldincorporate a maximum of available information on explicit roles.Encoding the distinction between verbal/nominal predicates increases F1 score.

Hybrid classifiers achieve best results.

Recognition rate of implicit roles (81%) outperforms the state of the art (66%,Laparra and Rigau, 2012).

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 26: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

1 Introduction

2 Our Approach

3 ExperimentsExperiment 1Experiment 2

4 Conclusion

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 27: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Summary & Conclusion

A novel, memory-based approach to implicit role detection.(uses ONLY explicit annotations to infer implicit information).

Flexible (probabilistic).

Language and domain-independent.

State-of-the art in terms of recognition rate, still suffers in precision.

Future work:Extend memory-based strategy to implicit role resolution + replicate onFrameNet roles.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 28: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 29: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

References

[1] Collin F. Baker, Charles J. Fillmore, and John B. Lowe. 1998. The Berkeley FrameNet Project. InProceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17thInternational Conference on Computational Linguistics - Volume 1, ACL 98, pages 8690, Stroudsburg, PA, USA.Association for Computational Linguistics.

[2] Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The Proposition Bank: An Annotated Corpus ofSemantic Roles. Comput. Linguist., 31(1):71 106, March.

[3] Josef Ruppenhofer, Caroline Sporleder, Roser Morante, Collin Baker, and Martha Palmer. 2010.SemEval-2010 Task 10: Linking Events and Their Participants in Discourse. In Proceedings of the 5thInternational Workshop on Semantic Evaluation, SemEval 10, pages 4550, Stroudsburg, PA, USA. Associationfor Computational Linguistics.

[4] Walter Daelemans and Antal van den Bosch. 2009. Memory-Based Language Processing. CambridgeUniversity Press, New York, NY, USA, 1st edition.

[5] Marco Baroni, Silvia Bernardini, Adriano Ferraresi, and Eros Zanchetta. 2009. The WaCky Wide Web: ACollection of Very Large Linguistically Processed Web-Crawled Corpora. Language Resources and Evaluation,43(3):209226.

[6] Egoitz Laparra and German Rigau. 2012. Exploiting Explicit Annotations and Semantic Types for ImplicitArgument Resolution. In Sixth IEEE International Conference on Semantic Computing, ICSC 2012., Palermo,Italy. IEEE Computer Society.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 30: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

References

[7] Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight,Philipp Koehn, Martha Palmer, and Nathan Schneider. 2013. Abstract Meaning Representation forSembanking. Proc. Linguistic Annotation Workshop.

[8] Christian Chiarcos and Niko Schenk. 2015. Memory-Based Acquisition of Argument Structures and itsApplication to Implicit Role Detection. In Proceedings of the SIGDIAL. 2015., Prague, Czech Republic.

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 31: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Appendix — DNI vs. INI Classification Results

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures

Page 32: Memory-Based Acquisition of Argument Structures - and its … · 2016. 4. 11. · Christian Chiarcos & Niko Schenk fchiarcos,n.schenkg@em.uni-frankfurt.de Applied Computational Linguistics

Introduction Our Approach Experiments Conclusion

Appendix — DNI vs. INI Classification Results

Christian Chiarcos & Niko Schenk Memory-Based Acquisition of Argument Structures