33
Learning Statistical Scripts with LSTM Recurrent Neural Networks Karl Pichotta & Raymond J. Mooney Department of Computer Science The University of Texas at Austin AAAI 2016 1

Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

  • Upload
    lydang

  • View
    244

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Learning Statistical Scripts with

LSTM Recurrent Neural Networks

Karl Pichotta & Raymond J. Mooney Department of Computer Science The University of Texas at Austin

!AAAI 2016

1

Page 2: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Motivation

• Following the Battle of Actium, Octavian invaded Egypt. As he approached Alexandria, Antony's armies deserted to Octavian on August 1, 30 BC.

• Did Octavian defeat Antony?

2

Page 3: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Motivation

• Following the Battle of Actium, Octavian invaded Egypt. As he approached Alexandria, Antony's armies deserted to Octavian on August 1, 30 BC.

• Did Octavian defeat Antony?

3

Page 4: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Motivation

• Antony’s armies deserted to Octavian ⇒Octavian defeated Antony

• Not simply a paraphrase rule!

• Need world knowledge.

4

Page 5: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Scripts• Scripts: models of events in sequence.

• “Event”: verb + arguments.

• Events don’t appear in text randomly, but according to world dynamics.

• Scripts try to capture these dynamics.

• Enable automatic inference of implicit events, given events in text (e.g. Octavian defeated Antony).

5

Page 6: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Outline

• Background

• Methods

• Experiments

• Conclusion

6

• Background

• Methods

• Experiments

• Conclusion

Page 7: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Outline

• Background

• Statistical Scripts

• Recurrent Neural Nets

• Background

• Statistical Scripts

• Recurrent Neural Nets

7

Page 8: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Background: Statistical Scripts

• Statistical Scripts: Statistical Models of Event Sequences.

• Non-statistical scripts date back to the 1970s [Schank & Abelson 1977].

• Statistical script learning is a small-but-growing subcommunity [e.g. Chambers & Jurafsky 2008].

• Model the probability of an event given prior events.

8

Page 9: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Background: Statistical Script Learning

9

Millions of

Documents

NLP Pipeline • Syntax • Coreference

Millions of Event Sequences

Train a Statistical Model

Page 10: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Background: Statistical Script Inference

10

New Test Document

NLP Pipeline • Syntax • Coreference

Single Event Sequence

Query Trained Statistical Model

Inferred Probable Events

Page 11: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Outline

• Background

• Statistical Scripts

• Recurrent Neural Nets

• Background

• Statistical Scripts

• Recurrent Neural Nets

11

Page 12: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Background: RNNs• Recurrent Neural Nets (RNNs): Neural Nets

with cycles in computation graph.

• RNN Sequence Models: Map inputs x1, …, xt to outputs o1, …, ot via learned latent vector states z1, …, zt.

12

Page 13: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Background: RNNs

[� [� [W

���

���

R� R� RW

]� ]� ]W

13

[Elman 1990]

Page 14: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Background: RNNs

• Hidden Unit can be arbitrarily complicated, as long as we can calculate gradients!

14

[�

���

���

R�

5118QLW

[�

R�

5118QLW

[7

R7

5118QLW

Page 15: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Background: LSTMs

• Long Short-Term Memory (LSTM): More complex hidden RNN unit. [Hochreiter & Schmidhuber, 1997]

• Explicitly addresses two issues:

• Vanishing Gradient Problem.

• Long-Range Dependencies.

15

Page 16: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Background: LSTMs• LSTMs recently successful on many hard NLP tasks:

• Machine Translation [Kalchbrenner & Blunsom 2013, Bahdanau et al. 2015].

• Captioning Images/Videos [Donahue et al. 2015, Venugopalan et al. 2015].

• Language Modeling [Sundermeyer et al. 2012, Kim et al. 2016].

• Question Answering [Hermann et al. 2015, Gao et al. 2015].

• Parsing [Vinyals et al. 2015].

16

Page 17: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Outline

• Background

• Methods

• Experiments

• Conclusion

17

• Background

• Methods

• Experiments

• Conclusion

Page 18: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Outline

• Background

• Methods

• Experiments

• Conclusion

18

• Background

• Methods

• Experiments

• Conclusion

Page 19: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

LSTM Script models • Train LSTM sequence model on event sequences.

• Events are (verbs + arguments).

• Arguments can have noun info, coref info, or both.

• To infer events, the model generates likely events from sequence.

19

Page 20: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

LSTM Script models• Mary’s late husband Matthew, whom she married at

21 because she loved him, …

20

[marry, mary, matthew, at, 21]; [love, she, him]

[�

���

���

R�

5118QLW

[�

R�

5118QLW

[7

R7

5118QLW

Page 21: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

LSTM Script models• Mary’s late husband Matthew, whom she married at

21 because she loved him, …

21

[marry, mary, matthew, at, 21]; [love, she, him]

[ ] [ ]PDU\

R�

PDWWKHZ

R�

DW

R�

��

R�

PDUU\

R�

/670

ORYH

R�

VKH

R�

KLP

R�

R�

R��

Page 22: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

LSTM Script models• Mary’s late husband Matthew, whom she married at

21 because she loved him, …

verb subj obj prep prep verb subj obj prep prep - 1 2 - - - 1 2 - -

22

[ ] [ ]PDU\

PDWWKHZ

PDWWKHZ

DW

DW

��

��

ORYH

PDUU\

PDU\

/670

ORYH

VKH

VKH

KLP

KLP

Page 23: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Outline

• Background

• Methods

• Experiments

• Conclusion

23

• Background

• Methods

• Experiments

• Conclusion

Page 24: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Experimental Setup• Train on English Wikipedia.

• Use Stanford CoreNLP to extract event sequences.

• Train LSTM using Batch Stochastic Gradient Descent with Momentum.

• To infer next events, have the LSTM generate additional events with highest probability.

24

Page 25: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Evaluation

• “Narrative Cloze” (Chambers & Jurafsky, 2008): from an unseen document, hold one event out, try to infer it given remaining document.

• “Recall at k” (Jans et al., 2012): make k top inferences, calculate recall of held-out events.

• (More metrics in the paper.)

25

Page 26: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Evaluation• Three Systems:

• Unigram: Always guess most common events.!

• Bigram: Variations of Pichotta & Mooney (2014)

• Uses event co-occurrence counts.

• Best-published system on task.!

• LSTM: LSTM script system (this work).

26

Page 27: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Results: Predicting Verbs & Coreference Info

Unigram

Bigram

LSTM

0 0.05 0.1 0.15 0.2

0.152

0.124

0.101

Recall at 25 for inferring Verbs & Coref info

27

Page 28: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Results: Predicting Verbs & Nouns

Unigram

Bigram

LSTM

0 0.02 0.04 0.06 0.08

0.061

0.037

0.025

Recall at 25 for inferring Verbs & Nouns

28

Page 29: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Human Evaluations

• Solicit judgments on individual inferences on Amazon Mechanical Turk.

• Have annotators rate inferences from 1-5 (or mark “Nonsense,” scored 0).

• More interpretable.

29

Page 30: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Results: Crowdsourced EvalRandom

Bigram

LSTM

0 1 2 3 4

3.67

2.87

0.87

30

“Neutral”

“Unlikely”

“Very Unlikely”

“Nonsense”

“Likely”

Page 31: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Generated “Story”

(bear, ., ., kingdom, into) (attend, she, brown, graduation, after) (earn, she, master, university, from) (admit, ., she, university, to) (receive,she,bachelor,university,from) (involve, ., she, production, in) (represent, she, company, ., .)

Born into a kingdom,… …she attended Brown after graduation She earned her Masters from the University She was admitted to a University She had received a bachelors from a University She was involved in the production She represented the company.

Generated event tuples English Descriptions

31

Page 32: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Conclusion

• Presented a method for inferring implicit events with LSTMs.

• Superior performance on reconstructing held-out events and inferring novel events.

32

Page 33: Learning Statistical Scripts with LSTM Recurrent Neural ... · PDF fileLearning Statistical Scripts with LSTM Recurrent Neural Networks ... • Language Modeling ... LSTM Script models

Thanks!

33