40
lti Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

Embed Size (px)

Citation preview

Page 1: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Extracting Simplified Statementsfor Factual Question Generation

Michael Heilman and Noah A. Smith

1

Page 2: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Automatic Factual Question Generation (QG)

Input: textOutput: questions for reading assessment (e.g., for a closed-book quiz)

2

We focus on sentence-level factual questions.We focus on sentence-level factual questions.

Page 3: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

…Prime Minister Vladimir V. Putin, the country's paramount leader, cut short a trip to Siberia, returning to Moscow to oversee the federal response. Mr. Putin built his reputation in part on his success at suppressing terrorism, so the attacks could be considered a challenge to his stature….

The ProblemIn complex sentences, facts can be presented with varied and complex linguistic constructions.

3

Page 4: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

In complex sentences, facts can be presented with varied and complex linguistic constructions.

…Prime Minister Vladimir V. Putin, the country's paramount leader, cut short a trip to Siberia, returning to Moscow to oversee the federal response. Mr. Putin built his reputation in part on his success at suppressing terrorism, so the attacks could be considered a challenge to his stature….

The Problem

4

main clause

Page 5: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

In complex sentences, facts can be presented with varied and complex linguistic constructions.

…Prime Minister Vladimir V. Putin, the country's paramount leader, cut short a trip to Siberia, returning to Moscow to oversee the federal response. Mr. Putin built his reputation in part on his success at suppressing terrorism, so the attacks could be considered a challenge to his stature….

The Problem

5

main clause appositive

Page 6: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

In complex sentences, facts can be presented with varied and complex linguistic constructions.

…Prime Minister Vladimir V. Putin, the country's paramount leader, cut short a trip to Siberia, returning to Moscow to oversee the federal response. Mr. Putin built his reputation in part on his success at suppressing terrorism, so the attacks could be considered a challenge to his stature….

The Problem

6

main clause appositive

participial phrase

Page 7: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

In complex sentences, facts can be presented with varied and complex linguistic constructions.

…Prime Minister Vladimir V. Putin, the country's paramount leader, cut short a trip to Siberia, returning to Moscow to oversee the federal response. Mr. Putin built his reputation in part on his success at suppressing terrorism, so the attacks could be considered a challenge to his stature….

The Problem

7

main clause appositive

participial phraseconjunction of clauses

Page 8: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

In complex sentences, facts can be presented with varied and complex linguistic constructions.

• Prime Minister Vladimir V. Putin cut short a trip to Siberia.• Prime Minister Vladimir V. Putin was the country's

paramount leader.• Prime Minister Vladimir V. Putin returned to Moscow to

oversee the federal response. • Mr. Putin built his reputation in part on his success at

suppressing terrorism.• The attacks could be considered a challenge to his stature.

The Problem

8

Output:Output:

Page 9: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

The Rest of the Talk

Input: complex sentenceOutput: set of simple declarative sentences

Our method:• Uses rules to extract and simplify sentences• Is motivated by linguistic knowledge• Outperformed a sentence compression baseline

9

Easier to convert into questionsEasier to convert into questions

Page 10: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Outline

• Introduction and motivation• Our Approach• Simplification and extraction operations• Evaluation• Conclusions

10

Page 11: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Alternative: Sentence CompressionInput: Complex sentenceOutput: Simpler sentence that conveys the main

point.

Suitable for QG?• Only one output per input• Most methods only delete words

11

Knight & Marcu 2000; Dorr et al. 2003;McDonald 2006; Clarke 2008; Martins & Smith 2009;inter alia

Knight & Marcu 2000; Dorr et al. 2003;McDonald 2006; Clarke 2008; Martins & Smith 2009;inter alia

Page 12: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Our Approach

• We extract and simplify multiple statements from complex sentences.

• We include operations for various syntactic constructions.– encoded with pattern matching rules for trees

12

Similar work: Klebanov et al. 2004Similar work: Klebanov et al. 2004

Page 13: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Example: Extracting from Appositives

13

Input: Putin, the Russian Prime Minister, visited Moscow.

Desired Output: Putin was the Russian Prime Minister.

Page 14: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Example: Extracting from Appositives

14

NP

Putin visited

VBD

NP

ROOT

S

,

VP

, ,

, NP

Siberia

NP

the Russian Prime Minister

(mainverb)(appositive)(noun)

Page 15: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Example: Extracting from Appositives

15

NP < (NP=noun !$-- NP $+ (/,/ $++ NP|PP=appositive !$CC|CONJP)) >> (ROOT << /^VB.*/=mainverb)

NP

Putin visited

VBD

NP

ROOT

S

,

VP

, ,

, NP

Siberia

NP

the Russian Prime Minister

(mainverb)(appositive)(noun)

Page 16: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Example: Extracting from Appositives

16

NP

Putin visited

VBDNP

the Russian Prime Minister

Page 17: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Example: Extracting from Appositives

17

NP

Putin was

VBDNP

the Russian Prime Minister

Singular past tense form of beSingular past tense form of be

Page 18: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Example: Extracting from Appositives

18

was

VBDNP

Putin

NP

the Russian Prime Minister

S

ROOT

VP

Page 19: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Implementation

• Representation: phrase structure trees from the Stanford Parser

• Syntactic rules are written in the Tregex tree searching language– Tregex operators encode tree relations such as

dominance, sisterhood, etc.

19

Klein & Manning 2003Klein & Manning 2003

Levy & Andrew 2006Levy & Andrew 2006

Page 20: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Outline

• Introduction and motivation• Our Approach• Simplification and extraction operations• Evaluation• Conclusions

20

Page 21: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Encoding Linguistic Knowledge

Given an input sentence A that is assumed true, we aim to extract sentences B that are also true.

Our operations are informed by two phenomena:• semantic entailment • presupposition

21

Page 22: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Semantic Entailment

A entails B:B is true whenever A is true.

22

Levinson 1983Levinson 1983

Page 23: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

A: However, Jefferson did not believe the Embargo Act, which restricted trade with Europe, would hurt the American economy.

Simplification by Removing Modifiers

23

Entailment holds when removing certain types of modifiers.Entailment holds when removing certain types of modifiers.

Page 24: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

A: However, Jefferson did not believe the Embargo Act, which restricted trade with Europe, would hurt the American economy.

Simplification by Removing Modifiers

24

Entailment holds when removing certain types of modifiers.Entailment holds when removing certain types of modifiers.

discourse marker non-restrictive relative clause

Page 25: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

A: However, Jefferson did not believe the Embargo Act, which restricted trade with Europe, would hurt the American economy.

Simplification by Removing Modifiers

25

B: Jefferson did not believe the Embargo Act would hurt the American economy.

Entailment holds when removing certain types of modifiers.Entailment holds when removing certain types of modifiers.

discourse marker non-restrictive relative clause

Page 26: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Extracting from Conjunctions

26

In most clausal and verbal conjunctions, the individual conjuncts are entailed.In most clausal and verbal conjunctions, the individual conjuncts are entailed.

A: Mr. Putin built his reputation in part on his success at suppressing terrorism, so the attacks could be considered a challenge to his stature.

B2: The attacks could be considered a challenge to his stature.

B1: Mr. Putin built his reputation in part on his success at suppressing terrorism.

Page 27: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Extracting from Presuppositions

In some constructions, B is true regardless of whether the main clause of sentence A is true.•i.e., B is presupposed to be true.

In some constructions, B is true regardless of whether the main clause of sentence A is true.•i.e., B is presupposed to be true.

27

Levinson 1983Levinson 1983

A: Hamilton did not like Jefferson, the third U.S. President.

B: Jefferson was the third U.S. President.

negation of main clausenegation of main clause

Page 28: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Presupposition TriggersMany presuppositions have clear syntactic or lexical associations.

28

Trigger Examplenon-restrictive appositives Jefferson, the third U.S. President,

… non-restrictive relative clauses

Jefferson, who was the third U.S. President…

participial modifiers Jefferson, being the third U.S. President, …

temporal subordinate clauses

Before Jefferson was the third U.S. President, …

Jefferson was the third U.S. President.Jefferson was the third U.S. President.

Page 29: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

(Over)simplified PseudocodeTake as input a tree t.Extract a set of declarative sentence trees Textracted from

constructions in t.

For each t’ in Textracted :Simplify t’ by removing modifiers.Extract trees Tconjuncts from conjunctions in t’.

For each tconjunct in Tconjuncts :Tresult = Tresult {tconjunct}

Return Tresult

29

by entailment

by entailment

primarily by presuppositionprimarily by presupposition

Page 30: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Outline

• Introduction and motivation• Our Approach• Simplification and extraction operations• Evaluation• Conclusions

30

Page 31: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Baselines• HedgeTrimmer– A rule-based sentence compression algorithm– Iteratively performs simplifying operations until

the input is less than a specified length (15 here).

• “Main clause only”– Only the simplified main clause extracted by the

full system.

31

Dorr et al. 2003Dorr et al. 2003

Both baselines produce one output per input.Both baselines produce one output per input.

Page 32: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Research Questions1. How long are the simplified outputs and how

many are there?– Extracted statements from 25 previously unseen

Encyclopedia Britannica articles about cities.

2. How well do the extracted statements cover the information in the input texts?– % of input words in at least one output.

32

Barzilay & Elhadad 2003Barzilay & Elhadad 2003

Page 33: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Results: Length & Coverage

33

Input Texts

Hedge-Trimmer

Main clause only

Full

Sentences per input

1.00 1.00 1.00 1.63

Sentence length

23.5 12.4 15.4 13.4

Word coverage (%)

100.0 61.4 69.1 87.9

Page 34: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Research Questions

3. How well does our system preserve fluency and correctness?– Two raters judged simplified outputs for fluency

and correctness using 1-5 scales.– We averaged the raters’ scores.

34

Inter-rater agreement: r = .92 for fluencyr = .82 for correctness

Inter-rater agreement: r = .92 for fluencyr = .82 for correctness

Page 35: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Results: Fluency & Correctness

Input Texts

Hedge-Trimmer

Main clause only

Full

Fluency - 3.53 4.72 4.75Correctness - 3.75 4.71 4.67Rated 5 for both (%)

- 44.0 78.7 75.0

35

Differences between HedgeTrimmer and Full are statistically significant (p < .05).Differences between HedgeTrimmer and Full are statistically significant (p < .05).

Page 36: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Outline

• Introduction and motivation• Our Approach• Simplification and extraction operations• Evaluation• Conclusions

36

Page 37: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Conclusions

• Method for extracting simplified declarative statements from complex sentences.

• Outperformed a text compression baseline.– More outputs and better coverage– Higher % of fluent and correct outputs

• Future work: evaluation of this as a component in a QG system.

37

Heilman & Smith 2010Heilman & Smith 2010

Page 38: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

Questions?

38

Demo & code release available on my website.http://www.cs.cmu.edu/~mheilman

Demo & code release available on my website.http://www.cs.cmu.edu/~mheilman

Page 39: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

A Whale of a Sentence

“As they narrated to each other their unholy adventures, their tales of terror told in words of mirth; as their uncivilized laughter forked upwards out of them, like the flames from the furnace; as to and fro, in their front, the harpooneers wildly gesticulated with their huge pronged forks and dippers; as the wind howled on, and the sea leaped, and the ship groaned and dived, and yet steadfastly shot her red hell further and further into the blackness of the sea and the night, and scornfully champed the white bone in her mouth, and viciously spat round her on all sides; then the rushing Pequod, freighted with savages, and laden with fire, and burning a corpse, and plunging into that blackness of darkness, seemed the material counterpart of her monomaniac commander's soul.”

39

Melville 1851Melville 1851

Gold standard parse:Gold standard parse:

133 word sentence from Moby Dick:133 word sentence from Moby Dick:

Page 40: Extracting Simplified Statements for Factual Question Generation Michael Heilman and Noah A. Smith 1

lti

A Whale of a Sentence

40

1. The rushing Pequod seemed the material counterpart of her monomaniac commander's soul.2. They narrated to each other their unholy adventures.3. Their uncivilized laughter forked upwards out of them.4. The harpooneers wildly gesticulated with their huge pronged forks and dippers in their front.5. The wind howled on.6. The sea leaped.7. The ship groaned.8. The ship dived.9. The ship steadfastly shot her red hell further and further into the blackness of the sea and the night.10. The ship scornfully champed the white bone in her mouth.11. The ship viciously spat round her on all sides.12. The rushing Pequod was freighted with savages.13. The rushing Pequod was laden with fire.14. The rushing Pequod was burning a corpse.15. The rushing Pequod was plunging into that blackness of darkness.16. Their unholy adventures were their tales of terror told in words of mirth.

System output:System output: