Linguistic Computing Laboratory Dipartimento di Informatica Universit à di Roma “La Sapienza”

© J

oh

an

Bos

Ap

ril 2

00

6

When logical inference helps in determining textual entailment

(and when it doesn’t)

Johan Bos & Katja Markert

Linguistic Computing LaboratoryDipartimento di Informatica

Università di Roma “La Sapienza”

Natural Language Processing GroupComputer Science Department

University of Leeds

© J

oh

an

Bos

Ap

ril 2

00

6

Aristotle’s Syllogisms

All men are mortal.

Socrates is a man.

-------------------------------

Socrates is mortal.

ARISTOTLE 1 (TRUE)

© J

oh

an

Bos

Ap

ril 2

00

6

Talk Outline

• Hybrid system combining:– Shallow semantic approach – Deep semantic approach

• Machine Learning– Features of both approaches are

combined in one classifier

© J

oh

an

Bos

Ap

ril 2

00

6

Shallow Semantic Analysis

• Primarily based on word overlap• Using weighted lemmas• Weights correspond to inverse doc. freq.

– Web as corpus– Wordnet for synonyms

• Additional features– Number of words in T and H– Type of dataset

© J

oh

an

Bos

Ap

ril 2

00

6

Deep Semantic Analysis

• Compositional Semantics– How to build semantic representations

for the text and hypothesis– Do this in a systematic way

• Logical Inference– FOL theorem proving– FOL model building

© J

oh

an

Bos

Ap

ril 2

00

6

Compositional Semantics

• The ProblemGiven a natural language expression, how do we convert it into a logical formula?

• Frege’s principleThe meaning of a compound expression is a function of the meaning of its parts.

© J

oh

an

Bos

Ap

ril 2

00

6

Compositional Semantics

• We need a theory of syntax, to determine the parts of a natural language expression

• We will use CCG

• We need a theory of semantics, to determine the meaning of the parts

• We will use DRT

• We need a technique to combine the parts• We will use the Lambda-calculus

© J

oh

an

Bos

Ap

ril 2

00

6

Combinatorial Categorial Grammar

• CCG is a lexicalised theory of grammar (Steedman 2001)

• Deals with complex cases of coordination and long-distance dependencies

• Lexicalised, hence easy to implement– English wide-coverage grammar– Fast robust parser available

© J

oh

an

Bos

Ap

ril 2

00

6

Discourse Representation Theory

• Well understood semantic formalism– Scope, anaphora, presupposition, tense, etc. – Kamp `81, Kamp & Reyle `93, Van der Sandt `92

• Semantic representations (DRSs) can be build using traditional tools– Lambda calculus– Underspecification

• Model-theoretic interpretation – Inference possible– Translation to first-order logic

© J

oh

an

Bos

Ap

ril 2

00

6

CCG/DRT example

NP/N:a N:spokesman S\NP:lied

p. q. ;p(x);q(x) z. x. x(y. )spokesman(z)

x e

lie(e)

agent(e,y)

© J

oh

an

Bos

Ap

ril 2

00

6

CCG/DRT example


p. q. ;p(x);q(x) z. x.x(y. )

-------------------------------------------------------- (FA)

NP: a spokesman

p. q. ;p(x);q(x)(z. )

spokesman(z)

x

spokesman(z)

e

lie(e)

agent(e,y)

x

© J

oh

an

Bos

Ap

ril 2

00

6

CCG/DRT example


p. q. ;p(x);q(x) z. x.x(y. )

-------------------------------------------------------- (FA)

NP: a spokesman

q. ; ;q(x))

spokesman(z)

x

spokesman(x)

e

lie(e)

agent(e,y)

x

© J

oh

an

Bos

Ap

ril 2

00

6

CCG/DRT example


p. q. ;p(x);q(x) z. x.x(y. )

-------------------------------------------------------- (FA)

NP: a spokesman

q. ;q(x)

spokesman(z)

x

x

spokesman(x)

e

lie(e)

agent(e,y)

© J

oh

an

Bos

Ap

ril 2

00

6

CCG/DRT example


p. q. ;p(x);q(x) x. x.x(y. )

-------------------------------------------------------- (FA)

NP: a spokesman

q. ;q(x)

-------------------------------------------------------------------------------- (BA)

S: a spokesman lied

x.x(y. ) (q. ;q(x))

spokesman(z)

x

x

spokesman(x)

e

lie(e)

agent(e,y)

e

lie(e)

agent(e,y)

x

spokesman(x)

© J

oh

an

Bos

Ap

ril 2

00

6

CCG/DRT example


p. q. ;p(x);q(x) x. x.x(y. )

-------------------------------------------------------- (FA)

NP: a spokesman

q. ;q(x)

-------------------------------------------------------------------------------- (BA)

S: a spokesman lied

;

spokesman(x)

x

x

spokesman(x)

e

lie(e)

agent(e,y)

e

lie(e)

agent(e,x)

x

spokesman(x)

© J

oh

an

Bos

Ap

ril 2

00

6

CCG/DRT example


p. q. ;p(x);q(x) x. x.x(y. )

-------------------------------------------------------- (FA)

NP: a spokesman

q. ;q(x)

-------------------------------------------------------------------------------- (BA)

S: a spokesman lied

spokesman(x)

x

x

spokesman(x)

e

lie(e)

agent(e,y)

x e

spokesman(x)

lie(e)

agent(e,x)

© J

oh

an

Bos

Ap

ril 2

00

6

The Clark & Curran Parser

• Use standard statistical techniques– Robust wide-coverage parser – Clark & Curran (ACL 2004)

• Grammar derived from CCGbank– 409 different categories– Hockenmaier & Steedman (ACL 2002)

• Results: 96% coverage WSJ– Bos et al. (COLING 2004)– Example output:

© J

oh

an

Bos

Ap

ril 2

00

6

Logical Inference

• How do we perform inference with DRSs? – Translate DRS into first-order logic– Use off-the-shelf inference engines

• What kind of inference engines?– Theorem Prover:

Vampire (Riazanov & Voronkov 2002)

– Model Builder: Paradox

© J

oh

an

Bos

Ap

ril 2

00

6

Using Theorem Proving

• Given a textual entailment pair T/H:– Produce DRSs for T and H– Translate these DRSs into FOL– Give to the theorem prover: T’ H’

• If a proof is found, then T entails H

• Good results for examples with:

– apposition, relative clauses, coordination

– intersective adjectives, noun noun compounds

– passive/active alternations

© J

oh

an

Bos

Ap

ril 2

00

6

Example (Vampire: proof)

On Friday evening, a car bomb exploded

outside a Shiite mosque in Iskandariyah,

30 miles south of the capital.

-----------------------------------------------------

A bomb exploded outside a mosque.

RTE-2 112 (TRUE)

© J

oh

an

Bos

Ap

ril 2

00

6

Example (Vampire: proof)

Initially, the Bundesbank opposed the

introduction of the euro but was compelled

to accept it in light of the political pressure

of the capitalist politicians who supportedits introduction.

-----------------------------------------------------

The introduction of the euro has been opposed.

RTE-2 489 (TRUE)

© J

oh

an

Bos

Ap

ril 2

00

6

Background Knowledge

• Many examples in the RTE dataset require additional knowledge– Lexical knowledge– Linguistic Knowledge – World knowledge

• Generate Background Knowledge for T&H in first order logic

• Give this to the theorem prover: (BK & T’) H’

© J

oh

an

Bos

Ap

ril 2

00

6

Lexical Knowledge

• We use WordNet as a start to get additional knowledge

• All of WordNet is too much, so we create MiniWordNets– Based on hyponym relations– Remove redundant information– Conversion in first order logic

© J

oh

an

Bos

Ap

ril 2

00

6

Linguistic Knowledge

• Manually coded rules – Possessives– Active/passive alternation– Noun noun compound interpretation

© J

oh

an

Bos

Ap

ril 2

00

6

Linguistic & World Knowledge

• Manually coded 115 rules – Spatial knowledge– Causes of death– Winning prizes or awards– Family relations– Diseases– Producers– Employment– Ownership

© J

oh

an

Bos

Ap

ril 2

00

6

Knowledge at work

• Background Knowledge:x(soar(x)rise(x))

Crude oil prices soared to record levels.

-----------------------------------------------------

Crude oil prices rise.

RTE 1952 (TRUE)

© J

oh

an

Bos

Ap

ril 2

00

6

Troubles with theorem proving

• Theorem provers are extremely precise

• They won’t tell you when there is “almost” a proof

• Even if there is a little background knowledge missing, Vampire will say:

NO

© J

oh

an

Bos

Ap

ril 2

00

6

Vampire: no proof

RTE 1049 (TRUE)

Four Venezuelan firefighters who were traveling

to a training course in Texas were killed when their

sport utility vehicle drifted onto the shoulder of a

Highway and struck a parked truck.

----------------------------------------------------------------

Four firefighters were killed in a car accident.

© J

oh

an

Bos

Ap

ril 2

00

6

Using Model Building

• Need a robust way of inference• Use model builders

– Paradox (Claessen & Sorensson 2003)– Mace (McCune)

• Produce minimal model by iteration of domain size

• Use size of models to determine entailment– Compare size of model of T and T&H– If the difference is small, then it is likely that T

entails H

© J

oh

an

Bos

Ap

ril 2

00

6

Using Model Building

• Given a textual entailment pair T/H withtext T and hypothesis H:– Produce DRSs for T and H– Translate these DRSs into FOL– Generate Background Knowledge– Give this to the Model Builder:

i) BK & T’

ii) BK & T’ & H’

• If the models for i) and ii) are similar in size, then T entails H

© J

oh

an

Bos

Ap

ril 2

00

6

Features for Classifier

• Features from deep analysis:– proof (yes/no)– inconsistent (yes/no)– domain size, model size– domain size difference, abs and relative– model size difference, abs and relative

• Combine this with features from shallow approach

• Machine learning took WEKA

© J

oh

an

Bos

Ap

ril 2

00

6

RTE2 Results

Shallow Deep

IE 0.51 0.55

IR 0.66 0.64

QA 0.57 0.53

SUM 0.74 0.71

all 0.62 0.61

© J

oh

an

Bos

Ap

ril 2

00

6

Conclusions

• Why relatively low results?– Recall for feature proof is low– Most proofs are also found by word

overlap– Same for small domain size differences

• Not only bad news– Deep analysis more consistent across

different datasets

© J

oh

an

Bos

Ap

ril 2

00

6

Future Stuff

• Error analysis!– Difficult, dataset not focussed– Many different sources of errors– Prepare more focussed datasets for system

development?

• Use better techniques for usingnumeric features

• Improve linguistic analysis• More background knowledge!

Documents

Linguistic Computing Laboratory Dipartimento di Informatica Universit à di Roma “La Sapienza”