21
Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and Genitives CICLING 2012, New Delhi Anselmo Peñas NLP & IR Group, UNED, Spain Ekaterina Ovchinnikova USC – Information Science Institute, USA

Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and Genitives CICLING 2012, New Delhi Anselmo Peñas NLP & IR Group, UNED, Spain Ekaterina

Embed Size (px)

Citation preview

Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and GenitivesCICLING 2012, New Delhi

Anselmo PeñasNLP & IR Group, UNED, Spain

Ekaterina OvchinnikovaUSC – Information Science Institute, USA

UNED

nlp.uned.es

Texts omit information

Humans optimize language generation effort

We omit information that we know the receptor is able to predict and recover

Our research goal is to make explicit the omitted information in texts

UNED

nlp.uned.es

Implicit predicates

In particular, some noun compounds and genitives are used in such way

In these cases, we want to recover the implicit predicates For example:

• Morning coffee -> coffee drunk in the morning• Malaria mosquito -> mosquito that carries malaria

UNED

nlp.uned.es

How to find the candidates? Nakov & Hearst 2006

Search the web• N1 N2 -> N2 THAT * N1• Malaria mosquito -> mosquito THAT *

malaria Here we use Proposition Stores

Harvest a text collection that will serve as context

Parse documents Count N-V-N, N-V-P-N, N-P-N, … structures Build Proposition Stores (Peñas & Hovy,

2010)

UNED

nlp.uned.es

Proposition Stores

Example: propositions that relateBomb, attack

• npn:[bomb:n, in:in, attack:n]:13.• nvpn:[bomb:n, explode:v, in:in,

attack:n]:11.• nvnpn:[bomb:n, kill:v, people:n, in:in,

attack:n]:8.• npn:[attack:n, with:in, bomb:n]:8.• …

All of them could be paraphrases for the noun compound “bomb attack”

UNED

nlp.uned.es

NE Semantic Classes

Now, What happens if we have a Named Entity?

Shakespeare’s tragedy -> write

Why? Consider

• John’s tragedy• Airbus’ tragedy

UNED

nlp.uned.es

NE Semantic Classes

We are considering the “semantic classes” of the NE

Shakespeare -> writerwriter, tragedy -> write

UNED

nlp.uned.es

Class-Instance relations

Fortunately, relevant semantic classes are pointed out in texts through well-known structures

• appositions, copulative verbs, “such as”, …

Here we take advantage of dependency parsing to get class-instance relationsNNP

NN

nn

NNP

NN

appos

NNP

NN

be

UNED

nlp.uned.es

Class-Instance relations

World News

has_instance(leader,'Yasir':'Arafat'):1491.has_instance(spokesman,'Marlin':'Fitzwater'):1001.has_instance(leader,'Mikhail':'S.':'Gorbachev'):980.has_instance(chairman,'Yasir':'Arafat'):756.has_instance(agency,'Tass'):637.has_instance(leader,'Radovan':'Karadzic'):611.has_instance(adviser,'Condoleezza':'Rice'):590.

UNED

nlp.uned.es

So far

Propositions: <p,a> | P(p,a)p: predicatea: list of arguments <a1 …an>

P(p,a): joint probability

Class-instance relations: <c,i> | P(c,i)c: classi: instanceP(c,i): joint probability

UNED

nlp.uned.es

Probability of a predicate

Let’s consider the following exampleFavre pass

Assume the text has pointed out he is a

quarterback What is Favre doing with the pass?

The same as other quarterbacks• The quarterbacks we observed before in

the background collection – Proposition Store

UNED

nlp.uned.es

Probability of a predicate

Favre pass -> p | P(p|i)Favre -> quarterback | P(c|i)

quarterback, pass -> throw | P(p|c)

We already have:

We need to estimate: P(p|c) (What other quarterbacks do with passes)

ic

cpPicPipP )|()|()|(

n

kkk icPicP

1

)|()|(

UNED

nlp.uned.es

Probability of a predicate

quarterback pass -> p | P(p|c)• Steve:Young pass -> throw | P(p|i)• Culpepper pass -> complete | P(p|i)• …

We already have

and P(p|i) comes from previous observation: Proposition Store

ci

ipPciPcpP )|()|()|(

n

kkk ciPciP

1

)|()|(

UNED

nlp.uned.es

Evaluation

We want to address the following questions Do we find the paraphrases required to

enable Textual Entailment?

Do all the noun-noun dependencies need to be paraphrased?

How frequently NEs appear in them?

UNED

nlp.uned.es

Experimental setting

Proposition Store from216,303 World News7,800,000 sentences parsed

RTE-2 (Recognizing Textual Entailment)83 entailment decisions depend

on noun-noun paraphrases 77 different noun-noun

paraphrases

UNED

nlp.uned.es

Results

How frequently NEs appear in these pairs? 82% of paraphrases contain at least one NE 62% are paraphrasing NE-N (e.g. Vikings

quarterback)

UNED

nlp.uned.es

Results

Do all the noun-noun dependencies need to be paraphrased? No, only 54% in our test set Some compounds encode semantic relations

such as: 12% are locative relations (e.g. New York club) Temporal relations (e.g. April 23rd strike , Friday semi-final) Class-instance relations (e.g. quarterback Favre) Measure, …

Some are trivial: 27% are paraphrased with “of”

UNED

nlp.uned.es

Results

Do we find the paraphrases required to enable Textual Entailment? Yes in 63% of non-trivial cases

Proposition type

Paraphrase

NPN Jackson trial ↔ trial against Jackson engine problem ↔ problem with engine

NVN U.S. Ambassador ↔ Ambassador represents the U.S.ETA bombing ↔ ETA  carried_out bombing

NVNPN wife of Joseph Wilson ↔ wife is married to Joseph Wilson

NVPN Vietnam veteran ↔ veteran comes from VietnamShapiro’s office ↔ Shapiro work in office

Germany's people ↔ people live in Germany Abu Musab al-Zarqawi's group ↔ group led by Abu Musab al-

Zarqawi

UNED

nlp.uned.es

Results

RTE-2 pair 485: Paraphrase not found

United Nations vehicle ↔ United Nations produces vehicles

United Nations doesn’t share any class with the instances that “produce vehicles”

Toyota vehicle -> develop, build, sell, produce, make, export, recall, assemble, …

UNED

nlp.uned.es

Conclusions

A significant proportion of noun-noun dependencies includes Named Entities

Some noun-noun dependencies don’t require the retrieval of implicit predicates

The method proposed is sensitive to different Nes Different NEs retrieve different predicates

Current work: to select the most relevant paraphrase according to the text We are exploring weighted abduction

Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and GenitivesCICLING 2012, New Delhi

Thanks!