Upload
monica-ball
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Unsupervised Acquisition of Axioms to Paraphrase Noun Compounds and GenitivesCICLING 2012, New Delhi
Anselmo PeñasNLP & IR Group, UNED, Spain
Ekaterina OvchinnikovaUSC – Information Science Institute, USA
UNED
nlp.uned.es
Texts omit information
Humans optimize language generation effort
We omit information that we know the receptor is able to predict and recover
Our research goal is to make explicit the omitted information in texts
UNED
nlp.uned.es
Implicit predicates
In particular, some noun compounds and genitives are used in such way
In these cases, we want to recover the implicit predicates For example:
• Morning coffee -> coffee drunk in the morning• Malaria mosquito -> mosquito that carries malaria
UNED
nlp.uned.es
How to find the candidates? Nakov & Hearst 2006
Search the web• N1 N2 -> N2 THAT * N1• Malaria mosquito -> mosquito THAT *
malaria Here we use Proposition Stores
Harvest a text collection that will serve as context
Parse documents Count N-V-N, N-V-P-N, N-P-N, … structures Build Proposition Stores (Peñas & Hovy,
2010)
UNED
nlp.uned.es
Proposition Stores
Example: propositions that relateBomb, attack
• npn:[bomb:n, in:in, attack:n]:13.• nvpn:[bomb:n, explode:v, in:in,
attack:n]:11.• nvnpn:[bomb:n, kill:v, people:n, in:in,
attack:n]:8.• npn:[attack:n, with:in, bomb:n]:8.• …
All of them could be paraphrases for the noun compound “bomb attack”
UNED
nlp.uned.es
NE Semantic Classes
Now, What happens if we have a Named Entity?
Shakespeare’s tragedy -> write
Why? Consider
• John’s tragedy• Airbus’ tragedy
UNED
nlp.uned.es
NE Semantic Classes
We are considering the “semantic classes” of the NE
Shakespeare -> writerwriter, tragedy -> write
UNED
nlp.uned.es
Class-Instance relations
Fortunately, relevant semantic classes are pointed out in texts through well-known structures
• appositions, copulative verbs, “such as”, …
Here we take advantage of dependency parsing to get class-instance relationsNNP
NN
nn
NNP
NN
appos
NNP
NN
be
UNED
nlp.uned.es
Class-Instance relations
World News
has_instance(leader,'Yasir':'Arafat'):1491.has_instance(spokesman,'Marlin':'Fitzwater'):1001.has_instance(leader,'Mikhail':'S.':'Gorbachev'):980.has_instance(chairman,'Yasir':'Arafat'):756.has_instance(agency,'Tass'):637.has_instance(leader,'Radovan':'Karadzic'):611.has_instance(adviser,'Condoleezza':'Rice'):590.
…
UNED
nlp.uned.es
So far
Propositions: <p,a> | P(p,a)p: predicatea: list of arguments <a1 …an>
P(p,a): joint probability
Class-instance relations: <c,i> | P(c,i)c: classi: instanceP(c,i): joint probability
UNED
nlp.uned.es
Probability of a predicate
Let’s consider the following exampleFavre pass
Assume the text has pointed out he is a
quarterback What is Favre doing with the pass?
The same as other quarterbacks• The quarterbacks we observed before in
the background collection – Proposition Store
UNED
nlp.uned.es
Probability of a predicate
Favre pass -> p | P(p|i)Favre -> quarterback | P(c|i)
quarterback, pass -> throw | P(p|c)
We already have:
We need to estimate: P(p|c) (What other quarterbacks do with passes)
ic
cpPicPipP )|()|()|(
n
kkk icPicP
1
)|()|(
UNED
nlp.uned.es
Probability of a predicate
quarterback pass -> p | P(p|c)• Steve:Young pass -> throw | P(p|i)• Culpepper pass -> complete | P(p|i)• …
We already have
and P(p|i) comes from previous observation: Proposition Store
ci
ipPciPcpP )|()|()|(
n
kkk ciPciP
1
)|()|(
UNED
nlp.uned.es
Evaluation
We want to address the following questions Do we find the paraphrases required to
enable Textual Entailment?
Do all the noun-noun dependencies need to be paraphrased?
How frequently NEs appear in them?
UNED
nlp.uned.es
Experimental setting
Proposition Store from216,303 World News7,800,000 sentences parsed
RTE-2 (Recognizing Textual Entailment)83 entailment decisions depend
on noun-noun paraphrases 77 different noun-noun
paraphrases
UNED
nlp.uned.es
Results
How frequently NEs appear in these pairs? 82% of paraphrases contain at least one NE 62% are paraphrasing NE-N (e.g. Vikings
quarterback)
UNED
nlp.uned.es
Results
Do all the noun-noun dependencies need to be paraphrased? No, only 54% in our test set Some compounds encode semantic relations
such as: 12% are locative relations (e.g. New York club) Temporal relations (e.g. April 23rd strike , Friday semi-final) Class-instance relations (e.g. quarterback Favre) Measure, …
Some are trivial: 27% are paraphrased with “of”
UNED
nlp.uned.es
Results
Do we find the paraphrases required to enable Textual Entailment? Yes in 63% of non-trivial cases
Proposition type
Paraphrase
NPN Jackson trial ↔ trial against Jackson engine problem ↔ problem with engine
NVN U.S. Ambassador ↔ Ambassador represents the U.S.ETA bombing ↔ ETA carried_out bombing
NVNPN wife of Joseph Wilson ↔ wife is married to Joseph Wilson
NVPN Vietnam veteran ↔ veteran comes from VietnamShapiro’s office ↔ Shapiro work in office
Germany's people ↔ people live in Germany Abu Musab al-Zarqawi's group ↔ group led by Abu Musab al-
Zarqawi
UNED
nlp.uned.es
Results
RTE-2 pair 485: Paraphrase not found
United Nations vehicle ↔ United Nations produces vehicles
United Nations doesn’t share any class with the instances that “produce vehicles”
Toyota vehicle -> develop, build, sell, produce, make, export, recall, assemble, …
UNED
nlp.uned.es
Conclusions
A significant proportion of noun-noun dependencies includes Named Entities
Some noun-noun dependencies don’t require the retrieval of implicit predicates
The method proposed is sensitive to different Nes Different NEs retrieve different predicates
Current work: to select the most relevant paraphrase according to the text We are exploring weighted abduction