Upload
derek-levine
View
32
Download
1
Embed Size (px)
DESCRIPTION
A Generative Model for Parsing Natural Language to Meaning Representations. Luke S. Zettlemoyer Massachusetts Institute of Technology. Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore. Classic Goal of NLP: Understanding Natural Language. - PowerPoint PPT Presentation
Citation preview
A Generative Model for Parsing Natural Language to Meaning Representations
Wei Lu, Hwee Tou Ng, Wee Sun Lee
National University of Singapore
Luke S. Zettlemoyer
Massachusetts Institute of Technology
Classic Goal of NLP: Understanding Natural Language
• Mapping Natural Language (NL) to Meaning Representations (MR)
How many states do not have rivers ?
2
Natural Language Sentence Meaning Representation
… … … … … … …
Meaning Representation (MR)
3
do not havestates riversHow many ?
QUERY:answer(NUM)
NUM:count(STATE)
STATE:exclude(STATE STATE)
STATE:state(all) STATE:loc_1(RIVER)
RIVER:river(all)
MR production
• Meaning representation production (MR production)
• Example:
NUM:count(STATE)
• Semantic category: NUM• Function symbol: count• Child semantic category: STATE• At most 2 child semantic categories
Task Description
• Training data: NL-MR pairs• Input: A new NL sentence• Output: An MR
Challenge
• Mapping of individual NL words to their associated MR productions is not given in the NL-MR pairs
Mapping Words to MR Productions
do not ?how many states have rivers
QUERY:answer(NUM)
NUM:count(STATE)
STATE:exclude(STATE STATE)
STATE:state(all) STATE:loc_1(RIVER)
RIVER:river(all)
7
Talk Outline
• Generative model – Goal: flexible model that can parse a wide range of
input sentences– Efficient algorithms for EM training and decoding– In practice: correct output is often in top-k list, but is not
always the best scoring option
• Reranking– Global features
• Evaluation– Generative model combined with reranking technique
achieves state-of-the-art performance
8
Hybrid Tree
do not havestates riversHow many ?
QUERY:answer(NUM)
NUM:count(STATE)
STATE:exclude(STATE STATE)
STATE:state(all) STATE:loc_1(RIVER)
RIVER:river(all)
9
NL-MR Pair
Hybrid sequences
Model Parameters
do not
havestates
rivers
How many
?
QUERY:answer(NUM)
NUM:count(STATE)
STATE:exclude(STATE STATE)
STATE:state(all) STATE:loc_1(RIVER)
RIVER:river(all)
P(w,m,T)=P(QUERY:answer(NUM)|-,arg=1)*P(NUM ?|QUERY:answer(NUM)) *P(NUM:count(STATE)|QUERY:answer(NUM),arg=1)*P(How many STATE|NUM:count(STATE))*P(STATE:exclude(STATE STATE)|NUM:count(STATE),arg=1) *P(STATE1 do not STATE2|STATE:exclude(STATE STATE))*P(STATE:state(all)|STATE:exclude(STATE STATE),arg=1)*P(states|STATE:state(all))*P(STATE:loc_1(RIVER)|STATE:exclude(STATE STATE),arg=2)*P(have RIVER|STATE:loc_1(RIVER))*P(RIVER:river(all)|STATE:loc_1(RIVER),arg=1)*P(rivers|RIVER:river(all))
w: the NL sentencem: the MRT: the hybrid tree
MR Model
Parameters
ρ(m’|m,arg=k)
10
Model Parameters
do not
havestates
rivers
How many
?
QUERY:answer(NUM)
NUM:count(STATE)
STATE:exclude(STATE STATE)
STATE:state(all) STATE:loc_1(RIVER)
RIVER:river(all)
P(How many STATE|NUM:count(STATE))= P(mwY|NUM:count(STATE))* P(How|NUM:count(STATE),BEGIN)* P(many|NUM:count(STATE),How)* P(STATE|NUM:count(STATE),many)* P(END|NUM:count(STATE),STATE)
w: the NL sentencem: the MRT: the hybrid tree
Pattern Parameters
Φ(r|m)
11
Hybrid Patterns
#RHS Hybrid Pattern # Patterns
0 M w 1
1 M [w] Y [w] 4
2M [w] Y [w] Z [w] 8
M [w] Z [w] Y [w] 8
• M is an MR production, w is a word sequence
• Y and Z are respectively the first and second child MR production
Note: [] denotes optional
12
Model Parameters
do not
havestates
rivers
How many
?
STATE:exclude(STATE STATE)
STATE:state(all) STATE:loc_1(RIVER)
RIVER:river(all)
P(How many STATE|NUM:count(STATE))= P(mwY|NUM:count(STATE))* P(How|NUM:count(STATE),BEGIN)* P(many|NUM:count(STATE),How)* P(STATE|NUM:count(STATE),many)* P(END|NUM:count(STATE),STATE)
w: the NL sentencem: the MRT: the hybrid tree
Emission
Parameters
θ(t|m,Λ)
13
QUERY:answer(NUM)
NUM:count(STATE)
Assumptions : Model I, II, III
14
many STATE
NUM:count(STATE)
HowBEGIN END
Θ(ti|M,Λ) = P(ti|M)Θ(ti|M,Λ) = P(ti|M,ti-1)Θ(ti|M,Λ) = [P(ti|M,ti-1) + P(ti|M)] * 0.5
Model IModel IIModel III
Unigram ModelBigram Model
Mixgram Model
Model Parameters
• MR model parametersΣmi
ρ(mi|mj,arg=k) = 1
They model the meaning representation
• Emission parametersΣt
Θ(t|mj,Λ) = 1
They model the emission of words and semantic categories of MR productions. Λ is the context.
• Pattern parametersΣr
Φ(r|mj) = 1
They model the selection of hybrid patterns
15
Parameter Estimation
• MR model parameters are easy to estimate• Learning the emission parameters and pattern
parameters is challenging• Inside-outside algorithm with EM
– Naïve implementation: O(n6m)– n: number of words in an NL sentence– m: number of MR productions in an MR
• Improved efficient algorithm– Two-layer dynamic programming– Improved time complexity: O(n3m)
16
Decoding
• Given an NL sentence w, find the optimal MR M*: M* = argmaxm P(m|w)
= argmaxmΣT P(m,T |w)
= argmaxmΣT P(w,m,T )
• We find the most likely hybrid tree M* = argmaxmmaxT P(w,m,T )
• Similar DP techniques employed• Implemented Exact top-k decoding algorithm
17
Reranking
• Weakness of the generative model– Lacks the ability to model long range dependencies
• Reranking with the averaged perceptron– Output space
• Hybrid trees from exact top-k (k=50) decoding algorithm for each training/testing instance’s NL sentence
– Single correct reference• Output of Viterbi algorithm for each training instance
– Feature functions• Features 1-5 are indicator functions, while feature 6 is real-
valued.– Threshold b that prunes unreliable predictions even when they
score the highest, to optimize F-measure
18
Reranking Features: Examples
19
do not
havestates
rivers
How many
?
QUERY:answer(NUM)
NUM:count(STATE)
STATE:exclude(STATE STATE)
STATE:state(all) STATE:loc_1(RIVER)
RIVER:river(all)
Feature 1: Hybrid Rule: A MR production and its child hybrid sequenceFeature 2: Expanded Hybrid Rule: A MR production and its child hybrid sequence expandedFeature 3: Long-range Unigram: A MR production and a NL word appearing below in treeFeature 4: Grandchild Unigram: A MR production and its grandchild NL wordFeature 5: Two Level Unigram: A MR production, its parent production, and its child NL wordFeature 6: Model Log-Probability: Logarithm of base model’s joint probability
log(P(w,m,T))
Related Work
• SILT (2005) by Kate, Wong, and Mooney– A system that learns deterministic rules to transform
either sentences or their syntactic parse trees to meaning structures
• WASP (2006) by Wong and Mooney– A system motivated by statistical machine translation
techniques
• KRISP (2006) by Kate and Mooney– A discriminative approach where meaning
representation structures are constructed from the natural language strings hierarchically
20
Evaluation Metrics
• Precision # correct output structures
# output structures
• Recall # correct output structures
# input sentences
• F measure 2
1/Precision + 1/Recall
21
Comparison over three models
• I/II/III: Unigram/Bigram/Mixgram model; +R: w/ reranking• Reranking is shown to be effective• Overall, model III with reranking performs the best
Evaluations
ModelGeoquery (880) Robocup (300)
Prec. Rec. F Prec. Rec. F
I 81.3 77.1 79.1 71.1 64.0 67.4
II 89.0 76.0 82.0 82.4 57.7 67.8
III 86.2 81.8 84.0 70.4 63.3 66.7
I + R 87.5 80.5 83.8 79.1 67.0 72.6
II + R 93.2 73.6 82.3 88.4 56.0 68.6
III + R 89.3 81.5 85.2 82.5 67.7 74.4
22
Comparison with other models
On Geoquery:• Able to handle more than 25% of the inputs that could not be
handled by previous systems• Error reduction rate of 22%
Evaluations
SystemGeoquery (880) Robocup (300)
Prec. Rec. F Prec. Rec. F
SILT 89.0 54.1 67.3 83.9 50.7 63.2
WASP 87.2 74.8 80.5 88.9 61.9 73.0
KRISP 93.3 71.7 81.1 85.2 61.9 71.7
Model III + R 89.3 81.5 85.2 82.5 67.7 74.4
23
Evaluations
SystemEnglish Spanish
Prec. Rec. F Prec. Rec. F
WASP 95.42 70.00 80.76 91.99 72.40 81.03Model III + R 91.46 72.80 81.07 95.19 79.20 86.46
Comparison on other languages
• Achieves performance comparable to previous system
SystemJapanese Turkish
Prec. Rec. F Prec. Rec. F
WASP 91.98 74.40 82.86 96.96 62.40 75.93Model III + R 87.56 76.00 81.37 93.82 66.80 78.04
24
Contributions
• Introduced a hybrid tree representation framework for this task
• Proposed a new generative model that can be applied to the task of transforming NL sentences to MRs
• Developed a new dynamic programming algorithm for efficient training and decoding
• The approach, augmented with reranking, achieves state-of-the-art performance on benchmark corpora, with a notable improvement in recall
25
Questions?
26