Upload
zeno
View
18
Download
0
Embed Size (px)
DESCRIPTION
Learning-Based Argument Structure Analysis of Event-Nouns in Japanese. Mamoru Komachi , Ryu Iida, Kentaro Inui and Yuji Matsumoto Graduate School of Information Science Nara Institute of Science and Technology, JAPAN 19 September 2007. Our goal. Our city, destroyed by the atomic bomb - PowerPoint PPT Presentation
Citation preview
Learning-Based Argument Structure Analysis of Event-Nouns in Japanese
Mamoru Komachi, Ryu Iida, Kentaro Inui and Yuji Matsumoto
Graduate School of Information Science
Nara Institute of Science and Technology, JAPAN
19 September 2007
2
Our goal
Our city, destroyed by the atomic bombOur city was destroyed by the atomic bombThe atomic bomb destroyed our citythe destruction of our city by the atomic bomb
IE, MT, Summarization, …
destroy
The atomic bomb Our cityNominalization
CAUSE UNDERGOER
Argument structure of event-nouns
Logical cases for event-nouns are often not marked by case markers
3
Kanojo-kara denwa-ga ki-taShe-ABL phone-NOM come-PAST
(She phoned me.)
phone
she (me)
come
phone she
ABLNOM NOM DAT
4
Task setting
Tom-ga kinou denwa-o ka-ttaTom-NOM yesterday phone-ACC buy-PAST
(Tom bought a phone yesterday.)
1. Event classification (determine event-hood)
2. Argument identification
buy
Tom phone
NOM ACC
phone
?
?
5
Outline
IntroductionArgument structure analysis of event-nouns
Event classificationArgument identification
ConclusionFuture work
6
Unsupervised learning of patterns
Encode an instance in a tree and learn contextual patterns as sub-trees by Boosting algorithm called BACT (Kudo and Matsumoto, 2004)
…persuasiondestruction…
…chairdesk…
… conducted destruction of documents …
… a little chair around…
Positive
Negative
Verb Commonnoun
Samephrase
Adj
Samephrase
Prep
Having eventhood
Not having eventhood
Encode each instance in a flat treeUsing surface text, POS, dependency relations, etc.
Depends
7
Experiments of event classificationMethod: Classify eventhood of event-nouns by
Support Vector MachinesData: 80 news articles (800 sentences)
1,237 event-nouns (590 have eventhood)
Features:Grammatical features
HeadPOS: CommonNounSemantic features
SemanticCategory: AnimateContextual features
FollowsVerbalNoun: 1
8
Results of event classificationPrec. Rec. F
Baseline (predominant) 60.4 88.2 71.7
Proposed (unsupervised) 73.3 80.2 76.6
Baseline: use the first sense determined by corpus statistics (NAIST Text Corpus)
Proposed: machine learning based classifierPrecision = correct / event-nouns which are classified
as having event-hood by systemRecall: correct / all event-nouns in the corpus
Outperform in precision and F by using contextual patternsCan improve more by adding more data
9
Outline
IntroductionArgument structure analysis of event-nouns
Event classificationArgument identification
ConclusionFuture work
10
Argument identificationBuild a classifier using tournament model (Iida et al., 2006)
政府 , 民間
政府 , 活性
日本 , 政府
日本 政府 による 民間 支援 が 活性 化 した。Japanese government-BY private sector support-NOM activate -PAST
training
民間 , 活性
政府 , 民間
日本 , 政府
L: 民間
L: 政府
R: 政府decoding
R
L
L
L: 民間L: 政府R: 政府 支援(する)
NOM
The support for the private sector by the Japanese government was activated.
11
Calculation of PMI using pLSI
Estimate point-wise mutual information using Probabilistic Latent Semantic Indexing (Hoffman, 1999) where noun n depends on verb v through case marker c (Fujita et al., 2004)
P( v,c,n ) P( v,czZ
| z)P(n | z)P(z)
PMI( v,c ,n)logP( v,c,n )
P( v,c )P(n)
… pay for the shoes <pay,for> shoes
Dimension reduction by a hidden class z
12
Case alignment dictionary
(ACCevent, oshie-ru) = DATpred→NOMevent
kare-ga kanojo-ni benkyo-o oshie-tahe-NOM her-DAT study-ACC teach-PAST
(He taught a lesson to her.)
kanojo-ga benkyo-sitaher-NOM study-PAST
(She studied.)
Case alternation
In NomBank, 20% of the arguments that occur outside NP are in support verb construction (Jiang and Ng, 2006)
(teach)
13
Experiments of argument identification
Method: Apply the Japanese zero-anaphora resolution model (Iida et al., 2006) to the argument identification taskBoth tasks lack case markerEvent classification = anaphoricity
determination task
Data: 137 articles for training and 150 articles for testing (event-nouns: 722, NOM: 722, ACC: 278, DAT: 72)
14
Features
Feature Example Instance
Lexical WordForm 日本Grammatical POS ProperNoun
Semantic CoocScore (PMI) < 支援(する) ,ガ >, 日本→ 2.80
Positional NPDependsOnSupportVerb
0
( 日本政府による )
日本 政府 による 民間 支援 が 活性化 した。Japanese government-BY private sector support-NOM activate-PAST
The support for the private sector by the Japanese government was activated.
15
Feature NOM ACC DAT
Baseline 60.5 79.7 73.0
+SVC 64.2 78.0 71.4
+COOC 67.1 80.1 74.6
+SVC
+COOC
68.3 80.1 74.6
Accuracy of argument identification
Case alignment dictionary and co-occurrence statistics improved accuracy
SVC: support verb construction; COOC: co-occurrence
16
Related work
Jiang and Ng (2006)Built maxent classifier for the NomBank
(Meyers et al., 2004) based on features for PropBank (Palmer et al., 2005)
Xue (2006)Used Chinese TB
Liu and Ng (2007)Applied Alternating Structure Optimization
(ASO) to the task of argument identification
17
Conclusion
Defined argument structure analysis of event-nouns in Japanese
Proposed an unsupervised approach to learn contextual patterns of event-nouns to the event classification task
Performed argument identification using co-occurrence statistics and syntactic clues
18
Future work
Explore semi-supervised approach to the event classification task
Use more lexical resources to the argument identification task