27
HW7 Extracting Arguments for % Ang Sun [email protected] March 25, 2012

HW7 Extracting Arguments for %

  • Upload
    wray

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

HW7 Extracting Arguments for %. Ang Sun [email protected] March 25, 2012. Outline. File Format Training Generating Training Examples Extracting Features Training of MaxEnt Models Decoding Scoring. File Format. - PowerPoint PPT Presentation

Citation preview

Page 1: HW7  Extracting Arguments for %

HW7 Extracting Arguments for %

Ang [email protected] 25, 2012

Page 2: HW7  Extracting Arguments for %

Outline• File Format

• Training– Generating Training Examples– Extracting Features– Training of MaxEnt Models

• Decoding

• Scoring

Page 3: HW7  Extracting Arguments for %

File Format

• Statistics Canada said service-industry <ARG1> output </ARG1> in August <SUPPORT> rose </SUPPORT> 0.4 <PRED class="PARTITIVE-QUANT"> % </PRED> from July .

Page 4: HW7  Extracting Arguments for %

• Generating Training Examples– Positive Example• Only one positive example for a sentence• The one with the annotation ARG1

Page 5: HW7  Extracting Arguments for %

• Generating Training Examples– Negative Examples

• Two methods!• Method 1: consider any token that has one of the following POSs

– NN 1150– NNS 905– NNP 205– JJ 25– PRP 24– CD 21– DT 16– NNPS 13– VBG 2– FW 1– IN 1– RB 1– VBZ 1– WDT 1– WP 1

Too many negative examples!

Page 6: HW7  Extracting Arguments for %

• Generating Training Examples– Negative Examples• Two methods!• Method 2: only consider head tokens

Page 7: HW7  Extracting Arguments for %

• Extracting Featuresf: candToken=output

Page 8: HW7  Extracting Arguments for %

• Extracting Featuresf: tokenBeforeCand=service-industry

Page 9: HW7  Extracting Arguments for %

• Extracting Featuresf: tokenAfterCand=in

Page 10: HW7  Extracting Arguments for %

• Extracting Featuresf:

tokensBetweenCandPRED=in_August_rose_0.4

Page 11: HW7  Extracting Arguments for %

• Extracting Featuresf: numberOfTokensBetween=4

Page 12: HW7  Extracting Arguments for %

• Extracting Featuresf: exisitVerbBetweenCandPred=true

Page 13: HW7  Extracting Arguments for %

• Extracting Featuresf: exisitSUPPORTBetweenCandPred=true

Page 14: HW7  Extracting Arguments for %

• Extracting Featuresf: candTokenPOS=NN

Page 15: HW7  Extracting Arguments for %

• Extracting Featuresf: posBeforeCand=NN

Page 16: HW7  Extracting Arguments for %

• Extracting Featuresf: posAfterCand=IN

Page 17: HW7  Extracting Arguments for %

• Extracting Featuresf: possBetweenCandPRED=IN_NNP_VBD_CD

Page 18: HW7  Extracting Arguments for %

• Extracting Featuresf: BIOChunkChain=

I-NP_B-PP_B-NP_B-VP_B-NP_I-NP

Page 19: HW7  Extracting Arguments for %

• Extracting Featuresf: chunkChain=

NP_PP_NP_VP_NP

Page 20: HW7  Extracting Arguments for %

• Extracting Featuresf: candPredInSameNP=False

Page 21: HW7  Extracting Arguments for %

• Extracting Featuresf: candPredInSameVP=False

Page 22: HW7  Extracting Arguments for %

• Extracting Featuresf: candPredInSamePP=False

Page 23: HW7  Extracting Arguments for %

• Extracting Featuresf: shortestPathBetweenCandPred=

NP_NP-SBJ_S_VP_NP-EXT

Page 24: HW7  Extracting Arguments for %

• Training of MaxEnt Model– Each training example is one line• candToken=output . . . . . class=Y• candToken=Canada . . . . . Class=N

– Put all examples in one file, the training file

– Use the MaxEnt wrapper or the program you wrote in HW5 to train your relation extraction model

Page 25: HW7  Extracting Arguments for %

Decoding

• For each sentence– Generate testing examples as you did for training

• One example per feature line (without class=(Y/N))

– Apply your trained model to each of the testing examples

– Choose the example with the highest probability returned by your model as the ARG1

– So there should be and must be one ARG1 for each sentence

Page 26: HW7  Extracting Arguments for %

Scoring

• As you are required to tag only one ARG1 for each sentence

• Your system will be evaluated based on accuracy– Accuracy = #correct_ARG1s / #sentences

Page 27: HW7  Extracting Arguments for %

Good Luck!