Summary#10_Application of Automatic Thesaurus Extraction for Computer Generation of Vocabulary Questions_Heilman2007

Embed Size (px)

DESCRIPTION

This is the summary of paper titled "Application of Automatic Thesaurus Extraction for Computer Generation of Vocabulary Questions" by Heilman (2007).

Citation preview

  • SUMMARY BY YUNI SUSANTI

    Application of Automatic Thesaurus Extraction for ComputerGeneration of Vocabulary Questions (Heilman, 2007)

    1 Introduction

    The goal of this work is to automatically generate a specific type of vocabulary assessment item, that address paradig-matic relations (synonyms, antonyms, similar but more/less specific in meaning) for a target word. This paper also appliedan automatic thesaurus extraction technique to generate a knowledge base rather than using manually-created lexicaldatabase such as WordNet.

    2 Methodology

    2.1 Automatic thesaurus extraction

    Automatic thesaurus extraction in this papere is to generate a knowledge base for generating related word questions.The basic idea for this is the distributional hypothesis. For example, the words milk, juice, and cup have similarcontext about meals. However the first two are likely to be the object of the verb such as drink, while cup would not.This kind of dependency relationship is used.Here are the steps:

    Using dependency parser to create dependency triple for all of the dependency relations extracted from largecorpus of text. For example. from the sentence I drank milk this morning, the triple would be (drink, obj, milk).

    ...

    2.2 Generating the questions

    The question is made with the template Which set of words is related to word w ?The correct response is the top three most similar words that werent morphologically related (for example, infrequentfor the target word frequent). For the distractors, they are taken randomly with the same POS.

    3 Evaluation

    The effectiveness of the thesaurus extraction technique was evaluated by comparing the output (thesaurus pro-duced by systems) to a traditional thesaurus.

    The quality of the questions produced was measured manually by teachers, and in task-based for native and non-native speakers of English.