AutoEval and Missplel: Two Generic Tools for Automatic Evaluation

Preview:

DESCRIPTION

AutoEval and Missplel: Two Generic Tools for Automatic Evaluation. Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se www.nada.kth.se/theory/humanlang/tools.html. Manual evaluation. Time-consuming, tedious, error-prone - PowerPoint PPT Presentation

Citation preview

AutoEval and Missplel:Two Generic Tools for Automatic Evaluation

Johnny Bigert, Linus Ericson, Anton Solis

Nada, KTH, Stockholm, SwedenContact: johnny@kth.se

www.nada.kth.se/theory/humanlang/tools.html

Manual evaluation Time-consuming, tedious, error-

prone Computers are good at repetitive

tasks, humans are not Unavoidable in some situations

Automatic evaluation Cheap, fast, accurate, easily

reproducible Incorporated in the development of

most NLP system

Automatic evaluation AutoEval: simplifies the

construction of (NLP system) evaluation

Missplel: introduces human-like errors into text

AutoEval "I write evaluation code myself in

all our NLP projects" "Why would I need AutoEval?"

AutoEval Our point exactly

Repetition of: Input and output file handling XML parsing and XML output Error handling, malformed input Data storage, management and

processing

AutoEvalFeatures — avoids repetition: Handles input (XML/structured plain-

text) and generates output (XML) Handles data storage and processing...and also: Generic and extendible script

language Efficient

AutoEvalScript language: Simple C-like syntax Powerful Modules and macros in repository

files Extendible, add your own functions

AutoEval

<root> <files> <file format="plain" type="in" name="datafile">TnT.wt</file> <file format="xml" type="out" name="outfile">out.xml</file> </files> <process> field(file("datafile"), "\t", "\n", var("word"), var("tag")); inc(cnt("tot")); inc(cnt(lookup("tag"))); </process> <processonce> outputintcon(out("outfile"), cntmap("global"), "global"); </processonce></root>

Example of configuration and script language:

AutoEval

<evaloutput date="Mon May 26 12:37:39 2003"><global> <var name="tot">14119</var>

<var name="ab">714</var> <var name="ab.kom">44</var> <var name="ab.pos">149</var> <var name="ab.suv">24</var> ... <var name="vb.sup.akt">117</var> <var name="vb.sup.sfo">35</var></global>

The result:

Missplel Missplel is a highly configurable tool

to introduce human-like spelling errors

Language, PoS tag set, character set and keyboard layout independent

All you need is a word/tag/lemma dictionary

MissplelPerformance errors – Damerau: Keyboard mistypes (Damerau, 1964):

Insertion, deletion, substitution, transposition of letters

wellcvome, wellcme, wellcpme, wellcmoe Result:

a new existing/non-existing word word class (PoS tag) change or not

MissplelCompetence errors – split compounds: May alter the semantics of a

sentence Kycklinglever – chicken liver Kyckling lever – chicken is alive

Settings of split compound elements: Minimum length? Allowed PoS tag? Found in dictionary? Word class change? etc.

MissplelCompetence errors – sound errors: Letter level e.g. sound-alike errors Regular expression rules:

(.+)ei(.+) @1ie@2 receive recieve

MissplelCompetence errors – syntax errors: Word/letter level Form new words from PoS tags,

missing/doubled words etc. Regular expression rules:

<rule ex="slutat skrika - slutat skrikit"> <match>vb\.sup\.akt(.*) vb\.inf.*</match> <to>vb.sup.akt@1 vb.sup.akt</to>

</rule>

MissplelLetters NN2 would VM0 be VBIwelcome AJ0-NN1

Litters NN2 damerau/wordexist-notagchange would VM0 okbee NN1 sound/wordexist-tagchangewelcmoe ERR damerau/nowordexist-tagchange

Missplel <input> <filename>TnT.wt</filename> <expression>([^\t]+)\t([^\t]+)([^\r\n]*).*</expression> </input>

<output> <filename>output.wte</filename> <!-- %1% Word, %2% Tag, %3% Lemma, %4% Rest of line, %5% Error descr --> <format>%1% %2% %5%</format> <description> <noError>ok</noError> <existingWord>exist</existingWord> <nonExistingWord>noexist</nonExistingWord> <wordChange>-wordch</wordChange> <noWordChange>-nowordch</noWordChange> <tagChange>-tagch</tagChange> <noTagChange>-notagch</noTagChange> </description> </output> ...

Missplel ... <options> <unknownTag>unknown</unknownTag> <unknownLemma>unknownLemma</unknownLemma> <escapeChar>@</escapeChar> <spaceChar> </spaceChar> <wordChar>'</wordChar> <sentenceSeparatorTag>mad</sentenceSeparatorTag> <maxErrorsInSentence>30</maxErrorsInSentence> <configDir>felstava/conf/</configDir> </options>

<wordlist> <create> <filename>Swedish.cwtl</filename> <expression>.+\t([^\t]+)\t([^\t]+)\t+([^\t]+)</expression> </create> <wordfile>outfile.gz</wordfile> <tagfile>tagfile</tagfile> </wordlist> ...

Missplel ... <damerau> <reportName>damerau</reportName> <active>yes</active> <probability>10.0</probability> <confusionMatrix>confusionfile</confusionMatrix> <subst>1</subst> <ins>1</ins> <del>1</del> <transp>1</transp> <allowExistingWords>no</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </damerau> ...

Missplel ... <splitCompound> <reportName>split</reportName> <active>no</active> <probability>99.0</probability> <splitUnknownWords>yes</splitUnknownWords> <splitThreshold>50</splitThreshold> <minWordLength>6</minWordLength> <minSplitWordLength>3</minSplitWordLength> <factors> <wordLength>1</wordLength> <inDictionaryFirst>10</inDictionaryFirst> <inDictionarySecond>10</inDictionarySecond> <tagAllowed>10</tagAllowed> <tagMatchFirst>0</tagMatchFirst> <tagMatchSecond>15</tagMatchSecond> </factors> </splitCompound> ...

Missplel ... <soundError> <reportName>sound</reportName> <active>no</active> <filename>sound.test</filename> <probability>100.0</probability> <expression>(.+)\t(.+)\t(.+)</expression> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </soundError> ...

Missplel ... <syntaxError> <reportName>introduced</reportName> <active>no</active> <filename>error.rules</filename> <probability>100.0</probability> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </syntaxError>

Applications AutoEval has been used to evaluate

Parsers PoS taggers PoS majority/ensemble tagging

Missplel has been used to evaluate Spell checkers Grammar checkers Robustness of parsers and taggers

Licence AutoEval and Missplel are open

source under the Gnu General Public Licence

Source code available at www.nada.kth.se/theory/ humanlang/tools.html

Recommended