5
MinorThird 서서서서서서서 서서서서서서서 서서서 ([email protected] )

MinorThird 서울시립대학교 인공지능연구실 곽별샘 ([email protected])[email protected]

Embed Size (px)

Citation preview

Page 1: MinorThird 서울시립대학교 인공지능연구실 곽별샘 (semix2@naver.com)semix2@naver.com

MinorThird

서울시립대학교 인공지능연구실곽별샘 ([email protected])

Page 2: MinorThird 서울시립대학교 인공지능연구실 곽별샘 (semix2@naver.com)semix2@naver.com

MinorThird

A collection of Java classes for Storing text Annotating text Learning to extract entities Categorize text

Page 3: MinorThird 서울시립대학교 인공지능연구실 곽별샘 (semix2@naver.com)semix2@naver.com

What's Different About MinorThird

Differs from existing NLP and learning toolkits

Combines tools for annotating and visualizing text with state-of-the art learning methods

Contains methods to visualize Both training data and the performance of classifiers Facilitates debugging

Integrated with text manipulation tools Possible to track and visualize the transformation of text data into machine

learning data

Architected to support active learning and on-line learning Should facilitate integration of learning methods into agents

Page 4: MinorThird 서울시립대학교 인공지능연구실 곽별샘 (semix2@naver.com)semix2@naver.com

Components

TextBase A collection of documents

TextLabels Logical assertions about documents in a TextBase A type of stand off annotation

The annotation are completely independent of the text Assert a category or property for a word, a document, or a subsequence

of words(span) by human labelers or by a learned program encode

syntactic properties like shallow parser or POS tags

semantic properties like the functional role that entities play in a sentence

Page 5: MinorThird 서울시립대학교 인공지능연구실 곽별샘 (semix2@naver.com)semix2@naver.com

Components

Repository Annotated TextBases are accessed in a single uniform way. However,

they are stored in one of several schemes. Repository can be configured to hold a bunch of TextLabels and their

associated TextBases.

Mixup (Minorthird Information eXtraction and Understanding Program) A special-purpose annotation language

Moderately complex hand-coded annotation programs can be implemented with Mixup

Based on the widely used notion of cascaded finite state transducers Includes some powerful features

A GUI debugging environment Escape to Java A kind of subroutine call mechanism