17
RDIET: Recognition and Discovery of Information from Extracted TEXT Prakhyath Rai III Semester M.Tech. {CSE} NMAMIT, Nitte Under the Guidance of: Mr. Vijay Murari Asst. Professor, Dept. of CSE NMAMIT, Nitte

Text Mining Framework

Embed Size (px)

Citation preview

Page 1: Text Mining Framework

RDIET: Recognition and Discovery of Information from Extracted TEXT

Prakhyath Rai

III Semester M.Tech. {CSE}

NMAMIT, Nitte

Under the Guidance of:

Mr. Vijay Murari

Asst. Professor, Dept. of CSE

NMAMIT, Nitte

Page 2: Text Mining Framework

Outline

Introduction

I/O Model for Text Mining

Literature Survey

Problem Statement

Architecture Diagram

Filtering Process

Screenshots

References

CONTENTS

Depatment of CSE NMAMIT, Nitte

Page 3: Text Mining Framework

Introduction

Text Mining is a Discovery

Text Mining is used to extract relevant information or

knowledge or pattern from different sources that are in

unstructured form

• Handles semi-structured or Unstructured dataText Mining

• Handles structured dataData Mining

Depatment of CSE NMAMIT, Nitte

Page 4: Text Mining Framework

Introduction Cont.

Extract and discover knowledge hidden in text automatically

Aid domain experts by automatically:

identifying concepts

extracting facts/relations

discovering implicit links

generating hypotheses

Depatment of CSE NMAMIT, Nitte

Page 5: Text Mining Framework

Input-Output Model for Text Mining

Input

Text MiningTechnique

Output

Patterns ConnectionsTrends

Documents

Depatment of CSE NMAMIT, Nitte

Page 6: Text Mining Framework

Literature SurveyDramatic growth rate of digital data

Depatment of CSE NMAMIT, Nitte

Page 7: Text Mining Framework

Literature Survey Cont.

Information Extraction (IE)

Knowledge Discovery from Databases (KDD)

KDT (Knowledge Discovery from Text)

High Specificity i.e. Low frequency Problem

Misinterpretations with low frequency pattern

Depatment of CSE NMAMIT, Nitte

Page 8: Text Mining Framework

Problem Statement

RDIET {Recognition and Discovery of Information from Extracted Text} demonstrates a framework for text mining

RDIET

IE

KDT

Standard Rule Induction

Depatment of CSE NMAMIT, Nitte

Page 9: Text Mining Framework

RDIET Architecture

Specification Information

Retrieval

Pre-

Processing Selection

Refinement Knowledge

Extraction

Background

Knowledge

DB DB

.pdf

.docx

.txt

Documents

Depatment of CSE NMAMIT, Nitte

Page 10: Text Mining Framework

Filtering Process

.pdf

.doc

.jpg

.docx

.txt

.doc

.pdf

.html

.mpeg

.html

.mp4

.docx

….

Filtering

Process

Documents Browsed from System Screened Documents

Depatment of CSE NMAMIT, Nitte

Page 11: Text Mining Framework

Screenshots

Depatment of CSE NMAMIT, Nitte

Page 12: Text Mining Framework

Screenshots Cont.

Depatment of CSE NMAMIT, Nitte

Page 13: Text Mining Framework

Screenshots Cont.

Depatment of CSE NMAMIT, Nitte

Page 14: Text Mining Framework

References[1] Ning Zhong, Yuefeng Li and T. Grance, “Effective Pattern Discovery for Text Mining,”

IEEE Transactions on Knowledge and Data Engineering, Vol. 24, No. 1, January 2012.

[2] Sangno Lee, Jeff Baker and Jaeki, “An Emperical Comparison of Four Text Mining

Methods”, Proceedings of the 43rd Hawaii International Conferences on System Sciences,

2010.

[3] Gary King, Patrick Lam and Margaret E Aroberts, “Computer-Assisted Keyword and

Documents from Unstructured Text”, 2014.

[4] Deepak Agnihotri, Kesari Verma and Priyanka Tripathi, “Pattern and Cluster Mining on

Text Data”, Fourth International Conferences on Communication Systems and Network

Technologies, 2014.

[5] Robert Moro and Maria Bielikova, “Personalized Text Summarization Based on

Important Terms Identification”, 23rd International Workshop on Database and Expert

Systems Applications, 2012.

Depatment of CSE NMAMIT, Nitte

Page 15: Text Mining Framework

Reference Cont.[6] M Sukanya and S Biruntha, “Techniques on Text Mining”, IEEE Conference on Advanced Communication Control and Computing Technologies, 2012.

[7] Christina Feilmayr, “Text Mining-Supported Information Extraction”, 22nd

International Workshop on Database and Expert Systems Applications, 2012.

[8] hadoop.intel.com, intel.com/bigdata, intel.com/microservers, “Extract, Transform, and Load Big Data with Apache Hadoop”, http://hadoop.intel.com

[9] Raymond J Mooney and Un Yong Nahm, “ Text Mining with Information Extraction”, Proceedings of the 4th International MIDP Colloquium, pages 141-160, Van Schaik Pub., South Africa, 2005.

[10] R Baeza-Yates and B Ribeiro-Neto. “Modern Information Retrieval”, ACM Press, New York, 1999.

Depatment of CSE NMAMIT, Nitte

Page 16: Text Mining Framework

Depatment of CSE NMAMIT, Nitte

Page 17: Text Mining Framework

Depatment of CSE NMAMIT, Nitte