25
Yogesh Benawat Sameer Deshmukh

Data Mining and Data Warehousing

  • Upload
    sameer

  • View
    10.844

  • Download
    3

Embed Size (px)

DESCRIPTION

Paper Presentation for Data Mining and Data Warehosuing.

Citation preview

Page 1: Data Mining and Data Warehousing

Yogesh BenawatSameer Deshmukh

Page 2: Data Mining and Data Warehousing

Outline

Data Mining Data Warehousing Q ‘n’ A Conclusion

Page 3: Data Mining and Data Warehousing

Historical Perspective

1960s: Data collection, database creation, IMS

and network DBMS 1970s:

Relational data model, relational DBMS implementation

1980s: RDBMS, advanced data models

(extended-relational, OO, deductive, etc.) and application-oriented DBMS (spatial, scientific, engineering, etc.)

1990s—2000s: Data mining and data warehousing,

multimedia databases, and Web databases

Page 4: Data Mining and Data Warehousing

Data Mining

Page 5: Data Mining and Data Warehousing

Definition

Data mining automates the process of locating and extracting the hidden patterns and

knowledge

In simple words Searching for new knowledge

Page 6: Data Mining and Data Warehousing

Why we need data mining

Data explosion problem

Automated data collection tools and mature database

technology lead to tremendous amounts of data stored in

databases, data warehouses and other information repositories

We are drowning in data, but starving for knowledge!

Solution: Data mining

Data warehousing and on-line analytical processing

Extraction of interesting knowledge (rules, regularities,

patterns, constraints) from data in large databases

Page 7: Data Mining and Data Warehousing

Data Mining Models

Predictive Model

Descriptive Model

Page 8: Data Mining and Data Warehousing

Predictive Model

Prediction determining how certain attributes will behave in the

future Regression

mapping of data item to real valued prediction variable

Classification categorization of data based on combinations of

attributes Time Series analysis

examining values of attributes with respect to time

Page 9: Data Mining and Data Warehousing

Descriptive Model

Clustering most closely data clubbed together into clusters

Data Summarization extracting representative information about

database Association Rules

associativity defined between data items to form relationship

Sequence Discovery it is used to determine sequential patterns in data

based on time sequence of action

Page 10: Data Mining and Data Warehousing

Data mining process

Problem Definition

Creating Database

Exploring database

Preparation for creating a data mining model

Building Data Mining Model

Evaluation Phase

Deploying the Data Mining model

Fig. General Phases of Data Mining Process

Page 11: Data Mining and Data Warehousing

Who needs data mining?

Whoever has information fastest and uses it wins

Don McKeough former president of Coke Cola

Businesses are looking for new ways to let end users find the data they need to:

make decisions Serve customers Gain the competitive edge

Page 12: Data Mining and Data Warehousing

Applications

Business analysis and management Computer security Customer relationships analysis and

management Telecommunication analysis and management News and entertainment Bioinformatics and Healthcare analysis

Page 13: Data Mining and Data Warehousing

Summary

Need of data mining Data mining models Process of data mining Some applications

Page 14: Data Mining and Data Warehousing

Data Warehousing

Page 15: Data Mining and Data Warehousing

Data Warehousing Data Warehouse

What is Data Warehouse? Database & Data Warehouse.

How to distinguish? Purpose

Database : Transactional Data Warehouse :Intended for Decision

Supporting Applications. Functionality

Optimized for data retrieval, not routine transaction processing.

Structure Performance

Page 16: Data Mining and Data Warehousing

Data Warehousing Modern Organization’s needs ?

Companies spread world wide. Have

So many Data Sources Different Operational Systems Different Schemas

Need Data for Complex Analysis Knowledge Discovery Decision Making.

Solution ???

Page 17: Data Mining and Data Warehousing

Data Warehousing Solution…Data Warehouse. Data Warehouse . Definition ??

No single definition…. Data Warehouse

Collection of Information gathered from multiple sources, stored under unified schema, at a single site & mainly intended for decision support applications.

A subject oriented, integrated, nonvolatile, time-variant, collection of data in support of management’s decision. ~ W.H. Inmon

Page 18: Data Mining and Data Warehousing

Warehouses are Very Large Databases

35

%

30

%

25

%

20

%

15

%

10

%

5%

0%

5GB

5-9GB

10-19GB 50-99GB 250-499GB

20-49GB 100-249GB500GB-1TB

Initial

Projected 2Q96

Source: META Group, Inc.

Res

pond

ents

Page 19: Data Mining and Data Warehousing

Data Warehousing Data Warehouse -

Architecture

DataSource1

DataSourcen

DataSource 2

Data Warehouse

Data Loaders

DBMS

Data

Data

DataMining

OLAP

DSSIESI

.

.

.

Page 20: Data Mining and Data Warehousing

Data Warehousing Data Warehouse building

When & how to gather data Source-driven architecture Destination-driven architecture

What schema to use Data Cleansing

Task of correcting and processing data How to propagate updates What data to summarize And many more……

Page 21: Data Mining and Data Warehousing

Summary

What is Data Warehousing? Data Warehouse. Data Warehouse – Architecture Data Warehouse vs. Data Mining

Page 22: Data Mining and Data Warehousing

Conclusion

Your data is full of undiscovered gems; start digging!

Page 23: Data Mining and Data Warehousing

References

Data Mining Introductory and advanced Topics Margaret H. Dunham Modern Data Warehousing, Mining, and visualization

George M. Marakas

Data Mining BPB Publications

Database System Concepts Silbershatz, Korth,

Sudarshan www.statoo.info/ www.crm2day.com/ www.trilliumsoftware.com/

Page 24: Data Mining and Data Warehousing

Q ‘n’ A

Page 25: Data Mining and Data Warehousing

Thank You!