23
Data Warehouse (DW) & On-line Analytic Processing (OLAP) Rev: Feb, 2012 Euiho (David) Suh, Ph.D. POSTECH Strategic Management of Information and Technology Laborato (POSMIT: http://posmit.postech.ac.kr) Dept. of Industrial & Management Engineering POSTECH

Data Warehouse (DW) & On-line Analytic Processing (OLAP)

  • Upload
    brasen

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Data Warehouse (DW) & On-line Analytic Processing (OLAP). Rev: Feb, 2012 Euiho (David) Suh , Ph.D. POSTECH Strategic Management of Information and Technology Laboratory (POSMIT: http://posmit.postech.ac.kr) Dept. of Industrial & Management Engineering POSTECH. Contents. - PowerPoint PPT Presentation

Citation preview

Page 1: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

Data Warehouse (DW) &On-line Analytic Processing (OLAP)

Rev: Feb, 2012

Euiho (David) Suh, Ph.D.

POSTECH Strategic Management of Information and Technology Laboratory(POSMIT: http://posmit.postech.ac.kr)

Dept. of Industrial & Management EngineeringPOSTECH

Page 2: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

Contents※ Discussion Questions

1 Data Warehouse1) Introduction of Data Warehouse

2) Concepts for Data Warehouse

3) Difficulties and Trends

2 On-line Analytic Processing (OLAP)1) Introduction of OLAP

2) Concepts for OLAP

3 Case Study

Page 3: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

3

Discussion Questions

■ What is the differences among Database, Data Warehouse, and Data Mart?

■ What is the core difference between DBMS and MBMS in their functionalities?

■ What are the benefits and limitations of the relational database model for business applications today?

■ What do you think the major reason for using OLAP in firms?

Page 4: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

4

■ Data Warehouse

– Stores static data that has been extracted from other databases in an organization– Central source of data that has been cleaned, transformed, and cataloged– Data is used for data mining, analytical processing, analysis, research, decision sup-

port

Definition of Data Warehouse 1. Data Warehouse1) Introduction of Data Warehouse

Integrated

Non-volatile

Time variant

A data warehouse is a collection of data in support of manage-ment’s decisions

Scattered Information Cleaned Data Warehouse Query & Distribute to End User

0

50

100

SalesHR

Cost

Finance

Bond

Customer

Page 5: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

5

■ Data Warehouse architecture

Data Warehouse Architecture 1. Data Warehouse1) Introduction of Data Warehouse

* Building the Data Warehouse *Use of Data Warehouse

Data Warehouse

External file

OLTP System

Back up file

Enterprise server

Workgroup server Query,

Reporting tool

OLAP tool

Datamining Application

EIS/DSS Application

Web browserSlice/Dice

SQLSQL

SQL

SQL

SQL

SQL

SQL

Data MartSource Data

MDB

RDB

Infra, Data integration and Administration

Application development, Data access & Use

Page 6: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

6

■ Technical architecture for a data warehousing system

Data Warehouse Architecture

DataAcquisitionComponent

DesignComponent

DataManager

Component

InformationDirectory

Component

DataDelivery

Component

MiddlewareComponent

Data AccessComponent

warehousedata

warehousemetadata

externaldata

externalmetadata

sourcedata

Management Component

1. Data Warehouse1) Introduction of Data Warehouse

Page 7: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

7

■ Definition of database– Integrated collection of logically related data elements

■ Common Database Structures (Types)– Hierarchical

• Early DBMS structure• Records arranged in tree-like structure• Relationships are one-to-many

– Network• Used in some mainframe DBMS packages• Many-to-many relationships

– Relational• Most widely used structure• Data elements are stored in tables• Row represents a record; column is a field• Can relate data in one file with data in another,

if both files share a common data element– Multidimensional

• Variation of relational model• Uses multidimensional structures to

organize data• Data elements are viewed as being in cubes• Popular for analytical databases that support Online Analytical Processing (OLAP)

– Object-Oriented• Store data together with the appropriate methods for accessing it i.e. encapsulation• Information is represented in the form of objects as used in object-oriented programming

Introduction of Database 1. Data Warehouse2) Concepts for Data Warehouse

Relational Struc-ture Object-Oriented

Structure

Page 8: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

8

■ Metadata– Data about data (similar to catalog card in library)– Define the data in the data warehouse– Enable to find the data in data warehouse, more easily and fast

■ Data Marts– Collection of database– Comparing with Data Warehouse, data marts are usually smaller and focus on a par-

ticular subject or department. – Data marts are subsets of larger Data Warehouse

■ Data Warehouse VS. Data Mart– Data in Data Warehouse• The data needs to be gathered from all the relevant transactional systems that produce it,

cleansed and validated, and made available from a system-of-record that ensures the referential integrity of the data

– Data in Data Mart• The data needs to be presented in a structure that is intuitive to the users and facilitates their

ability to query the data that is relevant to their needs

Metadata and Data Marts 1. Data Warehouse2) Concepts for Data Warehouse

Page 9: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

9

■ Data Warehouse built on top of DB

Information Flow 1. Data Warehouse2) Concepts for Data Warehouse

Internal / External

Database

Data Warehouse

Metadata RepositoryInternal / External

Database

Data Marts

Finance Management Reporting

Accounting

SalesMarketing

Page 10: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

10

■ Data Warehouse Components

Data Warehouse Components 1. Data Warehouse2) Concepts for Data Warehouse

Page 11: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

11

■ Applications and Data Marts

Applications and Data Marts 1. Data Warehouse2) Concepts for Data Warehouse

Page 12: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

12

Difficulties in implementing DW

■ Complete Alignment– Make sure you have full involvement and buy -in from those that represent your users -

the consumers of your data warehouse.

■ Iterative & Frequent Update– Consider all aspects of the process of researching your data sources, capturing and

transmitting that data to the data warehouse, transforming and loading it into the data warehouse and accounting for its lineage.

■ Risk– Make sure you develop a proper risk management plan.

1. Data Warehouse3) Difficulties and Trends

Page 13: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

13

Future Trends

■ Enterprise Data Warehouse– The enterprise data warehouse, whether a single store or integrated data marts across

a variety of platforms, yields a view of the operation previously unattainableby Don Hatcher, SAS

■ Real-time– Organization move to more real-time data transformation and seek to better leverage

common metadata across applications by Allan Houpt, CA

■ Capacity– The future of data warehousing is all about ever larger data warehouses - in fact I just

read about a U.S. Government effort to create petabyte repositoriesby Roman Bukary, SAP Director of Market Strategy

1. Data Warehouse3) Difficulties and Trends

Page 14: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

14

Definition of OLAP

■ OLAP (On-Line Analytical Processing)– The dynamic enterprise analysis required to create, manipulate, animate and synthesis

information from Enterprise Data Models * Providing OLAP : An IT Mandate

E.F. Codd(1993)

– FASMI(Fast Analysis of Shared Multidimensional Information)• This definition was first used in early 1995, and has not needed revision since

Pendse & Greeth(1995)

2. OLAP1) Introduction of OLAP

FAST

ANALYSIS

SHARED

MULTIDIMENSIONAL

INFORMATION

Page 15: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

15

OLAP Architecture

■ OLAP Architecture

2. OLAP1) Introduction of OLAP

Page 16: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

16

From OLTP to OLAP

■ Data used in OLAP– Sales data of June? (OLTP)– Multi-dimensional data(having many features) (OLAP)

■ Direct Access: EUC Environment

■ From What to Why– OLTP : Storing primitive data, supporting routine business operation(What) – OLAP : Storing cumulative data , supporting business goal(Why)

2. OLAP2) Concepts for OLAP

Information Source

Information Broker Information

Consumer

Page 17: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

17

OLTP vs. OLAP

■ OLTP vs. OLAP

2. OLAP2) Concepts for OLAP

OLTP OLAP

Definition On-Line Transaction Processing On-Line Analytical ProcessingObjective Operational Analytical

Focus Daily repetitious work Decision support in organizationDeveloper Computer expert End-user

User Simple operator Special analyst

Storing Current value Summarized and Consolidated data

Use Repetitive UnstructuredResponse Immediate Delayed

Data Updated SummarizedUpdate Field Recomputation

Amount of Data Small MuchData Structure Complex Simple

Database RDB MDBData period Past, Current Past, Current, FutureQuery type Regular Irregular, Analytical

Page 18: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

18

Enterprise IT Architecture

■ OLTP/OLAP Enterprise IT Architecture

2. OLAP2) Concepts for OLAP

Page 19: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

19

Data Warehouse vs. OLAP Server

■ Data Warehouse vs. OLAP Server

2. OLAP2) Concepts for OLAP

Data Warehouse OLAP Server

Objective Ready to all kinds of retrieval Specialized retrieval

Characteristics Data Storage Computation Engine

Query Type Read only Read/Write

Response Flexible Consistent, rapid

Content Historical, present Historical, present, Future

Data Structure Plain Multi-dimensional

Amount of Data Huge, much detail Much, detail Development pe-

riod A few month, yrs A few weeks, months

Page 20: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

20

Two types of OLAP

■ MOLAP

■ ROLAP

2. OLAP2) Concepts for OLAP

Clients

Clients

Clients

MDBMS

RDBMS MD Processing

Query

SQL

SQL Respond

MD Processing

Query

Respond

Page 21: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

21

From RDB to MDB

■ Basic Data Structure of MDB & RDB

– RDB : OLTP, Data Warehouse

■ RDB as OLAP Server– Cannot handle and represent Multi-dimensional relationship well– Cannot summarize data well

■ MDB as OLAP Server– Gives many managerial viewpoints– EUC– Supports analysis functionality

Table

Field, Row

Record,Column

Cube

Dimension

Hierarchy

– MDB : OLAP

2. OLAP2) Concepts for OLAP

Page 23: Data Warehouse (DW) & On-line Analytic Processing (OLAP)

23

Reference

■ Euiho Suh, “EIS_DSS_OLAP_DW (PPT Slide)”, POSMIT Lab. (POSTECH Strategic Management of Information and Technology Laboratory)

■ Euiho Suh, “OLAP (PPT Slide)”, POSMIT Lab. (POSTECH Strategic Management of Information and Technology Laboratory)

■ O’Brien & Marakas, “Introduction to Information Systems – Fifteenth Edition”, McGraw – Hill, Chapter 5, pp. 137~168