Data Warehouse andData Warehouse and OLAP IIOLAP IItwang/595DM/Slides/Week6.pdf · • An...

Preview:

Citation preview

Data Warehouse and OLAP IIData Warehouse and OLAP II

Week 6

1

Team Homework Assignment #8Team Homework Assignment #8

• Using a data warehousing tool and a data set, play four OLAPUsing a data warehousing tool and a data set, play four OLAP operations (Roll‐up (drill‐up), Drill‐down (roll down), Slice and dice, Pivot (rotate)) and show the results.

i 3 3 2 d 3 3• Exercise 3.11, 3,12 and 3.13. • Due date

beginning of the lecture on Friday March11th– beginning of the lecture on Friday March11th. 

T i l OLAP O tiTypical OLAP Operations

• Roll‐up (drill‐up)Roll up (drill up)• Drill‐down (roll down)• Slice and dice• Pivot (rotate)• Drill‐across• Drill‐throughDrill through

R llRoll-up

• Perform aggregation on a data cube by– Climbing up a concept hierarchy for a dimensionClimbing up a concept hierarchy for a dimension– Dimension reduction

Roll-upRoll up

5

Drill-downDrill down

• Drill‐down is the reverse of roll‐up• Navigates from less detailed data to more detailed data by

– Stepping down a concept hierarchy for a dimension– Introducing additional dimensions

Drill-downDrill down

7

Slice and DiceSlice and Dice

• The slice operation performs a selection on one dimension of the given cube, resulting in a sub‐cube

• The dice operation defines a sub‐cube by performing a selection on two or more dimensions

8

SliceSlice

9

DiceDice

10

Pivot (Rotate)Pivot (Rotate)

• Visualization operation that rotate the data axes in view in order to provide an alternative presentation of the data

11

PivotPivot

12

Drill-acrossDrill across

• An additional drilling operation• Executes queries involving (i e across) more than one fact• Executes queries involving (i.e., across) more than one fact 

table

13

Drill-throughDrill through

• An additional drilling operation• Uses relational SQL facilities to drill through the bottom level• Uses relational SQL facilities to drill through the bottom level 

of a data cube down to its back‐end relational tables

14

Figm

ud

a ure 3.10.Eultid

imensio

ta wareho Exam

ples oonal d

ata ousing

of Typical Ocube, com O

LAP oper

mm

only us rations on ed

for

15

Motivation for Building Data WarehouseMotivation for Building Data Warehouse

• Building and using a data warehouse is a complex, difficult, and long‐term task

• The construction of a large and complex information system• The construction of a large and complex information system can be viewed as the construction of  large and complex building

D t W h P j t P (1)Data Warehouse Project Process (1)

• Top‐down, bottom‐up approaches or a combination of both– Top‐down: Starts with overall design and planning (mature)

– Bottom‐up: Starts with experiments and prototypes (rapid)

Data Warehouse Project Process (2)Data Warehouse Project Process (2)

• Typical data warehouse design process– Choose a business process to model, e.g., orders, invoices, etcetc.

– Choose the grain (atomic level of data) of the business process

– Choose the dimensions that will apply to each fact table record

– Choose the measure that will populate each fact tableChoose the measure that will populate each fact table record

Th D t W h M d lThree Data Warehouse Models

• Enterprise warehouse• Enterprise warehouse– Collects all of the information about subjects spanning the entire 

organization

• Data mart– A subset of corporate‐wide data that is of value to a specific groups of 

users Its scope is confined to specific selected groups such asusers.  Its scope is confined to specific, selected groups, such as marketing data mart

• Independent vs. dependent (directly from warehouse) data mart

• Virtual warehouse– A set of views over operational databases– Only some of the possible summary views may be materialized– Only some of the possible summary views may be materialized

Data Warehouse Development: A R d d A hA Recommended Approach

Figure 3.13 A recommended approach for data warehouse development.

Figure 3.12 A three-tier data warehousing architecture.

OLAP S A hit tOLAP Server Architectures

• Relational OLAP (ROLAP) • Multidimensional OLAP (MOLAP)Multidimensional OLAP (MOLAP)• Hybrid OLAP (HOLAP) 

ROLAPROLAP

• AdvantagesC h dl l f d– Can handle large amounts of data

– Can leverage functionalities inherent in the relational databasedatabase

• Disadvantages– Performance can be slow– Limited by SQL functionalities

23

MOLAPMOLAP

• Advantages– Excellent performance – Can perform complex calculations

• Disadvantages– Limited in the amount of data it can handle– Requires additional investment

HOLAPHOLAP

• HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. 

25

Data Warehouse VendorsData Warehouse Vendors

• IBM – http://www‐306.ibm.com/software/data/informix/redbrick/

• Microsoft– http://www.microsoft.com/sql/solutions/bi/default.mspxp // / q / / / p

• Oracle– http://www.oracle.com/siebel/index.html

• Business Objects• Business Objects– http://www.businessobjects.com/

Data Warehouse Vendors (cont’d)Data Warehouse Vendors (cont d)

• Microstrategy– http://wwwmicrostrategy com/– http://www.microstrategy.com/

• Cognos– http://www.cognos.com/f• Informatica

– http://www.informatica.com/• Actuate

– http://www.actuate.com/home/index.asp

Open Source Data Warehousing ToolsOpen Source Data Warehousing Tools

• MySQL‐based  data warehouse• Open data warehouseOpen data warehouse

D t W h U (1)Data Warehouse Usage (1)

• Information processingsupports querying basic statistical analysis reporting using– supports querying, basic statistical analysis, reporting using cross‐tabs,  tables, charts and graphs

• Analytical processing• Analytical processing– multidimensional analysis of data warehouse datasupports basic OLAP operations slice dice drilling– supports basic OLAP operations, slice‐dice, drilling, pivoting

D t W h U (2)Data Warehouse Usage (2)

• Data mining• Data mining– knowledge discovery from hidden patterns Supports associations constructing analytical models– Supports associations, constructing analytical models, performing classification and prediction, and presenting the mining results using visualization toolspresenting the mining results using visualization tools

From OLAP t OLAMto OLAM

• On‐Line Analytical Miningy g– High quality of data in data warehouses– Available information processing infrastructure surrounding data warehouses

– OLAP‐based exploratory data analysisO li l ti f d t i i f ti– On‐line selection of data mining functions

Figarc gure 3.18

chitecture An integr

e.ra

ted O

LAAM

and

OOLA

P

Recommended