12
12/6/05 12/6/05 The Data Warehouse The Data Warehouse from William H. Inmon, from William H. Inmon, Building the Data Building the Data Warehouse (4 Warehouse (4 th th ed) ed)

The Data Warehouse

  • Upload
    talor

  • View
    64

  • Download
    0

Embed Size (px)

DESCRIPTION

The Data Warehouse. from William H. Inmon, Building the Data Warehouse (4 th ed). Data Warehouse =. architecture (not a technology) example of Decision Support System. Data Placement. DSS - Decision Support Systems (analytical function) OLTP – Online Transactional Processing - PowerPoint PPT Presentation

Citation preview

Page 1: The Data Warehouse

12/6/0512/6/05

The Data WarehouseThe Data Warehouse

from William H. Inmon, from William H. Inmon, Building the Data Warehouse Building the Data Warehouse

(4(4thth ed) ed)

Page 2: The Data Warehouse

12/6/0512/6/05

Data Warehouse =Data Warehouse = architecture (not a technology)architecture (not a technology)

example of Decision Support Systemexample of Decision Support System

Page 3: The Data Warehouse

12/6/0512/6/05

Data PlacementData Placement DSS - Decision Support SystemsDSS - Decision Support Systems

(analytical function)(analytical function)

OLTP – Online Transactional Processing OLTP – Online Transactional Processing (operational function)(operational function)

Archival data – cheaper/slower storageArchival data – cheaper/slower storage

Page 4: The Data Warehouse

12/6/0512/6/05

OLTPOLTP DSS DSS

primitive dataprimitive data operationaloperational day-to-dayday-to-day clerical functionclerical function non-redundantnon-redundant non-integratednon-integrated run repetitivelyrun repetitively

derived dataderived data analyticalanalytical historicalhistorical managerial functionmanagerial function redundantredundant data integrateddata integrated run heuristicallyrun heuristically

Page 5: The Data Warehouse

12/6/0512/6/05

A Definition:A Definition:

“A data warehouse is a subject-oriented, integrated, non-volatile, and time-variant collection of data in support of management’s decisions.”

(a sophisticated series of snapshots…)

Page 6: The Data Warehouse

12/6/0512/6/05

Design DecisionsDesign Decisions Granularity - level of detail or Granularity - level of detail or

summarization of the units of data in summarization of the units of data in the data warehouse (more detail = the data warehouse (more detail = lower level of granularity)lower level of granularity)

Partitioning – breakup of data into Partitioning – breakup of data into separate physical units that can be separate physical units that can be handled independentlyhandled independently

Page 7: The Data Warehouse

12/6/0512/6/05

Page 8: The Data Warehouse

12/6/0512/6/05

Page 9: The Data Warehouse

12/6/0512/6/05

Page 10: The Data Warehouse

12/6/0512/6/05

Major ComponentsMajor Components Design of Data Warehouse itselfDesign of Data Warehouse itself

Interface from operational systemsInterface from operational systems

-role of extract (ETL) software-role of extract (ETL) software[Extract/Transform/Load][Extract/Transform/Load]

-element of time (compound keys)-element of time (compound keys)-data purging-data purging

Page 11: The Data Warehouse

12/6/0512/6/05

Indirect Use of Data Warehouse DataIndirect Use of Data Warehouse Data

An analysis program periodically An analysis program periodically spins off a file to the operational spins off a file to the operational environment that includes specific environment that includes specific summarized datasummarized data

Airline commission exampleAirline commission example Retail personalization exampleRetail personalization example Credit scoring exampleCredit scoring example

Page 12: The Data Warehouse

12/6/0512/6/05

Data Warehouse RequirementsData Warehouse Requirements Manage large amounts of dataManage large amounts of data Manage data on diverse mediaManage data on diverse media Easily index and monitorEasily index and monitor Interface with varying technologiesInterface with varying technologies Store and access data in parallelStore and access data in parallel Metadata control (by “user”)Metadata control (by “user”) Contextual information (vs content)Contextual information (vs content) Efficiently use indexesEfficiently use indexes Support compound keysSupport compound keys