Upload
heather-white
View
217
Download
0
Embed Size (px)
Citation preview
Creating a Data Warehouse
Data Acquisition: Extract, Transform, Load
Extraction• Process of identifying and retrieving a set of data• from the operational systemsTransformation• Tools that allow data warehouse administrators• to apply business rules for integrating data from• multiple tables and source systems (e.g.,• aggregating data from two or more fields to• create a summary table or total)Loading• Move and load source data to a different storage location, often a
ROLAP star schema• Load performance becomes a critical component of the data
acquisition environment as volumes increase
Data Transformation
Typical Data Warehouse Architectures
Architecture for a Complex Data Warehouse
Data Warehouse Information Flows
Data Warehouse Information Flows
Inflow - Processes associated with the
extraction, cleansing, and loading of the data
from the source systems into the data
warehouse.
Upflow - Processes associated with adding
value to the data in the warehouse through
summarizing, packaging, and distribution of
the data.
Downflow - Processes associated with
archiving and backing-up/recovery of data in
the warehouse.
Outflow - Processes associated with making the data available to the end-users.
Metaflow - Processes associated with the
management of the metadata.
DW Tools and Technologies
• Building a data warehouse is a complex taskbecause there is no vendor that provides an‘end-to-end’ set of tools.• Necessitates that a data warehouse is builtusing multiple products from differentvendors.• Ensuring that these products work welltogether and are fully integrated is a majorchallenge.
Crucial Decision in Designing a Data warehouse
Step1 - Choose the subject matter one at a time
Step 2 – Decide what the fact table represents
Step 3 – Identify and conform the dimensions
Step 4 – Choose the facts
Step 5 – Store the pre calculations in the fact table
Step 6 – Define the dimension and tables
Step 7 – Decide the direction of the database and periodicity of updating
Step 8 – Track slowly the changing dimensions
Step 9 – Decide the query priorities and the query modes.
Implement the Data warehouse
Various Technical Consideration
To design and implement D.W.H the various technical issues needed to be considered
1. H/W platform of a data warehouse
2. DBMS for supporting D.W.H
3. Common and network infrastructure for D.W.H
4. System Management and OS consideration
5. S/W tools for building, operating and using D.W.H