25
1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH [email protected] The data warehouse architecture Query/Reporting Extract Transform Load Serve External sources Data warehouse Data marts Analysis/OLAP Falö aöldf flaöd aklöd falö alksdf Data mining Productt Time1 Value1 Value11 Product2 Time2 Value2 Value21 Product3 Time3 Value3 Value31 Product4 Time4 Value4 Value41 Operational source systems Data access tools (RK) End user applications Business Intelligence tools Data staging area (RK) Back end tools Data presentation area (RK) ”The data warehouse” Presentation (OLAP) servers Operational source systems (RK) Legacy systems OLTP/TP systems The back room The front room

The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH [email protected] The data warehouse architecture Query/Reporting

Embed Size (px)

Citation preview

Page 1: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

1

F4: DW Architecture and Lifecycle

Erik Perjons, DSV, SU/[email protected]

The data warehouse architecture

Query/Reporting

ExtractTransformLoad

Serve

External sources

Data warehouse

Data marts

Analysis/OLAP

Falö aöldfflaöd aklödfalö alksdf

Data mining

Productt Time1 Value1 Value11

Product2 Time2 Value2 Value21

Product3 Time3 Value3 Value31

Product4 Time4 Value4 Value41

Operationalsource systems

Data access tools (RK)End user applicationsBusiness Intelligence tools

Data stagingarea (RK)Back end tools

Data presentationarea (RK)”The data warehouse”Presentation (OLAP) servers

Operational sourcesystems (RK)Legacy systemsOLTP/TP systems

The back room The front room

Page 2: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

2

Operational source systemscharacteristics:

• the source data often in OLTP (Online Transaction Processing)systems, also called TPS (Transaction Processing Systems)

• high level of performance and availability

• often one-record-at-a time queries

• already occupied by the normal operations of the organisation

OLTP vs. DSS (Decision Support Systems) OLTP vs. OLAP (Online analytical processing)

Operational Source Systems

Operationalsource systems

More operational source systemscharacteristics:

• a OLTP system may be reliable and consistent, but there areoften inconsistencies between different OLTP systems

• different types of data format and data structures indifferent OLTP systems AND DIFFERENT SEMANTICS

Operational Source Systems

Operationalsource systems

Page 3: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

3

Operational Source Systems

Kimball et al´s assumptions (p 7):

•Source systems are not queried in the broadand unexpected ways

•Maintain little historical data

•Each source systems is often a natural stovepipe application

Operationalsource systems

DW architecture: Data staging area

Query/Reporting

ExtractTransformLoad

Serve

External sources

Data warehouse

Data marts

Analysis/OLAP

Falö aöldfflaöd aklödfalö alksdf

Data mining

Productt Time1 Value1 Value11

Product2 Time2 Value2 Value21

Product3 Time3 Value3 Value31

Product4 Time4 Value4 Value41

Operationalsource systems

Data access toolsData staging area Data presentation areaOperationalsource systems

Page 4: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

4

The Data Staging Area

Often the most complex part inthe architecture, and involves...

• Extraction (E)• Transformation (T)• Load (L)• indexing

ETL-tools can be usedScripts for extraction, transformation and load are

implemented

ExtractTransformLoad

Extractionmeans reading and understanding the source data andcopying the data needed for the data warehouse intostaging area for further manipulation, i.e.transformation

Data staging area

ExtractTransformLoad

Page 5: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

5

Transformation involves…

• data conversion/transformation(specify transformation rules to convert to a common data formatand common terms/semantics)

• data cleaning/cleansing– data scrubbing (use domain-specific knowledge (e.g postal

adresses) to check the data)– data auditing (discover suspicious pattern, discover violation of

stated rules)• combining data from multiple sources• assigning warehouse (surrogate) keys• data aggregation

Data staging areaExtractTransformLoad

A debate questions:

Should the data in the data staging area be stored in a3NF relational database and loaded into the presentationarea for querying and reporting?

Kimball (p 8-9): a 3NF relational database in data staging arearequires more time and resources for development, periodicloading and updating and more capacity of storing the multiplecopies of the data

Data staging area

ExtractTransformLoad

Page 6: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

6

Flat fileC

DB2table(s)

D’

DB2Connect

Staging area for checking, analysing, cleaning, complementing etc transactiondata

SQL, C++ ??

DB2Preliminarytarget DW

E

Fees(manually adjusted

to individualagreements)

I

Startbalance

H

CustomerdataG

Customerdata

F

Three star/join schemascomprising altogether 8 tablesFact tables:- transactions (10 attributes)- fees (7 attributes)- start balance (4 attributes)Dimensional tables:- time (7 attr)- customer (> 40 attr)- company (> 90 attr)- product (13 attr)- ”Service charged” (2 attr)

Various source files

DB2Final

target DWE’

+aggregation (new program)

E complemented with someaggregated tables

Some cleansingand scrubbingmay be neededhere

A Real World Example

DW architecture: Data presentation area

Query/Reporting

ExtractTransformLoad

Serve

External sources

Data warehouse

Data marts

Analysis/OLAP

Falö aöldfflaöd aklödfalö alksdf

Data mining

Productt Time1 Value1 Value11

Product2 Time2 Value2 Value21

Product3 Time3 Value3 Value31

Product4 Time4 Value4 Value41

Operationalsource systems

Data access toolsData staging area Data presentation areaOperationalsource systems

Page 7: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

7

Data presentation areaData warehouse

Data marts

OLAP servers

• What is OLAP?• Dimensional modelling vs. 3 NF modelling• Data Marts• ROLAP/MOLAP servers

• Acronym for “On-line analytical processing”

• A decision support system (DSS) that support ad-hoc querying, i.e.enables managers and analysts to interactively manipulate data. Theidea is to allow the users to easy and quickly manipulate and visualisethe data through multidimensional views, i.e. different perspectives.

What is OLAP?

qua r

ter

office

product

Service

Facts

Office

Quarter

Kimball: Dimensional modelling

Page 8: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

8

Date/Key Month Quarter Year991011 9910 4 - 99 99991012 9910 4 - 99 99

Key Customer Address RegionIncomegroup

C210 Anna N Stockholm Stockholm BC211 Lars S Malmö Skåne BC212 Erik P Rättvik Dalarna CC213 Danny B Stockholm Stockholm AC214 Åsa S Stockholm Stockholm A

Key ServiceServicegroup

S1 Local call Group AS2 Intern. call Group AS3 SMS Group BS4 WAP Group C

Key Seller OfficeF11 Anders C SundsvallF12 Lisa B SundsvallF13 Janis B Kista

Service Dimension Time Dimension

Sales DimensionCustomer Dimension

Fact table - Transactions

SumNumberof calls

C210 S1 F11 991011 25:00 3C210 S3 F11 991011 05:00 1C212 S2 F13 991011 89:00 1C213 S1 F13 991011 12:00 1C214 S4 F13 991012 08:00 1

Dimensional modelling

11

11

0..*

0..*0..*

0..*

Date/Key Month Quarter Year991011 9910 4 - 99 99991012 9910 4 - 99 99

Key Customer Address RegionIncomegroup

C210 Anna N Stockholm Stockholm BC211 Lars S Malmö Skåne BC212 Erik P Rättvik Dalarna CC213 Danny B Stockholm Stockholm AC214 Åsa S Stockholm Stockholm A

Key ServiceServicegroup

S1 Local call Group AS2 Intern. call Group AS3 SMS Group BS4 WAP Group C

Key Seller OfficeF11 Anders C SundsvallF12 Lisa B SundsvallF13 Janis B Kista

Service Dimension Time Dimension

Sales DimensionCustomer Dimension

Fact table - Transactions

SumNumberof calls

C210 S1 F11 991011 25:00 3C210 S3 F11 991011 05:00 1C212 S2 F13 991011 89:00 1C213 S1 F13 991011 12:00 1C214 S4 F13 991012 08:00 1

Query:For how muchdid customers in Sthlmuse service “Local call”in october 1999?

Σ=37:00

Dimensional modelling

Page 9: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

9

Key difference between 3NF and Dimensional modelling:- the degree of normalisation

3 NF modelling- a logical design technique to eliminate data redundancy to keepconsistency and storage efficiency, and makes transaction simpleand deterministic- ER models for enterprise are usually complex, e.g. they oftenhave hundreds, or even thousands, of entities/tables

Dimensional modelling- a logical design technique that present data in a intuitive, i.e.easier to navigate for the user- allow high performance access/queries (the complexity of 3NFmodels overwhelms the database systems optimizer, which meansbad performance)- aims at model decision support data

3 NF modelling vs. Dimensional modelling

[Kimball et al, p 10-11]

Kimball et al (p.10-12 and 396)

“we refer to the presentation area as a series of integrateddata marts”

“a data mart is a flexible set of data, ideally based on themost atomic (granular) data possible to extract fromoperational source, and presented in a symmetric(dimensional) model that is resilient when faced withunexpected user queries”

“in its most simplistic form a data mart represent data froma single business process” (business process=purchaseorder, store inventory and so on)

Data presentation area – Data marts

Page 10: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

10

Data martsService

Calls

Office

Quarter

Service

Subscription

ordersOffice

Quarter

Service

Calls

Office

Quarter

Subscription

orders

The data warehouse bus architecture

Orders

Production

DimensionsTimeSales RepCustomerPromotionProductPlantDistr. Center

[Kimball et al, p 78-79]

A data martA data mart

Page 11: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

11

• A dimensional model for a large data warehouseconsists of between 10 and 25 similar-looking datamarts. Each data marts will have 5 to 15 dimensionaltables.

Data marts

Kimball et al’s strong opinions (p.10-12)

• all data in the presentation area should be presented,stored and accesses in dimensional models

• the data marts must contain detailed, atomic data (itis unacceptable that the detailed data should belocked up in 3 NF models for drill-down)

• the data marts dimensions should be conformed fordrill-across techniques, which tie the data martstogether in the data warehouse bus architecture

The Data marts

Page 12: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

12

More about data marts:

• far smaller data volumes, fewer data sources

• easier data cleaning process, faster roll-out

• allows a “piecemeal” approach to some of the enormousintegration problems involved in creating an enterprisewide data model, but complex integration in the longterm

The Data marts

Dependent vs. Independent Data marts

Data warehouse

Dependent Data marts

Data warehouse

Independent Data marts

Page 13: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

13

Extended Relational DBMS (ROLAP servers)– data stored in RDB– star-join schemas– support SQL extensions– index structures

Multidimensional DBMS (MOLAP servers)– data stored in arrays (n-dimensional array)– direct access to array data structure– excellent indexing properties– poor storage utilisation, especially when the data is sparse.

The presentation/OLAP servers

Data warehouse

Data marts

OLAP servers

• Index structures (bit map indexes, join indexes)

• SQL extensions (operators like Cube, Crossjoin)

• Materialised views (pre-aggregations)

More about presentation servers

What is characteristics regarding data warehouse,according to Chaudhiri&Dayal :

Page 14: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

14

DW architechture: Metadata repository

Query/ReportingExtractTransformLoadRefreshOperational

source systems

Serve

External sourcesData warehouse

Metadatarepository

Monitoring & Administration

Data marts

OLAP servers

Analysis

Falö aöldfflaöd aklödfalö alksdf

Data mining

Productt Time1 Value1 Value11

Product2 Time2 Value2 Value21

Product3 Time3 Value3 Value31

Product4 Time4 Value4 Value41

Data access toolsData staging area Data presentation areaOperationalsource systems

What is metadata?

Main functions are to give...• data definitions• the origin of data• the structure of data• rules for the selection and transfer of data• qualitative and quantitative data about data

Contained in metadata repository

“Data about data”/”Information about data”

Page 15: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

15

The metadata repository

An integrated complete source of metadata

• is at the heart of the data warehouse architecture• supports the information needs of...

– system developers– data administrators– system administrators– users– applications on the data warehouse

• very complex data structure• must contain full version history• must always be up to date

Metadata life cycle activities

• Collection• identify and capture metadata in a central

repository

• Maintenance• establish processes to synchronise metadata with

the changing data structure

• Deployment• provide metadata to users in the right form and

with the right tools

Page 16: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

16

Different types of metadata

• Administrative metadata(includes all information necessary for setting up and using a DW,

e.g. Information about source databases, dw schemas,dimensions, hierachies, predefined queries, physicalorganisation, rules and script for extraction, transformationand load, back-end and front end tools)

• Business metadata (business terms and definitions, ownership of data)

• Operational metadata (information collected during the operations of the DW, e. g.

usage statistics, error reports)

DW architecture: End user applications

Query/ReportingExtractTransformLoadRefreshOperational DBs

Serve

External sourcesData warehouse

Metadatarepository

Monitoring & Administration

Data marts

OLAP servers

Analysis

Falö aöldfflaöd aklödfalö alksdf

Data mining

Productt Time1 Value1 Value11

Product2 Time2 Value2 Value21

Product3 Time3 Value3 Value31

Product4 Time4 Value4 Value41

Data access toolsData staging area Data presentation areaOperationalsource systems

Page 17: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

17

Query/Reporting

Analysis

Falö aöldfflaöd aklödfalö alksdf

Data mining

Productt Time1 Value1 Value11

Product2 Time2 Value2 Value21

Product3 Time3 Value3 Value31

Product4 Time4 Value4 Value41

• OLAP tools, BI apps, DSS• Query/Reporting tools• Data mining

End user applications

productproduct group

mounthquarter

officeregion

Product Group Region First Quarter - 1997Group A ABC 1245Group A XYZ 34534Group B ABC 45543Group B XYZ 34533

Column headers(join constraints)

Column header(application constraint) Answer set representing

focal event

Row headers

Spreadsheet output of OLAP tool

Page 18: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

18

Graphical output of OLAP tool

• Drill-down - decreasing the level of aggregation• Drill-up/Roll-up/Consolidation - increasing the level of aggregation• Drill-across - move between different star-join schemas using

conformed dimensions and joins• Slicing and dicing – ability to look at the database from different

views, e.g. one slice shows all sales of product type within regions,another slice shows all sales by sales channel within each producttype

• Pivoting - e.g. change columns to rows, rows to columns• Ranking - sorting

“Think of an OLAP data structure as a Rubik´s Cube of data that userscan twist and twirl in different ways to work through what-if anwhat-happend scenarios” [Lee Thé]

Functionalities of OLAP tools

Page 19: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

19

StrategicWho: strategic leadersWhat: formulate strategy and monitor corporate performanceExamples: Balance scorecard, Strategic Planning

OperationalWho: operational managersWhat: execution of strategy againts objectivesExamples: Budgeting, Sales forcasting

AnalyticalWho: analysts, knowledge worker, controllerWhat: ad-hoc analysisExamples: Financial and Sales Analysis, Customer Segmentation,Clickstream analysis

Business Intelligence (BI) apps

• Complexity of integration– Hidden problems with source systems

– Data homogenisation

– Underestimation of resources for data loading

• Required data not captured

• High maintenance

• Long duration projects

• Why not integrating the legacy applications(OLTP systems) instead?

Problems of Data Warehousing

Page 20: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

20

Operational Data Store (ODS)

No singel universal defintion...

ODS definition 1: Implemented to deliver operational reporting,especially when neither the legacy nor the modern OLTP systemsprovide adequate operational reports – fixed queries and for tacticaldecision makingODS definition 2: Built to support real-time interactions, especiallyin Customer Relationsship Management applications – the tradtionaldata warehouse typically is not in a position to support the demandfor near-real-time data

OMG’s standards

Model

Metamodel

Metametamodel

InstancesInvoiceno 34Helen

Nagy

M0 layer

M1 layer

M2 layer

M3 layer

CWM Metamodel

Meta Object Facility (MOF)

UML Metamodel

Page 21: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

21

Common Warehouse Metamodel (CWM)

Data

Source

Data

Source

Data

Source

Operational

Data StoreETL

Data

Warehouse

Data Mart

Analysis

Reporting

Visualization

Data Mining

Data Mart

Data Mart

The collection of metamodels by CWM can beused to model the whole data warehousingenvironment i.e from data sources to end useanalysis, and data warehouse management

Common Warehouse Metamodel

• Common Warehouse Metamodel (CWM) is alanguage specifically design to model datawarehousing and data mining applications, i.e.integrating data warehousing and businessanalysis (business intelligence) tools

• CWM has a lot in common with the UML metamodelbut has a number of special metamodels(metaclasses), e.g modelling relational databases,multidimensional databases, OLAP, schematransformations, XML

[Kleppe et al, p.139-140 (2003)]

Page 22: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

22

X

Check materialon stock

Captureordered items

Check materialon stock

Ordered item[captured]Ordered item

captured

Material on stock[checked]

Captureordered items

Material ison stock

Material isnot on stock

Orderrecieved

State

Precedes

Succedes

Activity

Precedes/Succedes

Succedes

Precedes

State

Event

consists of

Transform-ation

consists of

Function Event

Succedes

Precedes

Whymeta-modelling?

Metametamodel

level orReference

model

Meta-model

level

Modellevel

[Rosemann, Green, 2002]

CWM packages

Management Warehouse Process Warehouse Operation

Analysis Transformation OLAP Data Mining InformationVisualization

BusinessNomenclature

Resource Relational Record Multi-Dimensional XML

Foundation BusinessInformation Data Types Expressions Keys and

IndexesSoftware

Deployment Type Mapping

ObjectModel Core Behavioral Relationships Instance

Packages/Metamodels

Page 23: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

23

CWM packages layers

[Poole et al, p.36-40 (2002)]

• Object layer - base metamodels/packages, which are(re)used by the other metamodels/packages

• Foundation layer - extends the object layer withservices required which are (re)used by the othermetamodels/packages, e.g “unique key” in the KeyIndexes metamodel/package is used by relationaldatabases, OO-databases and record-oriented

• Resource layer - defines metamodels/packages forvarious types of data resouces

• Analysis layer - analysis-oriented metadata

• Management layer - describing the data warehousingprocess as a whole

CWM packages relations

Relational package

ModelElement

Class

Classifier

Core package

Datatype package

ClassifierFeature

Element

Namespace Feature

StructuralFeature

Expression

ProcedureExpression

QueryExpression

Attribute

ColumnSet

NamedColumnSet

Table View

QueryColumnSet

Column

Page 24: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

24

CWM classifyer equality

Object Package Classifier(Klass)

Feature(Attribut)

Schema Table ColumnRelational

Record Recordfile

RecordDef Field

MultiDimensional

XML

Schema Dimenson Dimensioned Objct

Schema ElementType

Attribute

More about CWM

CommonRepresentation

<<metamodels>>CWM Packages

Tool YMetamodel

Tool XMetamodel

Tool ZMetamodel

Page 25: The data warehouse architecture - s upetia/is5/Lectures/F4.pdf · 1 F4: DW Architecture and Lifecycle Erik Perjons, DSV, SU/KTH perjons@dsv.su.se The data warehouse architecture Query/Reporting

25

Business Dimensional Lifecycle

TechnicalArchitecture

Design

TechnicalArchitecture

Design

ProductSelection &Installation

ProductSelection &Installation

End-UserApplication

Specification

End-UserApplication

Specification

End-UserApplication

Development

End-UserApplication

Development

ProjectPlanningProject

Planning

Business

Requirement

Definition

Business

Requirement

Definition

DeploymentDeploymentMaintenance

andGrowth

Maintenanceand

Growth

Project ManagementProject Management

DimensionalModeling

DimensionalModeling

PhysicalDesign

PhysicalDesign

Data StagingDesign &

Development

Data StagingDesign &

Development

The Data Warehouse ArchitectureFrameworkLevel of ARCHITECTURE AREAdetail Data Back room Front room Infrastructure

Businessreqs andaudit

Architecturemodels anddocuments

Detailed models andspecs

Implemen-tation

Info neededfor better decisionsEnterprise models

How get, transform,

make availabledata

Capabilitiesneeded to get and

transform dataMajor data stores

User’s needsMajor classes of

analysesPriorities

Where is data coming from

Calc and storagereqs

HW/SW capabilities

needed vs whatwe have

Major businessissues.

How measureHow analyse

Focal events,facts, dimensions

Dimensional models

Install, test infra-structure. Connect sourcesto targets

to desktop

Standards, prodsto providecapabilities

How hook together

Report layouts, derivation

For whom, when

How interact withcapabilities

System utilties, calls, APIs ...

Write extracts, loads

Automate process

Implement reportand analysis env

Build rptTrain users

Logical and physical models

Domains,derivation rules

DB, indexesbackup ...