50
University of M¨ unster My Favorite Issues in Data Warehouse Modeling Jens Lechtenb ¨ orger University of M¨ unster & ERCIS, Germany http://dbms.uni-muenster.de

My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

University of Munster

My Favorite Issues inData Warehouse Modeling

Jens Lechtenborger

University of Munster & ERCIS, Germany

http://dbms.uni-muenster.de

Page 2: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Context

Data Warehouse (DW) modeling

• ETL design

• DW schema design

– Database design– Methodical process in several phases

• Focus here: Conceptual schema design

DOLAP 2005, November 5 Jens Lechtenborger 1

Page 3: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Outline

• Context

• Conceptual Modeling

• Meaning of Features

• Multidimensional Normal Forms

• Schema Versioning

• Conclusions

Page 4: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Conceptual Modeling (1/5)

• Conceptual representation of multidimensional scenario

– System- and implementation-independent

• No standard data model in sight

– Ad hoc– E/R variants– Object-oriented, based upon UML

• Specification of facts’ structure, i.e.,

– Relevant dimensions and their inner structure(→ dimension schema),

– Measures within their multidimensional contexts(→ fact schema)

DOLAP 2005, November 5 Jens Lechtenborger 2

Page 5: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Conceptual Modeling (2/5)Fact Schema

PersonCustType:

CompanyCustType:

Branch

RegionCity

Account AccountID

BranchID

CustID CustType

#Transactions YearDay Month Quarter

TransactionsBranch

Time

Job

DOLAP 2005, November 5 Jens Lechtenborger 3

Page 6: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Conceptual Modeling (3/5)Meaning of Fact Schema

• Universal relation

• Universal relation schema assumption (URSA):Semantics of attribute tied to its name

• Defining dimension levels form key

• Each arc represents functional dependency (FD)

DOLAP 2005, November 5 Jens Lechtenborger 4

Page 7: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Conceptual Modeling (4/5)Some Features

(Incomplete list)

• Standard Features

– Fact schema represents M:N relationship among dimensions– Arc in dimension schema represents M:1 relationship, i.e., FD

• Typical Features (some with challenges for summarizability)

– M:N relationships among dimension levels(non-strict hierarchies)

– Alternative and parallel paths, possibly including joining levels– Optional levels allowing NULL values

(heterogeneous, unbalanced, non-onto hierarchies)

DOLAP 2005, November 5 Jens Lechtenborger 5

Page 8: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Conceptual Modeling (5/5)Guidelines

• A rich set of features is good

• A set of guidelines for their proper use is even better

• Let’s consider above typical features in turn

DOLAP 2005, November 5 Jens Lechtenborger 6

Page 9: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Outline

• Context

• Conceptual Modeling

• Meaning of Features

• Multidimensional Normal Forms

• Schema Versioning

• Conclusions

Page 10: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesM:N relationships (1/4)

• M:N relationships are generally implicitly understood

• Consider levels Day and City

– Many cities exist at a given day– A city exists for many days

• There is no need to model this M:N relationship(if we don’t do history)

DOLAP 2005, November 5 Jens Lechtenborger 7

Page 11: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesM:N relationships (2/4)

Consider geographical levels City, Region, State, Country

• One Region per City, i.e., City→ Region

• M:N between Region and State, i.e., Region←→ State

• One Country per State, i.e., State→ Country

City

Location

All

Country

State

Region

Legal instance City Region State Countryci1 r1 s1 co1

ci1 r1 s2 co2

• City and State are in M:N relationship.

• Probably not intended. Different dimension schema needed.

DOLAP 2005, November 5 Jens Lechtenborger 8

Page 12: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesM:N relationships (3/4)

City

Location

All

Country

Region State

• Implicit M:N relationship

• No problems with summarizability

• Guideline

– Avoid “M:N arcs” within dimensions– Joint work with Bodo Husemann and Gottfried

Vossen, DMDW 2000∗ Synthesize fact schemata∗ Follow FDs to build dimension schemata

– Side remark: Bridge tables of Kimball et al. ariseautomatically as fact schemata

DOLAP 2005, November 5 Jens Lechtenborger 9

Page 13: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesM:N relationships (4/4)

However

• Maybe there was a reason to place State above Region

• Roll-Up like change in granularity

– In general, regions fit into state boundaries– But not always

• Then, add a new type of “M:N navigational arc”

– This is not Roll-Up! City

Location

All

Country

Region State

DOLAP 2005, November 5 Jens Lechtenborger 10

Page 14: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesJoining Levels (1/5)

City

Location

All

Country

Region State

City

Location

All

Region State

SCountryRCountry

1..*1..*

1..* 1..*

11

11..*

11

All

Country

Region State

City

Location

DOLAP 2005, November 5 Jens Lechtenborger 11

Page 15: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesJoining Levels (2/5)

Semantics of schema definable via admissible instances.Consider City c in Region r and State s.

• With universal relations, admissible instances are tables that satisfy FDs

– For left schema, by transitivity of FDs Country of r must be equal toCountry of s

• With objects, associations are implemented via references

– Object c has references to r and s– Objects r and s each have exactly one reference to a country object– That object for r may be distinct from the one of s

• Thus, left schema on previous slide has different meaning than other two,whose meaning is the same

DOLAP 2005, November 5 Jens Lechtenborger 12

Page 16: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesJoining Levels (3/5)

It’s even worse. . .

• Consider a 3NF implementation of left schema

– Tables for City, Region, State, Country

– Table for City has foreign keys to tables for Region, State

– Tables for Region and State each have a foreign key to table for Country

∗ Those foreign keys need not be “in sync”

• Thus, again a city may wind up in two countries

– Star and snowflake schemata have different semantics!

• What does your favorite OLAP tool do?

• Gap in relational theory. Research in progress.

• Guideline: Use handwritten code to maintain consistency. Be careful!

DOLAP 2005, November 5 Jens Lechtenborger 13

Page 17: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesJoining Levels (4/5)

Reuse of levels is different from joining

City

Product

Amount Supplier

CustID

State

Sales

...

...

Customer

SuppID

ProdID

Region

Country

...

...

Here, customer and supplier must be in the same city

DOLAP 2005, November 5 Jens Lechtenborger 14

Page 18: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesJoining Levels (5/5)

Reuse of levels is different from joining

Product

[City]CCity

Amount Supplier

CustIDSales

...

...

Customer

SuppID

ProdID ...

...

[City]SCity

Notice: New notation

DOLAP 2005, November 5 Jens Lechtenborger 15

Page 19: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesParallel vs Alternative Paths (1/5)

Parallel paths allow levels from different paths in single Group-By clause, e.g.:

City

Location

All

Country

Region State

All

Month Week

Year

Quarter

Day

Time

DOLAP 2005, November 5 Jens Lechtenborger 16

Page 20: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesParallel vs Alternative Paths (2/5)

Observations on parallel paths

• Including levels from more than one path increases level of detail

– E.g., grouping by Week and Month is OK

• Guideline: There are less problems than you might have thought

DOLAP 2005, November 5 Jens Lechtenborger 17

Page 21: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesParallel vs Alternative Paths (3/5)

Alternative paths require exclusive choice, e.g.:

Context dependency

CustType:CompanyPerson

CustType:

All

Artist null

P1 P2 P42042

...

...

Airline

all

null

CustType

Customer

Job ... Zoo director

C1...

Branch

CustID

Person Company

Grouping by Job and Branch is inconsistent

DOLAP 2005, November 5 Jens Lechtenborger 18

Page 22: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesParallel vs Alternative Paths (4/5)

Observations on alternative paths

• Alternative paths usually arise from optional levels

• Use context dependencies to explain presence of structural NULLs

• Or more complex dimension constraints

– Hurtado and Mendelzon, PODS 2002

• Guideline: Avoid/explain optional levels.

– Notice: Subclassing in object-oriented models expresses contextdependencies

DOLAP 2005, November 5 Jens Lechtenborger 19

Page 23: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Meaning of FeaturesParallel vs Alternative Paths (5/5)

CustID

CustID

CustID

CustType

CustTypeJob

CustIDBranch

Customer

Company

Person

All

Capital C. Subs. CapitalBusiness P. Legal Form

CustTypeLegal Form

DOLAP 2005, November 5 Jens Lechtenborger 20

Page 24: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Outline

• Context

• Conceptual Modeling

• Meaning of Features

• Multidimensional Normal Forms

• Schema Versioning

• Conclusions

Page 25: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Multidimensional Normal Forms (1/4)

Joint work with Gottfried Vossen: Multidimensional Normal Forms for DataWarehouse Design, Information Systems, 2003

• Three multidimensional normal forms (MNFs)

• 1MNF based on analysis of FDs

• 2MNF requires context dependencies for optional levels

• 3MNF places restrictions upon context dependencies

DOLAP 2005, November 5 Jens Lechtenborger 21

Page 26: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Multidimensional Normal Forms (2/4)

Implications of 1MNF

• Faithful representation of the application domain

• Completeness w.r.t. the application domain

• Avoidance of redundancies

• Avoidance of M:N relationships

DOLAP 2005, November 5 Jens Lechtenborger 22

Page 27: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Multidimensional Normal Forms (3/4)

Implications of 2MNF and 3MNF

• Explanation for structural NULLs allows

– context-sensitive summarizability– avoidance of contradictory queries

• Relational implementation of class hierarchies within dimensions withoutstructural NULLs possible

• Avoidance of alternative paths

DOLAP 2005, November 5 Jens Lechtenborger 23

Page 28: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Multidimensional Normal Forms (4/4)

Final remarks concerning 2MNF and 3MNF

• Both rely on purely relational techniques

• For object-oriented models considerable simplifications possible

– Disallow optional levels– Construction (see paper in Information Systems mentioned above)∗ As long as optional level l exists, introduce further sub-classes∗ One with l, now mandatory∗ The other without l

DOLAP 2005, November 5 Jens Lechtenborger 24

Page 29: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Outline

• Context

• Conceptual Modeling

• Meaning of Features

• Multidimensional Normal Forms

• Schema Versioning

• Conclusions

Page 30: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (1/14)

Joint work with Matteo Golfarelli, Stefano Rizzi, Gottfried Vossen.Schema Versioning in Data Warehouses: Enabling Cross-Version Querying viaSchema Augmentation. To appear in Data & Knowledge Engineering.

Challenges

• Storage of historical data under changing business requirements

• Non-volatility, in particular consistent re-execution of old queries

Our proposal

• Maintenance of history of schema versions

• Simple graph model representing core of multidimensional models

• Schema augmentation to represent new schema information on old data

• Schema intersection to answer cross-version queries

DOLAP 2005, November 5 Jens Lechtenborger 25

Page 31: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (2/14)

Part Customer

Size SaleDistrict

Deal

Type City

Nation

Brand

Region

Shipment

Qty Shipped

Category

Type Carrier

ShipMode

Incentive

Allowance

Year

Month

Container

Terms

Shipping CostsDM

Date

DOLAP 2005, November 5 Jens Lechtenborger 26

Page 32: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (3/14)

At t1 = 1/1/2003, the schema undergoes a major revision.

1. The temporal granularity changes from Date to Month.

2. A classification into Subcategories is added to part hierarchy.

3. A new constraint in customer hierarchy states that SaleDistricts belong toNations.

4. The Incentive is independent of shipment Terms.

At t2 = 1/1/2004, another version is created.

1. New measures ShippingCostsEU and ShippingCostsLIT are added.

2. The ShipMode dimension is deleted.

3. A ShipFrom dimension is added.

4. A descriptive attribute PartDescr is added to Part.

DOLAP 2005, November 5 Jens Lechtenborger 27

Page 33: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (4/14)

Part Customer

Size SaleDistrict

Deal

Type City

Nation

Brand

Region

Shipment

Qty Shipped

Year

Container

Category

Incentive

Allowance

Shipping CostsEUShipping CostsDM

Month

Subcategory

PartDescr

Terms

ShipFrom

Shipping CostsDM

Shipping CostsLIT

Resulting schema graph

DOLAP 2005, November 5 Jens Lechtenborger 28

Page 34: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (5/14)

Part Customer

Size SaleDistrict

Deal

Type City

Nation

Brand

Region

Shipment

Qty Shipped

Year

Container

Category

Incentive

Allowance

Shipping CostsEUShipping CostsDM

Month

Subcategory

PartDescr

Terms

ShipFrom

Shipping CostsDM

Shipping CostsLIT

Three sample query challenges:

• Compute the total quantity of each part category Shipped From eachwarehouse to each customer nation since July 2002.

• Drill down from Category to Subcategory

• Drill down from Nation to SaleDistrict

DOLAP 2005, November 5 Jens Lechtenborger 29

Page 35: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (6/14)Schema Modification (1/4)

Four schema modification operations on schema graph

• AddA() to add a new attribute

• DelA() to delete an existing attribute

• AddF() to add an arc involving existing attribute

• DelF() to remove an existing arc

DOLAP 2005, November 5 Jens Lechtenborger 30

Page 36: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (7/14)Schema Modification (2/4)

Consider again

Part Customer

Size SaleDistrict

Deal

Type City

Nation

Brand

Region

Shipment

Qty Shipped

Category

Type Carrier

ShipMode

Incentive

Allowance

Year

Month

Container

Terms

Shipping CostsDM

Date

First goal: Delete Date

DOLAP 2005, November 5 Jens Lechtenborger 31

Page 37: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (8/14)Schema Modification (3/4)

Result of DelA(Date)

Part Customer

Size SaleDistrict

Deal

Type City

Nation

Brand

Region

Shipment

Qty Shipped

Category

Type Carrier

ShipMode

Incentive

Allowance

Container

Year Terms

Shipping CostsDM

Month

Next goal: Insert Subcategory below CategoryDOLAP 2005, November 5 Jens Lechtenborger 32

Page 38: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (9/14)Schema Modification (4/4)

Result ofAddA(Subcategory)

Part

TypeBrand

Shipment

Container

Size

Category

.........

Subcategory

DOLAP 2005, November 5 Jens Lechtenborger 33

Page 39: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (9/14)Schema Modification (4/4)

Result ofAddA(Subcategory),AddF(Type→ Subcategory)

Part

TypeBrand

Shipment

Container

Size

Category

.........

Subcategory

DOLAP 2005, November 5 Jens Lechtenborger 33

Page 40: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (9/14)Schema Modification (4/4)

Result ofAddA(Subcategory),AddF(Type→ Subcategory),AddF(Subcategory→Category)

Part

TypeBrand

Shipment

Container

Size

Subcategory

Category

.........

DOLAP 2005, November 5 Jens Lechtenborger 33

Page 41: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (10/14)Schema Augmentation (1/2)

Previous schema versions associated with augmented schemata

• Previous schema computable via projection from augmented one

• Designer chooses to add information to augmented schemata based oncurrent schema modification, e.g.,

– old data enriched with new attributes, e.g., Subcategory

– more constraints expressed on old data, e.g., SaleDistrict→ Nation

• Augmented schemata used by querying subsystem

DOLAP 2005, November 5 Jens Lechtenborger 34

Page 42: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (11/14)Schema Augmentation (2/2)

Element Condition Augm. actionA is measure estimate values for A

(E→ A) ∈ F ′A is dimension disaggregate measure values

A is derived measure compute values for AA ∈ Diff+

A(S,S′)

(E→ A) 6∈ F ′A is property consistently add values for A

f ∈ Diff+F(S,S′) - check if f holds

DOLAP 2005, November 5 Jens Lechtenborger 35

Page 43: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (12/14)Cross-version Querying (1/3)

General idea: Formulation context for OLAP query is a schema graph

• Intersection of schema versions is the largest schema for uniform querying

• Query can be answered if formulation context is sub-graph of intersection

• More precisely, augmented schemata instead of real versions

DOLAP 2005, November 5 Jens Lechtenborger 36

Page 44: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (13/14)Cross-version Querying (2/3)

Customer

Size SaleDistrict

Deal

Incentive

Type City AllowanceBrand

Region

Month

Year

Container

Shipment

ShippingCostsDM

ShipFrom

Subcategory

Part

Nation

Terms

QtyShipped

Category

Compute the total quantity of each part category shipped from each warehouseto each customer nation since July 2002.

DOLAP 2005, November 5 Jens Lechtenborger 37

Page 45: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Schema Versioning (14/14)Cross-version Querying (3/3)

Observations

• Query well-formulated only if ShipFrom augmented

• Drilling down from Category to Subcategory only if subcategoriesestablished also for 2002 data

• Drilling down from Nation to SaleDistrict only if FD from sale districts tonations also satisfied before 2003.

DOLAP 2005, November 5 Jens Lechtenborger 38

Page 46: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Outline

• Context

• Conceptual Modeling

• Meaning of Features

• Multidimensional Normal Forms

• Schema Versioning

• Conclusions

Page 47: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Conclusions (1/3)

Summary

• FDs help in data warehouse design

• Meaning and potential of multidimensional features sometimesunderspecified

• Sub-classing helps to structure multidimensional schemata

• Versioning with cross-version querying is feasible

DOLAP 2005, November 5 Jens Lechtenborger 39

Page 48: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Conclusions (2/3)

• Schema versioning offers further potential

– What-if analysis– Horizontal benchmarking

• Open issue: Generalization to hyper-graphs(cross-dimensional attributes, derived measures)

DOLAP 2005, November 5 Jens Lechtenborger 40

Page 49: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

Conclusions (3/3)

There’s more. . .

• Taking full advantage of rich models

• Transformations of conceptual to logical models for ETL

– Alkis Simitsis: Mapping Conceptual to Logical Models for ETLProcesses. DOLAP 2005

• More generally, model-driven design

– Jose-Norberto Mazon et al.: Applying MDA to the Development ofData Warehouses. DOLAP 2005

• Where do the requirements come from?

– Paolo Giorgini et al.: Goal-oriented requirement analysis for datawarehouse design. DOLAP 2005

DOLAP 2005, November 5 Jens Lechtenborger 41

Page 50: My Favorite Issues in Data Warehouse Modelingdbis-group.uni-muenster.de/dbms/media/people/lechtenboerger/... · Conceptual representation of multidimensional scenario – System-

http://dbms.uni-muenster.de

Thank you for your attention!

DOLAP 2005, November 5 Jens Lechtenborger 42