Dimensional Modeling.ppt

Embed Size (px)

Citation preview

  • 7/25/2019 Dimensional Modeling.ppt

    1/182

    1

    DimensionalDimensionalDesignDesignA Handbook for Data Warehouse

    Design

  • 7/25/2019 Dimensional Modeling.ppt

    2/182

    2

    Course AgendaCourse Agenda

    Rationale for dimensional modeling Dimensional modeling basics Dimensional modeling details Fact table details Dimension table details Design process

    Aggregate schemas Multiple fact tables Architected data marts

  • 7/25/2019 Dimensional Modeling.ppt

    3/182

    3

    Rationale forRationale forDimensional ModelingDimensional Modeling

  • 7/25/2019 Dimensional Modeling.ppt

    4/182

    4

    OperationsSales andMarketing

    CustomerServices

    ProductDevelopme

    nt

    The Business Value ChainThe Business Value Chain

    A series of interrelated businessprocesses which contribute to increasedproduct value for the customer, and to

    prot for the enterprise orter !"#$

  • 7/25/2019 Dimensional Modeling.ppt

    5/182

    5

    Drive to CompeteDrive to Compete

    %usinesses constantl& strive to optimi'eeach process in the value chain

    (ptimi'ation re)uires measuring and

    anal&'ing the e*ectiveness of eachprocess as well as the value chain as awhole

    OperationsSales andMarketing

    CustomerServices

    ProductDevelopme

    nt

  • 7/25/2019 Dimensional Modeling.ppt

    6/182

    6

    The Role of nformationThe Role of nformation

    Technolog!Technolog! rocess optimi'ation

    +upported b& online transaction processings&stems

    (-. Measuring and anal&'ing processes

    +upported b& /anal&tic/ s&stems Data warehouse

    OperationsSales andMarketing

    CustomerServices

    ProductDevelopme

    nt

  • 7/25/2019 Dimensional Modeling.ppt

    7/182

    7

    "#ample O$TP S!stems"#ample O$TP S!stems

    Manufacturingand Process

    Control

    Sales Order"ntr! andCampaign

    Management

    CustomerSupport andRelationshipManagement

    Shipping andnventor!

    Management

    OperationsSales andMarketing

    CustomerServices

    ProductDevelopme

    nt

  • 7/25/2019 Dimensional Modeling.ppt

    8/182

    8

    O$TP S!stems % BusinessO$TP S!stems % Business

    "vents"vents 0vents are the heart

    of ever& business %ook an order rint a pick list

    Record a cashwithdrawal

    ost a pa&ment

    0vent detail is

    collected b& (-.s&stems Atomic focus .ransaction

    consistenc&

  • 7/25/2019 Dimensional Modeling.ppt

    9/182

    9

    O$TP S!stem ReportingO$TP S!stem Reporting

    (-. s&stems answereventoriented)uestions well Run invoices

    rint ledger ull up customer detail

    (perational reporting Focused on detail redictable

    re)uirements and )uer&patterns

    Does not reveal theoverall performance of aprocess

  • 7/25/2019 Dimensional Modeling.ppt

    10/182

    10

    O$TP Design CharacteristicsO$TP Design Characteristics

    Focus of (-. Design 1ndividual data

    elements Data relationships

    Design goals Accuratel& model

    business

    Remove redundanc&

  • 7/25/2019 Dimensional Modeling.ppt

    11/182

    11

    O$TP Design ShortcomingsO$TP Design Shortcomings

    2omple3 4nfamiliar to

    business people 1ncomplete histor& +low )uer&

    performance

  • 7/25/2019 Dimensional Modeling.ppt

    12/182

    12

    "mergence of Dimensional"mergence of Dimensional

    ModelModel -ogical modeling techni)ue

    For designing relational database structures

    Addresses (-. design shortcomings

    For use in anal&tic s&stems First developed earl& !"#5/s

    ackaged goods industr&

    opulari'ed b& Ralph 6imball, hD7 !""8 book9 /.he Data Warehouse .oolkit/

  • 7/25/2019 Dimensional Modeling.ppt

    13/182

    13

    & % A& % A

  • 7/25/2019 Dimensional Modeling.ppt

    14/182

    14

    Dimensional ModelingDimensional Modeling

    BasicsBasics

  • 7/25/2019 Dimensional Modeling.ppt

    15/182

    15

    Sample Value Chain Anal!sisSample Value Chain Anal!sis

    ' need to

    see overallgrossmargin (!categor!'

    ')o* do

    inventor!levels compare*ith sales (!product and*arehouse+'

    ',hat are

    outstandingreceiva(les (! -.$account+'

    ,hat is thereturn rate foreach supplier+

    Process/oriented (usiness 0uestions

    OperationsSales andMarketing

    CustomerServices

    ProductDevelopme

    nt

  • 7/25/2019 Dimensional Modeling.ppt

    16/182

    16

    Measurement 1ocusMeasurement 1ocus

    Process/oriented (usiness measures

    gross

    margin inventor!levels2 sales

    receiva(le

    s return rate

    OperationsSales andMarketing

    CustomerServices

    ProductDevelopme

    nt

  • 7/25/2019 Dimensional Modeling.ppt

    17/182

    17

    Brand

    Captain

    Coffee

    Product

    Standard

    Coffee

    Maker

    Thermal

    Coffee

    Maker

    Deluxe

    CoffeeMaker

    All

    Products

    Units Sold

    !"""

    #!4""

    #!"$3

    %!4$3

    Units Shipped

    3!&""

    1!'3#

    1!'&

    7,090

    ( Shipped

    $'(

    '&(

    &"(

    $(

    Coffee Maker Fulfillment Report

    )acts)acts

    Process MeasurementProcess Measurement

    Measures Metrics or indicators

    b& which peopleevaluate a businessprocess

    Referred to as :Facts;

    03amples Margin 1nventor& Amount +ales Dollars Receivable Dollars Return Rate

  • 7/25/2019 Dimensional Modeling.ppt

    18/182

    18

    Perspective 1ocusPerspective 1ocus

    Process/oriented (usiness perspectives

    categor!Product2

    *arehouse

    -.$

    accountsupplier

    OperationsSales andMarketing

    CustomerServices

    ProductDevelopme

    nt

  • 7/25/2019 Dimensional Modeling.ppt

    19/182

    19

    Brand

    Captain

    Coffee

    Product

    Standard

    Coffee

    Maker

    Thermal

    Coffee

    Maker

    Deluxe

    Coffee

    Maker

    All

    Products

    Units Sold

    !"""

    #!4""

    #!"$3

    9,473

    Units Shipped

    3!&""

    1!'3#

    1!'&

    7,090

    % Shipped

    $'(

    '&(

    &"(

    75%

    Coffee Maker Fulfillment Report

    DimensionsDimensions

    Process PerspectivesProcess Perspectives

    Dimensions .he parameters b& which

    measures are viewed 4sed to break out, lter

    or roll up measures

    (ften found after the word:b&; in a business)uestion

    Descriptive businessterms

    03amples roduct Warehouse 2ustomer +upplier

  • 7/25/2019 Dimensional Modeling.ppt

    20/182

    20

    Dimensional ModelDimensional Model

    Denition -ogical data model used to represent the

    measures and dimensions that pertain toone or more business sub

  • 7/25/2019 Dimensional Modeling.ppt

    21/182

    21

    Dimensional ModelDimensional Model

    AdvantagesAdvantages 4nderstandable +&stematicall&

    represents histor&

    Reliable

  • 7/25/2019 Dimensional Modeling.ppt

    22/182

    22

    StoreStore

    Star SchemaStar Schema

    imeime

    !rodu"t!rodu"t

    Fa"tsFa"ts

    Schema Simplicit!Schema Simplicit!

    Fewer tables Denormali'ed

    2onsolidated

    Dimensional Familiar to users

    Facts go in the fact

    tables

    Dimensions indimension tables

    1ncreases

    understandabilit&

  • 7/25/2019 Dimensional Modeling.ppt

    23/182

    23

    ime #imension

    *ear

    +uarter

    month

    date

    da* of the ,eek

    holida* fla-

    ord$date

    Data 1amiliarit!Data 1amiliarit!

    Adding business

    conte3t

    +ingle source eld

    03panded into parts

    Decoded into business

    terms

    Add special indicators

    and >ags

    e7g7 time dimension

    1ncreases

    understandabilit&

  • 7/25/2019 Dimensional Modeling.ppt

    24/182

    24

    Store

    !rodu"t

    Fa"ts

    Time DimensionTime Dimension

    ime#imension

    *ear

    +uarter

    month

    date

    da* of the ,eek

    holida* fla-

    Representing )istor!Representing )istor!

    .ime dimension art of ever& star

    schema

    Marks the date when the

    facts ?processmeasurements@ occurred

    Allows the schema to

    easil& add and )uer&

    data over time 0speciall& useful for

    performing comparison

    )ueries

  • 7/25/2019 Dimensional Modeling.ppt

    25/182

    25

    1e*er 3oin Paths1e*er 3oin Paths

    +tar schema

  • 7/25/2019 Dimensional Modeling.ppt

    26/182

    26

    )igh Performance Design)igh Performance Design

    Fewer

  • 7/25/2019 Dimensional Modeling.ppt

    27/182

    27

    +ub

  • 7/25/2019 Dimensional Modeling.ppt

    28/182

    28

    "nterprise Models"nterprise Models

    0nterprise+cope 0Rmodel

    0nterprisescopedimensionalmodel

  • 7/25/2019 Dimensional Modeling.ppt

    29/182

    29

    "#ercise 5"#ercise 5

    +cenario 1ndustr&9 Automobile manufacturing 2ompan&9 Millennium Motors Balue chain focus9 +ales

    +ample business )uestions9 What are the top !5 selling car models this monthC How do this months top !5 selling models compare to

    the top !5 over the last si3 monthsC

    +how me dealer sales b& region b& model b& da& What is the total number of cars sold b& month b&

    dealer b& stateC

    -ist facts and dimensions

  • 7/25/2019 Dimensional Modeling.ppt

    30/182

    30

    "#ercise 5 / *orksheet"#ercise 5 / *orksheet

  • 7/25/2019 Dimensional Modeling.ppt

    31/182

    31

    "#ercise 5 Solution"#ercise 5 Solution

    Facts +ales revenue uantit& sold

    Dimensions Model name Month

    Dealer name Region +tate Date

  • 7/25/2019 Dimensional Modeling.ppt

    32/182

    3#

    & % A& % A

  • 7/25/2019 Dimensional Modeling.ppt

    33/182

  • 7/25/2019 Dimensional Modeling.ppt

    34/182

    34

    #imension

    #imension

    #imension

    Star Schema DimensionStar Schema Dimension

    Ta(lesTa(les Dimension tables

    +tore dimensionvalues

    .e3tual content Dimension tables

    usuall& referred tosimpl& as/dimensions/

    +pend e3tra e*ortto add dimensionalattributes

  • 7/25/2019 Dimensional Modeling.ppt

    35/182

    35

    key

    key

    key

    #imension

    #imension

    #imension

    Dimension 6e!sDimension 6e!s

    +&nthetic ke&s 0ach table assigned a

    uni)ue primar& ke&,specicall& generatedfor the datawarehouse

    rimar& ke&s fromsource s&stems ma&

    be present in thedimension, but arenot used as primar&ke&s in the starschema

  • 7/25/2019 Dimensional Modeling.ppt

    36/182

    36

    Key

    attriute

    attriute

    attriute

    Key

    attriute

    attriute

    attriute

    Key

    attriute

    attriute

    attriute

    #imension

    #imension

    #imension

    Dimension ColumnsDimension Columns

    Dimension attributes +pecif& the wa& in

    which measures areviewed9 rolled up,

    broken out orsummari'ed (ften follow the

    word :b&; as in:+how me +ales b&

    Region anduarter;

    Fre)uentl& referredto as /Dimensions/

  • 7/25/2019 Dimensional Modeling.ppt

    37/182

    37

    Fa"t ale

    fa"t&

    fa"t'

    fa"t3

    Star Schema 1act Ta(leStar Schema 1act Ta(le

    rocess measures +tart b& assigning

    one fact table perbusiness sub

  • 7/25/2019 Dimensional Modeling.ppt

    38/182

    38

    Fa"t ale

    fa"t&

    fa"t'

    fa"t3

    key

    key

    key

    1act Ta(le Primar! 6e!1act Ta(le Primar! 6e!

    0ver& fact table Multipart primar&

    ke& added Made up of foreign

    ke&s referencingdimensions

  • 7/25/2019 Dimensional Modeling.ppt

    39/182

    39

    1act Ta(le Sparsit!1act Ta(le Sparsit!

    +parsit&.erm used to describe the ver& common

    situation where a fact table does not containa row for ever& combination of ever&

    dimension table row for a given time period

    %ecause fact tables contain a ver& smallpercentage of all possible combinations,

    the& are said to be Esparsel& populatedE orEsparseE

  • 7/25/2019 Dimensional Modeling.ppt

    40/182

    40

    Fa"t ale

    1act Ta(le -rain1act Ta(le -rain

    rain .he level of detail

    represented b& a rowin the fact table

    Must be identiedearl& 2ause of greatest

    confusion duringdesign process

    03ample 0ach row in the fact

    table represents thedail& item sales total

  • 7/25/2019 Dimensional Modeling.ppt

    41/182

    41

    Sparsit! "#ampleSparsit! "#ample

    Assume $,555 rows in /dealer/ dimension $5 rows in /model/ dimension

    1f all dealers sold all models ever& da&9 $,555 G $5 = $5,555 sales ever& da& "!,$5,555 sales ever& &ear Assuming onl& one model sold in ever& dealerI

    +parsit& Means that onl& a small fraction of the total possible

    $5,55 will be sold on a given da& enerall&, onl& record sales not 'eroes in fact table

  • 7/25/2019 Dimensional Modeling.ppt

    42/182

    42

    Designing a Star SchemaDesigning a Star Schema

    Five initial design steps %ased on 6imball/s si3 steps +tart designing in order

    Revisit and ad

  • 7/25/2019 Dimensional Modeling.ppt

    43/182

    43

    5757 1dentif& fact table+tart b& naming the fact table with thename of the business sub

  • 7/25/2019 Dimensional Modeling.ppt

    44/182

    44

    StepStepT*oT*o

    8787 1dentif& fact table grainDescribe what a row in the fact tablerepresents in business terms

  • 7/25/2019 Dimensional Modeling.ppt

    45/182

    45

    StepStepThreeThree

    9797 1dentif& dimensions

  • 7/25/2019 Dimensional Modeling.ppt

    46/182

    46

    StepStep1our1our

    :7:7 +elect facts

  • 7/25/2019 Dimensional Modeling.ppt

    47/182

    47

    StepStep1ive1ive

    ;7;7 1dentif& dimensionalattributes

  • 7/25/2019 Dimensional Modeling.ppt

    48/182

    48

    "#ercise 8"#ercise 8

    +cenario 1ndustr&9 Automobile manufacturing 2ompan&9 Millennium Motors Balue chain focus9 +ales

    +ample business )uestions9 What are the top !5 selling car models this

    monthC How do this months top !5 selling models

    compare to the top !5 over the last si3 monthsC +how me dealer sales b& region b& model b&

    da& What is the total number of cars sold b& month

    b& dealer b& stateC

  • 7/25/2019 Dimensional Modeling.ppt

    49/182

    49

    "#ercise 8 / continued"#ercise 8 / continued

    4sing these sources data elements,design a star schema that answers theproposed business )uestions +ales revenue

    uantit& sold Model name Dealer name Dealer cit& roduct line

    Region where sold +tate Behicle categor& Month Date of sales

  • 7/25/2019 Dimensional Modeling.ppt

    50/182

    50

    "#ercise 8 < sample data"#ercise 8 < sample data

  • 7/25/2019 Dimensional Modeling.ppt

    51/182

    51

    "#ercise 8 / *orksheet"#ercise 8 / *orksheet

  • 7/25/2019 Dimensional Modeling.ppt

    52/182

    52

    "#ercise 8 / solution"#ercise 8 / solution

    +tep ! Fact table name9 /+ale facts/

    +tep Fact table grain9 0ver& row in the sales facts table is a summar&

    of car model sales for that da& at a single dealer

    +tep J Dimensions9.ime, Model, Dealer

    +tep K Facts9.otal revenue, uantit& sold

    +tep $ Dimensional attributes9 +ee ne3t page

  • 7/25/2019 Dimensional Modeling.ppt

    53/182

    53

    "#ercise 8 < Dimensional"#ercise 8 < Dimensional

    ModelModelModel

    model_key

    cate-or*line

    model

    Sales Fa"ts

    model_key

    dealer_key

    time_key

    re.enue

    +uantit*

    imetime_key

    *ear

    +uarter

    month

    date

    #ealer

    dealer_key

    re-ion

    state

    cit*

    dealer

  • 7/25/2019 Dimensional Modeling.ppt

    54/182

    4

    & % A& % A

  • 7/25/2019 Dimensional Modeling.ppt

    55/182

    1act Ta(le Details1act Ta(le Details

  • 7/25/2019 Dimensional Modeling.ppt

    56/182

    56

    "#ample 1act Ta(le"#ample 1act Ta(le

    Sales Fa"ts

    model_key

    dealer_key

    time_key

    re.enue

    +uantit*

  • 7/25/2019 Dimensional Modeling.ppt

    57/182

    57

    "#ample 1act Ta(le Records"#ample 1act Ta(le Records

    time=ke! model=ke! dealer=ke! revenue 0uantit!

    1 1 1 $&4"/#$ #1 # 1 1##'"/3$ 3

    1 3 1 #&3'"/1 1

    1 4 1 13#'$/## 4

    1 1 43$&%/4 1

    1 1 # 3'$&/%& 1

    1 3 # $&'4/$& #

    1 # %#&$'/'$ #

    rimar& 6e& Facts

    +ales Facts

  • 7/25/2019 Dimensional Modeling.ppt

    58/182

    58

    1acts1acts

    Full& additive 2an be summed across an& and all

    dimensions +tored in fact table 03amples9 revenue, )uantit&

  • 7/25/2019 Dimensional Modeling.ppt

    59/182

    59

    "#ample> Additive 1acts"#ample> Additive 1acts

    Model

    model_key

    0randcate-or*

    line

    model

    Sales Fa"ts

    model_key

    dealer_key

    time_key

    re.enue

    +uantit*

    imetime_key

    *ear

    +uarter

    month

    date

    #ealer

    dealer_key

    re-ion

    state

    cit*

    dealer

  • 7/25/2019 Dimensional Modeling.ppt

    60/182

    60

    1acts1acts

    +emiadditive 2an be summed across most dimensions

    but not all 03amples9 1nventor& )uantities, account

    balances, or personnel counts An&thing that measures a :level; Must be careful with adhoc reporting

    (ften aggregated across the :forbiddendimension; b& averaging

  • 7/25/2019 Dimensional Modeling.ppt

    61/182

    61

    "#ample> Semi/additive"#ample> Semi/additive

    1acts1acts

    Sales Fa"ts

    model_key

    dealer_key

    time_key

    in.entor*

    Model

    model_key

    0rand

    cate-or*

    line

    model

    ime

    time_key

    *ear

    +uarter

    month

    date

    #ealer

    dealer_key

    re-ion

    state

    cit*

    dealer

  • 7/25/2019 Dimensional Modeling.ppt

    62/182

    62

    1acts1acts

    LonAdditive 2annot be summed across an& dimension

    All ratios are nonadditive

    %reak down to full& additive components,

    store them in fact table

  • 7/25/2019 Dimensional Modeling.ppt

    63/182

    63

    "#ample> ?on/Additive 1acts"#ample> ?on/Additive 1acts

    Marginrate is nonadditiveMarginrate = marginamtrevenue

    model_key

    dealer_key

    time_key

    revenue

    marginamt

    time_key

    &ear

    )uarter

    month

    date

    model_key

    brand

    categor&

    line

    model

    Model Sales 1acts

    dealer_key

    region

    state

    cit&

    dealer

    Dealer

    Time

  • 7/25/2019 Dimensional Modeling.ppt

    64/182

    64

    @nit Amounts@nit Amounts

    4nit price, 4nit cost, etc7 Are numeric, but not measures

    +tore the e3tended amounts which are

    additive 4nit amounts ma& be useful as dimensions

    for :price point anal&sis;

    Ma& store unit values to save space

  • 7/25/2019 Dimensional Modeling.ppt

    65/182

    65

    1actless 1act Ta(le1actless 1act Ta(le

    A fact table with no measures in it Lothing to measure777 N03cept the convergence of

    dimensional attributes +ometimes store a :!; for convenience 03amples9 Attendance, 2ustomer

    Assignments, 2overage

  • 7/25/2019 Dimensional Modeling.ppt

    66/182

    ''

    & % A& % A

  • 7/25/2019 Dimensional Modeling.ppt

    67/182

    '$

    Dimension Ta(leDimension Ta(leDetails

  • 7/25/2019 Dimensional Modeling.ppt

    68/182

    68

    "#ample Dimension Ta(les"#ample Dimension Ta(les

    dealer_key

    region

    state

    cit&

    dealer

    model_key

    brandcategor&

    line

    model

    Model time_key

    &ear

    )uarter

    monthdate

    Time

    Dealer

  • 7/25/2019 Dimensional Modeling.ppt

    69/182

    69

    "#ample Dimension Ta(le"#ample Dimension Ta(le

    RecordsRecords

    time=ke! !ear 0uarter month date

    ! !""O ! Panuar& !!$"O

    !""O ! Panuar& !!8"O

    J !""O ! Panuar& !!O"O

    !$5 !""O April K!"O

    OOO !""# K (ctober !5!J"#

    +&nthetic 6e& Attributes

    .ime Dimension

  • 7/25/2019 Dimensional Modeling.ppt

    70/182

    70

    "#ample Dimension Ta(le"#ample Dimension Ta(le

    RecordsRecordsdealer=ke! region state cit! dealer

    ! Lortheast Massachusetts %oston Honest .ed/s

    Lortheast Massachusetts %oston +toller 2o7

    J +outhwest Ari'ona .ucson Wright

    Motors

    ! +outhwest 2alifornia +an Diego

    American

    K$ 2entral 1llinois 2hicago -ugwig Motors+&nthetic 6e& Attributes

    Dealer Dimension

  • 7/25/2019 Dimensional Modeling.ppt

    71/182

    71

    Dimension Ta(lesDimension Ta(les

    2haracteristics Hold the dimensional attributes

    4suall& have a large number of attributes

    ?:wide;@ Add >ags and indicators that make it eas&

    to perform specic t&pes of reports Have small number of rows in comparison to

    fact tables ?most of the time@

  • 7/25/2019 Dimensional Modeling.ppt

    72/182

    72

    Dont ?ormalie DimensionsDont ?ormalie Dimensions

    +aves ver& little space 1mpacts performance 2an confuse matters when multiple

    hierarchies e3ist A star schema with normali'ed

    dimensions is called a Esnow>akeschemaE

    4suall& advocated b& software vendorswhose product re)uire snow>ake forperformance

  • 7/25/2019 Dimensional Modeling.ppt

    73/182

    73

    "#ample Sno*ake Schema"#ample Sno*ake Schema

    category_key

    categor&

    brand_key

    brand_key

    brand

    Brand

    Categor!

    line_key

    line

    category_key

    $ine

    model_key

    model

    line_key

    Model

    model_key

    dealer_key

    time_key

    revenue

    )uantit&

    Sales1acts

    date_key

    date

    month_ke

    y

    Da!

    month_key

    month

    quarter_ke

    y

    Monthquarter_ke

    y

    )uarter

    year_key

    &uarteryear_key

    &ear

    ear

    dealer_ke

    y

    dealer

    city_key

    Dealercity_key

    cit&state_key

    Cit!state_key

    state

    region_key

    Stateregion_ke

    y

    region

    Region

  • 7/25/2019 Dimensional Modeling.ppt

    74/182

    74

    Slo*l! Changing DimensionsSlo*l! Changing Dimensions

    Dimension source data ma& changeover time

    Relative to fact tables, dimension

    records change slowl& Allows dimensions to have multiple

    /proles/ over time to maintain histor& 0ach prole is a separate record in a

    dimension table

    Slo*l! Changing DimensionSlo*l! Changing Dimension

  • 7/25/2019 Dimensional Modeling.ppt

    75/182

    75

    Slo*l! Changing DimensionSlo*l! Changing Dimension

    "#ample"#ample 03ample9 A woman gets married

    ossible changes to customer dimension

    Q -ast Lame

    Q Marriage +tatus

    Q Address

    Q Household 1ncome

    03isting facts need to remain associatedwith her single prole

    Lew facts need to be associated with hermarried prole

    Slo*l! Changing DimensionSlo*l! Changing Dimension

  • 7/25/2019 Dimensional Modeling.ppt

    76/182

    76

    Slo*l! Changing DimensionSlo*l! Changing Dimension

    T!pesT!pes .hree t&pes of slowl& changing dimensions

    .&pe !

    Q 4pdates e3isting record with modications

    Q Does not maintain histor&

    .&pe Q Adds new record

    Q Does maintain histor&

    Q Maintains old record

    .&pe J9Q 6eep old and new values in the e3isting row

    Q Re)uires a design change

    Designing $oads to )andleDesigning $oads to )andle

  • 7/25/2019 Dimensional Modeling.ppt

    77/182

    77

    Designing $oads to )andleDesigning $oads to )andle

    SCDSCD Design and implementation guidelines

    ather +2D re)uirements when designingdata mapping and loading

    +2D needs to be dened and implemented at

    the dimensional attribute level 0ach column in a dimension table needs to be

    identied as a .&pe ! or a .&pe +2D 1f one .&pe ! column changes, then all .&pe !

    columns will be updated 1f one .&pe column changes, then a new

    record will be inserted into the dimensiontable

    Designing $oads to )andleDesigning $oads to )andle

  • 7/25/2019 Dimensional Modeling.ppt

    78/182

    78

    Designing $oads to )andleDesigning $oads to )andle

    SCDSCD Design and implementation guidelines

    For large dimension tables, change datacapture techni)ues ma& be used tominimi'e the data volume

    For smaller dimension tables, compare all(-. records with dimension table records

    %alance data volume with change datacapture logic comple3ities

    Designing $oads to )andleDesigning $oads to )andle

  • 7/25/2019 Dimensional Modeling.ppt

    79/182

    79

    2ustomer Dimension.able2olumn Lame +2D .&pe2ustomer 6e& LA

    2ustomer 1D !

    Lame !

    Marital +tatus !

    Home 1ncome !

    Designing $oads to )andleDesigning $oads to )andle

    SCDSCD .&pe ! e3ample9 a woman gets married

    T!pe 5T!pe 5

  • 7/25/2019 Dimensional Modeling.ppt

    80/182

    80

    T!pe 5T!pe 5"#ample"#ample

    CustD ?ame

    MaritalStatus

    589 Sue 3ones SE9F6

    )omencome

    CustD ?ame

    MaritalStatus

    5 589 Sue 3ones S E9F6F

    )omencome

    Cust6e!

    Cust6e!

    Da!6e! Sales

    5 5E:F

    Da& Dim

    Da!6e!

    BusinessDate

    5 5.95.F5

    +ales Facts2ustomer Dim2ustomer (-.

    Da!6e!

    BusinessDate

    5 5.95.F5

    8 8.F5.F5

    Da& Dim

    Cust6e!

    Da!6e! Sales

    5 5E:F5 8E;F

    +ales Facts

    CustD ?ame

    MaritalStatus

    589 Sue Smith MEGF6

    )omencome

    2ustomer (-.

    Status

    2ustomer Dim

    CustD ?ame

    MaritalStatus

    5 589 Sue Smith M EGF6F

    )omencome

    Cust6e! Status

    O$TP Star Schema

    Sue -ets Married 8.5.F5

  • 7/25/2019 Dimensional Modeling.ppt

    81/182

    81

    T!pe 5 "#ampleT!pe 5 "#ample

    (bservations 2ustomer histor& is not maintained in the

    (-. s&stem 2ustomer histor& is not maintained in the

    star schema +ue onl& has one customer /prole/ in

    customer dimension table +ues sales facts across all histor& are

    associated with her married prole +ales facts that were associated with +ues

    single prole have been lost

    Designing $oads to )andleDesigning $oads to )andle

  • 7/25/2019 Dimensional Modeling.ppt

    82/182

    82

    2ustomer Dimension.able2olumn Lame +2D .&pe

    2ustomer 6e& LA

    2ustomer 1D

    Lame

    Marital +tatus

    Home 1ncome !

    Designing $oads to )andleDesigning $oads to )andle

    SCDSCD .&pe e3ample9 a woman gets married

    T!pe 8T!pe 8

  • 7/25/2019 Dimensional Modeling.ppt

    83/182

    83

    T!pe 8T!pe 8"#ample"#ample

    CustD ?ame

    MaritalStatus

    589 Sue 3ones S9F6

    Da& Dim

    )omencome

    CustD ?ame

    MaritalStatus

    5 589 Sue 3ones S E9F6F

    )omencome

    Cust6e!

    Cust6e!

    Da!6e! Sales

    5 5E:F

    Da!6e!

    BusinessDate

    5 5.95.F5

    +ales Facts2ustomer Dim2ustomer (-.

    Cust6e!

    Da!6e! Sales

    5 5E:F8 8E;F

    +ales FactsCustD ?ame

    MaritalStatus

    5 589 Sue 3ones S E9F65

    )omencome

    Cust6e! Status

    8 589 Sue Smith M EGF6F

    2ustomer DimCustD ?ame

    MaritalStatus

    589 Sue Smith MEGF6

    )omencome

    2ustomer (-.

    Status

    O$TP Star Schema

    Sue -ets Married 8.5.F5

    Da& DimDa!6e!

    BusinessDate

    5 5.95.F5

    8 8.F5.F5

  • 7/25/2019 Dimensional Modeling.ppt

    84/182

    84

    T!pe 8 "#ampleT!pe 8 "#ample

    .&pe (bservations 2ustomer histor& is not maintained in the (-.

    s&stem

    2ustomer histor& is maintained in the star

    schema +ue has two /proles/ in the customer dimension

    +ues sales facts ma& be anal&'ed for when she

    was single, when she was married, and across all

    histor& b& using the customer id eld Home income was updated in the new prole

    record

    Slo*l! Changing DimensionSlo*l! Changing Dimension

  • 7/25/2019 Dimensional Modeling.ppt

    85/182

    85

    Slo*l! Changing DimensionSlo*l! Changing Dimension

    AdviceAdvice /When in doubt, design t&pe /

  • 7/25/2019 Dimensional Modeling.ppt

    86/182

    86

    Degenerate DimensionsDegenerate Dimensions

    Dimensions with no other place to go +tored in the fact table Are not facts

    2ommon e3amples include invoicenumbers or order numbers

  • 7/25/2019 Dimensional Modeling.ppt

    87/182

    87

    e-ion

    2ortheast

    Southeast

    Units Sold Re(enue

    )uarterl* +uto Sales Summar*

    State

    Maine

    2e, ork

    Massachusetts

    )lorida

    eor-ia

    5ir-inia

    e-ion

    2ortheast

    Southeast

    Central

    2orth,est

    South,est

    Units Sold Re(enue

    )uarterl* +uto Sales Summar*

    DrillingDrilling

    Drilling down Adding dimensional

    detail Further breaks out a

    measure in some wa& Has nothing to do

    with a hierarch&I

  • 7/25/2019 Dimensional Modeling.ppt

    88/182

    88

    Region

    Lortheast

    +outheast

    4nits +old Revenue

    &uarterl! Auto SalesSummar!

    +tate

    Maine

    Lew Sork

    Massachusetts

    Florida

    eorgia

    Birginia

    Region

    Lortheast

    +outheast

    2entral

    Lorthwest

    +outhwest

    4nits +old Revenue

    &uarterl! Auto SalesSummar!

    DrillingDrilling

    Rolling up Removing

    dimensional detail Rolls up a measure Has nothing to do

    with how &ou drilleddown

  • 7/25/2019 Dimensional Modeling.ppt

    89/182

    89

    DrillingDrilling

    Drilling across A )uer& that involves more than one fact

    table Lot necessaril& an action that changes how

    a user is looking at the data %est resolved b& multiple +- passes

  • 7/25/2019 Dimensional Modeling.ppt

    90/182

    %"

    & % A& % A

  • 7/25/2019 Dimensional Modeling.ppt

    91/182

    %1

    Dimensional DesignDimensional DesignProcessProcess

    ro

  • 7/25/2019 Dimensional Modeling.ppt

    92/182

    92

    Development

    hase

    Deplo&ment

    hase

    Design hase

    Data Mart DevelopmentData Mart Development

    Dimensional modeling is a critical partof the data mart development e*ort

    lD t M t D l t

  • 7/25/2019 Dimensional Modeling.ppt

    93/182

    93

    Data Mart DevelopmentData Mart Development

    Design phase Determine re)uirements and design schema

    Development phase

    1terative build and feedback Deplo&ment phase

    Automate load, document, train users

    P 4 D li (lP 4 t D li (l

  • 7/25/2019 Dimensional Modeling.ppt

    94/182

    94

    Pro4ect Delivera(lesPro4ect Delivera(les

    Design ro

  • 7/25/2019 Dimensional Modeling.ppt

    95/182

    95

    Developmenthase

    Deplo&menthase

    Design hase

    Pro4ect ApproachPro4ect Approach

    .he dimensional model is developedduring the design stage

    +cope of the pro

  • 7/25/2019 Dimensional Modeling.ppt

    96/182

    96

    Developmenthase

    Deplo&menthase

    Design hase

    Design Stage ActivitiesDesign Stage Activities

    ather re)uirements throughre)uirements workshops

    Develop star schema

    2onduct design review

    - th R i t- th R i t

  • 7/25/2019 Dimensional Modeling.ppt

    97/182

    97

    -ather Re0uirements-ather Re0uirements

    Re)uirements denition 4ser workshops +preadsheets +ample reports

    +ource s&stems anal&sis D%A interviews

    2op&books 0R diagrams

    D i D li (lD i D li (l

  • 7/25/2019 Dimensional Modeling.ppt

    98/182

    98

    Design Delivera(lesDesign Delivera(les

    Deliverables.he star schema itself -oad mapping document

    How these primar& components aredelivered will depend on needs andformat chosen

    Modeling tools +preadsheets.e3t documents

    ? t ti? t ti

  • 7/25/2019 Dimensional Modeling.ppt

    99/182

    99

    ?otation?otation

    Lo recogni'ed standard 0R semantics unnecessar& 2larit& is the onl& characteristic that

    reall& matters

    ? t ti " l?otation " ample

  • 7/25/2019 Dimensional Modeling.ppt

    100/182

    100

    Sales 1actstime_key

    model_key

    dealer_key

    time_key

    Time

    model_ke

    y

    Model

    dealer_keyDealer

    ?otation "#ample?otation "#ample

    1D0F!T Dependent entities fact tables 1ndependent entities dimension tables

    ? t ti " l?otation "#ample

  • 7/25/2019 Dimensional Modeling.ppt

    101/182

    101

    Sales 1acts

    Time

    Dealer

    Model

    ?otation "#ample?otation "#ample

    Martin 10 0ntities fact or dimension tables Attributes not shown

    ? t ti " l?otation "#ample

  • 7/25/2019 Dimensional Modeling.ppt

    102/182

    102

    time_key

    Time

    model_ke

    y

    Model

    dealer_key

    Dealer

    time_key

    model_key

    dealer_ke

    y

    Sales 1acts

    ?otation "#ample?otation "#ample

    6imball +imple structure 2ardinalit& implied

    Design ?aming StandardsDesign ?aming Standards

  • 7/25/2019 Dimensional Modeling.ppt

    103/182

    103

    Design ?aming StandardsDesign ?aming Standards

    Responsibilit& of data administration 03tended to the data warehouse 1mportant to start earl& in the pro

  • 7/25/2019 Dimensional Modeling.ppt

    104/182

    104

    Data "lement DeHnitionsData "lement DeHnitions

    2lear descriptions Facts

    2alculated formulae

    Dimensional attributes Multiple meaningss&non&mous terms

    Aliases

    Data "lement nstancesData "lement nstances

  • 7/25/2019 Dimensional Modeling.ppt

    105/182

    105

    Data "lement nstancesData "lement nstances

    03ample of Data As it will e3ist in the warehouse

    After decoding

    Adds to model understanding

    Removes ambiguit&uncertaint&

    Data "lement MappingData "lement Mapping

  • 7/25/2019 Dimensional Modeling.ppt

    106/182

    106

    Data "lement MappingData "lement Mapping

    Where is the data coming from +ource s&stem

    .able

    2olumn

    Record

    Field

    Data TransformationData Transformation

  • 7/25/2019 Dimensional Modeling.ppt

    107/182

    107

    Data TransformationData Transformation

    2hanging the data +erves as spec for 0.- process

    Decodes

    .&pe conversion

    2onditional logic

    Handling of L4--s

  • 7/25/2019 Dimensional Modeling.ppt

    108/182

    1"&

    & % A& % A

  • 7/25/2019 Dimensional Modeling.ppt

    109/182

    1"%

    Aggregates SchemasAggregates Schemas

    Aggregate DesignsAggregate Designs

  • 7/25/2019 Dimensional Modeling.ppt

    110/182

    110

    Aggregate DesignsAggregate Designs

    Aggregates restored fact summaries Along one or more dimensions.he most e*ective tool for improving

    performance

    03amples

    +ummar& of sales b& region, b& product, b&categor& Monthl& sales

    Aggregate BackgroundAggregate Background

  • 7/25/2019 Dimensional Modeling.ppt

    111/182

    111

    Aggregate BackgroundAggregate Background

    Aggregate rationale 1mprove end user )uer& performance Reduce re)uired 24 c&cles owerful cost saving tool

    Restrictions Additive facts onl&

    Must use dimensional design

    Aggregate -uidelinesAggregate -uidelines

  • 7/25/2019 Dimensional Modeling.ppt

    112/182

    112

    Aggregate -uidelinesAggregate -uidelines

    Dont start with aggregates

    Design and build based on usage +ooner or later &ou/ll need to build

    aggregates

    Aggregate T!pesAggregate T!pes

  • 7/25/2019 Dimensional Modeling.ppt

    113/182

    113

    Aggregate T!pesAggregate T!pes

    -evel eld

    +eparate fact tables

    Aggregate T!pesAggregate T!pes

  • 7/25/2019 Dimensional Modeling.ppt

    114/182

    114

    Aggregate T!pesAggregate T!pes

    -evel eld (ld techni)ue Re)uires :level; attribute in appropriate

    dimensions

    Aggregates and baselevel facts stored insame table

    +ame number of total fact records asseparate table approach

    Drawbacks 0ver& )uer& must constrain on the level eld ossibilit& of double counting

    $evel 1ield$evel 1ield

  • 7/25/2019 Dimensional Modeling.ppt

    115/182

    115

    time_key

    product_key

    market_key

    uantit&

    Amount

    time_key

    -evel

    Sear

    Fiscal eriod

    Month

    Da&

    Da& of Week

    product_ke

    y

    -evel

    2ategor&

    %rand

    roduct

    Diet

    1ndicator market_key

    Region

    District

    +tate

    2it&

    03ample9 -evel = 2ategor&L7A7 for appropriate attributes

    $evel 1ield$evel 1ield

    Product Sales 1acts

    Time

    Market

    Aggregate T!pesAggregate T!pes

  • 7/25/2019 Dimensional Modeling.ppt

    116/182

    116

    Aggregate T!pesAggregate T!pes

    +eparate .ables +eparate fact table for ever& aggregate +eparate dimension table for ever& aggregate

    dimension

    +ame number of fact records as level eld tables Advantage

    Removes possibilit& of double counting +chema clarit&

    2aveat Re)uires software with aggregate navigation

    capabilit&

    S t T (lSeparate Ta(les

    Month

  • 7/25/2019 Dimensional Modeling.ppt

    117/182

    117

    (ne Wa&Aggregate

    Separate Ta(lesSeparate Ta(les

    month_key

    product_key

    market_key

    uantit&

    Amount

    Mthl!

    Sales1acts Agg

    time_key

    product_key

    market_key

    uantit&

    Amount

    Sales 1actsproduct_key2ategor&

    %rand

    roduct

    Diet 1ndicator

    Product

    month_key

    Sear

    Fiscal eriodMonth

    Month

    market_key

    RegionDistrict

    +tate

    2it&

    Market

    time_key

    Sear

    Fiscal eriod

    Month

    Da&

    Da& of Week

    Time

    Separate Ta(lesSeparate Ta(les

  • 7/25/2019 Dimensional Modeling.ppt

    118/182

    118

    .wo Wa&Aggregate

    Separate Ta(lesSeparate Ta(les

    product_ke

    y2ategor&

    %rand

    roduct

    Diet 1ndicator

    Product

    category_key

    2ategor&

    Categor!

    month_key

    category_key

    market_key

    uantit&

    Amount

    Mnthl! Cat

    Sales 1actsAgg

    month_key

    SearFiscal eriod

    Month

    Month

    market_key

    RegionDistrict

    +tate

    2it&

    Market

    time_key

    Sear

    Fiscal eriod

    Month

    Da&

    Da& of Week

    Time

    time_key

    product_key

    market_key

    uantit&

    Amount

    Sales 1acts

    Aggregate PitfallsAggregate Pitfalls

  • 7/25/2019 Dimensional Modeling.ppt

    119/182

    119

    Aggregate PitfallsAggregate Pitfalls

    +parsit& failure.erm used to describe the result of building

    too man& aggregate fact that do notsummari'e enough rows7

    When +parsit& failure occurs, a relativel&small star schema can grow ?in terms ofdisk si'e@ thousands of times7

    +parsit& failure = aggregate e3plosion

    Aggregate Design -uidelinesAggregate Design -uidelines

  • 7/25/2019 Dimensional Modeling.ppt

    120/182

    120

    Aggregate Design -uidelinesAggregate Design -uidelines

    Rule of twent&.o avoid aggregate e3plosion Make sure each aggregate record

    summari'es 5 or more lowerlevel records

    Remember.otal number of possible fact tables in an&

    given dimensional model = cartesian

    product of all levels in all the dimensions

    )ierarchies % Aggregate)ierarchies % Aggregate

  • 7/25/2019 Dimensional Modeling.ppt

    121/182

    121

    ear I5J

    &uarter I:J

    Month I58J

    Date I9G;J

    Time

    ;!ears

    8F 0uarters

    GF months

    5K8; da!s

    DesignDesign

    Hierarch& diagram Helps visuali'e

    options for buildingaggregates

    Adding cardinalities

    insures following the

    rule of 5

    Lot re)uired to build

    initial star schema

    Aggregate ?avigationAggregate ?avigation

  • 7/25/2019 Dimensional Modeling.ppt

    122/182

    122

    Aggregate ?avigationAggregate ?avigation

    Description Function provided b& software la&er9

    Aggregate Lavigator Directs user )ueries to the most favorable

    available aggregate.ransparent to the end user

    Aggregate 1rame*orkAggregate 1rame*ork

  • 7/25/2019 Dimensional Modeling.ppt

    123/182

    123

    %usiness Biew

    Designer Biew

    Aggregate 1rame*orkAggregate 1rame*ork

    Aggregate ArchitectureAggregate Architecture

  • 7/25/2019 Dimensional Modeling.ppt

    124/182

    124

    Aggregate A*are

    S&$ Client PCS&$

    RDBMS

    Client PC

    Application Server

    S&$Aggregate A*are S&$

    RDBMS

    Client PCAggregate A*are S&$

    RDBMS

    Aggregate ArchitectureAggregate Architecture

    Aggregate Deplo!mentAggregate Deplo!ment

  • 7/25/2019 Dimensional Modeling.ppt

    125/182

    125

    Aggregate Deplo!mentAggregate Deplo!ment

    1ncremental

    %ased on usage

    .ransparent to users

    .&picall& warehouse D%A responsibilit&

    Aggregate Deplo!mentAggregate Deplo!ment

  • 7/25/2019 Dimensional Modeling.ppt

    126/182

    126

    %uild +ub

  • 7/25/2019 Dimensional Modeling.ppt

    127/182

    127

    "#ercise 9"#ercise 9

    +cenario iven the original star schema and the

    following hierarch&, design a twowa&aggregate table structure that will

    drasticall& increase performance Make &our own assumptions about

    summar& levels

    "#ercise 9 Dimensional"#ercise 9 Dimensional

  • 7/25/2019 Dimensional Modeling.ppt

    128/182

    128

    "#ercise 9 < Dimensional"#ercise 9 < Dimensional

    ModelModelModel

    model_key

    cate-or*

    line

    model

    Sales Fa"ts

    model_key

    dealer_key

    time_key

    re.enue

    +uantit*

    ime

    time_key

    *ear

    +uarter

    month

    date

    #ealer

    dealer_key

    re-ion

    state

    cit*

    dealer

    "#ercise 9"#ercise 9

  • 7/25/2019 Dimensional Modeling.ppt

    129/182

    129

    "#ercise 9"#ercise 9 +cenario

    1ndustr&9 Automobile manufacturing 2ompan&9 Millennium Motors Balue chain focus9 +ales

    +ample business )uestions9 What are the top !5 selling car models this monthC How do this months top !5 selling models compare

    to the top !5 over the last si3 monthsC +how me dealer sales b& region b& model b& da& What is the total number of cars sold b& month b&

    dealer b& stateC

    "#ercise 9"#ercise 9

  • 7/25/2019 Dimensional Modeling.ppt

    130/182

    130

    "#ercise 9"#ercise 9

    All

    2ategor&

    -ine

    Model name

    All

    Sear

    uarter

    Month

    Date

    TimeModel

    All

    Region

    +tate

    2it&

    Dealer name

    Dealer

    Millennium Motors dimensions

    $

    $5

    !555

    !555 K5

    !5

    5

    5

    85

    !#$

    $

    "#ercise 9 ,orksheet"#ercise 9 ,orksheet

  • 7/25/2019 Dimensional Modeling.ppt

    131/182

    131

    "#ercise 9 ,orksheet"#ercise 9 ,orksheet

    "#ercise 9 Solution"#ercise 9 Solution

  • 7/25/2019 Dimensional Modeling.ppt

    132/182

    132

    "#ercise 9 Solution"#ercise 9 Solution

    model_key

    categor&

    linemodel

    model_key

    dealer_key

    time_key

    revenue

    )uantit&

    time_key

    &ear

    )uartermonth

    date

    dealer_key

    region

    state

    cit&

    dealer

    month_key

    &ear

    )uarter

    month

    state_key

    region

    state

    state_key

    month_key

    model_key

    revenue

    )uantit&

    Dealer

    Time

    MonthAgg Sales

    1actsState

    ModelSales 1acts

  • 7/25/2019 Dimensional Modeling.ppt

    133/182

    133

    & % A& % A

  • 7/25/2019 Dimensional Modeling.ppt

    134/182

    134

    Multiple 1act Ta(lesMultiple 1act Ta(les

    Multiple 1act Ta(lesMultiple 1act Ta(les

  • 7/25/2019 Dimensional Modeling.ppt

    135/182

    135

    pp

    Di*erent business processes usuall&re)uire di*erent fact tables

    .here are also several cases where asingle business process will re)uire

    multiple fact tables 2ore and custom +napshot and transaction

    2overage Aggregates

    DiLerent Business ProcessesDiLerent Business Processes

  • 7/25/2019 Dimensional Modeling.ppt

    136/182

    136

    Di*erent business processes usuall&re)uire di*erent fact tables

    1n practice, it ma& be hard to identif&what a :process; is

    +ometimes &ou can spot di*erentprocesses because measures arerecorded

    With di*erent dimensions At di*ering grains

    DiLerent Dimensions orDiLerent Dimensions or

  • 7/25/2019 Dimensional Modeling.ppt

    137/182

    137

    DiLerent Dimensions orDiLerent Dimensions or

    -rain-rain

    product_key

    2ategor&

    %rand

    roduct

    Diet 1ndicator

    Product

    time_key

    product_ke

    y

    shipper_key

    market_key

    uantit&

    Weight

    Shipment1acts

    shipper_ke

    y

    name

    t&pe

    mode

    address

    Shipper

    time_key

    Sear

    Fiscal eriod

    Month

    Da&

    Da& of Week

    Time

    market_key

    Region

    District

    +tate

    2it&

    Markettime_key

    product_ke

    y

    market_key

    uantit&

    Amount

    Sales 1acts

    DiLerent Dimensions orDiLerent Dimensions or

    -rain-rain

  • 7/25/2019 Dimensional Modeling.ppt

    138/182

    138

    -rain-rain

    Dont take shortcuts with grain.he /not applicable/ dimension value 4sing a /not applicable/ row in a dimension

    confuses the grain and can introducereporting diUcult&

    DiLerent Points in TimeDiLerent Points in Time

  • 7/25/2019 Dimensional Modeling.ppt

    139/182

    139

    +ometimes, it is not eas& to identif& thediscrete business processes

    All measures ma& have the samedimensionalit& or grain

    Di*erent measures are recorded atdi*erent times uantit& sold is not recorded at the same

    time as )uantit& shipped

    DiLerent TimingDiLerent Timing

  • 7/25/2019 Dimensional Modeling.ppt

    140/182

    140

    gg

    %uilding a single fact table wouldre)uire recording 'ero or null formeasures that are not applicable at apoint in time

    Reports would contain a confusingcombination of 'eros, nulls, andabsence of data

    DiLerent Timing / One 1actDiLerent Timing / One 1act

  • 7/25/2019 Dimensional Modeling.ppt

    141/182

    141

    market_key

    RegionDistrict

    +tate

    2it&

    DiLerent Timing One 1actDiLerent Timing One 1act

    Ta(leTa(le

    1nitiall& will be null

    time_key

    product_key

    market_key

    uantit&sold

    Amountsold

    uantit&shippedAmountshipped

    Sales and

    Shipment1acts

    time_key

    Sear

    Fiscal eriod

    Month

    Da&

    Da& of Week

    Time

    Market

    product_key

    2ategor&%rand

    roduct

    Diet 1ndicator

    Product

    DiLerent Timing / T*o 1actDiLerent Timing / T*o 1act

  • 7/25/2019 Dimensional Modeling.ppt

    142/182

    142

    time_key

    product_key

    market_key

    uantit&

    Amount

    DiLerent Timing T*o 1actDiLerent Timing T*o 1act

    Ta(lesTa(les

    product_key

    2ategor&

    %rand

    roduct

    Diet 1ndicator

    Product

    Shipment

    1acts

    time_key

    product_key

    market_key

    uantit&

    Amount

    Sales 1acts market_keyRegion

    District

    +tate

    2it&

    Market

    time_keySear

    Fiscal eriod

    Month

    Da&

    Da& of Week

    Time

    dentif!ing DiLerentdentif!ing DiLerent

    ProcessesProcesses

  • 7/25/2019 Dimensional Modeling.ppt

    143/182

    143

    ProcessesProcesses

    -ook at the measures in )uestion +ort them into fact tables based on

    Dimensions

    rain

    Di*ering timings of events measured

    One Process2 Multiple 1actOne Process2 Multiple 1act

    Ta(lesTa(les

  • 7/25/2019 Dimensional Modeling.ppt

    144/182

    144

    Ta(lesTa(les

    2ore and custom 2overage

    +napshot and transaction

    Aggregates

    Core and Custom SchemasCore and Custom Schemas

  • 7/25/2019 Dimensional Modeling.ppt

    145/182

    145

    .here is a set of dimension attributesand measures shared in all cases Depending on the value in a dimension,

    certain e3tra dimension attributes or

    measures are recorded

    Heterogeneous products

    .&pes of customers

    Core andCore andC tCustom

  • 7/25/2019 Dimensional Modeling.ppt

    146/182

    146

    CustomCustom

    product_key

    777

    Product

    customer_ke

    y

    777

    Customer

    checking_key

    777custom checking

    attributes

    Checking Accounttime_key

    checking_key

    branch_key

    customer_key

    %alance

    .ransactioncount

    777custom checking

    facts

    CheckingAccount1acts

    time_key

    product_key

    branch_key

    customer_key

    %alance

    .ransactioncount

    Account 1acts

    time_key

    777

    Time

    branch_key

    777

    Branch

    Core and CustomCore and Custom

  • 7/25/2019 Dimensional Modeling.ppt

    147/182

    147

    2ore fact table and dimensions All attributes shared no matter what Appropriate for anal&sis across entire sub

  • 7/25/2019 Dimensional Modeling.ppt

    148/182

    148

    A star schema usuall& measure eventsthat happen Relationships between the dimensions

    involved are not captured if events do

    not happen A coverage table lls the gap

    What did not sell that was on promotionC

    Who was assigned to that customerC 4suall& :factless;

    Measuring ,hat )appenedMeasuring ,hat )appened

  • 7/25/2019 Dimensional Modeling.ppt

    149/182

    149

    product_key

    2ategor&

    %rand

    roduct

    +64

    Product

    customer_ke

    yLame

    2ompan&

    Account

    honenum

    Customer

    time_key

    product_key

    customer_key

    rep_key

    )uantit&

    salesdollars

    Sales 1acts

    time_key

    Sear

    Fiscal eriodMonth

    Da&

    Da& of Week

    Time

    rep_keyrepname

    repphone

    Region

    District

    +tate

    2it&

    Sales=rep

    +ales facts does not reveal who isassigned to a customer if the& do notsell

    Coverage Ta(leCoverage Ta(le

  • 7/25/2019 Dimensional Modeling.ppt

    150/182

    150

    2ustomercoveragefacts shows who isassigned to a customer at a point intime

    customer_key

    Lame

    2ompan&

    Account

    honenum

    Customer

    time_key

    customer_key

    rep_key

    CustomerCoverage1acts

    time_key

    Year

    Fiscal eriod

    Month

    Da&

    Da& of Week

    Time

    rep_keyrepname

    repphone

    Region

    District

    +tate

    2it&

    Sales=rep

    Snapshot and TransactionSnapshot and Transaction

  • 7/25/2019 Dimensional Modeling.ppt

    151/182

    151

    Biewing a single process multiple wa&s .ransactions

    .he changes to what is being measured

    +napshot.he status at a point in time

    03ample 2hanges to inventor&

    2urrent status of inventor&

    SnapshotSnapshot

  • 7/25/2019 Dimensional Modeling.ppt

    152/182

    152

    time_key

    Sear

    Fiscal eriod

    MonthDa&

    Da& of Week

    How much is on hand toda&C How much was on hand &esterda&C

    product_key

    2ategor& %rand

    roduct

    +64

    Product

    location_key

    Warehouse

    WHcode

    2it&

    +tate

    $ocation

    time_key

    product_key

    location_key

    )uantit&onhand

    nventor!Snapshot Time

    TransactionTransaction

  • 7/25/2019 Dimensional Modeling.ppt

    153/182

    153

    How did inventor& change toda&C How much product was returned due to

    failed inspectionC

    product_key

    2ategor& %rand

    roduct

    +64

    Product

    location_key

    Warehouse

    WHcode

    2it&

    +tate

    $ocation

    time_key

    product_key

    location_key

    transaction_type_k

    ey

    transactionamount

    nventor!Transactions

    time_key

    Sear

    Fiscal eriodMonth

    Da&

    Da& of Week

    Time

    transaction_type_key

    transactiont&pecode

    transactiont&pe

    transactioncategor&

    Transaction=t!pe

    Aggregate Ta(lesAggregate Ta(les

  • 7/25/2019 Dimensional Modeling.ppt

    154/182

    154

    Aggregate table A fact table that summari'es another fact

    table

    2reated for performance reasons

    2overed in previous section

    Design Tools for MultipleDesign Tools for Multiple

    Ta(lesTa(les

  • 7/25/2019 Dimensional Modeling.ppt

    155/182

    155

    Ta(lesTa(les

    2reate a set of matrices Facts vs dimension Facts vs dimensional attributes

    Mark where facts appl& to dimensions Mark where facts appl& to dimensional

    attributes When facts don/t appl&, assume

    separate fact table

    "#ample Matri#"#ample Matri#

  • 7/25/2019 Dimensional Modeling.ppt

    156/182

    156

    Attribu

    te!

    Attribu

    teH

    Attribu

    teJ

    Attribu

    teK

    Attribu

    te$

    Attribu

    te8

    Attribu

    teO

    Attribu

    te#

    Fact ! T T T T

    Fact T T T T

    Fact J T T T T T

    Fact K T T T T T

    Fact .able !

    Fact .able

    Fact vs dimensional attribute matri3

    Multiple 1act Ta(le Summar!Multiple 1act Ta(le Summar!

  • 7/25/2019 Dimensional Modeling.ppt

    157/182

    157

    Di*erent processes need di*erent tables 1dentied with

    rain Dimensionalit&

    .iming +ame process ma& need multiple fact

    tables Heterogeneous attributes

    2overage +napshot and transaction Aggregates

    "#ercise :"#ercise :

  • 7/25/2019 Dimensional Modeling.ppt

    158/182

    158

    +cenario 1ndustr&9 Automobile manufacturing 2ompan&9 Millennium Motors Balue chain focus9 +ales

    +ample business )uestions9 What are the top !5 selling car models this monthC How do this months top !5 selling models compare

    to the top !5 over the last si3 monthsC +how me dealer sales b& region b& model b& da&7 How man& cars have been purchased over the last

    si3 months b& customers with &earl& householdincomes greater than V55,555C

    "#ercise : / continued"#ercise : / continued

  • 7/25/2019 Dimensional Modeling.ppt

    159/182

    159

    4sing these sources data elements, design astar schema that answers the proposedbusiness )uestions

    Dail& salesrevenue Dail& )uantit&sold Model Dealer Dealer cit& roduct line Region wheresold +tate Behicle categor&

    Date of sales

    2ustomer name 2ustomer 'ip code 2ustomer &earl& income 7(7 Lumber urchase price Discount amount %rand of car

    "#ercise : / *orksheet"#ercise : / *orksheet

  • 7/25/2019 Dimensional Modeling.ppt

    160/182

    160

    "#ercise : Solution / Matri#"#ercise : Solution / Matri#

  • 7/25/2019 Dimensional Modeling.ppt

    161/182

    161

    facts

    dail&sales

    dail&)uantit& purchaseprice

    discountamount

    2ustomername

    2ustomer'ipcode

    Model

    2ustomerincome

    Dealer

    7(

    7Lumber

    Dealercit&

    ro

    ductline

    %ra

    ndofcar

    Regionwheresold

    +ta

    te

    Beh

    iclecategor&

    Dateofsales

    "#ercise : / Star schema"#ercise : / Star schema

  • 7/25/2019 Dimensional Modeling.ppt

    162/182

    162

    customer_key

    customername

    customer'ip

    &earl&income

    Customer

    model_key

    brand

    categor&

    line

    model

    Model

    model_key

    dealer_key

    time_key

    revenue

    )uantit&

    Dail! Sales1acts

    model_key

    dealer_key

    time_keycustomer_key

    po_number

    purchaseprice

    discountamt

    CustomerSales 1acts

    time_key

    &ear

    )uarter

    month

    date

    Time

    dealer_key

    region

    state

    cit&dealer

    Dealer

  • 7/25/2019 Dimensional Modeling.ppt

    163/182

    1'3

    & % A& % A

  • 7/25/2019 Dimensional Modeling.ppt

    164/182

    1'4

    Architected DataArchitected DataMartsMarts

    Data MartData Mart

  • 7/25/2019 Dimensional Modeling.ppt

    165/182

    165

    Meaning of the term /data mart/ hasshifted over the last several &ears777

    Data Mart Architecture 5NN9Data Mart Architecture 5NN9

  • 7/25/2019 Dimensional Modeling.ppt

    166/182

    166

    (perational+&stems

    07.7-707.7-7

    +oftware+oftware

    DataWarehouse

    Anal&sis4sers

    uer& uer&

    ReportinReportin

    gg

    +oftware+oftware

    07.7-707.7-7

    +oftware+oftware

    Data Marts

    Data Mart Architecture 5NNData Mart Architecture 5NN

  • 7/25/2019 Dimensional Modeling.ppt

    167/182

    167

    (perational+&stems

    07.7-7+oftware

    Data MartsAnal*sis Users

    uer& Reporting

    +oftware

    Architected Data MartsArchitected Data Marts

  • 7/25/2019 Dimensional Modeling.ppt

    168/182

    168

    (perational+&stems

    Anal&sis4sers

    Data Mart

    Data Warehouse

    07.7-+oftwar

    e

    uer& Reporting+oftware

    Data MartData Mart

  • 7/25/2019 Dimensional Modeling.ppt

    169/182

    169

    Warehouse +ub

  • 7/25/2019 Dimensional Modeling.ppt

    170/182

    170

    Produc

    t

    Produc

    t

    Time

    IDa!JShipment

    s 1acts

    ,arehous

    e

    ,arehouse nventor

    ! 1acts

    Product

    Month

    :+tovepipe; datamarts 1nconsistent and

    overlapping data DiUcult and costl& to

    maintain Redundant data load 2ant drill across 1ntegration re)uires

    starting over Dimensions not

    conformed

    Conformed DimensionsConformed Dimensions

  • 7/25/2019 Dimensional Modeling.ppt

    171/182

    171

    Denition Dimensions are conformed when the& are

    the sameor

    When one dimension is a strict rollup of

    another

    Conformed DimensionsConformed Dimensions

  • 7/25/2019 Dimensional Modeling.ppt

    172/182

    172

    +ame dimensions must9

    !7 777 have e3actl& the same set ofprimar& ke&s

    and

    7 777 have the same number of records

    Conformed DimensionsConformed Dimensions

  • 7/25/2019 Dimensional Modeling.ppt

    173/182

    173

    Rolled up dimension When one dimension is a strict rollup of

    another

    Which means.wo conformed dimensions can be

    combined into a single logical dimension b&creating a union of the attributes

    Conformed DimensionsConformed Dimensions

  • 7/25/2019 Dimensional Modeling.ppt

    174/182

    174

    Description +hared common dimensions

    1ntegrates logical design

    0nsures consistenc& between data marts Allows incremental development

    1ndependent of ph&sical location

    +ome rework ma& be re)uired

    Conformed DimensionsConformed Dimensions

  • 7/25/2019 Dimensional Modeling.ppt

    175/182

    175

    Advantages 0nables an incremental development approach

    0asier and cheaper to maintain

    Drasticall& reduces e3traction and loading

    comple3it&

    Answers business )uestions that cross data

    marts

    +upports both centrali'ed and distributedarchitectures

    Time

    nterlocking Star Schemasnterlocking Star Schemas

  • 7/25/2019 Dimensional Modeling.ppt

    176/182

    176

    Store

    Dimension

    Sales

    1acts

    Product

    Dimensio

    n

    Time

    Dimensio

    nShipmen

    t 1acts

    ,arehouse

    Dimensio

    n

    nventor

    ! 1acts

    Month

    Dimensio

    n

    2onformed Dimensions2onformed Dimensions

    6im(alls Data ,arehouse6im(alls Data ,arehouse

  • 7/25/2019 Dimensional Modeling.ppt

    177/182

    177Store Product Da! ,arehouse Month

    Sales1acts Shipment 1acts nventor! 1actsBusBus

    ,hen to Conform,hen to Conform

  • 7/25/2019 Dimensional Modeling.ppt

    178/182

    178

    .wo approaches 4pfront As&ougo %oth approaches work

    2hoose the approach that works for &ou

    Conform @p 1rontConform @p 1ront

  • 7/25/2019 Dimensional Modeling.ppt

    179/182

    179

    2ross0nterprise

    Anal&sis

    2reateFirst2ut+tars

    All +ub

  • 7/25/2019 Dimensional Modeling.ppt

    180/182

    180

    Design %uild+ub

  • 7/25/2019 Dimensional Modeling.ppt

    181/182

    1&1

    & % A& % A

    Course Revie*Course Revie*

  • 7/25/2019 Dimensional Modeling.ppt

    182/182

    Rationale for dimensional modeling Dimensional modeling basics Dimensional modeling details Fact table details Dimension table details Design process Aggregate schemas

    M lti l f t t bl