DW Stanford

  • View
    214

  • Download
    0

Embed Size (px)

DESCRIPTION

data ware house

Transcript

  • Data Warehousing OverviewCS245 Notes 11Hector Garcia-MolinaStanford UniversityCS 245*Notes11

    Notes11

  • CS 245Notes11*OutlineWhat is a data warehouse?Why a warehouse?Models & operationsImplementing a warehouse

    Notes11

  • CS 245Notes11*What is a Warehouse?Collection of diverse datasubject orientedaimed at executive, decision makeroften a copy of operational datawith value-added data (e.g., summaries, history)integratedtime-varyingnon-volatile

    Notes11

  • CS 245Notes11*What is a Warehouse?Collection of toolsgathering datacleansing, integrating, ...querying, reporting, analysisdata miningmonitoring, administering warehouse

    Notes11

  • CS 245Notes11*Warehouse ArchitectureMetadata

    Notes11

  • CS 245Notes11*Motivating ExamplesForecastingComparing performance of unitsMonitoring, detecting fraudVisualization

    Notes11

  • CS 245Notes11*Alternative to WarehousingTwo Approaches:Query-Driven (Lazy)Warehouse (Eager)

    Notes11

  • CS 245Notes11*Query-Driven Approach

    Notes11

  • CS 245Notes11*Advantages of WarehousingHigh query performanceQueries not visible outside warehouseLocal processing at sources unaffectedCan operate when sources unavailableCan query data not stored in a DBMSExtra information at warehouseModify, summarize (store aggregates)Add historical information

    Notes11

  • CS 245Notes11*Advantages of Query-DrivenNo need to copy dataless storageno need to purchase dataMore up-to-date dataQuery needs can be unknownOnly query interface needed at sourcesMay be less draining on sources

    Notes11

  • CS 245Notes11*Warehouse Models & OperatorsData ModelsrelationalcubesOperators

    Notes11

  • CS 245Notes11*Star

    Notes11

    Sheet: Sheet1

    Sheet: Sheet2

    Sheet: Sheet3

    Sheet: Sheet4

    Sheet: Sheet5

    Sheet: Sheet6

    Sheet: Sheet7

    Sheet: Sheet8

    Sheet: Sheet9

    Sheet: Sheet10

    Sheet: Sheet11

    Sheet: Sheet12

    Sheet: Sheet13

    Sheet: Sheet14

    Sheet: Sheet15

    Sheet: Sheet16

    customer

    custId

    name

    address

    city

    product

    id

    name

    price

    store

    code

    city

    sale

    custId

    prodId

    storeId

    qty

    amt

    53.0

    joe

    10 main

    sfo

    p1

    bolt

    10.0

    c1

    nyc

    53.0

    p1

    c1

    81.0

    fred

    12 main

    sfo

    p2

    nut

    5.0

    c2

    sfo

    53.0

    p2

    c1

    111.0

    sally

    80 willow

    la

    c3

    la

    111.0

    p1

    c3

    product

    id

    name

    price

    p1

    bolt

    10.0

    p2

    nut

    5.0

    store

    code

    city

    c1

    nyc

    c2

    sfo

    c3

    la

    sale

    custId

    prodId

    storeId

    qty

    amt

    53.0

    p1

    c1

    53.0

    p2

    c1

    111.0

    p1

    c3

    Sheet: Sheet1

    Sheet: Sheet2

    Sheet: Sheet3

    Sheet: Sheet4

    Sheet: Sheet5

    Sheet: Sheet6

    Sheet: Sheet7

    Sheet: Sheet8

    Sheet: Sheet9

    Sheet: Sheet10

    Sheet: Sheet11

    Sheet: Sheet12

    Sheet: Sheet13

    Sheet: Sheet14

    Sheet: Sheet15

    Sheet: Sheet16

    customer

    id

    name

    address

    city

    product

    prodId

    name

    price

    store

    code

    city

    sale

    custId

    prodId

    storeId

    qty

    amt

    53.0

    joe

    10 main

    sfo

    p1

    bolt

    10.0

    c1

    nyc

    53.0

    p1

    c1

    81.0

    fred

    12 main

    sfo

    p2

    nut

    5.0

    c2

    sfo

    53.0

    p2

    c1

    111.0

    sally

    80 willow

    la

    c3

    la

    111.0

    p1

    c3

    product

    id

    name

    price

    p1

    bolt

    10.0

    p2

    nut

    5.0

    store

    code

    city

    c1

    nyc

    c2

    sfo

    c3

    la

    sale

    custId

    prodId

    storeId

    qty

    amt

    53.0

    p1

    c1

    53.0

    p2

    c1

    111.0

    p1

    c3

    Sheet: Sheet1

    Sheet: Sheet2

    Sheet: Sheet3

    Sheet: Sheet4

    Sheet: Sheet5

    Sheet: Sheet6

    Sheet: Sheet7

    Sheet: Sheet8

    Sheet: Sheet9

    Sheet: Sheet10

    Sheet: Sheet11

    Sheet: Sheet12

    Sheet: Sheet13

    Sheet: Sheet14

    Sheet: Sheet15

    Sheet: Sheet16

    customer

    id

    name

    address

    city

    product

    id

    name

    price

    store

    storeId

    city

    sale

    custId

    prodId

    storeId

    qty

    amt

    53.0

    joe

    10 main

    sfo

    p1

    bolt

    10.0

    c1

    nyc

    53.0

    p1

    c1

    81.0

    fred

    12 main

    sfo

    p2

    nut

    5.0

    c2

    sfo

    53.0

    p2

    c1

    111.0

    sally

    80 willow

    la

    c3

    la

    111.0

    p1

    c3

    product

    id

    name

    price

    p1

    bolt

    10.0

    p2

    nut

    5.0

    store

    code

    city

    c1

    nyc

    c2

    sfo

    c3

    la

    sale

    custId

    prodId

    storeId

    qty

    amt

    53.0

    p1

    c1

    53.0

    p2

    c1

    111.0

    p1

    c3

    Sheet: Sheet1

    Sheet: Sheet2

    Sheet: Sheet3

    Sheet: Sheet4

    Sheet: Sheet5

    Sheet: Sheet6

    Sheet: Sheet7

    Sheet: Sheet8

    Sheet: Sheet9

    Sheet: Sheet10

    Sheet: Sheet11

    Sheet: Sheet12

    Sheet: Sheet13

    Sheet: Sheet14

    Sheet: Sheet15

    Sheet: Sheet16

    customer

    id

    name

    address

    city

    product

    id

    name

    price

    store

    code

    city

    sale

    oderId

    date

    custId

    prodId

    storeId

    qty

    amt

    53.0

    joe

    10 main

    sfo

    p1

    bolt

    10.0

    c1

    nyc

    o100

    1/7/97

    53.0

    p1

    c1

    81.0

    fred

    12 main

    sfo

    p2

    nut

    5.0

    c2

    sfo

    o102

    2/7/97

    53.0

    p2

    c1

    111.0

    sally

    80 willow

    la

    c3

    la

    105.0

    3/8/97

    111.0

    p1

    c3

    product

    id

    name

    price

    p1

    bolt

    10.0

    p2

    nut

    5.0

    store

    code

    city

    c1

    nyc

    c2

    sfo

    c3

    la

    sale

    custId

    prodId

    storeId

    qty

    amt

    53.0

    p1

    c1

    53.0

    p2

    c1

    111.0

    p1

    c3

  • CS 245Notes11*Star Schema

    Notes11

  • CS 245Notes11*TermsFact tableDimension tablesMeasures

    Notes11

  • CS 245Notes11*Dimension HierarchiesstoresTypecityregion snowflake schema constellations

    Notes11

    Sheet: Sheet1

    Sheet: Sheet2

    Sheet: Sheet3

    Sheet: Sheet4

    Sheet: Sheet5

    Sheet: Sheet6

    Sheet: Sheet7

    Sheet: Sheet8

    Sheet: Sheet9

    Sheet: Sheet10

    Sheet: Sheet11

    Sheet: Sheet12

    Sheet: Sheet13

    Sheet: Sheet14

    Sheet: Sheet15

    Sheet: Sheet16

    customer

    id

    name

    address

    city

    product

    id

    name

    price

    store

    code

    city

    sale

    oderId

    date

    custId

    prodId

    storeId

    qty

    amt

    sale

    store

    storeId

    cityId

    tId

    mgr

    city

    cityId

    pop

    region

    region

    regId

    name

    sType

    tId

    size

    location

    53.0

    joe

    10 main

    sfo

    p1

    bolt

    10.0

    c1

    nyc

    o100

    1/7/97

    53.0

    p1

    c1

    orderId

    s5

    sfo

    t1

    joe

    sfo

    1M

    north

    north

    cold region

    t1

    small

    downtown

    81.0

    fred

    12 main

    sfo

    p2

    nut

    5.0

    c2

    sfo

    o102

    2/7/97

    53.0

    p2

    c1

    date

    s7

    sfo

    t2

    fred

    la

    5M

    south

    south

    warm region

    t2

    large

    suburbs

    111.0

    sally

    80 willow

    la

    c3

    la

    105.0

    3/8/97

    111.0

    p1

    c3

    custId

    s9

    la

    t1

    nancy

    prodId

    storeId

    product

    id

    name

    price

    qty

    p1

    bolt

    10.0

    amt

    p2

    nut

    5.0

    customer

    id

    store

    code

    city

    name

    c1

    nyc

    address

    c2

    sfo

    city

    c3

    la

    product

    id

    sale

    custId

    prodId

    storeId

    qty

    amt

    name

    53.0

    p1

    c1

    price

    53.0

    p2

    c1

    111.0

    p1

    c3

    store

    code

    city

    Sheet: Sheet1

    Sheet: Sheet2

    Sheet: Sheet3

    Sheet: Sheet4

    Sheet: Sheet5

    Sheet: Sheet6

    Sheet: Sheet7

    Sheet: Sheet8

    Sheet: Sheet9

    Sheet: Sheet10

    Sheet: Sheet11

    Sheet: Sheet12

    Sheet: Sheet13

    Sheet: Sheet14

    Sheet: Sheet15

    Sheet: Sheet16

    customer

    id

    name

    address

    city

    product

    id

    name

    price

    store

    code

    city

    sale

    oderId

    date

    custId

    prodId

    storeId

    qty

    amt

    sale

    store

    code

    city

    type

    mgr

    city

    cityId

    pop

    regId

    region

    regId

    name

    sType

    tId

    size

    location

    53.0

    j