Data ware House KT

Embed Size (px)

Citation preview

  • 8/9/2019 Data ware House KT

    1/14

     Agenda Agenda

    Data Warehousing Concepts

    Hanmath Singuluri 

  • 8/9/2019 Data ware House KT

    2/14

    Data Warehousing - ArchitectureData Warehousing - Architecture

    Enterprise

    Data

    Warehouse

    Enterprise

    Data

    WarehouseData Mart

    Data Mart

    Data Mart

    Data Mart

    Execution

    Systems

    • CRM• ERP• Legacy• e-Commerce

    Execution

    Systems

    • CRM• ERP• Legacy• e-Commerce

    External

    Data

    • Purchase! Mar$et

    Data• Sprea!sheets

    External

    Data

    • Purchase! Mar$etData• Sprea!sheets

    •Oracle•S#L Ser%er •Tera!ata•D&'

    Data an! Meta!ata

    Repository Layer 

    ETL Tools(•)n*ormatica Po+erMart•ET)•Oracle Warehouse &uil!er •Custom programs

    •S#L scripts

    Extract, Trans*ormation,

    an! Loa! ETL. Layer 

    • Cleanse Data• /ilter Recor!s• Stan!ar!i0e alues• Deco!e alues• pply &usiness Rules• "ousehol!ing• De!upe Recor!s• Merge Recor!s

    Extract, Trans*ormation,

    an! Loa! ETL. Layer 

    • Cleanse Data• /ilter Recor!s• Stan!ar!i0e alues• Deco!e alues• pply &usiness Rules• "ousehol!ing• De!upe Recor!s

    • Merge Recor!s

    ETL Layer 

    Meta!ata

    Repository

    Meta!ata

    Repository

    ODS

    ODS

    •PeopleSo*t•SP•Sie2el•Oracle pplications•Manugistics

    •Custom Systems

    Data Mart

    Data Mart

    Source Systems

    Sample Technologies(

  • 8/9/2019 Data ware House KT

    3/14

    OLTP vs DW OLTP vs DW 

    OLTP DW

    Data dependencies (E-R) model Dimensional model

    Microscopic data consistency Global data consistenc

    Millions of transactions per day One transaction per da

    Mostly does not keep istory !eepin" istory is nec

    Gets loaded in te day Gets loaded in te ni"

  • 8/9/2019 Data ware House KT

    4/14

    Dimensional Data Modeling Dimensional Data Modeling  E-R mo!el

     – Symmetric

     – Di%i!es !ata into many entities – Descri2es entities an! relationships

     – See$s to eliminate !ata re!un!ancy

     – 4oo! *or high transaction per*ormance Dimensional mo!el

     –  symmetric

     –Di%i!es !ata into !imensions an! *acts

     – Descri2es !imensions an! measures

     – Encourages !ata re!un!ancy

     – 4oo! *or high 5uery per*ormance

  • 8/9/2019 Data ware House KT

    5/14

    Facts/DimensionsFacts/Dimensions

    /act

     – Central, !ominant ta2le

     – Multi-part primary $ey

     – "ol!s millions 6 2illions o* recor!s

     – Lin$s !irectly to !imensions

     – Stores 2usiness measures

     – Constantly %arying !ata

  • 8/9/2019 Data ware House KT

    6/14

    Facts/Dimensions (contd!Facts/Dimensions (contd!

    Dimensions

     – Single 3oin to the *act ta2le single primary $ey.

     – Stores 2usiness attri2utes

     –  ttri2utes are textual in nature

     – Organi0e! into hierarchies

     – More or less constant !ata

     – E7g7 Time, Pro!uct, Customer, Store, etc7

  • 8/9/2019 Data ware House KT

    7/14

    Star/Sno"#la$e schemaStar/Sno"#la$e schema

    Star schema

     – /act surroun!e! 2y 8-19 !imensions

     – Dimensions are !e-normali0e!

    Sno+*la$e schema

     – Star schema +ith secon!ary !imensions

     – Don:t sno+*la$e *or sa%ing space

     – Sno+*la$e i* secon!ary !imensions ha%e many attri2utes

  • 8/9/2019 Data ware House KT

    8/14

    Star schemaStar schema

  • 8/9/2019 Data ware House KT

    9/14

    Star schema e%am&leStar schema e%am&le

  • 8/9/2019 Data ware House KT

    10/14

    Sno"#la$e schema e%am&leSno"#la$e schema e%am&le

    STORE KEY

    Store Dimension

    Store Description

    City

    State

    District ID

    District Desc.

    Region_ID

    Region Desc.

    Regional Mgr.

    District_ID

    District Desc.

    Region_ID

    STORE KEY

    PRODUCT KEY

    PERIOD KEY

    Dollars

    Units

    Price

    Store Fact Tale

  • 8/9/2019 Data ware House KT

    11/14

    DM ' DW ODS DM ' DW ODS 

    DM

     – Organi0e! aroun! a single 2usiness process

     – Represents small part o* the organi0ation:s 2usiness

     – Logical su2set o* the complete !ata +arehouse

     – /aster roll out, 2ut complex integration in the long run

  • 8/9/2019 Data ware House KT

    12/14

    DM ' DW ODS (contd!DM ' DW ODS (contd!

    DW

     – ;nion o* its constituent !ata marts

     – #uerya2le source o* !ata in the organi0ation

     – Re5uires extensi%e 2usiness mo!eling may ta$e years an! 2uil!.

    ODS

     –

    Point o* integration *or operational systems – Lo+-le%el !ecision support

     – Can store integrate! !ata, 2ut at !etaile! le%el

  • 8/9/2019 Data ware House KT

    13/14

    OLAP OLAP 

    Element o* !ecision support systems DSS.

    Support almost. a!-hoc 5uerying *or 2usiness analyst

    "elps the $no+le!ge +or$er executi%e, manager, analyst. ma$e

    2etter !ecisions

    ROLP - exten!e! RD&MS that maps operations on multi!imen

    stan!ar! relational operators

    MOLP - Special-purpose ser%er that !irectly implements multi!

    !ata an! operations

  • 8/9/2019 Data ware House KT

    14/14