Presentation on DW

  • Upload
    ayush

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

  • 7/23/2019 Presentation on DW

    1/21

    SEMINAR REPORT

    ON

    DATA WAREHOUSE

    SAM HIGGINBOTTOM INSTITUTE OF AGRICULTURE,

    TECHNOLOGY & SCIENCES (DEEMED TO BE UNIVERSITY

    Department of Computer Science and I.T

    Submitted By:

    Ayush Barnawa !"#BT

    Submitted T(:

    Mudita Shri)asta)a

  • 7/23/2019 Presentation on DW

    2/21

    CONTENTS

    HISTOR*

    OB+E$TI,ES

    E,O-UTION IN OR.ANI/ATIONA- USE

    AR$HITE$TURE

    O-TP

    O-AP

    BENE0ITS O0 DATA WAREHOUSE

    STRATE.I$ USE

    -IMITATIONS

  • 7/23/2019 Presentation on DW

    3/21

    HISTOR*

    The concept of data warehousing dates back to the late

    1980s when IBM researchers Barry Devlin and aulMurphy developed the !business data warehouse"#

    Bill In$on% &ather 'f Data (arehouse#

    )ccording to In$on*s Definition+

    It is a collection of integrated% sub,ect-oriented databases

    designed to support the D.. function% where each unit ofdata is non-volatile and relevant to so$e $o$ent in ti$e#

  • 7/23/2019 Presentation on DW

    4/21

    DATA WAREHOUSE

    ) single% co$plete and consistent store of

    data obtained fro$ a variety of differentsources $ade available to end users in a what

    they can understand and use in a business

    conte/t#

    ) sub,ect-oriented% integrated% ti$e-variant%

    non-updatable collection of data used in

    support of $anage$ent decision-$aking

    processes

  • 7/23/2019 Presentation on DW

    5/21

    DATA WAREHOUSIN. IS1

    elational or Multidi$ensional Database Manage$ent .yste$

    designed to support Manage$ent Decision Making#

    Techniue for )sse$bling and Managing Data fro$ various sources

    for the purpose of answering Business uestions#

    ) sub,ect-oriented% integrated% ti$e-variant% non-updatable

    collection of data used in support of $anage$ent decision-$aking

    processes#

    2o$$on )ccessing syste$s include ueries% analysis and reporting

    The final result% however% is ho$ogeneous data% which can be $ore

    easily $anipulated#

  • 7/23/2019 Presentation on DW

    6/21

    SUB+E$T ORIENTED

    DATA WAREHOUSE

    'rgani3ed around $a,or sub,ects% such as custo$er%

    product% sales#

    &ocusing on the $odeling and analysis of data for decisio

    $akers% not on daily operations or transaction processing#

    rovide a si$ple and concise view around particular

    sub,ect issues by e/cluding data that are not useful in the

    decision support process#

  • 7/23/2019 Presentation on DW

    7/21

    INTE.RATED DATA WAREHOUSE

    2onstructed by integrating $ultiple% heterogeneousdata sources#

    elational databases% flat files% on-line transaction records#

    Data cleaning and Data Integration techniues are

    applied#

    4nsures 2onsistency in na$ing conventions% encoding

    structures% attribute structures% etc# a$ong different data sources#

    4#g#% 5otel price+ currency% ta/% breakfast covered% etc#

    (hen data is $oved to the warehouse% it is converted#

  • 7/23/2019 Presentation on DW

    8/21

    TIME ,ARIANT DATA WAREHOUSE

    The ti$e hori3on for the data warehouse is significantly

    longer than that of operational syste$s#

    'perational database+ current value data#

    Data warehouse data+ provide infor$ation fro$ a historical

    perspective 6e#g#% past 7-10 years

    4very key structure in the Data (arehouse#

    2ontains an ele$ent of ti$e% e/plicitly or i$plicitly#

    But the key of operational data $ay or $ay not contain "ti$e

    ele$ent#

  • 7/23/2019 Presentation on DW

    9/21

    NON 2 ,O-ATI-E

    DATA WAREHOUSE

    hysically separate store of Data transfor$ed fro$ the

    'perational 4nviron$ent#

    'perational update of data does not occur in the data

    warehouse environ$ent#

    Does not reuire transaction processing% recovery% and concurrencycontrol $echanis$s#

    euires only two operations in Data )ccessing#

    Initial loading of Data#

    )ccess of Data#

  • 7/23/2019 Presentation on DW

    10/21

    E,O-UTION

    IN

    OR.ANI/ATIONA- USE

    Data% Data 4verywhere yet% I can*t find the data I need

    Data is .cattered over the :etwork

    Many versions% .ubtle differences

    I can*t understand the data I found )vailable Data poorly docu$ented

    I can*t use the data I found

    esults are ;ne/pected

    Data needs to be transfor$ed fro$ one for$ to 'ther

  • 7/23/2019 Presentation on DW

    11/21

    WH* DATA WAREHOUSIN. 3

    Which are ourlowest/highest margin

    customers ?

    Which are ourlowest/highest margin

    customers ?

    Who are my customeand what productsare they buying?

    Who are my customeand what productsare they buying?

    What is the mosteectie distribution

    channel?

    What is the mosteectie distribution

    channel?

    What product prom!!otions hae the biggest

    impact on reenue?

    What product prom!!otions hae the biggest

    impact on reenue?

    What impact willnew products/serices

    hae on reenueand margins?

    What impact willnew products/serices

    hae on reenueand margins?

    Which customare most li"ely to the competit

    Which customare most li"ely tto the competiti

  • 7/23/2019 Presentation on DW

    12/21

    Client Client

    Warehouse

    SourceSource Source

    Query & Analysis

    Integration

    Metadata

    AR$HITE$TURE

  • 7/23/2019 Presentation on DW

    13/21

    O-TP

    O-TP4 ON-INE TRANSA$TION PRO$ESSIN.

    .pecial data organi3ation% access $ethods andi$ple$entation $ethods are needed to support data

    warehouse ueries 6typically $ultidi$ensional ueries#

    '

  • 7/23/2019 Presentation on DW

    14/21

    O-TP ,s DWO-TP DW

    )pplication 'riented .ub,ect 'riented

    ;sed to run business ;sed to )naly3e Business

    Detailed Data .u$$ari3ed and efined

    2urrent ;p-date Data .napshot Data

    Isolated Data Integrated Data

    &ew ecords )ccessed at a Ti$e6tens B Database .i3e 6100>B ?&ew Terabytes

    Transaction throughput is perfor$anceMetric#

    @uery throughput is the perfor$anceMetric#

    Thousands of users 5undreds of ;sers

  • 7/23/2019 Presentation on DW

    15/21

    META DATA"Data about Data# It*s I$portant for Designing% 2onstructing%

    etrieving% and controlling the (arehouse Data#

    TE$HNI$A- META DATA

    Include where the data co$e fro$% how the data were changed% how the data are

    organi3ed% how the data are stored% who owns the data% who is responsible for the

    data and how to contact the$% who can access the data % and the date of last update#

    BUSINESS META DATA

    Include what data are available% where the data are% what the data $ean% how to

    access the data% predefined reports and ueries% and how current the data are#

  • 7/23/2019 Presentation on DW

    16/21

    O-AP'nline )nalytical rocessing - coined by 4& 2odd in 199

    paper contracted by )rbor .oftware#

    'nline analytical processing refers to such end user

    activities as D.. $odelling using spreadsheets and

    graphics that are done online#

    '

  • 7/23/2019 Presentation on DW

    17/21

    O-AP IS 0ASMI

    &).T

    ):)

  • 7/23/2019 Presentation on DW

    18/21

    BENE0ITS O0 DW

    rovides business users with a "custo$er-centric view of the

    co$pany*s 5eterogeneous data by helping to integrate data fro$

    sales% service% $anufacturing and distribution% and other custo$er-

    related Business syste$s#

    rovides added value to the co$pany*s custo$ers by allowing the$

    to access better infor$ation when Data (arehousing is coupled with

    Internet technology#

    2onsolidates data about individual custo$ers and provides arepository of all custo$er contacts for seg$entation $odeling%

    custo$er retention planning% and cross sales analysis#

    eports on trends across $ultidivisional% $ultinational operating

    units% including trends or relationships in areas such as

    $erchandising% production planning etc#

  • 7/23/2019 Presentation on DW

    19/21

    STRATE.I$ USE

  • 7/23/2019 Presentation on DW

    20/21

    -IMITATIONSData warehouses are not the opti$al environ$ent for

    unstructured data#

    'ver their life% data warehouses can have high Maintenance

    costs#

    Data warehouses can get outdated relatively uickly# There i

    cost of delivering subopti$al infor$ation to the organi3ation

    There is often a fine line between data warehouses andoperational syste$s# Duplicate% e/pensive functionality $ay

    developed# 'r% functionality $ay be developed in the data

    warehouse that% in retrospect% should have been developed in

    the operational syste$s and vice versa#

  • 7/23/2019 Presentation on DW

    21/21

    THANKYOU

    !!!