Upload
ayush
View
216
Download
0
Embed Size (px)
Citation preview
7/23/2019 Presentation on DW
1/21
SEMINAR REPORT
ON
DATA WAREHOUSE
SAM HIGGINBOTTOM INSTITUTE OF AGRICULTURE,
TECHNOLOGY & SCIENCES (DEEMED TO BE UNIVERSITY
Department of Computer Science and I.T
Submitted By:
Ayush Barnawa !"#BT
Submitted T(:
Mudita Shri)asta)a
7/23/2019 Presentation on DW
2/21
CONTENTS
HISTOR*
OB+E$TI,ES
E,O-UTION IN OR.ANI/ATIONA- USE
AR$HITE$TURE
O-TP
O-AP
BENE0ITS O0 DATA WAREHOUSE
STRATE.I$ USE
-IMITATIONS
7/23/2019 Presentation on DW
3/21
HISTOR*
The concept of data warehousing dates back to the late
1980s when IBM researchers Barry Devlin and aulMurphy developed the !business data warehouse"#
Bill In$on% &ather 'f Data (arehouse#
)ccording to In$on*s Definition+
It is a collection of integrated% sub,ect-oriented databases
designed to support the D.. function% where each unit ofdata is non-volatile and relevant to so$e $o$ent in ti$e#
7/23/2019 Presentation on DW
4/21
DATA WAREHOUSE
) single% co$plete and consistent store of
data obtained fro$ a variety of differentsources $ade available to end users in a what
they can understand and use in a business
conte/t#
) sub,ect-oriented% integrated% ti$e-variant%
non-updatable collection of data used in
support of $anage$ent decision-$aking
processes
7/23/2019 Presentation on DW
5/21
DATA WAREHOUSIN. IS1
elational or Multidi$ensional Database Manage$ent .yste$
designed to support Manage$ent Decision Making#
Techniue for )sse$bling and Managing Data fro$ various sources
for the purpose of answering Business uestions#
) sub,ect-oriented% integrated% ti$e-variant% non-updatable
collection of data used in support of $anage$ent decision-$aking
processes#
2o$$on )ccessing syste$s include ueries% analysis and reporting
The final result% however% is ho$ogeneous data% which can be $ore
easily $anipulated#
7/23/2019 Presentation on DW
6/21
SUB+E$T ORIENTED
DATA WAREHOUSE
'rgani3ed around $a,or sub,ects% such as custo$er%
product% sales#
&ocusing on the $odeling and analysis of data for decisio
$akers% not on daily operations or transaction processing#
rovide a si$ple and concise view around particular
sub,ect issues by e/cluding data that are not useful in the
decision support process#
7/23/2019 Presentation on DW
7/21
INTE.RATED DATA WAREHOUSE
2onstructed by integrating $ultiple% heterogeneousdata sources#
elational databases% flat files% on-line transaction records#
Data cleaning and Data Integration techniues are
applied#
4nsures 2onsistency in na$ing conventions% encoding
structures% attribute structures% etc# a$ong different data sources#
4#g#% 5otel price+ currency% ta/% breakfast covered% etc#
(hen data is $oved to the warehouse% it is converted#
7/23/2019 Presentation on DW
8/21
TIME ,ARIANT DATA WAREHOUSE
The ti$e hori3on for the data warehouse is significantly
longer than that of operational syste$s#
'perational database+ current value data#
Data warehouse data+ provide infor$ation fro$ a historical
perspective 6e#g#% past 7-10 years
4very key structure in the Data (arehouse#
2ontains an ele$ent of ti$e% e/plicitly or i$plicitly#
But the key of operational data $ay or $ay not contain "ti$e
ele$ent#
7/23/2019 Presentation on DW
9/21
NON 2 ,O-ATI-E
DATA WAREHOUSE
hysically separate store of Data transfor$ed fro$ the
'perational 4nviron$ent#
'perational update of data does not occur in the data
warehouse environ$ent#
Does not reuire transaction processing% recovery% and concurrencycontrol $echanis$s#
euires only two operations in Data )ccessing#
Initial loading of Data#
)ccess of Data#
7/23/2019 Presentation on DW
10/21
E,O-UTION
IN
OR.ANI/ATIONA- USE
Data% Data 4verywhere yet% I can*t find the data I need
Data is .cattered over the :etwork
Many versions% .ubtle differences
I can*t understand the data I found )vailable Data poorly docu$ented
I can*t use the data I found
esults are ;ne/pected
Data needs to be transfor$ed fro$ one for$ to 'ther
7/23/2019 Presentation on DW
11/21
WH* DATA WAREHOUSIN. 3
Which are ourlowest/highest margin
customers ?
Which are ourlowest/highest margin
customers ?
Who are my customeand what productsare they buying?
Who are my customeand what productsare they buying?
What is the mosteectie distribution
channel?
What is the mosteectie distribution
channel?
What product prom!!otions hae the biggest
impact on reenue?
What product prom!!otions hae the biggest
impact on reenue?
What impact willnew products/serices
hae on reenueand margins?
What impact willnew products/serices
hae on reenueand margins?
Which customare most li"ely to the competit
Which customare most li"ely tto the competiti
7/23/2019 Presentation on DW
12/21
Client Client
Warehouse
SourceSource Source
Query & Analysis
Integration
Metadata
AR$HITE$TURE
7/23/2019 Presentation on DW
13/21
O-TP
O-TP4 ON-INE TRANSA$TION PRO$ESSIN.
.pecial data organi3ation% access $ethods andi$ple$entation $ethods are needed to support data
warehouse ueries 6typically $ultidi$ensional ueries#
'
7/23/2019 Presentation on DW
14/21
O-TP ,s DWO-TP DW
)pplication 'riented .ub,ect 'riented
;sed to run business ;sed to )naly3e Business
Detailed Data .u$$ari3ed and efined
2urrent ;p-date Data .napshot Data
Isolated Data Integrated Data
&ew ecords )ccessed at a Ti$e6tens B Database .i3e 6100>B ?&ew Terabytes
Transaction throughput is perfor$anceMetric#
@uery throughput is the perfor$anceMetric#
Thousands of users 5undreds of ;sers
7/23/2019 Presentation on DW
15/21
META DATA"Data about Data# It*s I$portant for Designing% 2onstructing%
etrieving% and controlling the (arehouse Data#
TE$HNI$A- META DATA
Include where the data co$e fro$% how the data were changed% how the data are
organi3ed% how the data are stored% who owns the data% who is responsible for the
data and how to contact the$% who can access the data % and the date of last update#
BUSINESS META DATA
Include what data are available% where the data are% what the data $ean% how to
access the data% predefined reports and ueries% and how current the data are#
7/23/2019 Presentation on DW
16/21
O-AP'nline )nalytical rocessing - coined by 4& 2odd in 199
paper contracted by )rbor .oftware#
'nline analytical processing refers to such end user
activities as D.. $odelling using spreadsheets and
graphics that are done online#
'
7/23/2019 Presentation on DW
17/21
O-AP IS 0ASMI
&).T
):)
7/23/2019 Presentation on DW
18/21
BENE0ITS O0 DW
rovides business users with a "custo$er-centric view of the
co$pany*s 5eterogeneous data by helping to integrate data fro$
sales% service% $anufacturing and distribution% and other custo$er-
related Business syste$s#
rovides added value to the co$pany*s custo$ers by allowing the$
to access better infor$ation when Data (arehousing is coupled with
Internet technology#
2onsolidates data about individual custo$ers and provides arepository of all custo$er contacts for seg$entation $odeling%
custo$er retention planning% and cross sales analysis#
eports on trends across $ultidivisional% $ultinational operating
units% including trends or relationships in areas such as
$erchandising% production planning etc#
7/23/2019 Presentation on DW
19/21
STRATE.I$ USE
7/23/2019 Presentation on DW
20/21
-IMITATIONSData warehouses are not the opti$al environ$ent for
unstructured data#
'ver their life% data warehouses can have high Maintenance
costs#
Data warehouses can get outdated relatively uickly# There i
cost of delivering subopti$al infor$ation to the organi3ation
There is often a fine line between data warehouses andoperational syste$s# Duplicate% e/pensive functionality $ay
developed# 'r% functionality $ay be developed in the data
warehouse that% in retrospect% should have been developed in
the operational syste$s and vice versa#
7/23/2019 Presentation on DW
21/21
THANKYOU
!!!