Upload
bhaskar-reddy
View
215
Download
0
Embed Size (px)
Citation preview
7/25/2019 Dimensional Modeling.ppt
1/182
1
DimensionalDimensionalDesignDesignA Handbook for Data Warehouse
Design
7/25/2019 Dimensional Modeling.ppt
2/182
2
Course AgendaCourse Agenda
Rationale for dimensional modeling Dimensional modeling basics Dimensional modeling details Fact table details Dimension table details Design process
Aggregate schemas Multiple fact tables Architected data marts
7/25/2019 Dimensional Modeling.ppt
3/182
3
Rationale forRationale forDimensional ModelingDimensional Modeling
7/25/2019 Dimensional Modeling.ppt
4/182
4
OperationsSales andMarketing
CustomerServices
ProductDevelopme
nt
The Business Value ChainThe Business Value Chain
A series of interrelated businessprocesses which contribute to increasedproduct value for the customer, and to
prot for the enterprise orter !"#$
7/25/2019 Dimensional Modeling.ppt
5/182
5
Drive to CompeteDrive to Compete
%usinesses constantl& strive to optimi'eeach process in the value chain
(ptimi'ation re)uires measuring and
anal&'ing the e*ectiveness of eachprocess as well as the value chain as awhole
OperationsSales andMarketing
CustomerServices
ProductDevelopme
nt
7/25/2019 Dimensional Modeling.ppt
6/182
6
The Role of nformationThe Role of nformation
Technolog!Technolog! rocess optimi'ation
+upported b& online transaction processings&stems
(-. Measuring and anal&'ing processes
+upported b& /anal&tic/ s&stems Data warehouse
OperationsSales andMarketing
CustomerServices
ProductDevelopme
nt
7/25/2019 Dimensional Modeling.ppt
7/182
7
"#ample O$TP S!stems"#ample O$TP S!stems
Manufacturingand Process
Control
Sales Order"ntr! andCampaign
Management
CustomerSupport andRelationshipManagement
Shipping andnventor!
Management
OperationsSales andMarketing
CustomerServices
ProductDevelopme
nt
7/25/2019 Dimensional Modeling.ppt
8/182
8
O$TP S!stems % BusinessO$TP S!stems % Business
"vents"vents 0vents are the heart
of ever& business %ook an order rint a pick list
Record a cashwithdrawal
ost a pa&ment
0vent detail is
collected b& (-.s&stems Atomic focus .ransaction
consistenc&
7/25/2019 Dimensional Modeling.ppt
9/182
9
O$TP S!stem ReportingO$TP S!stem Reporting
(-. s&stems answereventoriented)uestions well Run invoices
rint ledger ull up customer detail
(perational reporting Focused on detail redictable
re)uirements and )uer&patterns
Does not reveal theoverall performance of aprocess
7/25/2019 Dimensional Modeling.ppt
10/182
10
O$TP Design CharacteristicsO$TP Design Characteristics
Focus of (-. Design 1ndividual data
elements Data relationships
Design goals Accuratel& model
business
Remove redundanc&
7/25/2019 Dimensional Modeling.ppt
11/182
11
O$TP Design ShortcomingsO$TP Design Shortcomings
2omple3 4nfamiliar to
business people 1ncomplete histor& +low )uer&
performance
7/25/2019 Dimensional Modeling.ppt
12/182
12
"mergence of Dimensional"mergence of Dimensional
ModelModel -ogical modeling techni)ue
For designing relational database structures
Addresses (-. design shortcomings
For use in anal&tic s&stems First developed earl& !"#5/s
ackaged goods industr&
opulari'ed b& Ralph 6imball, hD7 !""8 book9 /.he Data Warehouse .oolkit/
7/25/2019 Dimensional Modeling.ppt
13/182
13
& % A& % A
7/25/2019 Dimensional Modeling.ppt
14/182
14
Dimensional ModelingDimensional Modeling
BasicsBasics
7/25/2019 Dimensional Modeling.ppt
15/182
15
Sample Value Chain Anal!sisSample Value Chain Anal!sis
' need to
see overallgrossmargin (!categor!'
')o* do
inventor!levels compare*ith sales (!product and*arehouse+'
',hat are
outstandingreceiva(les (! -.$account+'
,hat is thereturn rate foreach supplier+
Process/oriented (usiness 0uestions
OperationsSales andMarketing
CustomerServices
ProductDevelopme
nt
7/25/2019 Dimensional Modeling.ppt
16/182
16
Measurement 1ocusMeasurement 1ocus
Process/oriented (usiness measures
gross
margin inventor!levels2 sales
receiva(le
s return rate
OperationsSales andMarketing
CustomerServices
ProductDevelopme
nt
7/25/2019 Dimensional Modeling.ppt
17/182
17
Brand
Captain
Coffee
Product
Standard
Coffee
Maker
Thermal
Coffee
Maker
Deluxe
CoffeeMaker
All
Products
Units Sold
!"""
#!4""
#!"$3
%!4$3
Units Shipped
3!&""
1!'3#
1!'&
7,090
( Shipped
$'(
'&(
&"(
$(
Coffee Maker Fulfillment Report
)acts)acts
Process MeasurementProcess Measurement
Measures Metrics or indicators
b& which peopleevaluate a businessprocess
Referred to as :Facts;
03amples Margin 1nventor& Amount +ales Dollars Receivable Dollars Return Rate
7/25/2019 Dimensional Modeling.ppt
18/182
18
Perspective 1ocusPerspective 1ocus
Process/oriented (usiness perspectives
categor!Product2
*arehouse
-.$
accountsupplier
OperationsSales andMarketing
CustomerServices
ProductDevelopme
nt
7/25/2019 Dimensional Modeling.ppt
19/182
19
Brand
Captain
Coffee
Product
Standard
Coffee
Maker
Thermal
Coffee
Maker
Deluxe
Coffee
Maker
All
Products
Units Sold
!"""
#!4""
#!"$3
9,473
Units Shipped
3!&""
1!'3#
1!'&
7,090
% Shipped
$'(
'&(
&"(
75%
Coffee Maker Fulfillment Report
DimensionsDimensions
Process PerspectivesProcess Perspectives
Dimensions .he parameters b& which
measures are viewed 4sed to break out, lter
or roll up measures
(ften found after the word:b&; in a business)uestion
Descriptive businessterms
03amples roduct Warehouse 2ustomer +upplier
7/25/2019 Dimensional Modeling.ppt
20/182
20
Dimensional ModelDimensional Model
Denition -ogical data model used to represent the
measures and dimensions that pertain toone or more business sub
7/25/2019 Dimensional Modeling.ppt
21/182
21
Dimensional ModelDimensional Model
AdvantagesAdvantages 4nderstandable +&stematicall&
represents histor&
Reliable
7/25/2019 Dimensional Modeling.ppt
22/182
22
StoreStore
Star SchemaStar Schema
imeime
!rodu"t!rodu"t
Fa"tsFa"ts
Schema Simplicit!Schema Simplicit!
Fewer tables Denormali'ed
2onsolidated
Dimensional Familiar to users
Facts go in the fact
tables
Dimensions indimension tables
1ncreases
understandabilit&
7/25/2019 Dimensional Modeling.ppt
23/182
23
ime #imension
*ear
+uarter
month
date
da* of the ,eek
holida* fla-
ord$date
Data 1amiliarit!Data 1amiliarit!
Adding business
conte3t
+ingle source eld
03panded into parts
Decoded into business
terms
Add special indicators
and >ags
e7g7 time dimension
1ncreases
understandabilit&
7/25/2019 Dimensional Modeling.ppt
24/182
24
Store
!rodu"t
Fa"ts
Time DimensionTime Dimension
ime#imension
*ear
+uarter
month
date
da* of the ,eek
holida* fla-
Representing )istor!Representing )istor!
.ime dimension art of ever& star
schema
Marks the date when the
facts ?processmeasurements@ occurred
Allows the schema to
easil& add and )uer&
data over time 0speciall& useful for
performing comparison
)ueries
7/25/2019 Dimensional Modeling.ppt
25/182
25
1e*er 3oin Paths1e*er 3oin Paths
+tar schema
7/25/2019 Dimensional Modeling.ppt
26/182
26
)igh Performance Design)igh Performance Design
Fewer
7/25/2019 Dimensional Modeling.ppt
27/182
27
+ub
7/25/2019 Dimensional Modeling.ppt
28/182
28
"nterprise Models"nterprise Models
0nterprise+cope 0Rmodel
0nterprisescopedimensionalmodel
7/25/2019 Dimensional Modeling.ppt
29/182
29
"#ercise 5"#ercise 5
+cenario 1ndustr&9 Automobile manufacturing 2ompan&9 Millennium Motors Balue chain focus9 +ales
+ample business )uestions9 What are the top !5 selling car models this monthC How do this months top !5 selling models compare to
the top !5 over the last si3 monthsC
+how me dealer sales b& region b& model b& da& What is the total number of cars sold b& month b&
dealer b& stateC
-ist facts and dimensions
7/25/2019 Dimensional Modeling.ppt
30/182
30
"#ercise 5 / *orksheet"#ercise 5 / *orksheet
7/25/2019 Dimensional Modeling.ppt
31/182
31
"#ercise 5 Solution"#ercise 5 Solution
Facts +ales revenue uantit& sold
Dimensions Model name Month
Dealer name Region +tate Date
7/25/2019 Dimensional Modeling.ppt
32/182
3#
& % A& % A
7/25/2019 Dimensional Modeling.ppt
33/182
7/25/2019 Dimensional Modeling.ppt
34/182
34
#imension
#imension
#imension
Star Schema DimensionStar Schema Dimension
Ta(lesTa(les Dimension tables
+tore dimensionvalues
.e3tual content Dimension tables
usuall& referred tosimpl& as/dimensions/
+pend e3tra e*ortto add dimensionalattributes
7/25/2019 Dimensional Modeling.ppt
35/182
35
key
key
key
#imension
#imension
#imension
Dimension 6e!sDimension 6e!s
+&nthetic ke&s 0ach table assigned a
uni)ue primar& ke&,specicall& generatedfor the datawarehouse
rimar& ke&s fromsource s&stems ma&
be present in thedimension, but arenot used as primar&ke&s in the starschema
7/25/2019 Dimensional Modeling.ppt
36/182
36
Key
attriute
attriute
attriute
Key
attriute
attriute
attriute
Key
attriute
attriute
attriute
#imension
#imension
#imension
Dimension ColumnsDimension Columns
Dimension attributes +pecif& the wa& in
which measures areviewed9 rolled up,
broken out orsummari'ed (ften follow the
word :b&; as in:+how me +ales b&
Region anduarter;
Fre)uentl& referredto as /Dimensions/
7/25/2019 Dimensional Modeling.ppt
37/182
37
Fa"t ale
fa"t&
fa"t'
fa"t3
Star Schema 1act Ta(leStar Schema 1act Ta(le
rocess measures +tart b& assigning
one fact table perbusiness sub
7/25/2019 Dimensional Modeling.ppt
38/182
38
Fa"t ale
fa"t&
fa"t'
fa"t3
key
key
key
1act Ta(le Primar! 6e!1act Ta(le Primar! 6e!
0ver& fact table Multipart primar&
ke& added Made up of foreign
ke&s referencingdimensions
7/25/2019 Dimensional Modeling.ppt
39/182
39
1act Ta(le Sparsit!1act Ta(le Sparsit!
+parsit&.erm used to describe the ver& common
situation where a fact table does not containa row for ever& combination of ever&
dimension table row for a given time period
%ecause fact tables contain a ver& smallpercentage of all possible combinations,
the& are said to be Esparsel& populatedE orEsparseE
7/25/2019 Dimensional Modeling.ppt
40/182
40
Fa"t ale
1act Ta(le -rain1act Ta(le -rain
rain .he level of detail
represented b& a rowin the fact table
Must be identiedearl& 2ause of greatest
confusion duringdesign process
03ample 0ach row in the fact
table represents thedail& item sales total
7/25/2019 Dimensional Modeling.ppt
41/182
41
Sparsit! "#ampleSparsit! "#ample
Assume $,555 rows in /dealer/ dimension $5 rows in /model/ dimension
1f all dealers sold all models ever& da&9 $,555 G $5 = $5,555 sales ever& da& "!,$5,555 sales ever& &ear Assuming onl& one model sold in ever& dealerI
+parsit& Means that onl& a small fraction of the total possible
$5,55 will be sold on a given da& enerall&, onl& record sales not 'eroes in fact table
7/25/2019 Dimensional Modeling.ppt
42/182
42
Designing a Star SchemaDesigning a Star Schema
Five initial design steps %ased on 6imball/s si3 steps +tart designing in order
Revisit and ad
7/25/2019 Dimensional Modeling.ppt
43/182
43
5757 1dentif& fact table+tart b& naming the fact table with thename of the business sub
7/25/2019 Dimensional Modeling.ppt
44/182
44
StepStepT*oT*o
8787 1dentif& fact table grainDescribe what a row in the fact tablerepresents in business terms
7/25/2019 Dimensional Modeling.ppt
45/182
45
StepStepThreeThree
9797 1dentif& dimensions
7/25/2019 Dimensional Modeling.ppt
46/182
46
StepStep1our1our
:7:7 +elect facts
7/25/2019 Dimensional Modeling.ppt
47/182
47
StepStep1ive1ive
;7;7 1dentif& dimensionalattributes
7/25/2019 Dimensional Modeling.ppt
48/182
48
"#ercise 8"#ercise 8
+cenario 1ndustr&9 Automobile manufacturing 2ompan&9 Millennium Motors Balue chain focus9 +ales
+ample business )uestions9 What are the top !5 selling car models this
monthC How do this months top !5 selling models
compare to the top !5 over the last si3 monthsC +how me dealer sales b& region b& model b&
da& What is the total number of cars sold b& month
b& dealer b& stateC
7/25/2019 Dimensional Modeling.ppt
49/182
49
"#ercise 8 / continued"#ercise 8 / continued
4sing these sources data elements,design a star schema that answers theproposed business )uestions +ales revenue
uantit& sold Model name Dealer name Dealer cit& roduct line
Region where sold +tate Behicle categor& Month Date of sales
7/25/2019 Dimensional Modeling.ppt
50/182
50
"#ercise 8 < sample data"#ercise 8 < sample data
7/25/2019 Dimensional Modeling.ppt
51/182
51
"#ercise 8 / *orksheet"#ercise 8 / *orksheet
7/25/2019 Dimensional Modeling.ppt
52/182
52
"#ercise 8 / solution"#ercise 8 / solution
+tep ! Fact table name9 /+ale facts/
+tep Fact table grain9 0ver& row in the sales facts table is a summar&
of car model sales for that da& at a single dealer
+tep J Dimensions9.ime, Model, Dealer
+tep K Facts9.otal revenue, uantit& sold
+tep $ Dimensional attributes9 +ee ne3t page
7/25/2019 Dimensional Modeling.ppt
53/182
53
"#ercise 8 < Dimensional"#ercise 8 < Dimensional
ModelModelModel
model_key
cate-or*line
model
Sales Fa"ts
model_key
dealer_key
time_key
re.enue
+uantit*
imetime_key
*ear
+uarter
month
date
#ealer
dealer_key
re-ion
state
cit*
dealer
7/25/2019 Dimensional Modeling.ppt
54/182
4
& % A& % A
7/25/2019 Dimensional Modeling.ppt
55/182
1act Ta(le Details1act Ta(le Details
7/25/2019 Dimensional Modeling.ppt
56/182
56
"#ample 1act Ta(le"#ample 1act Ta(le
Sales Fa"ts
model_key
dealer_key
time_key
re.enue
+uantit*
7/25/2019 Dimensional Modeling.ppt
57/182
57
"#ample 1act Ta(le Records"#ample 1act Ta(le Records
time=ke! model=ke! dealer=ke! revenue 0uantit!
1 1 1 $&4"/#$ #1 # 1 1##'"/3$ 3
1 3 1 #&3'"/1 1
1 4 1 13#'$/## 4
1 1 43$&%/4 1
1 1 # 3'$&/%& 1
1 3 # $&'4/$& #
1 # %#&$'/'$ #
rimar& 6e& Facts
+ales Facts
7/25/2019 Dimensional Modeling.ppt
58/182
58
1acts1acts
Full& additive 2an be summed across an& and all
dimensions +tored in fact table 03amples9 revenue, )uantit&
7/25/2019 Dimensional Modeling.ppt
59/182
59
"#ample> Additive 1acts"#ample> Additive 1acts
Model
model_key
0randcate-or*
line
model
Sales Fa"ts
model_key
dealer_key
time_key
re.enue
+uantit*
imetime_key
*ear
+uarter
month
date
#ealer
dealer_key
re-ion
state
cit*
dealer
7/25/2019 Dimensional Modeling.ppt
60/182
60
1acts1acts
+emiadditive 2an be summed across most dimensions
but not all 03amples9 1nventor& )uantities, account
balances, or personnel counts An&thing that measures a :level; Must be careful with adhoc reporting
(ften aggregated across the :forbiddendimension; b& averaging
7/25/2019 Dimensional Modeling.ppt
61/182
61
"#ample> Semi/additive"#ample> Semi/additive
1acts1acts
Sales Fa"ts
model_key
dealer_key
time_key
in.entor*
Model
model_key
0rand
cate-or*
line
model
ime
time_key
*ear
+uarter
month
date
#ealer
dealer_key
re-ion
state
cit*
dealer
7/25/2019 Dimensional Modeling.ppt
62/182
62
1acts1acts
LonAdditive 2annot be summed across an& dimension
All ratios are nonadditive
%reak down to full& additive components,
store them in fact table
7/25/2019 Dimensional Modeling.ppt
63/182
63
"#ample> ?on/Additive 1acts"#ample> ?on/Additive 1acts
Marginrate is nonadditiveMarginrate = marginamtrevenue
model_key
dealer_key
time_key
revenue
marginamt
time_key
&ear
)uarter
month
date
model_key
brand
categor&
line
model
Model Sales 1acts
dealer_key
region
state
cit&
dealer
Dealer
Time
7/25/2019 Dimensional Modeling.ppt
64/182
64
@nit Amounts@nit Amounts
4nit price, 4nit cost, etc7 Are numeric, but not measures
+tore the e3tended amounts which are
additive 4nit amounts ma& be useful as dimensions
for :price point anal&sis;
Ma& store unit values to save space
7/25/2019 Dimensional Modeling.ppt
65/182
65
1actless 1act Ta(le1actless 1act Ta(le
A fact table with no measures in it Lothing to measure777 N03cept the convergence of
dimensional attributes +ometimes store a :!; for convenience 03amples9 Attendance, 2ustomer
Assignments, 2overage
7/25/2019 Dimensional Modeling.ppt
66/182
''
& % A& % A
7/25/2019 Dimensional Modeling.ppt
67/182
'$
Dimension Ta(leDimension Ta(leDetails
7/25/2019 Dimensional Modeling.ppt
68/182
68
"#ample Dimension Ta(les"#ample Dimension Ta(les
dealer_key
region
state
cit&
dealer
model_key
brandcategor&
line
model
Model time_key
&ear
)uarter
monthdate
Time
Dealer
7/25/2019 Dimensional Modeling.ppt
69/182
69
"#ample Dimension Ta(le"#ample Dimension Ta(le
RecordsRecords
time=ke! !ear 0uarter month date
! !""O ! Panuar& !!$"O
!""O ! Panuar& !!8"O
J !""O ! Panuar& !!O"O
!$5 !""O April K!"O
OOO !""# K (ctober !5!J"#
+&nthetic 6e& Attributes
.ime Dimension
7/25/2019 Dimensional Modeling.ppt
70/182
70
"#ample Dimension Ta(le"#ample Dimension Ta(le
RecordsRecordsdealer=ke! region state cit! dealer
! Lortheast Massachusetts %oston Honest .ed/s
Lortheast Massachusetts %oston +toller 2o7
J +outhwest Ari'ona .ucson Wright
Motors
! +outhwest 2alifornia +an Diego
American
K$ 2entral 1llinois 2hicago -ugwig Motors+&nthetic 6e& Attributes
Dealer Dimension
7/25/2019 Dimensional Modeling.ppt
71/182
71
Dimension Ta(lesDimension Ta(les
2haracteristics Hold the dimensional attributes
4suall& have a large number of attributes
?:wide;@ Add >ags and indicators that make it eas&
to perform specic t&pes of reports Have small number of rows in comparison to
fact tables ?most of the time@
7/25/2019 Dimensional Modeling.ppt
72/182
72
Dont ?ormalie DimensionsDont ?ormalie Dimensions
+aves ver& little space 1mpacts performance 2an confuse matters when multiple
hierarchies e3ist A star schema with normali'ed
dimensions is called a Esnow>akeschemaE
4suall& advocated b& software vendorswhose product re)uire snow>ake forperformance
7/25/2019 Dimensional Modeling.ppt
73/182
73
"#ample Sno*ake Schema"#ample Sno*ake Schema
category_key
categor&
brand_key
brand_key
brand
Brand
Categor!
line_key
line
category_key
$ine
model_key
model
line_key
Model
model_key
dealer_key
time_key
revenue
)uantit&
Sales1acts
date_key
date
month_ke
y
Da!
month_key
month
quarter_ke
y
Monthquarter_ke
y
)uarter
year_key
&uarteryear_key
&ear
ear
dealer_ke
y
dealer
city_key
Dealercity_key
cit&state_key
Cit!state_key
state
region_key
Stateregion_ke
y
region
Region
7/25/2019 Dimensional Modeling.ppt
74/182
74
Slo*l! Changing DimensionsSlo*l! Changing Dimensions
Dimension source data ma& changeover time
Relative to fact tables, dimension
records change slowl& Allows dimensions to have multiple
/proles/ over time to maintain histor& 0ach prole is a separate record in a
dimension table
Slo*l! Changing DimensionSlo*l! Changing Dimension
7/25/2019 Dimensional Modeling.ppt
75/182
75
Slo*l! Changing DimensionSlo*l! Changing Dimension
"#ample"#ample 03ample9 A woman gets married
ossible changes to customer dimension
Q -ast Lame
Q Marriage +tatus
Q Address
Q Household 1ncome
03isting facts need to remain associatedwith her single prole
Lew facts need to be associated with hermarried prole
Slo*l! Changing DimensionSlo*l! Changing Dimension
7/25/2019 Dimensional Modeling.ppt
76/182
76
Slo*l! Changing DimensionSlo*l! Changing Dimension
T!pesT!pes .hree t&pes of slowl& changing dimensions
.&pe !
Q 4pdates e3isting record with modications
Q Does not maintain histor&
.&pe Q Adds new record
Q Does maintain histor&
Q Maintains old record
.&pe J9Q 6eep old and new values in the e3isting row
Q Re)uires a design change
Designing $oads to )andleDesigning $oads to )andle
7/25/2019 Dimensional Modeling.ppt
77/182
77
Designing $oads to )andleDesigning $oads to )andle
SCDSCD Design and implementation guidelines
ather +2D re)uirements when designingdata mapping and loading
+2D needs to be dened and implemented at
the dimensional attribute level 0ach column in a dimension table needs to be
identied as a .&pe ! or a .&pe +2D 1f one .&pe ! column changes, then all .&pe !
columns will be updated 1f one .&pe column changes, then a new
record will be inserted into the dimensiontable
Designing $oads to )andleDesigning $oads to )andle
7/25/2019 Dimensional Modeling.ppt
78/182
78
Designing $oads to )andleDesigning $oads to )andle
SCDSCD Design and implementation guidelines
For large dimension tables, change datacapture techni)ues ma& be used tominimi'e the data volume
For smaller dimension tables, compare all(-. records with dimension table records
%alance data volume with change datacapture logic comple3ities
Designing $oads to )andleDesigning $oads to )andle
7/25/2019 Dimensional Modeling.ppt
79/182
79
2ustomer Dimension.able2olumn Lame +2D .&pe2ustomer 6e& LA
2ustomer 1D !
Lame !
Marital +tatus !
Home 1ncome !
Designing $oads to )andleDesigning $oads to )andle
SCDSCD .&pe ! e3ample9 a woman gets married
T!pe 5T!pe 5
7/25/2019 Dimensional Modeling.ppt
80/182
80
T!pe 5T!pe 5"#ample"#ample
CustD ?ame
MaritalStatus
589 Sue 3ones SE9F6
)omencome
CustD ?ame
MaritalStatus
5 589 Sue 3ones S E9F6F
)omencome
Cust6e!
Cust6e!
Da!6e! Sales
5 5E:F
Da& Dim
Da!6e!
BusinessDate
5 5.95.F5
+ales Facts2ustomer Dim2ustomer (-.
Da!6e!
BusinessDate
5 5.95.F5
8 8.F5.F5
Da& Dim
Cust6e!
Da!6e! Sales
5 5E:F5 8E;F
+ales Facts
CustD ?ame
MaritalStatus
589 Sue Smith MEGF6
)omencome
2ustomer (-.
Status
2ustomer Dim
CustD ?ame
MaritalStatus
5 589 Sue Smith M EGF6F
)omencome
Cust6e! Status
O$TP Star Schema
Sue -ets Married 8.5.F5
7/25/2019 Dimensional Modeling.ppt
81/182
81
T!pe 5 "#ampleT!pe 5 "#ample
(bservations 2ustomer histor& is not maintained in the
(-. s&stem 2ustomer histor& is not maintained in the
star schema +ue onl& has one customer /prole/ in
customer dimension table +ues sales facts across all histor& are
associated with her married prole +ales facts that were associated with +ues
single prole have been lost
Designing $oads to )andleDesigning $oads to )andle
7/25/2019 Dimensional Modeling.ppt
82/182
82
2ustomer Dimension.able2olumn Lame +2D .&pe
2ustomer 6e& LA
2ustomer 1D
Lame
Marital +tatus
Home 1ncome !
Designing $oads to )andleDesigning $oads to )andle
SCDSCD .&pe e3ample9 a woman gets married
T!pe 8T!pe 8
7/25/2019 Dimensional Modeling.ppt
83/182
83
T!pe 8T!pe 8"#ample"#ample
CustD ?ame
MaritalStatus
589 Sue 3ones S9F6
Da& Dim
)omencome
CustD ?ame
MaritalStatus
5 589 Sue 3ones S E9F6F
)omencome
Cust6e!
Cust6e!
Da!6e! Sales
5 5E:F
Da!6e!
BusinessDate
5 5.95.F5
+ales Facts2ustomer Dim2ustomer (-.
Cust6e!
Da!6e! Sales
5 5E:F8 8E;F
+ales FactsCustD ?ame
MaritalStatus
5 589 Sue 3ones S E9F65
)omencome
Cust6e! Status
8 589 Sue Smith M EGF6F
2ustomer DimCustD ?ame
MaritalStatus
589 Sue Smith MEGF6
)omencome
2ustomer (-.
Status
O$TP Star Schema
Sue -ets Married 8.5.F5
Da& DimDa!6e!
BusinessDate
5 5.95.F5
8 8.F5.F5
7/25/2019 Dimensional Modeling.ppt
84/182
84
T!pe 8 "#ampleT!pe 8 "#ample
.&pe (bservations 2ustomer histor& is not maintained in the (-.
s&stem
2ustomer histor& is maintained in the star
schema +ue has two /proles/ in the customer dimension
+ues sales facts ma& be anal&'ed for when she
was single, when she was married, and across all
histor& b& using the customer id eld Home income was updated in the new prole
record
Slo*l! Changing DimensionSlo*l! Changing Dimension
7/25/2019 Dimensional Modeling.ppt
85/182
85
Slo*l! Changing DimensionSlo*l! Changing Dimension
AdviceAdvice /When in doubt, design t&pe /
7/25/2019 Dimensional Modeling.ppt
86/182
86
Degenerate DimensionsDegenerate Dimensions
Dimensions with no other place to go +tored in the fact table Are not facts
2ommon e3amples include invoicenumbers or order numbers
7/25/2019 Dimensional Modeling.ppt
87/182
87
e-ion
2ortheast
Southeast
Units Sold Re(enue
)uarterl* +uto Sales Summar*
State
Maine
2e, ork
Massachusetts
)lorida
eor-ia
5ir-inia
e-ion
2ortheast
Southeast
Central
2orth,est
South,est
Units Sold Re(enue
)uarterl* +uto Sales Summar*
DrillingDrilling
Drilling down Adding dimensional
detail Further breaks out a
measure in some wa& Has nothing to do
with a hierarch&I
7/25/2019 Dimensional Modeling.ppt
88/182
88
Region
Lortheast
+outheast
4nits +old Revenue
&uarterl! Auto SalesSummar!
+tate
Maine
Lew Sork
Massachusetts
Florida
eorgia
Birginia
Region
Lortheast
+outheast
2entral
Lorthwest
+outhwest
4nits +old Revenue
&uarterl! Auto SalesSummar!
DrillingDrilling
Rolling up Removing
dimensional detail Rolls up a measure Has nothing to do
with how &ou drilleddown
7/25/2019 Dimensional Modeling.ppt
89/182
89
DrillingDrilling
Drilling across A )uer& that involves more than one fact
table Lot necessaril& an action that changes how
a user is looking at the data %est resolved b& multiple +- passes
7/25/2019 Dimensional Modeling.ppt
90/182
%"
& % A& % A
7/25/2019 Dimensional Modeling.ppt
91/182
%1
Dimensional DesignDimensional DesignProcessProcess
ro
7/25/2019 Dimensional Modeling.ppt
92/182
92
Development
hase
Deplo&ment
hase
Design hase
Data Mart DevelopmentData Mart Development
Dimensional modeling is a critical partof the data mart development e*ort
lD t M t D l t
7/25/2019 Dimensional Modeling.ppt
93/182
93
Data Mart DevelopmentData Mart Development
Design phase Determine re)uirements and design schema
Development phase
1terative build and feedback Deplo&ment phase
Automate load, document, train users
P 4 D li (lP 4 t D li (l
7/25/2019 Dimensional Modeling.ppt
94/182
94
Pro4ect Delivera(lesPro4ect Delivera(les
Design ro
7/25/2019 Dimensional Modeling.ppt
95/182
95
Developmenthase
Deplo&menthase
Design hase
Pro4ect ApproachPro4ect Approach
.he dimensional model is developedduring the design stage
+cope of the pro
7/25/2019 Dimensional Modeling.ppt
96/182
96
Developmenthase
Deplo&menthase
Design hase
Design Stage ActivitiesDesign Stage Activities
ather re)uirements throughre)uirements workshops
Develop star schema
2onduct design review
- th R i t- th R i t
7/25/2019 Dimensional Modeling.ppt
97/182
97
-ather Re0uirements-ather Re0uirements
Re)uirements denition 4ser workshops +preadsheets +ample reports
+ource s&stems anal&sis D%A interviews
2op&books 0R diagrams
D i D li (lD i D li (l
7/25/2019 Dimensional Modeling.ppt
98/182
98
Design Delivera(lesDesign Delivera(les
Deliverables.he star schema itself -oad mapping document
How these primar& components aredelivered will depend on needs andformat chosen
Modeling tools +preadsheets.e3t documents
? t ti? t ti
7/25/2019 Dimensional Modeling.ppt
99/182
99
?otation?otation
Lo recogni'ed standard 0R semantics unnecessar& 2larit& is the onl& characteristic that
reall& matters
? t ti " l?otation " ample
7/25/2019 Dimensional Modeling.ppt
100/182
100
Sales 1actstime_key
model_key
dealer_key
time_key
Time
model_ke
y
Model
dealer_keyDealer
?otation "#ample?otation "#ample
1D0F!T Dependent entities fact tables 1ndependent entities dimension tables
? t ti " l?otation "#ample
7/25/2019 Dimensional Modeling.ppt
101/182
101
Sales 1acts
Time
Dealer
Model
?otation "#ample?otation "#ample
Martin 10 0ntities fact or dimension tables Attributes not shown
? t ti " l?otation "#ample
7/25/2019 Dimensional Modeling.ppt
102/182
102
time_key
Time
model_ke
y
Model
dealer_key
Dealer
time_key
model_key
dealer_ke
y
Sales 1acts
?otation "#ample?otation "#ample
6imball +imple structure 2ardinalit& implied
Design ?aming StandardsDesign ?aming Standards
7/25/2019 Dimensional Modeling.ppt
103/182
103
Design ?aming StandardsDesign ?aming Standards
Responsibilit& of data administration 03tended to the data warehouse 1mportant to start earl& in the pro
7/25/2019 Dimensional Modeling.ppt
104/182
104
Data "lement DeHnitionsData "lement DeHnitions
2lear descriptions Facts
2alculated formulae
Dimensional attributes Multiple meaningss&non&mous terms
Aliases
Data "lement nstancesData "lement nstances
7/25/2019 Dimensional Modeling.ppt
105/182
105
Data "lement nstancesData "lement nstances
03ample of Data As it will e3ist in the warehouse
After decoding
Adds to model understanding
Removes ambiguit&uncertaint&
Data "lement MappingData "lement Mapping
7/25/2019 Dimensional Modeling.ppt
106/182
106
Data "lement MappingData "lement Mapping
Where is the data coming from +ource s&stem
.able
2olumn
Record
Field
Data TransformationData Transformation
7/25/2019 Dimensional Modeling.ppt
107/182
107
Data TransformationData Transformation
2hanging the data +erves as spec for 0.- process
Decodes
.&pe conversion
2onditional logic
Handling of L4--s
7/25/2019 Dimensional Modeling.ppt
108/182
1"&
& % A& % A
7/25/2019 Dimensional Modeling.ppt
109/182
1"%
Aggregates SchemasAggregates Schemas
Aggregate DesignsAggregate Designs
7/25/2019 Dimensional Modeling.ppt
110/182
110
Aggregate DesignsAggregate Designs
Aggregates restored fact summaries Along one or more dimensions.he most e*ective tool for improving
performance
03amples
+ummar& of sales b& region, b& product, b&categor& Monthl& sales
Aggregate BackgroundAggregate Background
7/25/2019 Dimensional Modeling.ppt
111/182
111
Aggregate BackgroundAggregate Background
Aggregate rationale 1mprove end user )uer& performance Reduce re)uired 24 c&cles owerful cost saving tool
Restrictions Additive facts onl&
Must use dimensional design
Aggregate -uidelinesAggregate -uidelines
7/25/2019 Dimensional Modeling.ppt
112/182
112
Aggregate -uidelinesAggregate -uidelines
Dont start with aggregates
Design and build based on usage +ooner or later &ou/ll need to build
aggregates
Aggregate T!pesAggregate T!pes
7/25/2019 Dimensional Modeling.ppt
113/182
113
Aggregate T!pesAggregate T!pes
-evel eld
+eparate fact tables
Aggregate T!pesAggregate T!pes
7/25/2019 Dimensional Modeling.ppt
114/182
114
Aggregate T!pesAggregate T!pes
-evel eld (ld techni)ue Re)uires :level; attribute in appropriate
dimensions
Aggregates and baselevel facts stored insame table
+ame number of total fact records asseparate table approach
Drawbacks 0ver& )uer& must constrain on the level eld ossibilit& of double counting
$evel 1ield$evel 1ield
7/25/2019 Dimensional Modeling.ppt
115/182
115
time_key
product_key
market_key
uantit&
Amount
time_key
-evel
Sear
Fiscal eriod
Month
Da&
Da& of Week
product_ke
y
-evel
2ategor&
%rand
roduct
Diet
1ndicator market_key
Region
District
+tate
2it&
03ample9 -evel = 2ategor&L7A7 for appropriate attributes
$evel 1ield$evel 1ield
Product Sales 1acts
Time
Market
Aggregate T!pesAggregate T!pes
7/25/2019 Dimensional Modeling.ppt
116/182
116
Aggregate T!pesAggregate T!pes
+eparate .ables +eparate fact table for ever& aggregate +eparate dimension table for ever& aggregate
dimension
+ame number of fact records as level eld tables Advantage
Removes possibilit& of double counting +chema clarit&
2aveat Re)uires software with aggregate navigation
capabilit&
S t T (lSeparate Ta(les
Month
7/25/2019 Dimensional Modeling.ppt
117/182
117
(ne Wa&Aggregate
Separate Ta(lesSeparate Ta(les
month_key
product_key
market_key
uantit&
Amount
Mthl!
Sales1acts Agg
time_key
product_key
market_key
uantit&
Amount
Sales 1actsproduct_key2ategor&
%rand
roduct
Diet 1ndicator
Product
month_key
Sear
Fiscal eriodMonth
Month
market_key
RegionDistrict
+tate
2it&
Market
time_key
Sear
Fiscal eriod
Month
Da&
Da& of Week
Time
Separate Ta(lesSeparate Ta(les
7/25/2019 Dimensional Modeling.ppt
118/182
118
.wo Wa&Aggregate
Separate Ta(lesSeparate Ta(les
product_ke
y2ategor&
%rand
roduct
Diet 1ndicator
Product
category_key
2ategor&
Categor!
month_key
category_key
market_key
uantit&
Amount
Mnthl! Cat
Sales 1actsAgg
month_key
SearFiscal eriod
Month
Month
market_key
RegionDistrict
+tate
2it&
Market
time_key
Sear
Fiscal eriod
Month
Da&
Da& of Week
Time
time_key
product_key
market_key
uantit&
Amount
Sales 1acts
Aggregate PitfallsAggregate Pitfalls
7/25/2019 Dimensional Modeling.ppt
119/182
119
Aggregate PitfallsAggregate Pitfalls
+parsit& failure.erm used to describe the result of building
too man& aggregate fact that do notsummari'e enough rows7
When +parsit& failure occurs, a relativel&small star schema can grow ?in terms ofdisk si'e@ thousands of times7
+parsit& failure = aggregate e3plosion
Aggregate Design -uidelinesAggregate Design -uidelines
7/25/2019 Dimensional Modeling.ppt
120/182
120
Aggregate Design -uidelinesAggregate Design -uidelines
Rule of twent&.o avoid aggregate e3plosion Make sure each aggregate record
summari'es 5 or more lowerlevel records
Remember.otal number of possible fact tables in an&
given dimensional model = cartesian
product of all levels in all the dimensions
)ierarchies % Aggregate)ierarchies % Aggregate
7/25/2019 Dimensional Modeling.ppt
121/182
121
ear I5J
&uarter I:J
Month I58J
Date I9G;J
Time
;!ears
8F 0uarters
GF months
5K8; da!s
DesignDesign
Hierarch& diagram Helps visuali'e
options for buildingaggregates
Adding cardinalities
insures following the
rule of 5
Lot re)uired to build
initial star schema
Aggregate ?avigationAggregate ?avigation
7/25/2019 Dimensional Modeling.ppt
122/182
122
Aggregate ?avigationAggregate ?avigation
Description Function provided b& software la&er9
Aggregate Lavigator Directs user )ueries to the most favorable
available aggregate.ransparent to the end user
Aggregate 1rame*orkAggregate 1rame*ork
7/25/2019 Dimensional Modeling.ppt
123/182
123
%usiness Biew
Designer Biew
Aggregate 1rame*orkAggregate 1rame*ork
Aggregate ArchitectureAggregate Architecture
7/25/2019 Dimensional Modeling.ppt
124/182
124
Aggregate A*are
S&$ Client PCS&$
RDBMS
Client PC
Application Server
S&$Aggregate A*are S&$
RDBMS
Client PCAggregate A*are S&$
RDBMS
Aggregate ArchitectureAggregate Architecture
Aggregate Deplo!mentAggregate Deplo!ment
7/25/2019 Dimensional Modeling.ppt
125/182
125
Aggregate Deplo!mentAggregate Deplo!ment
1ncremental
%ased on usage
.ransparent to users
.&picall& warehouse D%A responsibilit&
Aggregate Deplo!mentAggregate Deplo!ment
7/25/2019 Dimensional Modeling.ppt
126/182
126
%uild +ub
7/25/2019 Dimensional Modeling.ppt
127/182
127
"#ercise 9"#ercise 9
+cenario iven the original star schema and the
following hierarch&, design a twowa&aggregate table structure that will
drasticall& increase performance Make &our own assumptions about
summar& levels
"#ercise 9 Dimensional"#ercise 9 Dimensional
7/25/2019 Dimensional Modeling.ppt
128/182
128
"#ercise 9 < Dimensional"#ercise 9 < Dimensional
ModelModelModel
model_key
cate-or*
line
model
Sales Fa"ts
model_key
dealer_key
time_key
re.enue
+uantit*
ime
time_key
*ear
+uarter
month
date
#ealer
dealer_key
re-ion
state
cit*
dealer
"#ercise 9"#ercise 9
7/25/2019 Dimensional Modeling.ppt
129/182
129
"#ercise 9"#ercise 9 +cenario
1ndustr&9 Automobile manufacturing 2ompan&9 Millennium Motors Balue chain focus9 +ales
+ample business )uestions9 What are the top !5 selling car models this monthC How do this months top !5 selling models compare
to the top !5 over the last si3 monthsC +how me dealer sales b& region b& model b& da& What is the total number of cars sold b& month b&
dealer b& stateC
"#ercise 9"#ercise 9
7/25/2019 Dimensional Modeling.ppt
130/182
130
"#ercise 9"#ercise 9
All
2ategor&
-ine
Model name
All
Sear
uarter
Month
Date
TimeModel
All
Region
+tate
2it&
Dealer name
Dealer
Millennium Motors dimensions
$
$5
!555
!555 K5
!5
5
5
85
!#$
$
"#ercise 9 ,orksheet"#ercise 9 ,orksheet
7/25/2019 Dimensional Modeling.ppt
131/182
131
"#ercise 9 ,orksheet"#ercise 9 ,orksheet
"#ercise 9 Solution"#ercise 9 Solution
7/25/2019 Dimensional Modeling.ppt
132/182
132
"#ercise 9 Solution"#ercise 9 Solution
model_key
categor&
linemodel
model_key
dealer_key
time_key
revenue
)uantit&
time_key
&ear
)uartermonth
date
dealer_key
region
state
cit&
dealer
month_key
&ear
)uarter
month
state_key
region
state
state_key
month_key
model_key
revenue
)uantit&
Dealer
Time
MonthAgg Sales
1actsState
ModelSales 1acts
7/25/2019 Dimensional Modeling.ppt
133/182
133
& % A& % A
7/25/2019 Dimensional Modeling.ppt
134/182
134
Multiple 1act Ta(lesMultiple 1act Ta(les
Multiple 1act Ta(lesMultiple 1act Ta(les
7/25/2019 Dimensional Modeling.ppt
135/182
135
pp
Di*erent business processes usuall&re)uire di*erent fact tables
.here are also several cases where asingle business process will re)uire
multiple fact tables 2ore and custom +napshot and transaction
2overage Aggregates
DiLerent Business ProcessesDiLerent Business Processes
7/25/2019 Dimensional Modeling.ppt
136/182
136
Di*erent business processes usuall&re)uire di*erent fact tables
1n practice, it ma& be hard to identif&what a :process; is
+ometimes &ou can spot di*erentprocesses because measures arerecorded
With di*erent dimensions At di*ering grains
DiLerent Dimensions orDiLerent Dimensions or
7/25/2019 Dimensional Modeling.ppt
137/182
137
DiLerent Dimensions orDiLerent Dimensions or
-rain-rain
product_key
2ategor&
%rand
roduct
Diet 1ndicator
Product
time_key
product_ke
y
shipper_key
market_key
uantit&
Weight
Shipment1acts
shipper_ke
y
name
t&pe
mode
address
Shipper
time_key
Sear
Fiscal eriod
Month
Da&
Da& of Week
Time
market_key
Region
District
+tate
2it&
Markettime_key
product_ke
y
market_key
uantit&
Amount
Sales 1acts
DiLerent Dimensions orDiLerent Dimensions or
-rain-rain
7/25/2019 Dimensional Modeling.ppt
138/182
138
-rain-rain
Dont take shortcuts with grain.he /not applicable/ dimension value 4sing a /not applicable/ row in a dimension
confuses the grain and can introducereporting diUcult&
DiLerent Points in TimeDiLerent Points in Time
7/25/2019 Dimensional Modeling.ppt
139/182
139
+ometimes, it is not eas& to identif& thediscrete business processes
All measures ma& have the samedimensionalit& or grain
Di*erent measures are recorded atdi*erent times uantit& sold is not recorded at the same
time as )uantit& shipped
DiLerent TimingDiLerent Timing
7/25/2019 Dimensional Modeling.ppt
140/182
140
gg
%uilding a single fact table wouldre)uire recording 'ero or null formeasures that are not applicable at apoint in time
Reports would contain a confusingcombination of 'eros, nulls, andabsence of data
DiLerent Timing / One 1actDiLerent Timing / One 1act
7/25/2019 Dimensional Modeling.ppt
141/182
141
market_key
RegionDistrict
+tate
2it&
DiLerent Timing One 1actDiLerent Timing One 1act
Ta(leTa(le
1nitiall& will be null
time_key
product_key
market_key
uantit&sold
Amountsold
uantit&shippedAmountshipped
Sales and
Shipment1acts
time_key
Sear
Fiscal eriod
Month
Da&
Da& of Week
Time
Market
product_key
2ategor&%rand
roduct
Diet 1ndicator
Product
DiLerent Timing / T*o 1actDiLerent Timing / T*o 1act
7/25/2019 Dimensional Modeling.ppt
142/182
142
time_key
product_key
market_key
uantit&
Amount
DiLerent Timing T*o 1actDiLerent Timing T*o 1act
Ta(lesTa(les
product_key
2ategor&
%rand
roduct
Diet 1ndicator
Product
Shipment
1acts
time_key
product_key
market_key
uantit&
Amount
Sales 1acts market_keyRegion
District
+tate
2it&
Market
time_keySear
Fiscal eriod
Month
Da&
Da& of Week
Time
dentif!ing DiLerentdentif!ing DiLerent
ProcessesProcesses
7/25/2019 Dimensional Modeling.ppt
143/182
143
ProcessesProcesses
-ook at the measures in )uestion +ort them into fact tables based on
Dimensions
rain
Di*ering timings of events measured
One Process2 Multiple 1actOne Process2 Multiple 1act
Ta(lesTa(les
7/25/2019 Dimensional Modeling.ppt
144/182
144
Ta(lesTa(les
2ore and custom 2overage
+napshot and transaction
Aggregates
Core and Custom SchemasCore and Custom Schemas
7/25/2019 Dimensional Modeling.ppt
145/182
145
.here is a set of dimension attributesand measures shared in all cases Depending on the value in a dimension,
certain e3tra dimension attributes or
measures are recorded
Heterogeneous products
.&pes of customers
Core andCore andC tCustom
7/25/2019 Dimensional Modeling.ppt
146/182
146
CustomCustom
product_key
777
Product
customer_ke
y
777
Customer
checking_key
777custom checking
attributes
Checking Accounttime_key
checking_key
branch_key
customer_key
%alance
.ransactioncount
777custom checking
facts
CheckingAccount1acts
time_key
product_key
branch_key
customer_key
%alance
.ransactioncount
Account 1acts
time_key
777
Time
branch_key
777
Branch
Core and CustomCore and Custom
7/25/2019 Dimensional Modeling.ppt
147/182
147
2ore fact table and dimensions All attributes shared no matter what Appropriate for anal&sis across entire sub
7/25/2019 Dimensional Modeling.ppt
148/182
148
A star schema usuall& measure eventsthat happen Relationships between the dimensions
involved are not captured if events do
not happen A coverage table lls the gap
What did not sell that was on promotionC
Who was assigned to that customerC 4suall& :factless;
Measuring ,hat )appenedMeasuring ,hat )appened
7/25/2019 Dimensional Modeling.ppt
149/182
149
product_key
2ategor&
%rand
roduct
+64
Product
customer_ke
yLame
2ompan&
Account
honenum
Customer
time_key
product_key
customer_key
rep_key
)uantit&
salesdollars
Sales 1acts
time_key
Sear
Fiscal eriodMonth
Da&
Da& of Week
Time
rep_keyrepname
repphone
Region
District
+tate
2it&
Sales=rep
+ales facts does not reveal who isassigned to a customer if the& do notsell
Coverage Ta(leCoverage Ta(le
7/25/2019 Dimensional Modeling.ppt
150/182
150
2ustomercoveragefacts shows who isassigned to a customer at a point intime
customer_key
Lame
2ompan&
Account
honenum
Customer
time_key
customer_key
rep_key
CustomerCoverage1acts
time_key
Year
Fiscal eriod
Month
Da&
Da& of Week
Time
rep_keyrepname
repphone
Region
District
+tate
2it&
Sales=rep
Snapshot and TransactionSnapshot and Transaction
7/25/2019 Dimensional Modeling.ppt
151/182
151
Biewing a single process multiple wa&s .ransactions
.he changes to what is being measured
+napshot.he status at a point in time
03ample 2hanges to inventor&
2urrent status of inventor&
SnapshotSnapshot
7/25/2019 Dimensional Modeling.ppt
152/182
152
time_key
Sear
Fiscal eriod
MonthDa&
Da& of Week
How much is on hand toda&C How much was on hand &esterda&C
product_key
2ategor& %rand
roduct
+64
Product
location_key
Warehouse
WHcode
2it&
+tate
$ocation
time_key
product_key
location_key
)uantit&onhand
nventor!Snapshot Time
TransactionTransaction
7/25/2019 Dimensional Modeling.ppt
153/182
153
How did inventor& change toda&C How much product was returned due to
failed inspectionC
product_key
2ategor& %rand
roduct
+64
Product
location_key
Warehouse
WHcode
2it&
+tate
$ocation
time_key
product_key
location_key
transaction_type_k
ey
transactionamount
nventor!Transactions
time_key
Sear
Fiscal eriodMonth
Da&
Da& of Week
Time
transaction_type_key
transactiont&pecode
transactiont&pe
transactioncategor&
Transaction=t!pe
Aggregate Ta(lesAggregate Ta(les
7/25/2019 Dimensional Modeling.ppt
154/182
154
Aggregate table A fact table that summari'es another fact
table
2reated for performance reasons
2overed in previous section
Design Tools for MultipleDesign Tools for Multiple
Ta(lesTa(les
7/25/2019 Dimensional Modeling.ppt
155/182
155
Ta(lesTa(les
2reate a set of matrices Facts vs dimension Facts vs dimensional attributes
Mark where facts appl& to dimensions Mark where facts appl& to dimensional
attributes When facts don/t appl&, assume
separate fact table
"#ample Matri#"#ample Matri#
7/25/2019 Dimensional Modeling.ppt
156/182
156
Attribu
te!
Attribu
teH
Attribu
teJ
Attribu
teK
Attribu
te$
Attribu
te8
Attribu
teO
Attribu
te#
Fact ! T T T T
Fact T T T T
Fact J T T T T T
Fact K T T T T T
Fact .able !
Fact .able
Fact vs dimensional attribute matri3
Multiple 1act Ta(le Summar!Multiple 1act Ta(le Summar!
7/25/2019 Dimensional Modeling.ppt
157/182
157
Di*erent processes need di*erent tables 1dentied with
rain Dimensionalit&
.iming +ame process ma& need multiple fact
tables Heterogeneous attributes
2overage +napshot and transaction Aggregates
"#ercise :"#ercise :
7/25/2019 Dimensional Modeling.ppt
158/182
158
+cenario 1ndustr&9 Automobile manufacturing 2ompan&9 Millennium Motors Balue chain focus9 +ales
+ample business )uestions9 What are the top !5 selling car models this monthC How do this months top !5 selling models compare
to the top !5 over the last si3 monthsC +how me dealer sales b& region b& model b& da&7 How man& cars have been purchased over the last
si3 months b& customers with &earl& householdincomes greater than V55,555C
"#ercise : / continued"#ercise : / continued
7/25/2019 Dimensional Modeling.ppt
159/182
159
4sing these sources data elements, design astar schema that answers the proposedbusiness )uestions
Dail& salesrevenue Dail& )uantit&sold Model Dealer Dealer cit& roduct line Region wheresold +tate Behicle categor&
Date of sales
2ustomer name 2ustomer 'ip code 2ustomer &earl& income 7(7 Lumber urchase price Discount amount %rand of car
"#ercise : / *orksheet"#ercise : / *orksheet
7/25/2019 Dimensional Modeling.ppt
160/182
160
"#ercise : Solution / Matri#"#ercise : Solution / Matri#
7/25/2019 Dimensional Modeling.ppt
161/182
161
facts
dail&sales
dail&)uantit& purchaseprice
discountamount
2ustomername
2ustomer'ipcode
Model
2ustomerincome
Dealer
7(
7Lumber
Dealercit&
ro
ductline
%ra
ndofcar
Regionwheresold
+ta
te
Beh
iclecategor&
Dateofsales
"#ercise : / Star schema"#ercise : / Star schema
7/25/2019 Dimensional Modeling.ppt
162/182
162
customer_key
customername
customer'ip
&earl&income
Customer
model_key
brand
categor&
line
model
Model
model_key
dealer_key
time_key
revenue
)uantit&
Dail! Sales1acts
model_key
dealer_key
time_keycustomer_key
po_number
purchaseprice
discountamt
CustomerSales 1acts
time_key
&ear
)uarter
month
date
Time
dealer_key
region
state
cit&dealer
Dealer
7/25/2019 Dimensional Modeling.ppt
163/182
1'3
& % A& % A
7/25/2019 Dimensional Modeling.ppt
164/182
1'4
Architected DataArchitected DataMartsMarts
Data MartData Mart
7/25/2019 Dimensional Modeling.ppt
165/182
165
Meaning of the term /data mart/ hasshifted over the last several &ears777
Data Mart Architecture 5NN9Data Mart Architecture 5NN9
7/25/2019 Dimensional Modeling.ppt
166/182
166
(perational+&stems
07.7-707.7-7
+oftware+oftware
DataWarehouse
Anal&sis4sers
uer& uer&
ReportinReportin
gg
+oftware+oftware
07.7-707.7-7
+oftware+oftware
Data Marts
Data Mart Architecture 5NNData Mart Architecture 5NN
7/25/2019 Dimensional Modeling.ppt
167/182
167
(perational+&stems
07.7-7+oftware
Data MartsAnal*sis Users
uer& Reporting
+oftware
Architected Data MartsArchitected Data Marts
7/25/2019 Dimensional Modeling.ppt
168/182
168
(perational+&stems
Anal&sis4sers
Data Mart
Data Warehouse
07.7-+oftwar
e
uer& Reporting+oftware
Data MartData Mart
7/25/2019 Dimensional Modeling.ppt
169/182
169
Warehouse +ub
7/25/2019 Dimensional Modeling.ppt
170/182
170
Produc
t
Produc
t
Time
IDa!JShipment
s 1acts
,arehous
e
,arehouse nventor
! 1acts
Product
Month
:+tovepipe; datamarts 1nconsistent and
overlapping data DiUcult and costl& to
maintain Redundant data load 2ant drill across 1ntegration re)uires
starting over Dimensions not
conformed
Conformed DimensionsConformed Dimensions
7/25/2019 Dimensional Modeling.ppt
171/182
171
Denition Dimensions are conformed when the& are
the sameor
When one dimension is a strict rollup of
another
Conformed DimensionsConformed Dimensions
7/25/2019 Dimensional Modeling.ppt
172/182
172
+ame dimensions must9
!7 777 have e3actl& the same set ofprimar& ke&s
and
7 777 have the same number of records
Conformed DimensionsConformed Dimensions
7/25/2019 Dimensional Modeling.ppt
173/182
173
Rolled up dimension When one dimension is a strict rollup of
another
Which means.wo conformed dimensions can be
combined into a single logical dimension b&creating a union of the attributes
Conformed DimensionsConformed Dimensions
7/25/2019 Dimensional Modeling.ppt
174/182
174
Description +hared common dimensions
1ntegrates logical design
0nsures consistenc& between data marts Allows incremental development
1ndependent of ph&sical location
+ome rework ma& be re)uired
Conformed DimensionsConformed Dimensions
7/25/2019 Dimensional Modeling.ppt
175/182
175
Advantages 0nables an incremental development approach
0asier and cheaper to maintain
Drasticall& reduces e3traction and loading
comple3it&
Answers business )uestions that cross data
marts
+upports both centrali'ed and distributedarchitectures
Time
nterlocking Star Schemasnterlocking Star Schemas
7/25/2019 Dimensional Modeling.ppt
176/182
176
Store
Dimension
Sales
1acts
Product
Dimensio
n
Time
Dimensio
nShipmen
t 1acts
,arehouse
Dimensio
n
nventor
! 1acts
Month
Dimensio
n
2onformed Dimensions2onformed Dimensions
6im(alls Data ,arehouse6im(alls Data ,arehouse
7/25/2019 Dimensional Modeling.ppt
177/182
177Store Product Da! ,arehouse Month
Sales1acts Shipment 1acts nventor! 1actsBusBus
,hen to Conform,hen to Conform
7/25/2019 Dimensional Modeling.ppt
178/182
178
.wo approaches 4pfront As&ougo %oth approaches work
2hoose the approach that works for &ou
Conform @p 1rontConform @p 1ront
7/25/2019 Dimensional Modeling.ppt
179/182
179
2ross0nterprise
Anal&sis
2reateFirst2ut+tars
All +ub
7/25/2019 Dimensional Modeling.ppt
180/182
180
Design %uild+ub
7/25/2019 Dimensional Modeling.ppt
181/182
1&1
& % A& % A
Course Revie*Course Revie*
7/25/2019 Dimensional Modeling.ppt
182/182
Rationale for dimensional modeling Dimensional modeling basics Dimensional modeling details Fact table details Dimension table details Design process Aggregate schemas
M lti l f t t bl