17
CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1. Case Study 2. Defining the business processes 3. Data Warehouse bus matrix 4. Assumptions 5. Data Marts 5.1 Vehicle Rental Income 5.2 Side Product Sales 5.3 Promotion Revenue Analysis 5.4 Revenue Analysis Aggregation 5.5 VIP Card Processing 6. Fact Granularity 7. Dimensions and the Attribute Hierarchies 8. Other design features 9. Questions and Answers from Business Management 10. Bibliography CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

Embed Size (px)

Citation preview

Page 1: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

CSE 4DWD – Semester 1 2014_______________________________

Table of Contents

1. Case Study

2. Defining the business processes

3. Data Warehouse bus matrix

4. Assumptions

5. Data Marts

5.1 Vehicle Rental Income

5.2 Side Product Sales

5.3 Promotion Revenue Analysis

5.4 Revenue Analysis Aggregation

5.5 VIP Card Processing

6. Fact Granularity

7. Dimensions and the Attribute Hierarchies

8. Other design features

9. Questions and Answers from Business Management

10. Bibliography

Page 2: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

Case Study

• Vehicle Rental Oz is a large vehicle rental company chain. It has over 500 stores all over all states in Australia. Vehicle Rental Oz lends vehicles such as cars, vans, busses and trucks. Customers can rent vehicles online, as well as from the stores. The stores also sell drinks, candies, posters, maps and VIP cards.

• Revenue is generated by rental fees, any overdue fees and the sales of side-items .• Every week HQ sends a list with all available vehicles , including price, availability information, the ratings and categories of vehicles.• The company has different methods of payment. VIP cards are different to cash / credit transactions. HQ management also analyses the

usage of the cards for all customer classifications by store.• Each store has a local operational database to capture their day to day rental and sales figures. They send the receipts file and the

customer file to HQ every night.• Management wishes to perform detailed analysis of the company’s performance and have decided to build a Data Warehouse to assist

their business analysis and decision making for new vehicle purchases.• Rental and Sales Analysis – The requirements are to see historical data and optimize human resource utilization at the POS. They need

to build a monthly/quarterly/yearly top 10 list of vehicle category, per store and per suburb and also per class of customer.• Revenue Streams from Rentals, Returns (Overdue Fees) , VIP Cards and Side item sales need to be analyzed. The Revenue streams

will be performed using only the actual price of individual vehicles as indicated in the transaction records. The price indicates the price including any promotions.

• The management want to compare the different revenue streams over various time periods.• The Management want to do Promotion Revenue Analysis.

Page 3: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

Define the Business Processes

• Business Process 1 – Vehicle Rental (Contract)• Business Process 2 - Side Product Sales• Promotion Revenue Analysis• Aggregate Revenue Analysis• VIP Card Processing

Page 4: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

VehicleRentalOZ – Data Warehouse Bus Matrix

Date

Product –Side Item

Vehicle

Vehicle Rental Group Key

Store-Location

Customer

Promotion

VIP Class

Payment Method(Junk Dimension)

Media Type

Promotion Provider

Transaction

Vehicle Rental Income X X X X X X X X

Side Product Sales X X X X X X X

Promotion Revenue Analysis

X X X X X X X X

Aggregate Revenue Analysis

X X X X X X X

VIP Card Processing

X X X X X X X X

DIMENSIONS

BUSI

NES

S

PRO

CESS

ES

Page 5: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

4) Assumptions

I have made the following assumptions and my Data warehouse is built accordingly:

1) The driver and the customer are the same entity. I am assuming that VehicleRentalOz merely rents vehicles out but does not provide a chauffeur service. All customers will be their own drivers also.

2) I am assuming that all customers are local. A much bigger data warehouse would have to be built to accommodate different countries and overseas addresses.

3) I am assuming that all transactions are made in Australian Dollars and have not made provision for any other currencies.

Page 6: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

(ii) Star Schema – Rental Fee Income

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

Vehicle rental (Contrsct)Fact Table

__________________________

Date Key (FK)Rental Date (FK)Planned Return Date (FK)Actual return Date (FK)Customer ID (FK)Vehicle Registration no (FK) Vehicle Rental Group (FK)Take away Store ID (FK)Return Store ID (FK)Customer (FK) Promotion (FK)VIP Card (FK)Booking (FK)Payment Ked (FK)VIP Class Dimension (FK)Invoice number (DD)Contract (DD)Rental Fee Total Income in DollarsRental Fee VIP in DollarsRental Fee VIP discount in DollarsOverdue Fees in DollarsCost of Operations Dollar AmountPromotion Cost in DollarsDollar Sale Side ItemsDollar Cost Side ItemDamage PaymentPenaltyLimitMileageDistanceTavelledTotal CostFuel level on return

DATE Dimension views for 3 roles_____________________

Vehicle Dimension ___________________

Vehicle Rental Group Dimension Minidimension___________________

Store Dimension views for 2 roles ___________________

Booking Dimension Factless Fact table ________________

VIP card Dimension ______________

Promotion Dimension (causal dimension)-___________________

Date Key (PK)DayMonthYearDay of weekMonth of yearTime

Vehicle Reg no (FK)ColourMade yearType IDModelMileageAvailabilityConditionHirecostperHour

Store Key (PK)Store ID (NK)Store addressStore suburbStore State

Vehicle Rental Group Key(PK)Vehicle TypeAge groupPrice RangeGender

Promotion Key (PK)Promotion NamePrice reduction typePromotion Media TypeAd TypeDisplay TypeCoupon TypeAd media NameDisplay ProviderPromotion CostPromotion Begin datePromotion End date

Booking_no Key (PK)Booking Type desciptionRegistration IDCustomer IDStart DateEnd Date

VIP card_ID (PK)Card Balance StartingCard Balance EndingAverage Transaction amountAverage Top up amountNumber top ups in monthNumber Rentals during month

Customer Dimension ___________________ Customer Key (PK )Customer ID (NK)First LastBirthdateAddressCityState Postal CodeTelephone noVIP Card Class (FK)Return Protocol Drivers License number

Payment method (Junk Dimension)___________________ Payment Key (PK)Payment TypePayment DateDiscountSurcharge

VIP CLASS Dimension __________________

VIP Class Key (PK)ClassifiicationValue Spend

VIP Customer demographics (Outrigger)---------------------------------- Neighborhoods

Protocol (factless fact table)---------------------------------- Protocol ID

Contract (DD)

Return Protocol (Factless Fact table)---------------------------------- Return Protocol ID

Protocol ID

Page 7: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

(ii) Star Schema – Sales OTHER

Item Sales- Fact Table

______________________

Date (FK)Store (FK)Product (FK)Customer (FK) Promotion (FK)VIPClass(FK)Transaction type (FK)Payment method (FK)POS Transaction number (DD)Sales Other Income in DollarsCost – Other in Dollars

DATE Dimension_____________________

Store Dimension ___________________

Product Dimension___________________

Date Key (PK)DayMonthYearTime

Store Key (PK)Store ID (NK)Store addressStore suburbStore State

Customer Dimension ___________________ Customer Key (PK )Customer ID (NK)NameAddressBirthdayTelephone noVIP Card Class (FK)

___________________

VIP CLASS Dimension __________________

Payment method (Junk Dimension)___________________ Payment Key (PK)Payment TypePayment DateDiscountSurcharge

VIP Class Key (PK)ClassifiicationValue Spend

VIP Customer demographics (Outrigger)----------------------------------

Product Key (PK)Product DescriptionCategory Description

Neighborhoods

Transaction type___________________

Transaction type (PK)Transaction description

Page 8: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

(iii) Star Schema - Promotion Revenue Analysis

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

Promotion Fact Table

__________________________Date Key (FK)Vehicle Key (FK)Store Key (FK)Promotion Key (FK)Media Type (FK)Promotion Provider (FK)Transacion Type (FK)Rental Revenue Dollar AmountCost Dollar AmountGross Profit Dollar Amount

DATE Dimension_____________________

Store Dimension ___________________

Promotion Dimension ___________________

Date Key (PK)DayMonthYearTime

Store Key (PK)Store ID (NK)Store addressStore suburbStore State

Promotion Provider (PK)NameAddress

Promotion Key (PK)Promotion NamePrice reduction typePromotion Media TypeAd TypeDisplay TypeCoupon TypeAd media NameDisplay ProviderPromotion CostPromotion Begin datePromotion End date

Media Type___________________ Media Type Key(PK)Vehicle Type

Vehicle Dimension ___________________ Vehicle Key (PK)Vehicle Registration IDColourMade yearType IDModelMileageAvailabilityCondition

Promotion Provider___________________

Transaction Dimension___________________

Transaction type (PK)

Page 9: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

(iv) Star Schema Aggragate Revenue Analysis

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

Aggregate Revenue Snapshot Fact

__________________________

Month end Date Key (FK)Year end Date Key (FK)Customer ID (FK)Vehicle ID(FK) Vehicle Rental Group Key (FK)Product Dimension (FK)Promotion (FK)Store (FK)VIP Card (FK)Transaction Type FKActual PriceVehicleRental Total Income in DollarsIncome Side Items in DollarsCost Side Items in DollarsRental Fee VIP in DollarsRental Fee VIP discount in DollarsOverdue Fees in DollarsCost of Operations Dollar AmountPromotion Cost in DollarsDamage PaymentPenalty

DATE Dimension (multiple views)_____________________

Vehicle Dimension ___________________

Vehicle Rental Group Dimension Minidimension___________________

Store Dimension ___________________

VIP card Dimension ______________

Date Key (PK)DayMonthYearTimeFiscal Year MonthFiscal QuarterFiscal year

Vehicle Key (PK)Vehicle Registration IDColourMade yearType IDModelMileageAvailabilityConditionFoto

Store Key (PK)Store ID (NK)Store addressStore suburbStore State

Vehicle Rental Group Key(PK)Vehicle TypeAge groupPrice RangeGender

VIP card_ID (PK)Card Balance StartingCard Balance EndingAverage Transaction amountAverage Top up amountNumber top ups in monthNumber Rentals during month

Customer Dimension ___________________

Customer Key (PK )Customer ID (NK)First LastBirthdateAddressCityState ZipTelephone noVIP Card Class (FK)Return Protocol Drivers License number

Product Dimension___________________

Product Key (PK)Product DescriptionCategory Description

Transaction Dimension ___________________ Tranaction Type (PK)

Page 10: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

(v) Star schema VIP Card Processing

VIP Cards Fact Table

__________________________

Date Key (FK)Customer ID (FK)Vehicle Dimension (FK)Vehicle Rental Group (FK)VIP Card (FK)VIP Class (FK)Transaction Type Dimension (FK)Dollar Amount opening balanceDollar Amount addedDollar Amount subtractedDollar Amount Closing Balance

DATE Dimension_____________________

Vehicle Dimension ___________________

Vehicle Rental Group Dimension Minidimension___________________

Store Dimension ___________________

VIP card Dimension ______________

Date Key (PK)DayMonthYearTime

Vehicle Key (PK)Vehicle Registration IDColourMade yearType IDModelMileageAvailabilityConditionFoto

Store Key (PK)Store ID (NK)Store addressStore suburbStore State

Vehicle Rental Group Key(PK)Vehicle TypeAge groupPrice RangeGender

VIP card_ID (PK)Card Balance StartingCard Balance EndingAverage Transaction amountAverage Top up amountNumber top ups in monthNumber Rentals during month

Customer Dimension ___________________ Customer Key (PK )Customer ID (NK)Customer Location (suburb)First LastBirthdateAddressCityState ZipTelephone noVIP Card Class (FK)Return Protocol Drivers License number

VIP CLASS Dimension __________________

VIP Class Key (PK)ClassifiicationValue Spend

VIP Customer demographics (Outrigger)---------------------------------- Neighborhood

Transaction Dimension_____________________

Transaction Type (PK)

Page 11: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

iii)Granularity

FACT Table Fact Granularity Brief Justification

Vehicle Rental Transaction Vehicle Rental Transaction Fact table is used to register the Contract for the customer ID.

Each vehicle Per Rental Analysis per vehicle per month

Side Product - Sales One Item of sale One Item per sale Analysis Side product sales per month

Promotion Revenue Analysis One promotion transaction in dollars One Promotion Sale Promotion Analysis

Aggragate Revenue Analysis One row for each account One row for each account at the end of each month

Revenue Analysis

VIP Card Processing One VIP Card Transaction One row per VIP card transaction VIP card Analysis

Page 12: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DW Assignment ,Semester 1 2014 Monika Theilig 17656167

iv) Dimensions

Dimension Table name Brief Justification Attribute HierarchiesDate Dimension 3 views of the Date Dimension are necessary to

show the Rental date, Planned return date and actual return date.

Day, Month Year Time

Product – Side item Dimension The side item dimension will list all the products other than vehicles that the company sells to make revenue.

Description, Category

Vehicle Dimension This dimension needs to describe the vehicle and provide all its details, including condition, model, year of manufacture etc

Vehicle Reggo, Colour, Made Year, Type, Model, Mileage, Availability, condition,

Vehicle Rental Group Dimension(Minidimension)

This dimension needs to show what group or range of vehicle the vehicle falls into

Vehicle type, Age group, price range, gender

Store Dimension This gives all the information regarding the store Number, Street, Suburb, City, Code , State

Customer Dimension This gives all the information regarding the customer/driver of the vehicle

First name, last name, birthdate, address, tel no, VIP card class, Drivers License no, Insurance

Promotion Dimension This will give all the details regarding the promotion Promotion name, price reduction type, Ad type

VIP Class Dimension This Dimension shows the classification of the customers VIP cards eg Gold, Platinum, Silver

Classification, Value spend

VIP Class Demographics Dimension (Outrigger)

Payment method (Junk Dimension) This shows how the customer will be paying for the rental of the vehicle or product. Cash, Cr card etc

Payment type, Date, Discount, Surcharge

Media Type Dimension This shows what media type will be used for the promotion of the vehicle

Media type , vehicle

Promotion Provider Dimension This will tell us who the promotion provider was Name, address, description

Page 13: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

v) Design Features Used

Design Feature Brief Description Brief JustificationFactless Fact table – Use a factless fact table for Booking. A factless fact

table has no metrics.There are no table rows with zero facts .By putting this in the fact table it would make the table doing so would make the table too large.

Factless Fact table – Use a factless fact table for Protocol and Return Protocol. A factless fact table has no metrics.

There are no table rows with zero facts .By putting this in the fact table it would make the table doing so would make the table too large.

Degenerate Dimension – for Sales of items such as drinks , posters, candies, maps, VIP cards etcAlso for Invoice numbers generated for clients.

POS Transaction number keyand Invoice number Key in the Fact table – stripped of all its descriptive items that might otherwise fall in a POS transactiond dimension.

Since the dimension is empty we refer to the POS transaction number as a degenerate dimension (DD). The grain is a single transaction or line item because the DD represents the Unique Identifier of the parent.

Surrogate Keys Every joint between fact tables and Dimension tables in the DW should be based on meaningless integer surrogate keys.

Surrogate keys are like an immunisation (flu shot) for the DW. They buffer the DW environment from buffer changes. Surrogate keys can also record conditions that have no operational codes, such as “No promotion in effect” Surrogate keys are very important for the date dimension eg January 1 = surrogate key 1.

Outrigger for Customer Demographics The customers are from different neighborhoods. Rather than repeating this large block of data for every customer within a neighborhood we opt to model it as an outrigger.

Junk Dimension for Payment method We create a separate Junk Dimension for the payment type used by customers. We remove these indicators from the order fact table and place them in a single dimension. (Use a surrogate key for the foreign keys)

An abstract dimension with the decodes for a group of low-cardinality flags and indicators, thereby removing the flags from the fact table.

Minidimension for VIP Class The VIP class is a dimension that would be changing frequently. Customers would be going from solver to gold and back down from gold to silver etc. A separate minidimension is created . Continuosly variable attributes such as top up amounts and rental amount would be in the minidimension.

To address both the performance and change tracking challenges we break off frequently changing attributes into a separate dimension called a minidimension.The changes are also captured.

Dimension Role Playing for Date There are different dates when the vehicle was rented out, the planned return date an the actual return date. 3 Different views are needed

Role playing occurs when a single dimension appears simultaneously several times in the same fact table. The dimension exists as a single physical table, but each of the roles should be presented in a separate labelled view.

Page 14: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

v) Design Features Used (continued)

Design Feature Brief Description Brief Justification

Dimension Role Playing for Store The vehicles will be rented in one store but can be returned in another store. 2 different views are needed.

Role playing occurs when a single dimension appears simultaneously several times in the same fact table. The dimension exists as a single physical table, but each of the roles should be presented in a separate labelled view.

No Foreign Key Handling No Promotion In Effect – No promotion took place A null key will be used . A single row will be used in the promotion dimension with its own unique key to identify “No promotion in effect”, and avoid a null promotion key in the fact table.

No Foreign Key HandlingNon Customer Purchases – Somebody purchased a product from a store but was not a customer.

A null key will be used . A single row will be used in the Customer dimension with its own unique key to identify “Non Customer “purchases.

Type 2 Slowly Changing Dimension Customer information will change over time. A Type 2 slowly changing dimension will add a new row to the table.

A type 2 response is the primary technique for accurately tracking slowly changing dimension attributes. The new dimension row automatically partitions history in the fac ttable.

Page 15: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

Identify which fields from your Fact / Dimensions are required to answer each of the following business question:

• 1) Whether the promotion was profitable? That is , whether the products under promotion experienced an increase in rental during the promotion period ? You would look at your Rental Fee Income in dollars for the period between the Promotion Begin Date and the Promotion end date and subtract the Promotion cost as well as the Cost of Operations for that same period.

• 2) Did any stores rent out more Vehicles during the Promotions? Does this vary across different months or event types?• You would look at your Take Away Store ID and analyze the highest Rental Revenue Income per month for each event during the

Promotion Begin date and the Promotion End date.• 3) What products were on promotion but did not sell?• You would look at the Product category and the Promotion Begin and end date Dollar Sale Side Item. The Items with zero or very low

sales would answer your question.

• 4) For which customers have we provided the most products? How much do we make a year out of our top 5 customers?• You would look at the customer, the product, the vehicle and the the Total Revenue Income per customer

• 5) Which categories of Vehicles have made the highest profit?• You would look in the Vehicle Rental Group Dimension under Vehicle type and then Subtract the Cost from the Total Income per vehicle

type.

• 6) What is the main location of those people (customers) renting online? What event types are they attending?• You would look at the booking type description in the Booking factless fact table and also their demographics in the customer

demographics outrigger. You would also look at the Promotion name in the Promotion dimension

Page 16: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

• 8) Whether the products under promotion showed a drop in rental just prior to or after the promotion thereby cancelling any gain.• You would have to look at the Rental Fee income and analyze the dates between the Promotion Begin date and the Promotion End date.

• 9) How much do we make on overdue fees in last 5 years?• You would have to analyze the Overdue Fees in Dollars as well the the Aggregate Revenue in Dollars fact per year and add the last 5

years together.

• How much do we make on rental in the last financial year? How does this compare to the past 2 years?• You would look at the Aggregate Revenue for Rental and the Aggregate Revenue for your side products and add them together for the

current year, You would then compare these figures with the history figures for the last 2 years as per your date dimension.

Identify which fields from your Fact / Dimensions are required to answer each of the following business question: Continued

Page 17: CSE 4DWD – Semester 1 2014 _______________________________ Table of Contents 1.Case Study 2.Defining the business processes 3.Data Warehouse bus matrix

CSE4DWD Assignment ,Semester 1 2014 Monika Theilig 17656167

10) Bibliography

• Dr Jinli Cao Lecture Notes• Kimball , Ross, The Data Warehouse Toolkit,2002, second edition• Various Internet websites