19
Presented by : Vivek Sharma(09BM8062) Ashutosh Sinha(09BM8067), MBA 2 nd year VGSOM, IIT Kharagpur BI in Retail Industry

It bi retail

Embed Size (px)

Citation preview

Page 1: It bi retail

Presented by :Vivek Sharma(09BM8062)

Ashutosh Sinha(09BM8067),MBA 2nd year

VGSOM, IIT Kharagpur

BI in Retail Industry

Page 2: It bi retail

Retail Industry: Case studyWe are running a retail business which has 100

retail stores spread over five states.Each of the stores has a full complement of

departments, including grocery, frozen foods, dairy, meat, produce, bakery, floral, and health/beauty aids.

Each store has roughly 60,000 individual products on its shelves.

About 55,000 of the SKUs come from outside manufacturers and

have bar codes imprinted on the product package. These bar codes are called universal product codes (UPCs). UPCs are at the same grain as individual SKUs.

The remaining 5,000 SKUs come from departments such as meat, produce, bakery, or floral. While these products don’t have nationally recognized UPCs, the grocery chain assigns SKU numbers to them

Page 3: It bi retail

Retail Industry: Case study(contd.)Data is collected at two points:

Point-of-sale(POS)-when customer make purchases.

Data collection point-when vendors deliver materials.

Management concern: maximizing profit.Charging as much as possible.Lowering cost for product acquisition and

overheads.Attracting more and more customers in the

highly competitive pricing environment.Other management’s major decisions

revolve around:Promotions.Pricing.

Page 4: It bi retail

Dimensional Design process:It will consist of four steps:1.Selecting the Business process.2.Declaring the Grain.3.Choosing the dimensions.4.Identifying the facts.

Page 5: It bi retail

Step 1: Selecting the Business process In our retail case study, management

wants to better understand customer purchases as captured by the POS system.

Thus the business process we’re going to model is POS retail sales.

This data will allow us to analyze what products are selling in which stores on what days under what promotional conditions.

Page 6: It bi retail

Step 2: declaring the grain. In our case study, the most granular data is an

individual line item on a POS transaction. To ensure maximum dimensionality and flexibility, we will proceed with this grain.

While users probably are not interested in analyzing single items associated with a specific POS transaction, we can’t predict all the ways that they’ll want to cull through that data.

For example, they business users may want to understand the difference in sales on Monday versus Sunday. Or they may want to assess whether it’s worthwhile to stock so many individual sizes of certain brands, such as cereal. Or they may want to understand how many shoppers took advantage of the 50-cents-off promotion on shampoo.

While none of these queries calls for data from one specific transaction, they are broad questions that require detailed data sliced in very precise ways.

Page 7: It bi retail

Step 3: Choosing the dimensions Major primary dimensions for the our grain

are the date, product, and store. We assume that the calendar date is the

date value delivered to us by the POS system. Later, we will see what to do if we also get a time of day along with the date.

Within the framework of the primary dimensions, we can ask whether other dimensions can be attributed to the data, such as the promotion under which the product is sold.

Page 8: It bi retail

Basic outline: Retail sale schema

TBD-to be decided

Page 9: It bi retail

Step 4: Identifying the facts The facts collected by the POS system

include the sales quantity (e.g., the number of cans of chicken noodle soup), per unit sales price, and the sales dollar amount. In some cases it may include the dollar cost.

Three of the facts, sales quantity, sales dollar amount, and cost dollar amount, are beautifully additive across all the dimensions. We can slice and dice the fact table with impunity, and every sum of these three facts is valid and correct.

Whereas, dimensions like gross profit and unit price are non additive and can be calculated through query.

Page 10: It bi retail

Measured facts in the Retail sales schema.

Page 11: It bi retail

The Date dimensionMany date attributes not supported by SQL

functionsBetter to store than to calculate on-the-fly

Page 12: It bi retail

The Product dimensionFacilitate slicing & dicing

Page 13: It bi retail

The Store dimensionStrict star schema

Page 14: It bi retail

The Promotion dimensionAll four factors affecting sales pushed in a

single rowMay be difficult for users to understandAvoid null keys

Page 15: It bi retail

To do (key tips that make a difference) For step 1(Business process selection),

Initial cost for setup is very high . Thus select the business process which answeres the most important questions (immediate ROI concerns).

For step 2 (Declare the grain), Go for the lowest level of granularity.It offers design flexibility & extensiblity. But granularity comes with a cost of excessive storage & computation needs. We may need a workaround.

Page 16: It bi retail

To do (contd...) For step 3 (Choose the dimension), avoid

adding too many dimensions, just to accomodate new requirements ASAP. Bad practice leads to degenerate dimensions (high query cost, lower user friendliness). Use lowest possible granularity.

For step 4 (Identify the facts), avoid including ratios and percentages in the Fact table.

Use surrogate keys as immunization against operational changes

Page 17: It bi retail

Extensibility New Dimension attributes New dimensions New measured facts Dimensions becoming more granular Addition of new data source with

unexpected dimensions

Page 18: It bi retail

Don't s !! Dimension normalization:- Storage VS

Computation & User friendliness Too many dimensions (Centipede fact

table)

Page 19: It bi retail

Thank You !