02 OLAP

Embed Size (px)

Citation preview

  • 8/8/2019 02 OLAP

    1/41

    Revisit to OLAP

  • 8/8/2019 02 OLAP

    2/41

    What is OLAP ? OLAP = Online Analytical Processing

    Support (almost) ad-hoc querying for business

    analyst

    Think in terms of spreadsheets View sales data by geography, time, or product

    Extend spreadsheet analysis model to work with

    warehouse data Large data sets

    Semantically enriched to understand business terms

    Combine interactive queries with reporting functions

  • 8/8/2019 02 OLAP

    3/41

  • 8/8/2019 02 OLAP

    4/41

    OLAP Models

  • 8/8/2019 02 OLAP

    5/41

    OLAP Models

    Relational OLAP (ROLAP): Extended relational DBMS that maps

    operations on multidimensional data to

    standard relations operations

    Store all information, including fact tables,

    as relations

    Multidimensional OLAP (MOLAP):

    Special purpose server that directly

    implements multidimensional data and

    operations

    store multidimensional datasets as arrays

  • 8/8/2019 02 OLAP

    6/41

    OLAP Models

    Hybrid OLAP (HOLAP):

    Give users/system administrators freedom to

    select different partitions.

  • 8/8/2019 02 OLAP

    7/41

    OLAP & Cube

    Cubes are used to have a data summarized, copied, or read directly

    from the data warehouse.

    Cubes allows

    Rapid analytical access

    Spares end users from writing language-based queries

  • 8/8/2019 02 OLAP

    8/41

    nai

    nnai

    nnai

    nnai

    GroceriesGroceries

    9797

    9898

    9999

    ClothingClothing

    AppliancesAppliances

    Mum

    ba

    i

    Mum

    ba

    i

    Pune

    Pune

    Delhi

    Delhi

    9696

    9595

    ka

    tta

    ka

    tta

    ka

    tta

    ka

    tta

    (Products.Groceries, Location.Pune,Time.95,Measures.Sales)(Products.Groceries, Location.Pune,Time.95,Measures.Sales)

    The Sales CubeThe Sales Cube

    (Products.Clothing, Location.Mumbai,Time.97,Measures.Sales)(Products.Clothing, Location.Mumbai,Time.97,Measures.Sales)

    (Products.Clothing, Location.Delhi,Time. 98,Measures.Sales)(Products.Clothing, Location.Delhi,Time. 98,Measures.Sales)

    Time

    Product

    Location

  • 8/8/2019 02 OLAP

    9/41

    OLAP and Data Warehouses are complementary.

    A Data Warehouse stores and manages data. OLAP transforms Data Warehouse data into

    strategic information.

    OLAP enables decision-making about future actions.

    OLAP vs. Data Warehouse

  • 8/8/2019 02 OLAP

    10/41

    Product

    Warehouse

    Sales

    Budget

    Store Sales

    OLAP

    OLAP vs. Data Warehouse

  • 8/8/2019 02 OLAP

    11/41

    Typical OLAP Operations

    Roll up (drill-up): summarize data

    by climbing up hierarchy or by dimension reduction

    Drill down (roll down): reverse of roll-up

    from higher level summary to lower level summary or detailed data, or

    introducing new dimensions

    Slice and dice:

    project and select

    Pivot (rotate):

    reorient the cube, visualization, 3D to series of 2D planes.

    Other operations

    drill across: involving (across) more than one fact table

    drill through: through the bottom level of the cube to its back-end

    relational tables (using SQL)

  • 8/8/2019 02 OLAP

    12/41

    OLAP Queries

    Roll up: summarize data along a

    dimension hierarchy

    if we are given total sales volume per city wecan aggregate on the Location to obtain sales

    per states

  • 8/8/2019 02 OLAP

    13/41

  • 8/8/2019 02 OLAP

    14/41

    OLAP Queries

    Roll down, drill down: go from higher

    level summary to lower level summary or

    detailed data For a particular product category, find the

    detailed sales data for each salesperson by

    date Given total sales by state, we can ask for sales

    per city, or just sales by city for a selected state

  • 8/8/2019 02 OLAP

    15/41

    OLAP Queries

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8

    c1 c2 c3

    p1 56 4 50

    p2 11 8

    c1 c2 c3

    sum 67 12 50

    sum

    p1 110

    p2 19

    129

    drill-down

    rollup

    day 1

  • 8/8/2019 02 OLAP

    16/41

    OLAP Queries

    Slice and dice: select and project Sales of video in USA over the last 6 months

    Slicing and dicing reduce the number of

    dimensions Pivot: reorient cube

    The result of pivoting is called a cross-

    tabulation

    If we pivot the Sales cube on the Client and

    Product dimensions, we obtain a table for

    each client for each product value

  • 8/8/2019 02 OLAP

    17/41

    OLAP Queries

    Pivoting can be combined with aggregation

    sale prodId clientid date amt

    p1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50

    p2 c2 1 8

    p1 c1 2 44

    p1 c2 2 4

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8

    day 1

    c1 c2 c3 Sum

    p1 56 4 50 110

    p2 11 8 19

    Sum 67 12 50 129

    c1 c2 c3 Sum

    1 23 8 50 81

    2 44 4 48

    Sum 67 12 50 129

  • 8/8/2019 02 OLAP

    18/41

    OLAP- Slice

    Slice for

    Year = 98

    GroceriesGroceries

    9595

    9797

    9898

    9999

    ClothingClothingAppliancesAppliances

    9696

    Mum

    ba

    i

    Mum

    ba

    i

    Pune

    Pune

    Delhi

    Delhi

    Ko

    lka

    tta

    Ko

    lka

    tta

    Ko

    lka

    tta

    Ko

    lka

    tta

    Ch

    enn

    ai

    Ch

    enn

    ai

    Ch

    enn

    ai

    Ch

    enn

    ai

    GroceriesGroceries

    AppliancesAppliances

    ClothingClothing

    Mum

    ba

    i

    Mum

    ba

    i

    Pune

    Pune

    Delhi

    Delhi

    atta

    ka

    tta

    ka

    tta

    ka

    tta

    nai

    nainai

    nai

  • 8/8/2019 02 OLAP

    19/41

    OLAP- Dice

    Dice for

    Groceries, Clothing

    Year = 97 & 98 andSales & Cost

    GroceriesGroceries

    9595

    9797

    9898

    9999

    ClothingClothingAppliancesAppliances

    9696

    Mum

    ba

    i

    Mum

    ba

    i

    Pune

    Pune

    Delhi

    Delhi

    Ko

    lka

    tta

    Ko

    lka

    tta

    Ko

    lka

    tta

    Ko

    lka

    tta

    Ch

    enn

    ai

    Ch

    enn

    ai

    Ch

    enn

    ai

    Ch

    enn

    ai

    GroceriesGroceriesClothingClothing

    Mum

    ba

    i

    Mum

    ba

    i

    Pune

    Pune

    9797

    9898

  • 8/8/2019 02 OLAP

    20/41

    OLAP Queries

    Ranking: selection of first n elements (e.g. select 5

    best purchased products in July)

    Others: stored procedures, selection, etc.

    Time functions

    e.g., time average

  • 8/8/2019 02 OLAP

    21/41

    Multi-Dimensional View

    of Data

    ProfitProfit

    by Divisionby Division

    by Countryby Country

    by Monthby Month

    by Actual/Budgetby Actual/Budget

    FinanceFinance

    RevenueRevenueby Productby Product

    by Regionby Region

    by Sales Repby Sales Rep

    by Quarterby Quarter

    SalesSalesRevenueRevenue

    by Customerby Customer

    by Industryby Industry

    by Channelby Channel

    by Weekby Week

    MarketingMarketing

    VolumeVolume

    by Plantby Plant

    by Shiftby Shift

    by Productby Product

    by Dayby Day

    OperationsOperations

  • 8/8/2019 02 OLAP

    22/41

    Multidimensional Data Model

    Sales of products may be represented in

    one dimension (as a fact relation) or

    in two dimensions, e.g. : clients andproducts

  • 8/8/2019 02 OLAP

    23/41

    Multidimensional Data Model

    sale Product Client Amt

    p1 c1 12

    p2 c1 11

    p1 c3 50p2 c2 8

    c1 c2 c3

    p1 12 50p2 11 8

    Fact relation Two-dimensional cube

  • 8/8/2019 02 OLAP

    24/41

    Multidimensional Data Model

    sale Product C lient Date Amt

    p1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50

    p2 c2 1 8

    p1 c1 2 44

    p1 c2 2 4

    day 2 c1 c2 c3p1 44 4p2 c1 c2 c3

    p1 12 50

    p2 11 8

    day 1

    Fact relation 3-dimensional cube

  • 8/8/2019 02 OLAP

    25/41

    Multidimensional Data Model and

    Aggregates

    Add up amounts for day 1

    In SQL: SELECT sum(Amt) FROM SALE

    WHERE Date = 1sale Product Client Date Amt

    p1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50

    p2 c2 1 8

    p1 c1 2 44p1 c2 2 4

    81result

  • 8/8/2019 02 OLAP

    26/41

    Multidimensional Data Model and

    Aggregates

    Add up amounts by day

    In SQL: SELECT Date, sum(Amt)

    FROM SALE GROUP BY Date

    sale Product Client Date Amt

    p1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50

    p2 c2 1 8

    p1 c1 2 44

    p1 c2 2 4

    Date sum

    1 81

    2 48

    result

  • 8/8/2019 02 OLAP

    27/41

    Multidimensional Data Model and

    Aggregates

    Add up amounts by client, product

    In SQL: SELECT client, product, sum(amt)

    FROM SALEGROUP BY client, product

  • 8/8/2019 02 OLAP

    28/41

    Multidimensional Data Model and

    Aggregates

    sale Product Client Date Amt

    p1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50p2 c2 1 8

    p1 c1 2 44

    p1 c2 2 4

    sale Product Client Sum

    p1 c1 56

    p1 c2 4

    p1 c3 50

    p2 c1 11

    p2 c2 8

  • 8/8/2019 02 OLAP

    29/41

    Multidimensional Data Model and

    Aggregates

    In multidimensional data model together

    with measure values usually we store

    summarizing information (aggregates)

    c1 c2 c3 Sum

    p1 56 4 50 110

    p2 11 8 19

    Sum 67 12 50 129

  • 8/8/2019 02 OLAP

    30/41

    Aggregates

    Operators: sum, count, max, min,median, ave

    Having clause

    Using dimension hierarchyaverage by region (within store)

    maximum by month (within date)

  • 8/8/2019 02 OLAP

    31/41

    Cube Aggregation

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8

    c1 c2 c3

    p1 56 4 50

    p2 11 8

    c1 c2 c3

    sum 67 12 50

    sum

    p1 110

    p2 19

    129

    . . .

    Example: computing sums

    day 1

  • 8/8/2019 02 OLAP

    32/41

    Cube

    c1 c2 c3 *

    p1 56 4 50 110

    p2 11 8 19

    * 67 12 50 129day 2

    c1 c2 c3 *

    p1 44 4 48

    p2

    * 44 4 48c1 c2 c3 *

    p1 12 50 62

    p2 11 8 19

    * 23 8 50 81

    day 1

    *

    sale(*,p2,*)

  • 8/8/2019 02 OLAP

    33/41

    Aggregation Using Hierarchies

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8day 1

    region Aregion B

    p1 1 2 5 0p2 1 1 8

    customer

    region

    country

    (customer c1 in Region A;

    customers c2, c3 in Region B)

  • 8/8/2019 02 OLAP

    34/41

    Aggregation Using Hierarchies

    c1

    c2

    c3

    c4

    video

    Camera

    New

    Orleans

    Pozna

    CD

    Date of

    sale

    10

    12

    1112

    3

    5

    711

    21

    9

    715

    Video Camera CD

    NO 22 8 30

    PN 23 18 22

    aggregation with

    respect to city

    client

    city

    region

  • 8/8/2019 02 OLAP

    35/41

    A Sample Data Cube

    sum

    sum

    sum

    USA

    Canada

    Mexico

    C

    o

    un

    t

    r

    y

    Date

    Prod

    uct

    CD

    videocamera

    1Q 2Q 3Q 4Q

  • 8/8/2019 02 OLAP

    36/41

    Exercise

    Suppose the AAA Automobile Co. builds a datawarehouse to analyze sales of its cars.

    The measure - price of a car

    We would like to answer the following typical

    queries:

    find total sales by day, week, month and year

    find total sales by week, month, ... for each dealer find total sales by week, month, ... for each car model

    find total sales by month for all dealers in a given city,

    region and state.

  • 8/8/2019 02 OLAP

    37/41

    Exercise

    Dimensions: time (day, week, month, quarter, year)

    dealer (name, city, state, region, phone)

    cars (serialno, model, color, category , )

    Design the conceptual data warehouse schema

  • 8/8/2019 02 OLAP

    38/41

    Who Uses OLAP and Why?

    Finance department

    budgeting,

    activity-based costing (allocations),

    financial performance analysis financial modeling.

    Marketing department

    Market research analysis,

    sales forecasting,

    promotions analysis,

    customer analysis,

    market/customer segmentation.

  • 8/8/2019 02 OLAP

    39/41

    Sales department

    Sales analysis

    Sales forecasting.

    Manufacturing department

    Production planning

    Defect analysis.

    Who Uses OLAP and Why?

    Contd..

  • 8/8/2019 02 OLAP

    40/41

    Key features of OLAP Multidimensional views of data

    Ability to "slice and dice";

    View financial data by scenario (for example, actual vs. budget), organization,

    line items, and time

    View sales data by product, geography, channel, and time.

    Calculation-intensive capabilities

    Share calculations (percentage of total)

    Moving averages and percentage growth

    Time intelligence This year's vs. last year's

    This month vs. the same month last year.

    Aggregation

    Allows pre-aggregated Values

    Drill Down and Drill Up

    Pivot

  • 8/8/2019 02 OLAP

    41/41

    Questions