20
1 Flexible Data Cube for Range-Sum Queries in Dynamic OLAP Data Cubes Authors: C.-I Lee and Y.-C. Li Speaker: Y.-C. Li Date :Dec. 19, 2002

Flexible Data Cube for Range-Sum Queries in Dynamic OLAP Data Cubes

  • Upload
    ismet

  • View
    32

  • Download
    1

Embed Size (px)

DESCRIPTION

Flexible Data Cube for Range-Sum Queries in Dynamic OLAP Data Cubes. Authors: C.-I Lee and Y.-C. Li Speaker: Y.-C. Li Date :Dec. 19, 2002. Outline. Introduction Related works Analysis of the average query and update costs Flexible data cube Performance analysis Conclusions. - PowerPoint PPT Presentation

Citation preview

Page 1: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

1

Flexible Data Cube for Range-Sum Queries in

Dynamic OLAP Data Cubes

Authors: C.-I Lee and Y.-C. LiSpeaker: Y.-C. LiDate :Dec. 19, 2002

Page 2: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

2

Outline

Introduction Related works Analysis of the average query and

update costs Flexible data cube Performance analysis Conclusions

Page 3: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

3

Introduction

Data cubes are frequently adopted to implement OLAP and provides aggregate information

Data cube: also known as Multi-dimensional Database(MDDB)

Measure attributes: be chosen as metrics of interest Functional attributes(dimensions):

other attributes of records. Cells: store measure attribute values Range-Sum Query:

add all cells in query region

Page 4: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

4

Measure attribute → Sale_Volume Dimensions → Year and Age of customers

Car-sales example

Page 5: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

5

+

+

4

20

255

1430

Page 6: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

6

Several previous approaches are used to accelerate the response time

But they slow down the update speed and require further space overhead

This study considers both query and update costs to construct data cubes

No extra space overhead Choice the best cube in any query or update ratio

We also present a FDC method No extra space overhead (for dense data cube) Select or integrate some pre-aggregation

techniques for each dimension

Page 7: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

7

Related works

The history of pre-aggregate range-sum queries

Prefix Sum(PS)[Ho et al., 1997]

Dynamic Data Cube(DDC)[Geffer et al., 1999b]

Relative Prefix Sum(RPS) [Geffer et al., 1999a]

Hierarchical Cube (HC)[Chan & Ioannidis, 1999]

Double RPS[Liang et al., 2000]

Space-Efficient Data Cube(SEDC)[Riedewal et al., 2000]

Iterative Data Cube(IDC)[Riedewal et al., 2001]

1997 1998 1999 2000 2001

Page 8: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

8

Prefix Sum(PS) ( Ho et al., 1997 )

3+5+1+2+7+3+2+6+2+4+2+3=40 A: 2+3+3+3+1+5+3+5+1+3+3+4=36 P: 103-50-35+18=36

Page 9: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

9

Prefix Sum(PS)

Page 10: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

10

Other methods RPS ( Geffer et al., 1999a)

Two levels(Local PS and overlay boxes) but extra space overhead HC ( Chan & Ioannidis, 1999 )

Hierarchical method DDC ( Geffer et al., 1999b )

Hierarchical method but need extra space overhead SEDC ( Riedewald et al., 2000 )

No exrtra space overhead of RPS and DDC (SRPS and SDDC) Double RPS ( Liang et al., 2000 )

Three levels but need extra space overhead IDC ( Riedewald et al., 2001 )

No extra space overhead (different method in different dimension)

Page 11: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

11

Our work focuses mainly on methods that do not require any extra space overhead for dense data cubes.

Page 12: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

12

Analysis of the average query and update costs Assume query ratio + update ratio

=100% Average query cost:

Average update cost: Cu(n) / n

Page 13: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

13

Page 14: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

14

Flexible Data Cube(FDC)

Exponential time is required to find the optimal pre-aggregated data cube

Proposed the FDC method that is a heuristic method to select or integrate any two pre-aggregation techniques for each dimension.

Page 15: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

15

In certain situation Size Query ratio

FDCopt = min average cost{FDC candidates}

FDCopt = min{q×CaqFDC + u×CauFDC} Time complexity O(9n)=O(n)

The FDC Method

k’=0 A, LPS or PSk’=1 A, LPS or PSA, LPS or PS

k’=2 A, LPS or PSA, LPS or PS

k’=3 A, LPS or PSA, LPS or PSk’=4 A, LPS or PS

A, LPS or PSk’=5

A, LPS or PS

A, LPS or PSk’=7

A, LPS or PS

A, LPS or PSk’=6 A, LPS

or PS

A, LPS or PS

k’=4 A PS

Page 16: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

16

Performance analysis

Average cost at different query ratios d = 2, n = 16, 64

0

10

20

30

40

50

60

70

1 0.8 0.6 0.4 0.2 0query ratio (q)

Aver

age

cost

(acc

ess

cells

)

ALPSPSFDC

1

10

100

1000

10000

1 0.8 0.6 0.4 0.2 0

query ratio (q)

Aver

age

cost

(ac

cess

cel

ls)

ALPSPSFDC

Page 17: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

17

Average cost for different dimension sizes: d = 4, q = 1, 0.9

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1.E+06

1.E+07

1.E+08

2 4 8 16 32 64 128 256

size (n)

Aver

age

cost

(acc

ess

cells

)

A

LPS

PS

FDC

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1.E+06

1.E+07

1.E+08

2 4 8 16 32 64 128 256

size (n)

Aver

age

cost

(acc

ess

cells

)

A

LPS

PS

FDC

Page 18: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

18

Average cost for different dimension sizes: d = 4, q = 0.1, 0

1.E+001.E+011.E+021.E+031.E+04

1.E+051.E+061.E+071.E+081.E+09

2 4 8 16 32 64 128 256

size (n)

Aver

age

cost

(acc

ess

cells

) A

LPS

PS

FDC

1.E+001.E+011.E+021.E+031.E+04

1.E+051.E+061.E+071.E+081.E+09

2 4 8 16 32 64 128 256

size (n)

Aver

age

cost

(acc

ess c

ells)

A

LPS

PS

FDC

Page 19: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

19

Conclusions

Take both the query and update costs into consideration to select the suitable data cube.

Propose the FDC method select or integrate pre-aggregating techniques for

each dimension. Outperform other methods for any query (or

update) ratio situation linear time: determine the best FDC structure.

In the future, develop new techniques to support sparse data sets

Page 20: Flexible Data Cube for Range-Sum Queries in Dynamic OLAP  Data Cubes

20

Thank You