8
Closed and Iceberg Cubes

Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

Embed Size (px)

Citation preview

Page 1: Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

Closed and Iceberg Cubes

Page 2: Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

Reduction necessity

• Data cube produces large outputs– 1,015,367 tuples (39MB)– 210,343,580 tuples (8GB)(200 times)

• Two methods to reduce outputs– Iceberg cube– Closed cube

Closed Iceberg cube

Page 3: Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

Cells and Measures

• Cell– In an n-dimension data cube, a cell c = (a1,a2,

…,an: m) (where m is a measure) is called a k-dimensional group-by cell, if and only if there are exactly k (k<=n) values among {a1,a2,…,an} which are not * (i.e., all).

– Further denote M(c) = m and V(c) = (a1,a2,…,an).

Page 4: Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

Notion of Cover

• Cover– Given two cells c = (a1,a2,…,an:m) and c’ =

(a1’,a2’,…,an’:m’), we denote V(c)<= V(c’) if for each ai (i = 1,…,n) which is not *, ai’ = ai.

– A cell c is said to be covered by another cell c’ if c’’ such that V(c)<=V (c’’)<=V (c’), M(c’’) = M(c’).

Page 5: Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

Closed and iceberg cells

• Closed cell– A cell is called a closed cell if it is not covered

by any other cells.

• Closed Iceberg cell– Closed cell which satisfies the iceberg

constraints

Page 6: Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

Closed and iceberg cell contd

• Let the measure be count, and the iceberg constraint be count>=2.

• Cell1 = (a1,b1,c1,*: 2), and cell2 = (a1,*, *, * : 3) are closed iceberg cells;

• Cell3 = (a1,*, c1,* : 2) and cell4 = (a1, b2, c2, d2 : 1)are not, because the former is covered by cell1, where as the latter does not satisfy the iceberg constraint.

Page 7: Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

Methods of computation

• Top-down method

• Bottom-up method

Page 8: Closed and Iceberg Cubes. Reduction necessity Data cube produces large outputs –1,015,367 tuples (39MB) –210,343,580 tuples (8GB)(200 times) Two methods

• These methods of computing the cubes such as BUC, Multi array aggregation and Star Cubing shall be explained in detail in the next resource in this module.