CMSC724: Data Warehouses - University Of MarylandCMSC724: Data Warehouses Amol Deshpande University of Maryland, College Park April 21, 2013 ... what is different from traditional

Overview Data Cubes (Slides borrowed from Nick) Variant Indexes

CMSC724: Data Warehouses

Amol Deshpande

University of Maryland, College Park

April 21, 2013

Amol Deshpande CMSC724: Data Warehouses


Outline

1 Overview2 Data Cubes (Slides borrowed from Nick)3 Variant Indexes



Data Warehouses

A (usually) stand-alone system that integrates data fromeverywhere

Read-only, updated at nightGeared toward business analytics, data mining etc...

Heavily used and heavily optimized1 Materialized views (summary tables, data cubes)

Tradeoff between pre-computation and query latencies

2 New types of indexesIndexes are a form of pre-computation

3 New join techniques for “star”/“snowflake” schemas4 Compressed storage techniques

Key observation: Read-only, so updating not critical issueHowever this may not be true in many application domains



Data Warehouses: Overview

OLAP operations include rollup (increasing the level of

aggregation) and drill-down (decreasing the level of

aggregation or increasing detail) along one or more

dimension hierarchies, slice_and_dice (selection and

projection), and pivot (re-orienting the multidimensional view

of data).

Given that operational databases are finely tuned to support

known OLTP workloads, trying to execute complex OLAP

queries against the operational databases would result in

unacceptable performance. Furthermore, decision support

requires data that might be missing from the operational

databases; for instance, understanding trends or making

predictions requires historical data, whereas operational

databases store only current data. Decision support usually

requires consolidating data from many heterogeneous

sources: these might include external sources such as stock

market feeds, in addition to several operational databases.

The different sources might contain data of varying quality, or

use inconsistent representations, codes and formats, which

have to be reconciled. Finally, supporting the

multidimensional data models and operations typical of

OLAP requires special data organization, access methods,

and implementation methods, not generally provided by

commercial DBMSs targeted for OLTP. It is for all these

reasons that data warehouses are implemented separately

from operational databases.

Data warehouses might be implemented on standard or

extended relational DBMSs, called Relational OLAP

(ROLAP) servers. These servers assume that data is stored in

relational databases, and they support extensions to SQL and

special access and implementation methods to efficiently

implement the multidimensional data model and operations.

In contrast, multidimensional OLAP (MOLAP) servers are

servers that directly store multidimensional data in special

data structures (e.g., arrays) and implement the OLAP

operations over these special data structures.

There is more to building and maintaining a data warehouse

than selecting an OLAP server and defining a schema and

some complex queries for the warehouse. Different

architectural alternatives exist. Many organizations want to

implement an integrated enterprise warehouse that collects

information about all subjects (e.g., customers, products,

sales, assets, personnel) spanning the whole organization.

However, building an enterprise warehouse is a long and

complex process, requiring extensive business modeling, and

may take many years to succeed. Some organizations are

settling for data marts instead, which are departmental

subsets focused on selected subjects (e.g., a marketing data

mart may include customer, product, and sales information).

These data marts enable faster roll out, since they do not

require enterprise-wide consensus, but they may lead to

complex integration problems in the long run, if a complete

business model is not developed.

In Section 2, we describe a typical data warehousing

architecture, and the process of designing and operating a

data warehouse. In Sections 3-7, we review relevant

technologies for loading and refreshing data in a data

warehouse, warehouse servers, front end tools, and

warehouse management tools. In each case, we point out

what is different from traditional database technology, and we

mention representative products. In this paper, we do not

intend to provide comprehensive descriptions of all products

in every category. We encourage the interested reader to look

at recent issues of trade magazines such as Databased

Advisor, Database Programming and Design, Datamation,

and DBMS Magazine, and vendors’ Web sites for more

details of commercial products, white papers, and case

studies. The OLAP Council2 is a good source of information

on standardization efforts across the industry, and a paper by

Codd, et al.3 defines twelve rules for OLAP products. Finally,

a good source of references on data warehousing and OLAP

is the Data Warehousing Information Center4.

Research in data warehousing is fairly recent, and has focused

primarily on query processing and view maintenance issues.

There still are many open research problems. We conclude in

Section 8 with a brief mention of these issues.

2. Architecture and End-to-End Process

Figure 1 shows a typical data warehousing architecture.

Data sources

Operational

dbs

External

sources

Extract

Transform

Load

Refresh

Data Warehouse

Data Marts

AnalysisOLAP

Servers

Data Mining

Query/Reporting

Metadata

Repository

Monitoring & Admnistration

Tools

Serve

Figure 1. Data Warehousing Architecture

It includes tools for extracting data from multiple operational

databases and external sources; for cleaning, transforming

and integrating this data; for loading data into the data

warehouse; and for periodically refreshing the warehouse to

reflect updates at the sources and to purge data from the

warehouse, perhaps onto slower archival storage. In addition

to the main warehouse, there may be several departmental

data marts. Data in the warehouse and data marts is stored

and managed by one or more warehouse servers, which

present multidimensional views of data to a variety of front

end tools: query tools, report writers, analysis tools, and data

mining tools. Finally, there is a repository for storing and

Figure : Overview (From Chaudhuri, Dayal; SIGMOD Rec., 1997 )


http://portal.acm.org/citation.cfm?id=248603.248616


Data Warehouses

Extract-Transform-Load (ETL)Data cleaning, auditing, integrity constraintsSemantic heterogeniety

Issues like entity resolution, schema mapping/matching,cleaning etc..

Load/Refresh:Typically done periodicallyBatch loading, so can heavily optimize the indexes

E.g. If using a B+-tree, bulk-loading can result in much betterindexes, than inserting one at a time

Refresh:Usually done incrementally, at night or something

Real-time analysis ?



"MAD Skills": Experiences from a real DW

From MAD Skills: New Analysis Practices for Big DataClaim: Business Analytics increasingly need to be muchmore agile, and accomodating of dirty dataExperiences from a Fox Audience Network DataWarehouse on Grennplum

Serves ads across Fox online publishers (e.g. Myspace)200 TB of unique production data: growing rapidly

Need to support: sales acct managers to researchscientists

No precomputed set of statistics will be enoughQuery: How many female WWF enthusiasts under the age of 30visited the Toyota community over the last four days and saw astandard-sized web ad?Analysis: How are these similar to those who visited Nissan?First can be done in SQL, the second requires R (or Matlab,SAS, SPSS etc)


http://db.cs.berkeley.edu/papers/vldb09-madskills.pdf



MAD Skills: New Analysis Practices for Big DataTraditional View: There is no point in bringing data ... into the datawarehouse environment without integrating it. If the data arrives at thedata warehouse in an unintegrated state, it cannot be used to support acorporate view of data. And a corporate view of data is one of theessences of the architected environment.

Can hold up access to data for months: not acceptableAdvocate a three-layer approach:

Staging schema: loading raw fact tables or logsAccess only to Analysts and Engineers

Production Data Warehouse schemaAggregates that serve most usersAlso SQL command-line access to some users

Reporting schemaSpecialized, static aggregates for supporting reporting toolsand casual usersData Cubes perhaps fit best here





MAD Skills: New Analysis Practices for Big DataThe paper also discusses how many statistical analysistasks can be written in SQL

Worth readingGreenplum supports a variety of storage formats

External tables through wrappersSpecialized append-only stores for data that is not updated...

Also supports Map-Reduce using User Defined Functions




Data Warehouses: Star Schema

Front End Tools

The multidimensional data model grew out of the view of

business data popularized by PC spreadsheet programs that

were extensively used by business analysts. The spreadsheet

is still the most compelling front-end application for OLAP.

The challenge in supporting a query environment for OLAP

can be crudely summarized as that of supporting spreadsheet

operations efficiently over large multi-gigabyte databases.

Indeed, the Essbase product of Arbor Corporation uses

Microsoft Excel as the front-end tool for its multidimensional

engine.

We shall briefly discuss some of the popular operations that

are supported by the multidimensional spreadsheet

applications. One such operation is pivoting. Consider the

multidimensional schema of Figure 2 represented in a

spreadsheet where each row corresponds to a sale . Let there

be one column for each dimension and an extra column that

represents the amount of sale. The simplest view of pivoting

is that it selects two dimensions that are used to aggregate a

measure, e.g., sales in the above example. The aggregated

values are often displayed in a grid where each value in the

(x,y) coordinate corresponds to the aggregated value of the

measure when the first dimension has the value x and the

second dimension has the value y. Thus, in our example, if

the selected dimensions are city and year, then the x-axis may

represent all values of city and the y-axis may represent the

years. The point (x,y) will represent the aggregated sales for

city x in the year y. Thus, what were values in the original

spreadsheets have now become row and column headers in

the pivoted spreadsheet.

Other operators related to pivoting are rollup or drill-down.

Rollup corresponds to taking the current data object and

doing a further group-by on one of the dimensions. Thus, it is

possible to roll-up the sales data, perhaps already aggregated

on city, additionally by product. The drill-down operation is

the converse of rollup. Slice_and_dice corresponds to

reducing the dimensionality of the data, i.e., taking a

projection of the data on a subset of dimensions for selected

values of the other dimensions. For example, we can

slice_and_dice sales data for a specific product to create a

table that consists of the dimensions city and the day of sale.

The other popular operators include ranking (sorting),

selections and defining computed attributes.

Although the multidimensional spreadsheet has attracted a lot

of interest since it empowers the end user to analyze business

data, this has not replaced traditional analysis by means of a

managed query environment. These environments use stored

procedures and predefined complex queries to provide

packaged analysis tools. Such tools often make it possible for

the end-user to query in terms of domain-specific business

data. These applications often use raw data access tools and

optimize the access patterns depending on the back end

database server. In addition, there are query environments

(e.g., Microsoft Access) that help build ad hoc SQL queries

by “pointing-and-clicking”. Finally, there are a variety of

data mining tools that are often used as front end tools to data

warehouses.

5. Database Design Methodology

The multidimensional data model described above is

implemented directly by MOLAP servers. We will describe

these briefly in the next section. However, when a relational

ROLAP server is used, the multidimensional model and its

operations have to be mapped into relations and SQL queries.

In this section, we describe the design of relational database

schemas that reflect the multidimensional views of data.

Entity Relationship diagrams and normalization techniques

are popularly used for database design in OLTP

environments. However, the database designs recommended

by ER diagrams are inappropriate for decision support

systems where efficiency in querying and in loading data

(including incremental loads) are important.

Most data warehouses use a star schema to represent the

multidimensional data model. The database consists of a

single fact table and a single table for each dimension. Each

tuple in the fact table consists of a pointer (foreign key - often

uses a generated key for efficiency) to each of the dimensions

that provide its multidimensional coordinates, and stores the

numeric measures for those coordinates. Each dimension

table consists of columns that correspond to attributes of the

dimension. Figure 3 shows an example of a star schema.

Fact table

Order

OrderNo

OrderDate

Customer

CustomerNo

CustomerName

CustomerAddress

City

Salesperson

SalespersonID

SalespesonName

City

Quota

ProdNo

ProdName

ProdDescr

Category

CategoryDescr

UnitPrice

QOH

City

CityName

State

Country

Date

DateKey

Date

Month

Year

OrderNo

SalespersonID

CustomerNo

ProdNo

DateKey

CityName

Quantity

TotalPrice

Figure 3. A Star Schema.

Star schemas do not explicitly provide support for attribute

hierarchies. Snowflake schemas provide a refinement of starFigure : A Star Schema (FromChaudhuri, Dayal; SIGMOD Record, 1997 )




Data Warehouses: Snowflake Schema

schemas where the dimensional hierarchy is explicitly

represented by normalizing the dimension tables, as shown in

Figure 4. This leads to advantages in maintaining the

dimension tables. However, the denormalized structure of the

dimensional tables in star schemas may be more appropriate

for browsing the dimensions.

Fact constellations are examples of more complex structures

in which multiple fact tables share dimensional tables. For

example, projected expense and the actual expense may form

a fact constellation since they share many dimensions.

Fact table

OrderNo

SalespersonID

CustomerNo

DateKey

CityName

ProdNo

Quantity

TotalPrice

Order

OrderNo

OrderDate

Customer

CustomerNo

CustomerName

CustomerAddress

City

Salesperson

SalespersonID

SalespesonName

City

Quota

ProdNo

ProdName

ProdDescr

Category

UnitPrice

QOH

Category

CategoryName

CategoryDescr

City

CityName

State

State

Date

DateKey

Date

Month

Month

Month

Year

Year

Figure 4. A Snowflake Schema.

In addition to the fact and dimension tables, data warehouses

store selected summary tables containing pre-aggregated data.

In the simplest cases, the pre-aggregated data corresponds to

aggregating the fact table on one or more selected

dimensions. Such pre-aggregated summary data can be

represented in the database in at least two ways. Let us

consider the example of a summary table that has total sales

by product by year in the context of the star schema of Figure

3. We can represent such a summary table by a separate fact

table which shares the dimension Product and also a separate

shrunken dimension table for time, which consists of only the

attributes of the dimension that make sense for the summary

table (i.e., year). Alternatively, we can represent the summary

table by encoding the aggregated tuples in the same fact table

and the same dimension tables without adding new tables.

This may be accomplished by adding a new level field to each

dimension and using nulls: We can encode a day, a month or

a year in the Date dimension table as follows: (id0, 0, 22, 01,

1960) represents a record for Jan 22, 1960, (id1, 1, NULL,

01, 1960) represents the month Jan 1960 and (id2, 2, NULL,

NULL, 1960) represents the year 1960. The second attribute

represents the new attribute level: 0 for days, 1 for months, 2

for years. In the fact table, a record containing the foreign key

id2 represents the aggregated sales for a Product in the year

1960. The latter method, while reducing the number of tables,

is often a source of operational errors since the level field

needs be carefully interpreted.

6. Warehouse Servers

Data warehouses may contain large volumes of data. To

answer queries efficiently, therefore, requires highly efficient

access methods and query processing techniques. Several

issues arise. First, data warehouses use redundant structures

such as indices and materialized views. Choosing which

indices to build and which views to materialize is an

important physical design problem. The next challenge is to

effectively use the existing indices and materialized views to

answer queries. Optimization of complex queries is another

important problem. Also, while for data-selective queries,

efficient index scans may be very effective, data-intensive

queries need the use of sequential scans. Thus, improving the

efficiency of scans is important. Finally, parallelism needs to

be exploited to reduce query response times. In this short

paper, it is not possible to elaborate on each of these issues.

Therefore, we will only briefly touch upon the highlights.

Index Structures and their Usage

A number of query processing techniques that exploit indices

are useful. For instance, the selectivities of multiple

conditions can be exploited through index intersection. Other

useful index operations are union of indexes. These index

operations can be used to significantly reduce and in many

cases eliminate the need to access the base tables.

Warehouse servers can use bit map indices, which support

efficient index operations (e.g., union, intersection). Consider

a leaf page in an index structure corresponding to a domain

value d. Such a leaf page traditionally contains a list of the

record ids (RIDs) of records that contain the value d.

However, bit map indices use an alternative representation of

the above RID list as a bit vector that has one bit for each

record, which is set when the domain value for that record is

d. In a sense, the bit map index is not a new index structure,

but simply an alternative representation of the RID list. The

popularity of the bit map index is due to the fact that the bit

vector representation of the RID lists can speed up index

intersection, union, join, and aggregation11

. For example, if

we have a query of the form column1 = d & column2 = d’,

then we can identify the qualifying records by taking the

AND of the two bit vectors. While such representations can

be very useful for low cardinality domains (e.g., gender), they

can also be effective for higher cardinality domains through

compression of bitmaps (e.g., run length encoding). Bitmap

indices were originally used in Model 204, but many products

support them today (e.g., Sybase IQ). An interesting question

is to decide on which attributes to index. In general, this is

really a question that must be answered by the physical

database design process.

In addition to indices on single tables, the specialized nature

of star schemas makes join indices especially attractive for

decision support. While traditionally indices map the value in

a column to a list of rows with that value, a join index

Figure : A Snowflake Schema (FromChaudhuri, Dayal; SIGMOD Record, 1997 )




Star and Snowflake Schemas

The Facts table is HUGEDimension tables relatively small

Strong key-foreign key dependenciesEach fact table tuple joins with exactly one tuple from eachdimension tableCritical in optimizations

Many queries are of the form:Join the Facts table with some of the dimension tablesSelections on the dimension table attributes (e.g. state =’MD’)Possibly selection on the fact tableGroup by on some of the dimension table attributes (e.g.ProdName)Aggregate on a main Facts table attribute (e.g. quantity)



Star and Snowflake Schemas

The Facts table is HUGEDimension tables relatively small

Strong key-foreign key dependenciesEach fact table tuple joins with exactly one tuple from eachdimension tableCritical in optimizations

Many queries are of the form:Join the Facts table with some of the dimension tablesSelections on the dimension table attributes (e.g. state =’MD’)Possibly selection on the fact tableGroup by on some of the dimension table attributes (e.g.ProdName)Aggregate on a main Facts table attribute (e.g. quantity)



Data Warehouses: Multi-dimensional Data

coordinated to ensure consistency of derived data and indices

with the base data.

Refresh

Refreshing a warehouse consists in propagating updates on

source data to correspondingly update the base data and

derived data stored in the warehouse. There are two sets of

issues to consider: when to refresh, and how to refresh.

Usually, the warehouse is refreshed periodically (e.g., daily or

weekly). Only if some OLAP queries need current data (e.g.,

up to the minute stock quotes), is it necessary to propagate

every update. The refresh policy is set by the warehouse

administrator, depending on user needs and traffic, and may

be different for different sources.

Refresh techniques may also depend on the characteristics of

the source and the capabilities of the database servers.

Extracting an entire source file or database is usually too

expensive, but may be the only choice for legacy data

sources. Most contemporary database systems provide

replication servers that support incremental techniques for

propagating updates from a primary database to one or more

replicas. Such replication servers can be used to

incrementally refresh a warehouse when the sources change.

There are two basic replication techniques: data shipping and

transaction shipping.

In data shipping (e.g., used in the Oracle Replication Server,

Praxis OmniReplicator), a table in the warehouse is treated as

a remote snapshot of a table in the source database. After_row

triggers are used to update a snapshot log table whenever the

source table changes; and an automatic refresh schedule (or a

manual refresh procedure) is then set up to propagate the

updated data to the remote snapshot.

In transaction shipping (e.g., used in the Sybase Replication

Server and Microsoft SQL Server), the regular transaction log

is used, instead of triggers and a special snapshot log table.

At the source site, the transaction log is sniffed to detect

updates on replicated tables, and those log records are

transferred to a replication server, which packages up the

corresponding transactions to update the replicas. Transaction

shipping has the advantage that it does not require triggers,

which can increase the workload on the operational source

databases. However, it cannot always be used easily across

DBMSs from different vendors, because there are no standard

APIs for accessing the transaction log.

Such replication servers have been used for refreshing data

warehouses. However, the refresh cycles have to be properly

chosen so that the volume of data does not overwhelm the

incremental load utility.

In addition to propagating changes to the base data in the

warehouse, the derived data also has to be updated

correspondingly. The problem of constructing logically

correct updates for incrementally updating derived data

(materialized views) has been the subject of much research 7 8

9

10. For data warehousing, the most significant classes of

derived data are summary tables, single-table indices and

join indices.

4. Conceptual Model and Front End Tools

A popular conceptual model that influences the front-end

tools, database design, and the query engines for OLAP is the

multidimensional view of data in the warehouse. In a

multidimensional data model, there is a set of numeric

measures that are the objects of analysis. Examples of such

measures are sales, budget, revenue, inventory, ROI (return

on investment). Each of the numeric measures depends on a

set of dimensions, which provide the context for the measure.

For example, the dimensions associated with a sale amount

can be the city, product name, and the date when the sale was

made. The dimensions together are assumed to uniquely

determine the measure. Thus, the multidimensional data

views a measure as a value in the multidimensional space of

dimensions. Each dimension is described by a set of

attributes. For example, the Product dimension may consist of

four attributes: the category and the industry of the product,

year of its introduction, and the average profit margin. For

example, the soda Surge belongs to the category beverage

and the food industry, was introduced in 1996, and may have

an average profit margin of 80%. The attributes of a

dimension may be related via a hierarchy of relationships. In

the above example, the product name is related to its category

and the industry attribute through such a hierarchical

relationship.C

ity

Pro

du

ctP

rod

uct

DateDate

JuiceCola

Milk

Cream

Toothpaste

Soap

1 2 3 4 76 5

WS

N

15

10

12

20

50

10

Dimensions: Product, City, Date

Hierarchical summarization paths

Industry Country Year

Category State Quarter

Product City Month Week

Date

Figure 2. Multidimensional data

Another distinctive feature of the conceptual model for

OLAP is its stress on aggregation of measures by one or

more dimensions as one of the key operations; e.g.,

computing and ranking the total sales by each county (or by

each year). Other popular operations include comparing two

measures (e.g., sales and budget) aggregated by the same

dimensions. Time is a dimension that is of particular

significance to decision support (e.g., trend analysis). Often,

it is desirable to have built-in knowledge of calendars and

other aspects of the time dimension.

Figure : Multi-dimensional Data (FromChaudhuri, Dayal; SIGMOD Record, 1997 )




Data Warehouses: Multi-dimensional Data

OLAP: On-line Analytical ProcessingContrast with: OLTP (transaction processing)

ROLAP: Relational OLAPOLAP built on top of relational databases (standard now)

MOLAP: Multi-dimensional OLAPSpecialized database that stores data in multi-dimensionalarraysMakes it easier to support Data Cube type queriesUpdates need to be done at both places

HOLAP: A hybrid between the two commonly usedCompute part of the cube at higher levels of aggregation,for rest go down to the relations

See SAS OLAP Server + Teradata for a nice discussion


http://www.teradata.com/tdmo/v08n03/tech2tech/AppliedSolutions/CubesByDesign.aspx


Data Warehouses: Today and Future?

A Recent Article by StonebrakerColumn stores are likely going to be the underlying storageengine for data warehouses

But are not optimized for OLTP workloadsSeveral specialized systems are likely the only way to go

What about Map-Reduce?Most data warehouses will likely evolve toward supportsome form of MapReduceSee Greenplum’s take on it

Parallelism is pretty much requiredIncreasinly need sophisticated statistical analysis

Many companies in this space: SAS, SPSS (bought byIBM), TableauCan’t always scale to very large volumesSee recent work at the SystemML project at IBM


http://cacm.acm.org/blogs/blog-cacm/98136-my-top-10-assertions-about-data-warehouses/fulltext

http://www.greenplum.com/resources/mapreduce/

http://www.almaden.ibm.com/cs/projects/systemml/


Outline




Data Cubes

Database Management Systems -2-© N. Roussopoulos 2007

OLAP-The Data Analysis Cycle

• User extracts data fromdatabase with query

• Then visualizes, analyzesdata with desktop tools

Spread Sheet

Table

1

1015

1012

109

106

103

Size vs Speed

Access Time (seconds)10-9 10-6 10-3 10 0 10 3

Cache

Main

Secondary

Disc

Nearline Tape Offline

Tape

OnlineTape

104

102

100

10-2

10-4

Price vs Speed

Access Time (seconds)10-9 10-6 10-3 10 0 10 3

Cache

MainSecondary

Disc

Nearline Tape

OfflineTape

OnlineTape

Size(B) $/MB



Data Cubes


The Data Cube[Gray, Bosworth, Layman, Pirahesh ICDE 96]

• summarize multidimensional data for trend analysis

weather(time,latitude,longitude,altitude,temp,b-pressure)

• groupby with statistical functions (avg,min,max,count,sum) aggregates overtable sub-groups

• results in a new table

select location, sum(units)from inventorygroup by locationhaving nation = “USA”;

TableSUM()

A

B

C

D

attributeA A A B B B B B C C C C C D D

select avg(temp) from weatherselect time,altitude fromweathergroupby time,altitude



Data Cubes


Problems with SQL Groubys

• Histograms (aggregation over computed categories)

F() G() H()

GROUP BY CUBE



Data Cubes


Problems with SQL Groubys

• drill-down and roll-up

Not relational (null values in the keys)



Data Cubes


More problems with Groubys

• roll-up is asymmetric (e.g. does not aggregate by year or by coloralone

• cross-tabulation (spreadsheets)

• even if SQL syntax can be devised, a 6D cross-tab requires 64groupby queries to generate it and 64 scans and sorts of the data

most of these are not relational expressions but are in many reportwriters



Data Cubes


CUBE:A Relational Aggregate Operator Generalizing Group By

By Make & Color

CHEVY

FORD 19901991

19921993

REDWHITEBLUE

By Color

By Make & Year

By Color & Year

By MakeBy Year

Sum

The Data Cube and The Sub-Space Aggregates

REDWHITEBLUE

ChevyFord

By Make

By Color

Sum

Cross Tab

Sum

Aggregate

REDWHITEBLUE

By Color

Sum

Group By (with total)



Data Cubes


Idea: N-dimensional CubeEach Attribute is a Dimension

• N-dimensionalAggregate (sum(), max(),...) fits relational model exactly:

a1, a2, ...., aN, f(*)

• Super-aggregate over N-1 Dimensional sub-cubesALL, a2, ...., aN , f(*)a3 , ALL, a3, ...., aN , f(*)...a1, a2, ...., ALL, f(*)

this is the N-1 Dimensional cross-tab.

• Super-aggregate over N-2 Dimensional sub-cubesALL, ALL, a3, ...., aN , f(*)...a1, a2 ,...., ALL, ALL, f(*)



Data Cubes


Division of laborComputation vs Visualization

• Relational system builds CUBE relation aggregation best done close to data filtering of data is possible Cube computation may be recursive

(e.g., percent of total, quartile, ....)

• Visualization System displays/explores the cube

19901991

1992ALL

Red

Blue0

50

100

150

200 150-200

100-150

50-100

0-50



Data Cubes


An Example SALES Model Year Color Sales Chevy 1990 red 5 Chevy 1990 white 87 Chevy 1990 blue 62 Chevy 1991 red 54 Chevy 1991 white 95 Chevy 1991 blue 49 Chevy 1992 red 31 Chevy 1992 white 54 Chevy 1992 blue 71 Ford 1990 red 64 Ford 1990 white 62 Ford 1990 blue 63 Ford 1991 red 52 Ford 1991 white 9 Ford 1991 blue 55 Ford 1992 red 27 Ford 1992 white 62 Ford 1992 blue 39

DATA CUBE Model Year Color Sales ALL ALL ALL 942 chevy ALL ALL 510 ford ALL ALL 432 ALL 1990 ALL 343 ALL 1991 ALL 314 ALL 1992 ALL 285 ALL ALL red 165 ALL ALL white 273 ALL ALL blue 339 chevy 1990 ALL 154 chevy 1991 ALL 199 chevy 1992 ALL 157 ford 1990 ALL 189 ford 1991 ALL 116 ford 1992 ALL 128 chevy ALL red 91 chevy ALL white 236 chevy ALL blue 183 ford ALL red 144 ford ALL white 133 ford ALL blue 156 ALL 1990 red 69 ALL 1990 white 149 ALL 1990 blue 125 ALL 1991 red 107 ALL 1991 white 104 ALL 1991 blue 104 ALL 1992 red 59 ALL 1992 white 116 ALL 1992 blue 110

CUBE



Data Cubes


Summary of the Cube

• CUBE operator generalizes relational aggregates• Needs ALL value to denote sub-cubes

ALL values represent aggregation sets

• Needs generalization of user-defined aggregates• Decorations and abstractions are interesting• Computation has interesting optimizations• Relationship to “rest of SQL” not fully worked out.



How to "Query" Cubes?

Over the years, a language has been developedCalled MDX: MultiDimensional eXpressionsExample:

More details at: MDX: A Gentle Introduction


http://www.iccube.com/support/documentation/mdx_tutorial/gentle_introduction.html


Materialized Views

CUBE is a user abstractionHow do we store/maintain it ?

Option 1: Compute the entire CUBE over all possibleattributes

Not feasible... too largeMaintenance cost will be formidable

Option 2: Compute from scratchDon’t materialize anything... go straight to the source tablefor every queryToo much query latency

Option 3: Materialize a portion of the CUBEKeep the most commonly required portions aroundTry to answer all queries using that



Materialized Views








Materialized Views








Data Cubes


Cube={Materialized Views}[Harinarayan, Rajaraman, Ullman 96]

• each groupby creates a “summary table” which is amaterialized view with some dressing

• storing these summary tables speed up cube queries• what to store and what not• TPC-D example for sale analysis



Data Cubes


The Lattice Organization

• the query sales groupby part will be answered at p - cost of scanning 0.2M records pc - -”- 6.0M -”- psc - -”- 6.0M -”-

• select the views that minimize overall query performance need a good query model need a good optimization criterion



Data Cubes


Views grow exponentially

• in general 2**N subspaces ABCD

ABC BCDACDABD

AB ADAC BC BD CD

A B C D

none



Outline




Variant Indexes (O’Neil, Quass; SIGMOD’97)

Specialized indexes for supporting summary aggregatequeriesDifferent types of indexes:

Traditional Value-List IndexesBitmap IndexesProjection Indexes

Very similar to Column-based storage (much research lastfew years)

Bit-sliced IndexesJoin Indexes

Key observation: Read-only database, so can build asmany indexes as you wantMany of the ideas are present in today’s column-stores



Value-List/Bitmap Index

Key idea: Given a property over a domain, the followingtwo are interchangeable and complementary

a list of valuesa bitmap over the domain

In our case:Domain: The set of all RIDsProperty: A predicate R.a =′ Sports′

If the number of RID that satisfy the property is:small: store as a list of RIDslarge: store as a bitmap over the RIDs

-2-

2. Indexing Definitions

In this section we examine traditional Value-List indexes andshow how Bitmap representations for RID-lists can easily beused. We then introduce Projection and Bit-Sliced indexes.

2.1 Traditional Value-List Indexes

Database indexes provided today by most database systems useB+-tree1 indexes to retrieve rows of a table with specified valuesinvolving one or more columns (see [COMER79]). The leaflevel of the B-tree index consists of a sequence of entries for in-dex keyvalues. Each keyvalue reflects the value of the indexedcolumn or columns in one or more rows in the table, and eachkeyvalue entry references the set of rows with that value. Sinceall rows of an indexed relational table are referenced exactly oncein the B-tree, the rows are partitioned by keyvalue. However,object-relational databases allow rows to have multi-valued at-tributes, so that in the future the same row may appear undermany keyvalues in the index. We therefore refer to this type ofindex simply as a Value-List index.

Traditionally, Value-List (B-tree) indexes have referenced eachrow individually as a RID, a Row IDentifier, specifying the disk

position of the row. A sequence of RIDs, known as a RID-list,is held in each distinct keyvalue entry in the B-tree. In indexeswith a relatively small number of keyvalues compared to thenumber of rows, most keyvalues will have a large number of as-sociated RIDs and the potential for compression arises by list-ing a keyvalue once, at the head of what we call a RID-listFragment, containing a long list of RIDs for rows with thiskeyvalue. For example, MVS DB2 provides this kind of com-pression, (see [O'NEI96], Figure 7.19). Keyvalues with RID-lists that cross leaf pages require multiple Fragments. We as-sume in what follows that RID-lists (and Bitmaps, which fol-low) are read from disk in multiples of Fragments. With thisamortization of the space for the keyvalue over multiple 4-byteRIDs of a Fragment, the length in bytes of the leaf level of the B-tree index can be approximated as 4 times the number of rows inthe table, divided by the average fullness of the leaf nodes. Inwhat follows, we assume that we are dealing with data that isupdated infrequently, so that B-tree leaf pages can be completelyfilled, reorganized during batch updates. Thus the length inbytes of the leaf level of a B-tree index with a small number ofkeyvalues is about 4 times the number of table rows.

2.1.1 Bitmap Indexes

Bitmap indexes were first developed for database use in theModel 204 product from Computer Corporation of America (see[O'NEI87]). A Bitmap is an alternate form for representing RID-lists in a Value-List index. Bitmaps are more space-efficient thanRID-lists when the number of keyvalues for the index is low.Furthermore, we will show that Bitmaps are usually more CPU-efficient as well, because of the simplicity of their representation.To create Bitmaps for the n rows of a table T = {r1, r2, . . . rn}, we

start with a 1-1 mapping m from rows of T to Z[M], the first Mpositive integers. In what follows we avoid frequent reference tothe mapping m. When we speak of the row number of a row r ofT, we will mean the value m(r).

1B+-trees are commonly referred to simply as B-trees in database

documentation, and we will follow this convention.

Note that while there are n rows in T = {r1, r2, . . . rn}, it is not

necessarily true that the maximum row number M is the same asn, since a method is commonly used to associate a fixed numberof rows p with each disk page for fast lookup. Thus for a givenrow r with row number j, the table page number accessed to re-trieve row r is j/p and the page slot is (in C terms) j%p. Thismeans that rows will be assigned row numbers in disk clusteredsequence, a valuable property. Since the rows might have vari-able size and we may not always be able to accommodate anequal number of rows on each disk page, the value p must be achosen as a maximum, so some integers in Z[M] might be wasted.They will correspond to non-existent slots on pages that cannotaccommodate the full set of p rows. (And we may find that m-1(j)for some row numbers j in Z[M] is undefined.)

A "Bitmap" B is defined on T as a sequence of M bits. If a BitmapB is meant to list rows in T with a given property P, then foreach row r with row number j that has the property P, we set bitj in B to one; all other bits are set to zero. A Bitmap index for acolumn C with values v1, v2, . . ., vk, is a B-tree with entries hav-

ing these keyvalues and associated data portions that containBitmaps for the properties C = v1, . . ., C = vk. Thus Bitmaps in

this index are just a new way to specify lists of RIDs for specificcolumn values. See Figure 2.1 for an Example. Note that a seriesof successive Bitmap Fragments make up the entry for "depart-ment = 'sports'".

B-tree Root Node for departmentindex

'clothes''china'... 'sports' ...'tools'

' spor t s

'

101101 . . . 01011 . .

.

' spor t s

'

Figure 2.1. Example of a Bitmap Index on department,a column of the SALES table

We say that Bitmaps are dense if the proportion of one-bits inthe Bitmap is large. A Bitmap index for a column with 32 valueswill have Bitmaps with average density of 1/32. In this case thedisk space to hold a Bitmap column index will be comparable tothe disk space needed for a RID-list index (which requires about32 bits for each RID present). While the uncompressed Bitmapindex size is proportional to the number of column values, aRID-list index is about the same size for any number of values(as long as we can continue to amortize the keysize with a longblock of RIDs). For a column index with a very small number ofvalues, the Bitmaps will have high densities (such as 50% forpredicates such as GENDER = 'M' or GENDER = 'F'), and thedisk savings is enormous. On the other hand, when averageBitmap density for a Bitmap index becomes too low, methods ex-ist for compressing a Bitmap. The simplest of these is to trans-late the Bitmap back to a RID list, and we will assume this inwhat follows.

2.1.2 Bitmap Index Performance

An important consideration for database query performance isthe fact that Boolean operations, such as AND, OR, and NOT are

Figure : Bitmap Index



Value-List/Bitmap Index

Key idea: Given a property over a domain, the followingtwo are interchangeable and complementary

a list of valuesa bitmap over the domain

In our case:Domain: The set of all RIDsProperty: A predicate R.a =′ Sports′

If the number of RID that satisfy the property is:small: store as a list of RIDslarge: store as a bitmap over the RIDs

-2-

2. Indexing Definitions

In this section we examine traditional Value-List indexes andshow how Bitmap representations for RID-lists can easily beused. We then introduce Projection and Bit-Sliced indexes.

2.1 Traditional Value-List Indexes

Database indexes provided today by most database systems useB+-tree1 indexes to retrieve rows of a table with specified valuesinvolving one or more columns (see [COMER79]). The leaflevel of the B-tree index consists of a sequence of entries for in-dex keyvalues. Each keyvalue reflects the value of the indexedcolumn or columns in one or more rows in the table, and eachkeyvalue entry references the set of rows with that value. Sinceall rows of an indexed relational table are referenced exactly oncein the B-tree, the rows are partitioned by keyvalue. However,object-relational databases allow rows to have multi-valued at-tributes, so that in the future the same row may appear undermany keyvalues in the index. We therefore refer to this type ofindex simply as a Value-List index.

Traditionally, Value-List (B-tree) indexes have referenced eachrow individually as a RID, a Row IDentifier, specifying the disk

position of the row. A sequence of RIDs, known as a RID-list,is held in each distinct keyvalue entry in the B-tree. In indexeswith a relatively small number of keyvalues compared to thenumber of rows, most keyvalues will have a large number of as-sociated RIDs and the potential for compression arises by list-ing a keyvalue once, at the head of what we call a RID-listFragment, containing a long list of RIDs for rows with thiskeyvalue. For example, MVS DB2 provides this kind of com-pression, (see [O'NEI96], Figure 7.19). Keyvalues with RID-lists that cross leaf pages require multiple Fragments. We as-sume in what follows that RID-lists (and Bitmaps, which fol-low) are read from disk in multiples of Fragments. With thisamortization of the space for the keyvalue over multiple 4-byteRIDs of a Fragment, the length in bytes of the leaf level of the B-tree index can be approximated as 4 times the number of rows inthe table, divided by the average fullness of the leaf nodes. Inwhat follows, we assume that we are dealing with data that isupdated infrequently, so that B-tree leaf pages can be completelyfilled, reorganized during batch updates. Thus the length inbytes of the leaf level of a B-tree index with a small number ofkeyvalues is about 4 times the number of table rows.

2.1.1 Bitmap Indexes

Bitmap indexes were first developed for database use in theModel 204 product from Computer Corporation of America (see[O'NEI87]). A Bitmap is an alternate form for representing RID-lists in a Value-List index. Bitmaps are more space-efficient thanRID-lists when the number of keyvalues for the index is low.Furthermore, we will show that Bitmaps are usually more CPU-efficient as well, because of the simplicity of their representation.To create Bitmaps for the n rows of a table T = {r1, r2, . . . rn}, we

start with a 1-1 mapping m from rows of T to Z[M], the first Mpositive integers. In what follows we avoid frequent reference tothe mapping m. When we speak of the row number of a row r ofT, we will mean the value m(r).

1B+-trees are commonly referred to simply as B-trees in database

documentation, and we will follow this convention.

Note that while there are n rows in T = {r1, r2, . . . rn}, it is not

necessarily true that the maximum row number M is the same asn, since a method is commonly used to associate a fixed numberof rows p with each disk page for fast lookup. Thus for a givenrow r with row number j, the table page number accessed to re-trieve row r is j/p and the page slot is (in C terms) j%p. Thismeans that rows will be assigned row numbers in disk clusteredsequence, a valuable property. Since the rows might have vari-able size and we may not always be able to accommodate anequal number of rows on each disk page, the value p must be achosen as a maximum, so some integers in Z[M] might be wasted.They will correspond to non-existent slots on pages that cannotaccommodate the full set of p rows. (And we may find that m-1(j)for some row numbers j in Z[M] is undefined.)

A "Bitmap" B is defined on T as a sequence of M bits. If a BitmapB is meant to list rows in T with a given property P, then foreach row r with row number j that has the property P, we set bitj in B to one; all other bits are set to zero. A Bitmap index for acolumn C with values v1, v2, . . ., vk, is a B-tree with entries hav-

ing these keyvalues and associated data portions that containBitmaps for the properties C = v1, . . ., C = vk. Thus Bitmaps in

this index are just a new way to specify lists of RIDs for specificcolumn values. See Figure 2.1 for an Example. Note that a seriesof successive Bitmap Fragments make up the entry for "depart-ment = 'sports'".

B-tree Root Node for departmentindex

'clothes''china'... 'sports' ...'tools'

' spor t s

'

101101 . . . 01011 . .

.

' spor t s

'

Figure 2.1. Example of a Bitmap Index on department,a column of the SALES table

We say that Bitmaps are dense if the proportion of one-bits inthe Bitmap is large. A Bitmap index for a column with 32 valueswill have Bitmaps with average density of 1/32. In this case thedisk space to hold a Bitmap column index will be comparable tothe disk space needed for a RID-list index (which requires about32 bits for each RID present). While the uncompressed Bitmapindex size is proportional to the number of column values, aRID-list index is about the same size for any number of values(as long as we can continue to amortize the keysize with a longblock of RIDs). For a column index with a very small number ofvalues, the Bitmaps will have high densities (such as 50% forpredicates such as GENDER = 'M' or GENDER = 'F'), and thedisk savings is enormous. On the other hand, when averageBitmap density for a Bitmap index becomes too low, methods ex-ist for compressing a Bitmap. The simplest of these is to trans-late the Bitmap back to a RID list, and we will assume this inwhat follows.

2.1.2 Bitmap Index Performance

An important consideration for database query performance isthe fact that Boolean operations, such as AND, OR, and NOT are




Value-List/Bitmap Index: Segmentation

Each disk page can store 48K bits, so must partition theFacts table into 48K row partitionsSo

Each B+-Tree page contains a portion of the bitmap overthe RIDs

If the number of 1’s is small, convert to an RID-listThe tipping point is when the number of 1’s is < 1/32 of thesize.At that point, the RID-list exactly fits in the disk page(48000/32 = 1500, 1500 ∗ 4 = 6K )This is always true regardless of the page size

Segmentation also helps with space storage... if an entiresegment is all 0’s, don’t store it



Value-List/Bitmap Index: Segmentation

Each disk page can store 48K bits, so must partition theFacts table into 48K row partitionsSo

Each B+-Tree page contains a portion of the bitmap overthe RIDs

If the number of 1’s is small, convert to an RID-listThe tipping point is when the number of 1’s is < 1/32 of thesize.At that point, the RID-list exactly fits in the disk page(48000/32 = 1500, 1500 ∗ 4 = 6K )This is always true regardless of the page size

Segmentation also helps with space storage... if an entiresegment is all 0’s, don’t store it



Value-List/Bitmap Index: Queries

Selections on the table return bitmapsAND, OR, NOT very fast on bitmapsResult called a Foundset: Bf (the domain is the Facts Table)

Next step: Aggregate (recall almost all queries computeaggregates)

Can perform directly on the bitmap in some cases (COUNT)Otherwise use projection indexesOR use a bit-sliced index



Value-List/Bitmap Index: COUNT

shcount: count the number of ones in the binaryrepresentation

shcount[3] = 2, shcount[10] = 2, shcount[15] = 4 etc...

-3-

extremely fast for Bitmaps. Given Bitmaps B1 and B2, we cancalculate a new Bitmap B3, B3 = B1 AND B2, by treating allbitmaps as arrays of long ints and looping through them, usingthe & operation of C:

for (i = 0; i < len(B1); i++)/* Note: len(B1)=len(B2)=len(B3) */

B3[i] = B1[i] & B2[i];/* B3 = B1 AND B2 */

We would not normally expect the entire Bitmap to be memoryresident, but would perform a loop to operate on Bitmaps byreading them in from disk in long Fragments. We ignore thisloop here. Using a similar approach, we can calculate B3 = B1OR B2. But calculating B3 = NOT(B1) requires an extra step.Since some bit positions can correspond to non-existent rows,we postulate an Existence Bitmap (designated EBM) which hasexactly those 1 bits corresponding to existing rows. Now whenwe perform a NOT on a Bitmap B, we loop through a long int ar-ray performing the ~ operation of C, then AND the result withthe corresponding long int from EBM.

for (i = 0; i < len(B1); i++)B3[i] = ~B1[i] & EBM[i];

/* B3 = NOT(B1)for rows that exist */

Typical Select statements may have a number of predicates intheir Where Clause that must be combined in a Boolean manner.The resulting set of rows, which is then retrieved or aggregatedin the Select target-list, is called a Foundset in what follows.Sometimes, the rows filtered by the Where Clause must be furthergrouped, due to a group-by clause, and we refer to the set ofrows restricted to a single group as a Groupset.

Finally, we show how the COUNT function for a Bitmap of aFoundset can be efficiently performed. First, a short int arrayshcount[ ] is declared, with entries initialized to contain thenumber of bits in the entry subscript. Given this array, we canloop through a Bitmap as an array of short int values, to get thecount of the total Bitmap as shown in Algorithm 2.1. Clearlythe shcount[ ] array is used to provide parallelism in calculatingthe COUNT on many bits at once.

Algorithm 2.1. Performing COUNT with a Bitmap/* Assume B1[ ] is a short int array

overlaying a Foundset Bitmap */count = 0;for (i = 0; i < SHNUM; i++)

count += shcount[B1[i]];/* add count of bits for next short int */

u

Loops for Bitmap AND, OR, NOT, or COUNT are extremely fastcompared to loop operations on RID lists, where several opera-tions are required for each RID, so long as the Bitmaps involvedhave reasonably high density (down to about 1%).

Example 2.1. In the Set Query benchmark of [O'NEI91], theresults from one of the SQL statements in Query Suite Q5 givesa good illustration of Bitmap performance. For a table namedBENCH of 1,000,000 rows, two columns named K10 and K25have cardinalities 10 and 25, respectively, with all rows in thetable equally likely to take on any valid value for either column.Thus the Bitmap densities for indexes on this column are 10%and 4% respectively. One SQL statement from the Q5 Suite is:

[2.1] SELECT K10, K25, COUNT(*) FROM BENCHGROUP BY K10, K25;

A 1995 benchmark on a 66 MHz Power PC of the Praxis OmniWarehouse, a C language version of MODEL 204, demonstratedan elapsed time of 19.25 seconds to perform this query. Thequery plan was to read Bitmaps from the indexes for all values ofK10 and K25, perform a double loop through all 250 pairs ofvalues, AND all pairs of Bitmaps, and COUNT the results. The250 ANDs and 250 COUNTs of 1,000,000 bit Bitmaps requiredonly 19.25 seconds on a relatively weak processor. By compar-ison, MVS DB2 Version 2.3, running on an IBM 9221/170 usedan algorithm that extracted and wrote out all pairs of (K10, K25)values from the rows, sorted by value pair, and counted the re-sult in groups, taking 248 seconds of elapsed time and 223 sec-onds of CPU. (See [O'NEI96] for more details.) u

2.1.3 Segmentation

To optimize Bitmap index access, Bitmaps can be broken intoFragments of equal sizes to fit on single fixed-size disk pages.Corresponding to these Fragments, the rows of a table are parti-tioned into Segments, with an equal number of row slots foreach segment. In MODEL 204 (see [M204, O'NEI87]), a BitmapFragment fits on a 6 KByte page, and contains about 48K bits,so the table is broken into segments of about 48K rows each.This segmentation has two important implications.

The first implication involves RID-lists. When Bitmaps are suf-ficiently sparse that they need to be converted to RID-lists, theRID-list for a segment is guaranteed to fit on a disk page (1/32 of48K is about 1.5K; MODEL 204 actually allows sparserBitmaps than 1/32, so several RID lists might fit on a singledisk page). Furthermore, RIDs need only be two bytes inlength, because they only specify the row position within thesegment (the 48K rows of a segment can be counted in a shortint). At the beginning of each RID-list, the segment number willspecify the higher order bits of a longer RID (4 byte or more),but the segment-relative RIDs only use two bytes each. This isan important form of prefix RID compression, which greatlyspeeds up index range search.

The second implication of segmentation involves combiningpredicates. The B-tree index entry for a particular value inMODEL 204 is made up of a number of pointers by segment toBitmap or RID-list Fragments, but there are no pointers for seg-ments that have no representative rows. In the case of a clus-tered index, for example, each particular index value entry willhave pointers to only a small set of segments. Now if severalpredicates involving different column indexes are ANDed, theevaluation takes place segment-by-segment. If one of the predi-cate indexes has no pointer to a Bitmap Fragment for a segment,then the segment Fragments for the other indexes can be ignoredas well. Queries like this can turn out to be very common in aworkload, and the I/O saved by ignoring I/O for these indexFragments can significantly improve performance.

In some sense, Bitmap representations and RID-list representa-tions are interchangeable: both provide a way to list all rowswith a given index value or range of values. It is simply the casethat, when the Bitmap representations involved are relativelydense, Bitmaps are much more efficient than RID-lists, both instorage use and efficiency of Boolean operations. Indeed aBitmap index can contain RID-lists for some entry values oreven for some Segments within a value entry, whenever thenumber of rows with a given keyvalue would be too sparse in




Other Indexes

Projection IndexesSimply a list of the values for one attribute for all valuesLike the leaf-level in a B+-tree (except no pointers)Each page contains same number of values (so easier toidentify the value for a specific tuple)

Bit-sliced IndexA set of bitmaps, one for each “position” in the binaryrepresentation of the values

Makes more sense for numerical/ordinal attribues

Can be used for computing aggregates like SUM

-7-

assuming that I/O requires 10K instructions is:((f.100.10,000+k.1000.10)/1,000,000).$12. Since k ! f.100, theformula f.100.10,000 + k.1000.10 ! f.100.10,000 + f.100.1000.10= f.2,000,000. Thus, the total CPU cost is bounded above byf.$24, which is still cheap compared to an I/O cost of f.$600. Yetthis is the highest cost we assume for CPU due to I/O, which isthe dominant CPU term. In Table 3.4, we give the maximum dol-lar cost for each index approach.

Method $Cost for 10Kins per I/O

Projection index f.$624

Value-List index $642

Bit-Sliced index f.$425

Table 3.4. Costs of the four plans in dollars, with kM rowsand clustering fraction f

The clustered case clearly affects the plans by making theProjection and Bit-Sliced indexes more efficient compared to theValue-List index.

3.2 Evaluating Other Column Aggregate Functions

We consider aggregate functions of the form in [3.2], whereAGG is an aggregate function, such as COUNT, MAX, MIN, etc.

[3.2] SELECT AGG(C) FROM T WHERE condition;

Table 3.5 lists a group of aggregate functions and the indextypes to evaluate these functions. We enter the value "Best" in acell if the given index type is the most efficient one to have forthis aggregation, "Slow" if the index type works but not very ef-ficiently, etc. Note that Table 3.5 demonstrates how different in-dex types are optimal for different aggregate situations.

Aggregate Value-ListIndex

ProjectionIndex

Bit-SlicedIndex

COUNT Not needed Not needed Not needed

SUM Not bad Good Best

AVG ( SUM/COUNT) Not bad Good Best

MAX and MIN Best Slow Slow

MEDIAN, N-TILE UsuallyBest

Not Useful Sometimes

Best2

Column-Product Very Slow Best Very Slow

Table 3.5. Tabulation of Performance by Index Type forEvaluating Aggregate Functions

The COUNT and SUM aggregates have already been covered.COUNT requires no index, and AVG can be evaluated asSUM/COUNT, with performance determined by SUM.

The MAX and MIN aggregate functions are best evaluated witha Value-List index. To determine MAX for a Foundset Bf, one

loops from the largest value in the Value-List index down to thesmallest, until finding a row in Bf. To find MAX and MIN using

a Projection index, one must loop through all values stored.The algorithm to evaluate MAX or MIN using a Bit-Sliced indexis given in our extended paper, [O'NQUA], together with otheralgorithms not detailed in this Section.

2Best only if there is a clustering of rows in B in a localregion, a fraction f of the pages, f ! 0.755.

To calculate MEDIAN(C) with C a keyvalue in a Value-List in-dex, one loops through the non-null values of C in decreasing(or increasing) order, keeping a count of rows encountered, untilfor the first time with some value v the number of rows encoun-tered so far is greater than COUNT(Bf AND Bnn)/2. Then v is

the MEDIAN. Projection indexes are not useful for evaluatingMEDIAN, unless the number of rows in the Foundset is verysmall, since all values have to be extracted and sorted.Surprisingly, a Bit-Sliced index can also be used to determinethe MEDIAN, in about the same amount of time as it takes to de-termine SUM (see [O'NQUA]).

The N-TILE aggregate function finds values v1, v2, . . ., vN-1,

which partition the rows in Bf into N sets of (approximately)

equal size based on the interval in which their C value falls: C<= v1, v1 < C <= v2, . . ., vN-1 < C. MEDIAN equals 2-TILE.

An example of a COLUMN-PRODUCT aggregate function isone which involves the product of different columns. In theTPC-D benchmark, the LINEITEM table has columnsL_EXTENDEDPRICE and L_DISCOUNT. A large number ofque r i e s i n TPC-D re t r i eve t he agg rega t e :SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)), usually withthe column alias "REVENUE". The most efficient method for cal-culating Column-Product Aggregates uses Projection indexesfor the columns involved. It is possible to calculate products ofcolumns using Value-List or Bit-Sliced indexes, with the sort ofalgorithm that was used for SUM, but in both cases, Foundsetsof all possible cross-terms of values must be formed and counted,so the algorithm are terribly inefficient.

4. Evaluating Range Predicates

Consider a Select statement of the following form:

[4.1] SELECT target-list FROM TWHERE C-range AND <condition>;

Here, C is a column of T, and <condition> is a general search-condition resulting in a Foundset Bf. The C-range represents a

range predicate, {C > c1, C >= c1, C = c1, C >= c1, C > c1, C be-tween c1 and c2}, where c1 and c2 are constant values. We willdemonstrate below how to further restrict the Foundset Bf, creat-

ing a new Foundset BF, so that the compound predicate "C-range

AND <condition>" holds for exactly those rows contained inBF. We do this with varying assumptions regarding index types

on the column C.

Evaluating the Range using a Projection Index. If there is aProjection index on C, we can create BF by accessing each C

value in the index corresponding to a row number in Bf and test-

ing whether it lies within the specified range.

Evaluating the Range using a Value-List Index. With aValue-List index, evaluation the C-range restriction of [4.1] usesan algorithm common in most database products, loopingthrough the index entries for the range of values. We varyslightly by accumulating a Bitmap Br as an OR of all row sets in

the index for values that lie in the specified range, then ANDthis result with Bf to get BF. See Algorithm 4.1.

Note that for Algorithm 4.1 to be efficiently performed, we mustfind some way to guarantee that the Bitmap Br remains in memory

at all times as we loop through the values v in the range. Thisrequires some forethought in the Query Optimizer if the table T



Other Indexes

Projection IndexesSimply a list of the values for one attribute for all valuesLike the leaf-level in a B+-tree (except no pointers)Each page contains same number of values (so easier toidentify the value for a specific tuple)

Bit-sliced IndexA set of bitmaps, one for each “position” in the binaryrepresentation of the values

Makes more sense for numerical/ordinal attribues

Can be used for computing aggregates like SUM

-7-

assuming that I/O requires 10K instructions is:((f.100.10,000+k.1000.10)/1,000,000).$12. Since k ! f.100, theformula f.100.10,000 + k.1000.10 ! f.100.10,000 + f.100.1000.10= f.2,000,000. Thus, the total CPU cost is bounded above byf.$24, which is still cheap compared to an I/O cost of f.$600. Yetthis is the highest cost we assume for CPU due to I/O, which isthe dominant CPU term. In Table 3.4, we give the maximum dol-lar cost for each index approach.

Method $Cost for 10Kins per I/O

Projection index f.$624

Value-List index $642

Bit-Sliced index f.$425

Table 3.4. Costs of the four plans in dollars, with kM rowsand clustering fraction f

The clustered case clearly affects the plans by making theProjection and Bit-Sliced indexes more efficient compared to theValue-List index.

3.2 Evaluating Other Column Aggregate Functions

We consider aggregate functions of the form in [3.2], whereAGG is an aggregate function, such as COUNT, MAX, MIN, etc.

[3.2] SELECT AGG(C) FROM T WHERE condition;

Table 3.5 lists a group of aggregate functions and the indextypes to evaluate these functions. We enter the value "Best" in acell if the given index type is the most efficient one to have forthis aggregation, "Slow" if the index type works but not very ef-ficiently, etc. Note that Table 3.5 demonstrates how different in-dex types are optimal for different aggregate situations.

Aggregate Value-ListIndex

ProjectionIndex

Bit-SlicedIndex

COUNT Not needed Not needed Not needed

SUM Not bad Good Best

AVG ( SUM/COUNT) Not bad Good Best

MAX and MIN Best Slow Slow

MEDIAN, N-TILE UsuallyBest

Not Useful Sometimes

Best2

Column-Product Very Slow Best Very Slow

Table 3.5. Tabulation of Performance by Index Type forEvaluating Aggregate Functions

The COUNT and SUM aggregates have already been covered.COUNT requires no index, and AVG can be evaluated asSUM/COUNT, with performance determined by SUM.

The MAX and MIN aggregate functions are best evaluated witha Value-List index. To determine MAX for a Foundset Bf, one

loops from the largest value in the Value-List index down to thesmallest, until finding a row in Bf. To find MAX and MIN using

a Projection index, one must loop through all values stored.The algorithm to evaluate MAX or MIN using a Bit-Sliced indexis given in our extended paper, [O'NQUA], together with otheralgorithms not detailed in this Section.

2Best only if there is a clustering of rows in B in a localregion, a fraction f of the pages, f ! 0.755.

To calculate MEDIAN(C) with C a keyvalue in a Value-List in-dex, one loops through the non-null values of C in decreasing(or increasing) order, keeping a count of rows encountered, untilfor the first time with some value v the number of rows encoun-tered so far is greater than COUNT(Bf AND Bnn)/2. Then v is

the MEDIAN. Projection indexes are not useful for evaluatingMEDIAN, unless the number of rows in the Foundset is verysmall, since all values have to be extracted and sorted.Surprisingly, a Bit-Sliced index can also be used to determinethe MEDIAN, in about the same amount of time as it takes to de-termine SUM (see [O'NQUA]).

The N-TILE aggregate function finds values v1, v2, . . ., vN-1,

which partition the rows in Bf into N sets of (approximately)

equal size based on the interval in which their C value falls: C<= v1, v1 < C <= v2, . . ., vN-1 < C. MEDIAN equals 2-TILE.

An example of a COLUMN-PRODUCT aggregate function isone which involves the product of different columns. In theTPC-D benchmark, the LINEITEM table has columnsL_EXTENDEDPRICE and L_DISCOUNT. A large number ofque r i e s i n TPC-D re t r i eve t he agg rega t e :SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)), usually withthe column alias "REVENUE". The most efficient method for cal-culating Column-Product Aggregates uses Projection indexesfor the columns involved. It is possible to calculate products ofcolumns using Value-List or Bit-Sliced indexes, with the sort ofalgorithm that was used for SUM, but in both cases, Foundsetsof all possible cross-terms of values must be formed and counted,so the algorithm are terribly inefficient.

4. Evaluating Range Predicates

Consider a Select statement of the following form:

[4.1] SELECT target-list FROM TWHERE C-range AND <condition>;

Here, C is a column of T, and <condition> is a general search-condition resulting in a Foundset Bf. The C-range represents a

range predicate, {C > c1, C >= c1, C = c1, C >= c1, C > c1, C be-tween c1 and c2}, where c1 and c2 are constant values. We willdemonstrate below how to further restrict the Foundset Bf, creat-

ing a new Foundset BF, so that the compound predicate "C-range

AND <condition>" holds for exactly those rows contained inBF. We do this with varying assumptions regarding index types

on the column C.

Evaluating the Range using a Projection Index. If there is aProjection index on C, we can create BF by accessing each C

value in the index corresponding to a row number in Bf and test-

ing whether it lies within the specified range.

Evaluating the Range using a Value-List Index. With aValue-List index, evaluation the C-range restriction of [4.1] usesan algorithm common in most database products, loopingthrough the index entries for the range of values. We varyslightly by accumulating a Bitmap Br as an OR of all row sets in

the index for values that lie in the specified range, then ANDthis result with Bf to get BF. See Algorithm 4.1.

Note that for Algorithm 4.1 to be efficiently performed, we mustfind some way to guarantee that the Bitmap Br remains in memory

at all times as we loop through the values v in the range. Thisrequires some forethought in the Query Optimizer if the table T



Other Indexes

Bitmap Join IndexA bitmap index on the Facts table on a dimension attributeRecall that each Facts table tuple joins with exactly onetuple from any dimension tableSo each Facts table tuple has a unique value for adimension attribute

In the example, might build a bitmap index on attribute"STATE" on Fact table

Using Bitmap Join Index, can evaluate predicates directlyon the Fact tableGrouping also more efficient



Other Indexes

ClusteringCan get better performance by appropriately clustering theFact table

Groupset IndexesFor better grouping performance

Many other optimizations commonly found in DataWarehouses



Other Indexes

ClusteringCan get better performance by appropriately clustering theFact table

Groupset IndexesFor better grouping performance

Many other optimizations commonly found in DataWarehouses



Other issues

Using monetary cost as the metricUseful in the real-world (see the 5-minute rule paper byGray, Graefe in the Redbook)

Bloom FiltersCan’t use them here... there is no “check” at the end tomake sure the results are accurate

Data MiningTypically done outside the database, so the indexes don’treall help

Applicability to operational databasesNot much... these techniques require too many indexesAn update would require changing all of those... prohibitive


http://www.informatik.uni-trier.de/~ley/db/journals/sigmod/GrayG97.html

Documents

CMSC724: Data Warehouses - University Of MarylandCMSC724: Data Warehouses Amol Deshpande University of Maryland, College Park April 21, 2013 ... what is different from traditional