79
Chapter 5 Multidimensional Indexes

Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Embed Size (px)

Citation preview

Page 1: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Chapter 5

Multidimensional Indexes

Page 2: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

One dimensional index can be used to support multidimensional query.

F1=‘abcd’ F2= 123 ‘abcd#123’

Page 3: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Applications Needing Multiple Dimensions

• Geographic Information Systems

• Data Cubes

Page 4: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Geographic Information Systems

In GIS, data are stored in a two-dimensional space such as map.

school

Road1

r

o

a

d

2

House1

House2

o pipeline

Page 5: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Typical Queries of GIS

• Partial match queries

• Range queries

• Near-neighbor queries

• Where-am-I queries

Page 6: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Data Cubes

Data with multiple properties can be seen as existing in a high-dimensional space.

Multidimensional data is gathered by many corporations for decision-support applications

Page 7: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

An Example of Data Cube

A chain store may record each sale made, including:• The day and time• The store at which the sale was made• The item purchased• The color of the item• The size of the item• The other properties

Give the sales of pink shirts for each store and each month of 1998

Page 8: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Multidimensional Queries in SQL

Multidimensional data can be stored in a conventional relational database and we can query them in SQL.

Page 9: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Finding the nearest points to (10.0, 20.0)

Store points in the relation Points (x, y) with x and y representing the x- and y-coordinates

SELECT *

FROM POINTS p

WHERE NOT EXISTS(

SELECT *

FROM POINTS q

WHERE (q.x-10)*(q.x-10)+(q.y-20)*(q.y-20)<

(p.x-10)*(p.x-10)+(p.y-20)*(p.y-20)

);

Page 10: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Finding the rectangles that contain (10.0, 20.0)

rectangles ( ID,xll,yll,xur,yur)

SELECT id

FROM rectangles

WHERE xll<=10 AND yll<=20 AND

xur>=10 AND yul>=20 ;

Page 11: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Summarizing the sales of pink shirts

Sales ( day , store , item , color , size )

SELECT day, store, count(*) AS totalSales

FROM sales

WHERE item=‘shirt’ AND color=‘pink’

GROUP By day,store;

Page 12: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Executing Range Queries Using Conventional Indexes

• Given ranges in all dimensions, suppose we build a secondary index B+ tree for each dimension. • Using B+ tree for each dimension, we could get pointers to all of records in the range for that dimension. • We intersect these pointers to get final range query results.

Page 13: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

The disk I/O for range query includes:

• to find the way down the B-Trees

• to examine leaf nodes of each B-tree

• to retrieve all the matching records

Page 14: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Range query asking for pointers in the square of side 100 surrounding the center of the space

10,000

100,000

1000

1000Disk I/O: 2X(100,000/200+1)+ Number of Data Blocks containing the desired points (at worst 10,000)

Little Help

100

Look at every block of data file

Suppose a leaf node holding 200 key-point pairs, a block holding 100 records

Access the 100,000 pointers in either dimension.

Page 15: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Executing Nearest-Neighbor Queries Using Conventional Indexes

1. picking a range in each dimension2. asking the range query3. selecting the point closest to the target

within that range

Page 16: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Two things that could go wrong:

• No points within distance d of the given point

to repeat the entire process with a higher value of d

• The distance from the target to the closest point d’ > d

to repeat the search with d’ in place of d

*

*Closest point in range

*Possible closer point

Page 17: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Disk I/O to find the nearest neighbor to (10.0, 20.0)

• Pick d = 1• Examine B-tree for the x-coordinate with range

query (10.0-d=9)<=x<=(10.0+d=11)• Get about 2,000 points• Traverse at least 10 leaves, most likely 11• One disk I/O for an intermediate node• Another 12 disk I/O’s for y-coordinate• One more disk to retrieve the desired record• A total of 25 disk I/O’s

Significantly more disk I/O’s

Page 18: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Multidimensional Index Structures

1. Hash-table-like approaches (1) Grid Files (2) Partitioned Hash Functions2. Tree-like approaches (1) Multiple-Key Indexes (2) kd-Trees (3) Quad Trees (4) R-tree3. Bitmap Indexes

Page 19: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Grid Index Key 2

X1 X2 …… Xm

V1 V2

Key 1

Vn

To records with key1=V3, key2=X2

Page 20: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Customers who bought gold jewelry:

*

*

*

* *

***

*

**

*

0 40 55 100

500K

225K

90K

0

Salary

Age

(25,60) (45,60) (50,75) (50,100) (50,120) (70,110) (85,140) (30,260) (25,400) (45,350) (50,275) (60, 260)

Page 21: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

• How is Grid Index stored on disk?

Like

Array... X1

X2

X3

X4

X1

X2

X3

X4

X1

X2

X3

X4

V1 V2 V3

Problem:

• Need regularity so we can computeposition of <Vi,Xj> entry

Page 22: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Solution: Use Indirection

BucketsV1

V2

V3 *Grid onlyV4 contains

pointers to buckets

Buckets------

------

------

------

------

X1 X2 X3

Page 23: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

The grid file representing database of customers

30,260 25,400

25,60

45,60 50,75

50,100 50,120

45,350 50,275

60,260

70,110 85,140

0-40 40-55 55+

225+

90-225

0-90

Page 24: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Lookup in a Grid File

The positions of the point in each of the dimensions together determine bucket.

Page 25: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Insertion Into Grid Files

Lookup the record; place the new record in

that bucket. If no room, there are two

general approaches as follows:

(1) Add overflow blocks to the bucket.

(2) Reorganize the structure by adding or moving the grid lines

Page 26: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Insertion of the point (52,200) followed by splitting of buckets

*

*

*

* *

****

**

*

0 40 55 100

500K

225K130K90K

0

Salary

Age

*

Page 27: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Performance of Grid Files

• Lookup of Specific Points Read: 1 disk I/O, Insertion/Deletion: 2 disk I/O (+1

if the creation of an overflow block)• Partial-Match Queries Look at all the buckets in a row or column of the

bucket matrix• Range Queries Look at all the buckets that cover the range

defined by range queries.• Nearest-Neighbor Queries Not easy to put an upper bound on how costly the

search is.

Page 28: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Idea:

Key1 Key2

Partitioned hash function

h1 h2

010110 1110010

Page 29: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

h1(toy) =0 000h1(sales) =1 001h1(art) =1 010

. 011

.h2(10k) =01 100h2(20k) =11 101h2(30k) =01 110h2(40k) =00 111

.

.

<Fred,toy,10k>,<Joe,sales,10k><Sally,art,30k>

EX:

Insert

<Joe><Sally>

<Fred>

Page 30: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

h1(toy) =0 000h1(sales) =1 001h1(art) =1 010

. 011

.h2(10k) =01 100h2(20k) =11 101h2(30k) =01 110h2(40k) =00 111

.

.• Find Emp. with Dept. = Sales Sal=40k

<Fred><Joe><Jan>

<Mary>

<Sally>

<Tom><Bill><Andy>

Page 31: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

h1(toy) =0 000h1(sales) =1 001h1(art) =1 010

. 011

.h2(10k) =01 100h2(20k) =11 101h2(30k) =01 110h2(40k) =00 111

.

.• Find Emp. with Sal=30k

<Fred><Joe><Jan>

<Mary>

<Sally>

<Tom><Bill><Andy>

look here

Page 32: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

h1(toy) =0 000h1(sales) =1 001h1(art) =1 010

. 011

.h2(10k) =01 100h2(20k) =11 101h2(30k) =01 110h2(40k) =00 111

.

.• Find Emp. with Dept. = Sales

<Fred><Joe><Jan>

<Mary>

<Sally>

<Tom><Bill><Andy>

look here

Page 33: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Comparison of Grid Files and Partitioned Hashing

• Grid files are good at nearest-neighbor queries or range queries.

• Partitioned hashing is good at partial match queries.

Page 34: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Tree-Like Structure for Multidimensional Data

• Multiple-key indexes

• Kd-trees

• Quad trees

• R-trees

Page 35: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Motivation: Find records where

DEPT = “Toy” AND SAL > 50k

Multi-key Index

Page 36: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Strategy I:

• Use one index, say Dept.

• Get all Dept = “Toy” records and check their salary

I1

Page 37: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

• Use 2 Indexes; Manipulate Pointers

Toy Sal> 50k

Strategy II:

Page 38: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

• Multiple Key Index

One idea:

Strategy III:

I1

I2

I3

Index on first attribute

Indexes on second attribute

Page 39: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Example

ExampleRecord

DeptIndex

SalaryIndex

Name=JoeDEPT=SalesSAL=15k

ArtSalesToy

10k15k17k21k

12k15k15k19k

Page 40: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Performance of Multiple-Key Indexes

• Partial-Match Queries

quite efficient for the first attribute

• Range Queries

quite well for a range query

• Nearest-Neighbor Queries

the same strategy as the other index structures

Page 41: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Partial-Match Queries

• If the first attribute is specified, the access is quite efficient.

• If the second attribute is specified, the access is time-consuming.

Page 42: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Range Queries

• Range query on the first attribute to find all of the subindexes

• Search each of these subindexes, using the range specified for the second attribute

• ……

Page 43: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Nearest-Neighbor Queries

• Pick a distance d.• Ask range query x0-d<=x<=x0+d and y0-

d<=y<=y0+d.• Find a closest point within this range• If no points within the range or the distance

from (x0,y0) of the closest point greater than d, increase the range and search again.

Page 44: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

kd-Trees(k-dimensional search tree)

• Generalization of the binary search tree to multidemensional data.

• Interior nodes with an associated attribute A and its dividing value V.

• The attributes rotating at different levels of the tree.

• Leaves with blocks holding data records.

Page 45: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Salary 150

Age 60 Age 47

Salary 80

70,110

85,140

Age 38

25,60

45,60

50,75

50,100

50,120

Salary 300 50,275

60,260

30,260

25,400

45,350

A kd-tree example

Page 46: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Tree after insertion of (35,500)

Salary 150

Age 60 Age 47

Salary 80

70,110

85,140

Age 38

25,60

45,60

50,75

50,100

50,120

Salary 300 50,275

60,260

30,260

25,400

45,350

年龄 35

25,400 35,500

45,350

Page 47: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Complex Queries on kd-tree

• Partial-Match Queries

ask for all points with age = 50

• Range Queries

ask for all points with ages 35 to 55 and salaries $100K to $200K

• Nearest-Neighbor Queries

use the same approach as discussed before

Page 48: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Partial-Match Queries ( ask for all points with age = 50)

• Explore both ways at the level with the unknown attribute.

• Go one way at the level with the specified attribute.

Page 49: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Range Queries (ask for all points with ages 35 to 55 and

salaries $100K to $200K)

• If the range straddles the splitting value, explore the two children

• Otherwise, move to only one child.

Page 50: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Nearest-Neighbor Queries

• Treat them as range queries

• Repeat with a larger range if necessary

Page 51: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Two approaches to improve• Multiway Branches at Interior Nodes• Group Interior Nodes Into Blocks

Problem: (1) long paths: log2n for a kd-tree with n leaves. (2) unused space: interior nodes with little info.

Page 52: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Multiway Branches at Interior Nodes

• Interior nodes with many key-pointer pairs

• Keeping distribution and balance as we do for B-tree

Page 53: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Group Interior Nodes Into Blocks

• Packing many interior nodes into a single block.

• Including in one block a node and its descendants for some number of levels

Page 54: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Quad Trees

• Data points are contained in a square region.

• If data points in a square can fit in a block, the square will be a leaf of the tree.

• Otherwise, the square will be an interior node, with children corresponding to its four quadrants.

Page 55: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Data organized in a quad tree

400k

*

*

*

*

*

* *

*

* * *

*

0 100

Salary

Age

Page 56: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

A quad tree

Page 57: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

R-Trees(Region Tree)

• The R-tree node represents a data region which has subregions as its children.

• The data region can be of any shape.

• The subregions do not cover the entire region.

• The subregions are allowed to overlap.

Page 58: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

The region of an R-tree node and subregions of its children

Page 59: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

“Where-am-I” Query

• Start at the root.

• Examine the subregions at the root to see whether they contain point P

• If there are zero regions, P is not in any data region;

If there is at least one interior region that contains P, recursively search for P until reaching the leaves.

Page 60: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Insert a new region

school

Road1

r

o

a

d

2

House1

House2

o pipeline

pop

Suppose that leaves have room for six regions.

Page 61: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

((0,0),(60,50)) ((20,20),(100,80))

Road1 road2 house1 School house2 pipelin pop

Page 62: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Expand a region

school

Road1

r

o

a

d

2

House1

House2

o pipeline

pop

House3

• Expand lower subrange, increase 1000 units• Expand upper subrange increase 1200 units.

Page 63: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Bitmap Indexes

1. A bitmap index for a field F is a collection of bit-vectors of length n (n: number of records).

2.One bit-vector corresponds to each possible value that may appear in the field F.

3.The vector for value v has 1 in position i if the ith record has v in field F, and it has 0 there if not.

Page 64: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

An Example of a Bitmap Index

Suppose a file has six records with two fields f and g: (30 , foo), (30,bar),(40,baz),(50,foo),(40,bar),(30,baz)

f : 30:110001 g: foo:100100 40:001010 bar:010010 50:000100 baz:001001

Page 65: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Partial-match queries by bitmap indexes

movie ( title,year,length,studioname)

SELECT title FROM movie WHERE studioname=‘Disney’ AND year=1995

bitwise AND of the bit vector for year = 1995 and the bit vector for studioName = ‘Disney’

Page 66: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Range queries by bitmap indexes Records, 1:(25,60) 2:(45,60) 3:(50,75 ) 4:(50,100) 5:(50,120) 6:(70,140) 7:(85,140) 8:(45,350) Find all records with an age in the range 45 - 55 and a salary in t

he range 100 - 200, using bitmap indexes as follows. Age : 25 ; 10000000 45 : 01000001 50 : 00111000 70 : 00000100 85 : 00000010 Salary : 60 : 11000000 75 : 00100000 100 : 00010000 120 : 00001000 140 : 00000110 350 : 00000001 45 : 01000001 , 50 : 00111000 01000001 OR 00111000 = 01111001

100 : 00010000 , 120 : 00001000 , 140 : 00000110 00010000 OR 00001000 OR 00000110 =00011110

01111001 AND 00011110 = 00011000

Page 67: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Compressed Bitmaps

• Run-length encoding (run: a sequence of i 0’s followed by a 1)

• The number j (log2i) by j-1 1’s and a single 0, followed with i in binary

• Concatenate the codes for each run together.

i=0, 00; i=1, 01i=13, 1110 1101

Page 68: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Encode and Decode

• Encode

age 25: 100000001000

(0,7) 00 110111

• Decode

11101101001011

13, 0, 3 000000000000110001

Page 69: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

To perform bitwise AND or OR on encoded bit-vectors

• Decode one run at a time• Determine where the next 1 is in each operand

bit-vector.• If OR, produce 1 at that position of the output;

If AND, produce 1 if and only if both operands have their next 1 at the same position

Page 70: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

25: 00110111 30: 110111OR

First Run 0 7

1 in position 1 1 in position 8

Second Run 7

1 in position 9

Result 100000011

Page 71: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Managing Bitmap Indexes

• Finding Bit-Vectors

• Finding Records

• Handling Modifications to the Data File

Page 72: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Finding Bit-Vectors

Use any secondary index with the field value as search key, such as B-tree, hash table or indexed-sequential files.

Page 73: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Finding Records

Use a secondary index on the data file, whose search key is the number of the record.

Page 74: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Handling Modifications to the Data file

• Record numbers must remain fixed once assigned

• Changes to the data file require the bitmap index to change as well

Page 75: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Deletion Record i

• Leave a “ tombstone “ in the data file

• Change the bit-vector in position i from 1 to 0.

Page 76: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Insert New Record

• Assign the next available record number to the new record.

• Modify the bit-vector for the value of the new record by appending a 1 at the end

• Add the new bit-vector for the value which did not appear before.

• Insert the new bit-vector and its corresponding value to the secondary index.

Page 77: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Modification the value of record i from v to w

• Change bit-vector for v in position i from 1 to 0

• Change bit-vector for w in position i from 0 to 1, or create a bit-vector for w if w is a new value.

Page 78: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Conclusion

• Multidimensional Data• Grid files• Partitioned Hash Tables• Multiple-Key Indexes• Kd-Trees• Quad Trees• R-Trees• Bitmap Indexes

Page 79: Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’

Exercises

• Ex 4.1.2, Ex 4.2.6, Ex 4.3.1, Ex 4.4.6

• Ex 5.1.3, Ex 5.2.7, Ex 5.3.2, Ex 5.4.2