46
File Structures SNU-OOPSLA Lab. 1 Chap.10 Indexed Sequential File Chap.10 Indexed Sequential File Access and Prefix B+ Trees Access and Prefix B+ Trees 서서서서서 서서서서서서 서서서서서서서서서서 SNU-OOPSLA-LAB 서서 서 서 서 File Structures by Folk, Zoellick, and R icarrdi

File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

Embed Size (px)

Citation preview

Page 1: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 1

Chap.10 Indexed Sequential FileChap.10 Indexed Sequential File Access and Prefix B+ Trees Access and Prefix B+ Trees

서울대학교 컴퓨터공학부객체지향시스템연구실SNU-OOPSLA-LAB

교수 김 형 주

File Structures by Folk, Zoellick, and Ricarrdi

Page 2: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 2

Chapter ObjectivesChapter Objectives

Introduce indexed sequential files Describe operations on a sequence set of blocks that maintains records

in order by key Show how an index set can be built on top of the sequence set to

produce an indexed sequential file structure Introduce the use of a B-tree to maintain the index set, thereby

introducing B+ trees and simple prefix B+ trees Illustrate how the B-tree index in a simple prefix B+ tree can be

of variable order, holding a variable number of separators Compare the strengths and weakness of B+ trees, simple prefix

B+ trees, and B-trees

Page 3: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 3

ContentsContents

10.1 Indexed Sequential Access 10.2 Maintaining a Sequence Set 10.3 Adding a Simple Index to the Sequence Set 10.5 The Contents of the Index: Separators Instead of Keys 10.6 The Simple Prefix B+ Tree Maintenance 10.7 Index Set Block size 10.8 Internal Structure of the Index Set Blocks: A variable-order B-Tree 10.9 Loading a Simple Prefix B+ Tree 10.10 B+ Trees 10.11 B-Trees, B+ Trees, and Simple Prefix B+ Trees in Perspective

Page 4: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 4

10.1 Indexed Sequential Access10.1 Indexed Sequential Access

Two alternative views indexed : records are indexed by keys

no good for sequential processing sequential : records can be accessed sequentially

not good for access, insert, delete records in random order

In chap 9, we see B tree and now we want derive Indexed + Sequential ==> B+ tree with help of the idea of the sequence set

Sequential file ==> Indexed Sequential file ==> B+ tree Indexed-Sequential file = Indexed Sequential Access Method (ISAM)

Page 5: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 5

Overview : ISAM FileOverview : ISAM FileR

main memory

secondary memory61

10 20 50 61 101

30 40 45D C A

1 3 10A B A

11 20C D

51 55 57A D B

65 70 101

E B C

120150

A D

50D

60B

61A

a

b c

ihgfed

part description records

PART #PART-Type

primary key

Example : Indexed sequential structure (when using overflow chain)

Page 6: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 6

Overview : ISAM File (2)Overview : ISAM File (2)

Compared with ordered relative file Ordered on a key, like ordered relative file Can be accessed by an index, structure that

contains information on where a record with a given key is located (usually intermingled with blocks of records)

Tree search of an index replaces binary search of ordered relative files

Page 7: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 7

Indexed Sequential FilesIndexed Sequential Files

Block types Index Block Primary Data Block Overflow Data Block

IndexBlock

DataBlock

DataBlock

OverflowDataBlock

DataBlock

. . .Overflow

DataBlock

OverflowDataBlock

Page 8: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 8

Indexed Sequential Files :RetrievalIndexed Sequential Files :Retrieval

Retrieve parts_file where part# = 60 Primary Key search : nodes R,a,b,g accessed 3 primary block access, 1 overflow block accessed

Retrieve parts_file where part# = 101 and part_type = C (overqualified) Primary Key search : nodes R,a,c,h accessed 3 primary block accesses Block “access”es are really block fetches. The blocks may b

e in main memory buffers so that actual block accesses aren’t performed

Page 9: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 9

Indexed Sequential Files : Retrieval(2)Indexed Sequential Files : Retrieval(2)

Retrieve part_file where part#= 101 or part_type = C Scan : node R,d,e,f,g,h,I accessed 6 primary block “accesses” overflow block “accesses”

Page 10: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 10

61

10 20 50 61 101

30 40 45

D C A

1 3 10

A B A

11 20

C D

51 55 57

A D B

65 70 101

E B C

120150

A D

50

D

60

B

61

A

a

b c

ihgfed

Retrieval of Indexed sequential structure

1 2 3

R

Page 11: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 11

Indexed Sequential Files : InsertionIndexed Sequential Files : Insertion

(Step 1) Locate data level node via key search in which to insert record

(Step 2) Determine if record is to be inserted into primary block or overflow in order to maintain primary key order sequence of records

(Step 3a) If record is to be placed in primary block and block is not full, shift all records with higher-valued primary keys to the right and place new record into vacated slot. STOP.

Page 12: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 12

Indexed Sequential Files : Insertion(2)Indexed Sequential Files : Insertion(2)

(Step 3b) If record is to be placed in primary block and block is full, place record of the block with highest valued primary key so that it is the first record on the overflow chain (move one record to the overflow chain) . Primary block is now not full. Go to Step 3a.

(Step 4) If record is to be placed in overflow chain, place record in appropriate position on overflow chain so that primary key sequencing is maintained. STOP.

Page 13: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 13

110120130

F A E

110120130

F A E

120130150

A E D

120130150

A E D

150120

A D

130

E

180

C

110

F

170

G

180

C

150

D

180

C

150

D

170

G

180

C

Yields(step 3a)

Yields(step 4)

Yields(step 3b)

Yields(step 4)

insert

insert

insert

insert

insert i

Example : Insertion

Page 14: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 14

Indexed Sequential Files : DeletionIndexed Sequential Files : Deletion

(Step 1) Locate record to delete by primary key search

(Step 2) If record is in primary block, free its slot and

shift all records in the block with higher-valued primary

keys to the left. STOP

(Step 3) If record is in overflow, remove it from

overflow chain. STOP

Page 15: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 15

110130

F E

150

D

170

G

180

C

110130

F E

170

G

180

C

120

A

150

D

yields

yields

remove

remove

Example : Deletion

Page 16: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 16

Indexed Sequential Files : UpdateIndexed Sequential Files : Update

(Step 1) Locate record to update by primary key search

(Step 2) If primary key was not altered, simply replace stored copy of record with the updated copy. STOP.

(Step 3) If primary key was altered, delete(remove) the located record. Insert updated record just as if were a new record. STOP.

Page 17: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 17

Indexed Sequential Files : Indexed Sequential Files : ReorganizationReorganization

Reading records out of old file in the primary key order

Building new indexed sequential structure with no records in overflow. (file creation)

Reorganization is really hectic !!!

Definitions Loading Factor = average number of records per node Initial Loading Factor = Loading Factor when file is created

Page 18: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 18

main memory

secondary memory45 70

3 11 30 45 120

20 30

D D

1 3

A B

10 11

A C

10 120

C A

150

D

51 57 61 70

60 6145

B A

50 51

D A

55 57

D B

65 70

E B

40 45

C A

Example : Reorganization

Page 19: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 19

Indexed Sequential Files : CreationIndexed Sequential Files : Creation

(Step 1) Using a specified initial loading factor LF, pack LF

records per node and create the data level of the new indexed

sequential file structure. (Last node on data level will have from

1 to LF records in it)

(Step 2) Build consecutive levels of index nodes until a level is

reached where there is only a single node. The root node is

created and is placed on the next higher level blocks of index

are to be packed as full as possible. Stop.

Page 20: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 20

10.2 Maintaining a Sequence Set10.2 Maintaining a Sequence Set

A sequence set (similar terms: ordered file, sequential set) a set of records in physical order by key

Sequence set + Simple Index ===> Simple Prefix B+ Tree

The Use of Blocks We want to rule out sorting and resorting of the sequence set

insertion of records into block : overflow -> split deletion of records : underflow -> redistribution, concatenation

costs for avoidance of sorting

more space overhead (internal fragmentation in a block) -> redistribution in place of splitting, two-to-three splitting the maximum guaranteed extent of physical sequentiality is wi

thin a block -> choice of block size

10.2 Maintaining a Sequence Set

Page 21: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 21

ADAMS...BAIRD...BIXBY...BOONE...

BYNUM...CARSON...COLE...DAVIS...

DENVER...ELLIS...

Block1

Block2

Block3

ADAMS...BAIRD...BIXBY...BOONE...

BYNUM...CARSON...CARTER...

DENVER...ELLIS...

Block1

Block2

Block3

COLE...DAVIS...Block4

(a)Initial blocked sequence set

(b)Sequence set after insertion of CARTER record - block 2 splits, and the contents are divided between blocks 2 and 4

Block splitting & concatenation(1)

(continued....)

10.2 Maintaining a Sequence Set

Page 22: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 22

ADAMS...BAIRD...BIXBY...BOONE...

BYNUM...CARSON...CARTER...

Block1

Block2

Block3

COLE...DENVER...ELLIS...Block4

(c)Sequence set after deletion of DAVIS record - block 4 is less than half full, so it is concatenated

with block3

Block splitting & concatenation(2)

Availablefor use

10.2 Maintaining a Sequence Set

Page 23: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 23

Issue: Choice of Block SizeIssue: Choice of Block Size Block : basic unit for I/O The maximum guaranteed extent of physical sequentiality Two considerations

several blocks should be in RAM at once e.g. for split or concatenation, at least two blocks in RAM

reading/writing a block should not be very long

Cluster :- the minimum number of sectors allocated at a time

- the minimum size of a file Reasonable suggestion: block size == cluster size

can access a block without seeking within a cluster

10.2 Maintaining a Sequence Set

Page 24: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 24

10.3 Adding a Simple Index to the Sequence Set10.3 Adding a Simple Index to the Sequence Set(1)(1)

An efficient way to locate some specific block containing a particular record, given the record’s key build index records containing the key for the last record in a block

Possible Index Structures simple index

binary search of the index works well while the entire index is in RAM

B+ tree B-tree index + a sequence set with actual records

Page 25: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 25

ADAMS-BERNE

BOLEN-CAGE

CAMP-DUTTON

EMBRY-EVANS

FABER-FOLK

FOLKS-GADDIS

1 32 4 5 6

Sequence of blocks

Key Block Number

BERNECAGEDUTTONEVANSFOLKGADDIS

123456

Simple index

Page 26: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 26

10.4 The Content of the Index :Separators Instead of Keys10.4 The Content of the Index :Separators Instead of Keys

Need not to have actual keys in the index set

Our real need is separators

Separator - distinguishes between 2 blocks

among many candidates, shortest separator is preferable

there is not always a unique shortest separator

Page 27: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 27

ADAMS-BERNE

BOLEN-CAGE

CAMP-DUTTON

EMBRY-EVANS

FABER-FOLK

FOLKS-GADDIS

1 32 4 5 6

Separators: BO CAM E F FOLKS

Separators between blocks in the sequence set

CAMP-DUTTON

EMBRY-EVANS

DUTUDVXGHSJFDZEEBQXELEEMOSYNARY

A list of potential separators

Page 28: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 28

10.5 The Simple Prefix B10.5 The Simple Prefix B++ Tree Tree Index like B-tree + blocks of sequential sets

The use of simple prefixes

prefixes of the keys rather than actual keys

contains shortest separators

N separators -> N+1 children

Properties of B+ tree B-tree like Index

Sequential data set

Indexed-sequential file

Page 29: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 29

E

BO CAM F FOLKS

ADAMS-BERNE

BOLEN-CAGE

CAMP-DUTTON

EMBRY-EVANS

FABER-FOLK

FOLKS-GADDIS

1 32 4 5 6

Indexset

A B-tree index set for the sequence set, forming a simple prefix B+ tree

Page 30: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 30

10.6 Simple Prefix B10.6 Simple Prefix B+ + Tree Maintenance (1)Tree Maintenance (1)

Changes localized to single blocks in the sequence set

deletion without concatenation, redistribution e.g. delete EMBRY, FOLKS

insertion without splitting e.g. insert EATON

Page 31: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 31

E

BO CAM F FOLKS

ADAMS-BERNE

BOLEN-CAGE

CAMP-DUTTON

ERVIN-EVANS

FABER-FOLK

FROST-GADDIS

1 32 4 5 6

Deletion of the EMBRY and FOLKS from the sequence set

Page 32: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 32

10.6 Simple Prefix B10.6 Simple Prefix B+ + Tree Maintenance(2)Tree Maintenance(2)

Changes involving multiple blocks in the sequence set split, concatenation : propagate to index set change the number of blocks in the sequence set

change the number of separators change the index set

insertion with splitting e.g. overflow in block1 block1, block7 with separator AY

deletion with concatenation/redistribution e.g. underflow in block2 block2, block3

split

concatenation

Page 33: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 33

AY CAM F FOLKS

AYERS-BERNE

BOLEN-CAGE

CAMP-DUTTON

ERVIN-EVANS

FABER-FOLK

FROST-GADDIS

7 32 4 5 6

ADAMS-AVERY

1

BO E

An insertion into block 1 causes a split and the consequent addition of block 7

Page 34: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 34

AYERS-BERNE

BOLEN-DUTTON

ERVIN-EVANS

FABER-FOLK

FROST-GADDIS

7 2 4 5 6

ADAMS-AVERY

1

AY BO F FOLKS

E

A deletion from block 2 causes underflow andthe consequent concatenation of blocks 2 and 3

Page 35: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 35

Bottom up procedure to handle changesBottom up procedure to handle changes

** insert/delete in the sequence set as if there is no B-tree index set

if blocks are splita new separator must be inserted into the index set

if blocks are concatenateda separator must be removed from the index set

if records are redistributed between blocksthe value of a separator in the index set must be changed

else no propagation to index set

Page 36: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 36

10.7 Index Set Block Size10.7 Index Set Block Size

size of an index node for the index set == size of a data block in the sequence set Reasons for using a common block size

the best size for sequence set is usually the best for the index set

a common block size makes it easier to implement a buffering scheme

the index set blocks and sequence set blocks are often mingled within the same file

to avoid seeking between separate files while accessing the simple prefix B+ tree

Page 37: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 37

10.8 Internal Structure of Index Set Blocks: 10.8 Internal Structure of Index Set Blocks: A variable-order B-tree A variable-order B-tree

Variable-length shortest separator possibility of packing them into a node

separator index (fixed length) : means of performing binary

searches on a list of variable-length entities

A simple prefix B+ tree with a variable order not maximum order -> not minimum depth

decisions about when to split, concatenate, or redistribute become

more complicated

Page 38: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 38

separators

As, Ba, Bro, C, Ch, Cra, Dele, Edi, Err, Fa, File

00 02 04 07 08 10 13 17 20 23 25AsBaBroCChCraDeleEdiErrFaFile

Variable-length separators and corresponding index

AsBaBroCChCraDeleEdiErrFaFile 00 02 04 07 08 10 13 17 20 23 25 B00 B01 ..... B10 B1111 28

Separator count

Total length of separators

Separators Index to separators Relative blocknumbers

Structure of an index set block

Page 39: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 39

10.9 Loading a Simple Prefix B+ Tree(1)10.9 Loading a Simple Prefix B+ Tree(1)

One way is successive insertions and splits

The other way is using separate loading process working from a sorted file and then place the records into sequence set block if one block is full

determine the separator and insert it into the index set block

place the records into new sequence set block

Page 40: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 40

10.9 Loading Simple Prefix B+ Tree(2)10.9 Loading Simple Prefix B+ Tree(2)

Advantages to using a separate loading process the output can be written sequentially simple than succcessive insert & split performance during loading

can load 100% utilization (c.f. insert & split produces blocks between 67~80% full)

creating a degree of spatial locality

Page 41: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 41

10.10 B10.10 B+ + TreesTrees

Contains copies of actual keys cf. simple prefix B+ tree : separator

ALWAYS/ASPECT/BETTER 00 1206

ALWAYS-ASK

ASPECT-BEST

ACCESS-ALSO

Next separator: CATCH

BETTER-CAST

CATCH-CHECK

Next sequenceset block:

Page 42: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 42

10.11 B-Tree, B+ Tree and Simple Prefix B+Tree in Perspective

Shared characteristics Paged index structures : broad and shallow Height-balanced Growing from bottom-up Possible to obtain greater storage efficiency through two-

three block splitting, concatenation, redistribution Can be implemented as virtual tree structures Can be adapted for variable-length records

Page 43: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 43

B-Trees

General Characteristics Information can be found at any level of the B-tree B-tree take up less space than B + tree ( B + tree  h as

additional space)

Ordered sequential access Through in-order traversal of the tree(virtual tree is

necessary) Separated record files(B-tree has only pointers) are not

workable

Page 44: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 44

B + Trees

General Characteristics Separation of index set and sequence set

Separators : copies of keys

Shallower tree than B-tree

Ordered sequential access Sequence set is truly linear

efficient access to records in order by key

Page 45: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 45

Simple Prefix B+ Trees

General Characteristics Separators : smaller than actual keys

Shallower than B + Trees

Separator compression, variable-length field management

overhead

Ordered sequential access Sequence set is truly linear (same as B + Tree)

Page 46: File StructuresSNU-OOPSLA Lab.1 Chap.10 Indexed Sequential File Access and Prefix B+ Trees 서울대학교 컴퓨터공학부 객체지향시스템연구실 SNU-OOPSLA-LAB 교수 김

File Structures SNU-OOPSLA Lab. 46

Let’s Review !!!Let’s Review !!!

10.1 Indexed Sequential Access 10.2 Maintaining a Sequence Set 10.3 Adding a Simple Index to the Sequence Set 10.5 The Contents of the Index: Separators Instead of Keys 10.6 The Simple Prefix B+ Tree Maintenance 10.7 Index Set Block size 10.8 Internal Structure of the Index Set Blocks: A variable-order B-Tree 10.9 Loading a Simple Prefix B+ Tree 10.10 B+ Trees 10.11 B-Trees, B+ Trees, and Simple Prefix B+ Trees in Perspective