70
Wolf-Tilo Balke Jan-Christoph Kalo Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Relational Database Systems 2 3. Indexing and Access Paths

Relational Database Systems 2 - TU Braunschweig · 3.4 Physical Tuning Relational Database Systems 2 –Wolf-Tilo Balke–Institut für ... or rollback segments ... –Storing related

  • Upload
    haanh

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Wolf-Tilo Balke

Jan-Christoph Kalo

Institut für Informationssysteme

Technische Universität Braunschweig

http://www.ifis.cs.tu-bs.de

Relational Database Systems 23. Indexing and Access Paths

3.1 Introduction to access paths

3.2 Files and blocks

3.3 Indexing

– Single-level indexes

– Multi-level indexes

– Hash indexes

3.4 Physical Tuning

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 2

3 Indexing and Access Paths

• Databases persistently store data

• Major problem: efficiency of data access

– Obviously depends on the hardware used

• But also depends to a large degree

– On the allocation of data on the disks

– On intelligent buffer management

– On creating access paths and indexing data

• Physical tuning of the database is a main task of

database administrators

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 3

3.1 Introduction to Access Paths

• For persistent storage databases are mapped into a

number of files

– Located in specially protected parts of the file system

(tablespaces, etc.)

– Actually maintained by the operating system

• Each file is partitioned into fixed length blocks (pages)

– Smallest unit transferred from/to storage

– Block size is specified on DB creation

– Is a multiple of the OS block size

– A block usually contains several data records

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 4SKS 10.5

3.1 Introduction to Access Paths

• Harddisk Sectors are abstracted by the file system to blocks

• DBMS abstracts FS blocks to DBMS blocks

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 5

3.1 Sectors and Blocks

DISK

FS Block 1 DBMS Block 1

DBMS Block 2

DBMS Block 5

DBMS Block 3

DBMS Block 4

DBMS Block 6

FS Block 2

FS Block 3

FS Block 4

FS Block 5

FS Block 6

• For data access always complete blocks (pages) are transferred from disk into the main memory

– The part used for holding copies of blocks is referred to as DB buffer and managed by the buffer manager

– Optimized (pre-)fetching of blocks greatly improves performance

• If a data record is requested by the DB

– If the block is buffered, return main memory address

– Else allocate space in the buffer, fetch the block from disk, and return main memory address

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 6SKS 10.5

3.1 Buffer Management

• Once new blocks are fetched, generally currently buffered blocks have to be evicted

– If blocks have been modified, they must be written back to disk (which is not always possible)

– Writing of blocks depends on the recovery strategy:

• Blocks that cannot yet be written are called pinned blocks

• Before checkpoints there might be a forced output of blocks

• Several buffer replacement strategies can be applied

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 7SKS 10.5

3.1 Buffer Management

• Least recently used (LRU): the ‘oldest’ block is

replaced

– Usually used in OS buffer management

– Simple, yet effective strategy

– Does not use semantics of the data

• Toss immediate: After the DB has finished to process

a block, the block is immediately replaced

– Example: block-nested loop join between tables R, S

• Take one block from R and compare against all blocks in S

• The block from R can be thrown out, but no block from S

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 8SKS 10.5

3.1 Replacement Strategies

• Expected Reuse: statistics about query frequencies

assess how useful a buffered block is

– The block with least expected usefulness is evicted

– Example: index blocks are more often

addressed than data blocks

• In any case, whatever the strategy:

Once a block has been modified and has not yet been

written to disk, it cannot be evicted!

– Note: recovery subsystem has to agree before writing a block

to disk

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 9SKS 10.5

3.1 Replacement Strategies

• Overhead & Payload (e.g., ORACLE data

blocks)

– Header contains general block information like

block address, type of segment (data, index,…)

– Table directory contains

tables that have rows

in the block

– Row directory contains

information about the

actual rows in the block

(IDs, addresses,…)

– Row data is actual data

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 10

3.2 Blocks (Pages)

• An Extent is a logical collection

of blocks (usually adjacent)

– Fixed size, once more data space is

needed for rows a new extent is

allocated

– Remember: reading adjacent

blocks improves access time

• Segments are collections of

logically connected extents

– For instance tables, index segments,

or rollback segments

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 11

3.2 Extents and Segments

• A tablespace is the logical storage space needed

for the data in a table

– A grouping of multiple files allocated on disk

• Example: Data1.ora, system.ora, test.dbf

– Good practice to have one tablespace for tables, and a

different one for indexes

CREATE TABLESPACE user_data

DATAFILE ‘udata.ora’ SIZE 10M

EXTENT MANAGEMENT LOCAL

SEGMENT SPACE MANAGEMENT AUTO

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 12

3.2 Tablespaces

• Records of different relations should be stored in

individual files

– Storing related records in the same block minimizes

disk accesses

• Multitable clustering file organization may

store records of different tables in the same block

– Good e.g. for some often occurring joins

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 13

3.2 File Organization

• Files reserve space in the file system

– File size can be specified or change dynamically

– Default values strongly differ (e.g., DB2 allocates 100 MB)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 14

3.2 File Organization

Schema file

Index file with index blocks

Files related to table “content_type_lecture”

Data file with data blocks

• Organization of records in the file

– Heap – a record can be placed anywhere in the file

where there is space

– Sequential – store records in sequential order, based

on the value of the search key of each record

– Hashing – a hash function is computed on some

attribute of each record. The result specifies in which

block of the file the record should be placed

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 15

3.2 File Organization

• Data records have to be written in a file such that the

entire record can be accessed with minimal disk accesses

– Fixed length records (easy to implement)

– Variable length records (storage space efficient)

TYPE deposit = RECORD

name : CHAR(22);

accNo : CHAR(10);

balance : REAL;

END

– If each char is 1 byte and a real

8 bytes, a record takes 40 bytes

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 16SKS 10.6

3.2 File Organization

record Name AccNo Balance

0 Burg 864442 654,55

1 Myers 967531 56,45

2 Smith 145288 457,75

• Fixed length records

– 40 bytes for the first record, 40 bytes for the second,…

– Problem: if the block size is not a multiple of 40, records may

cross boundaries

– Problem: deleting a record can be either done by marking it

as deleted, or by replacing it with some other record of the file

• Keeping deleted items? Reading slow!

• Move up all other items? Efficiency!

• Fill space with next inserted item?

Changes sequence!

• Pointer lists? Wastes space in each

record!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 17SKS 10.6

3.2 File Organization

record Name AccNo Balance

0 Burg 864442 654,55

1 Myers 967531 56,45

2 Smith 145288 457,75

• Variable length records

– Necessary for multi-table files or records that allow variable

length attributes

– Problem: How to know when a record ends?

• Introduce ‘end-of-record’ symbols or store record length at

beginning of the record. But what if a record is updated?

• Slotted page structure: header contains number of entries, end of

free space and pointers to location/size of entries. Records are

moved to use

up space: no

fragmentation!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 18SKS 10.6

3.2 File Organization

• Typical database records like in our banking example

easily fit into one block

– Name, account number, balance,…

– Usually multiple records per block

• Unspanned record organization

fills blocks only with

complete records

• Spanned organization uses

pointers to divide records

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 19EN 4.4

3.2 File Organization

i

i+1

record 1 record 2 record 3

record 4 record 5

i

i+1

record 1 record 2 record 3 4

record 54

p

• Necessary for large objects: binary large objects

(blobs) and character large objects (clobs)

• Large objects are not interpreted in databases

– Text documents, images, audio and video data

– Need to be stored in a contiguous sequence of bytes

– If an object is bigger than a block, contiguous pages

of the buffer pool have to be allocated for storage

– Sometimes preferable to

disallow direct access, but

only allow access through

file-system-like API to allow

for fragmentation

20

3.2 Non-Standard Blocks

• Indexes help to locate records in a DB file

– Creation of indexes is part of the physical tuning task of

database administrators

– Indexes often influence the actual

location of storage for a record

• Example: sequential storage,

storage via a hash function

• If the location is determined by

the index

– Not all attributes can

be directly indexed (but secondary

access paths may be used)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 21

3.3 Indexing

• When items have to be found quickly, indexing

mechanisms are used to

speed up access

– Alphabetical author

catalog in libraries

– Lexicographic ordering

– …

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 22

3.3 Basic Concepts

• Ordered by indexing field (search key)

– Attribute(s) used to look up records in a file

• An index file consists of index entries

– Records of the form

• Two basic kinds of queries

– Exact matches

• Locating a record(s) with a certain value

– Range queries

• Locating all records in a certain value range

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 23

3.3 Basic Concepts

search key record location(s)

• Two basic kinds of indexes

– Ordered indexes: search keys are stored in sorted

order

• Single-level ordered indexes

• Multi-level ordered indexes

– Hash indexes: search keys are distributed uniformly

across blocks using some hash function

• Hash function distributes records uniformly over blocks

• Hashed value of search key decides for storage block

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 24

3.3 Basic Concepts

• Index links indexing fields to logical file/block locations

– Dense indexes index all items, sparse indexes only selected ones

– Much smaller file than the data file

• Hence, a binary search on the index file requires fewer block accesses than a binary search on the data file

– Works exactly like a book’s index

• Only few pages at beginning/end, alphabetically ordered, references to respective page(s)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 25

3.3 Single-Level Ordered Indexes

• Primary indexes

– Order data by some usually unique attribute as indexing field

(primary key), store database records in this order

– Index record contains pointer to the respective storage place

(block address)

• To save entries usually there

is only a single index entry

for each block (block anchor)

– But sparse indexes do not directly

show, whether some record is in the DB

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 26EN 14.1

3.3 Single-Level Ordered Indexes

AccNo 1 1, Adams, $ 887,00

2, Bertram, $19,99AccNo 4

AccNo 73, Behaim, $ 167,00

4, Cesar, $ 1866,00

5, Miller, $179,99

6, Naders, $ 682,56

7, Ruth, $ 8642,78

8, Smith, $675,99

9, Tarrens, $ 99,00

– Advantages

• Number of blocks needed for storing the index is small

compared to data

– Not all records are indexed (non-dense)

– Index entries are smaller than data records

• Can often be kept in buffer

– Disadvantages

• Insertions and Deletions need to move data

in storage and to update index entries

affected by the shifts

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 27

3.3 Single-Level Ordered Indexes

• Clustering indexes store data records in the

order of a non-unique indexing field

– One entry in the index per distinct value of the

indexing field

– Search keys are linked to the block address of the

containing the search key

– Is a sparse index

• But existence of records with a certain key can be assessed

by index look-up

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 28

3.3 Single-Level Ordered Indexes

• Still problems with insertion/deletion

– Often one or more complete

blocks are reserved for each

single search key to allow for

block anchors

– Blocks do not have to be

adjacent, but can use block

pointers, if more space is needed

• Structurally very similar to a hash index

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 29

3.3 Single-Level Ordered Indexes

Adams 1, Adams, $ 887,00

2, Adams, $19,99Miller

Tarrans

4, Miller, $ 1866,00

5, Miller, $179,99

6, Miller, $ 682,56

7, Miller, $ 8642,78

8, Tarrens, $ 856,78

• Secondary indexes point to locations of

records regarding a non-ordering attribute

– Indexing does not affect the storage order

– There can be multiple secondary indexes for the same

DB file

• Secondary indexes are usually dense

– Objects with same or adjacent values are usually not

adjacent on disk

– If the indexing field has unique values (secondary

key) definitely all records have to be indexed

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 30

3.3 Single-Level Ordered Indexes

• If the indexing field is not unique there are several possibilities to create a secondary index

– Create a dense index by including duplicate search keys (one for each record)

– Use variable-length index entries, where each search key is assigned a list of pointers

– Keep fixed-length index entries, but point to a blockcontaining (multiple) pointers to the actual records

• Introduces a level of indirection, but allows for a sparse index

• Usually used in practice

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 31

3.3 Single-Level Ordered Indexes

• Dense secondary index or sparse with indirection

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 32

3.3 Single-Level Ordered Indexes

Adams

1, Cesar, $ 887,00

2, Tarrens, $19,99Cesar

Miller

4, Orwell, $ 1866,00

5, Miller, $179,99

6, Tarrens, $ 682,56

7, Post, $ 8642,78

8, Adams, $ 856,78

Miller

3, Miller, $19,99

9, Snyder, $ 856,78

Orwell

Post

Tarrens

Tarrens

Snyder

Adams

1, Cesar, $ 887,00

2, Tarrens, $19,99Cesar

Miller

4, Orwell, $ 1866,00

5, Miller, $ 179,99

6, Tarrens, $ 682,56

7, Post, $ 8642,78

8, Adams, $ 856,78

3, Miller, $19,99

9, Snyder, $ 856,78

Orwell

Post

Tarrens

Snyder

p

p

p

p

p

p

p

p

p

• Characteristics of secondary indexes

– Speeds up retrieval, because if it does not exist, the

entire file would have to be scanned linearly

• For non-existent primary indexes files can still be scanned in

a binary search fashion

– Uses more search time and space, because it is dense

– Secondary indexes provides a logical ordering

• Accessing records in that order might not be the most

efficient way regarding block accesses

• Each record access may fetch a new block into the buffer

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 33

3.3 Single-Level Ordered Indexes

• Improving secondary indexes for range queries

– Same block may be (unnecesarrily) accessed multiple times

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 34

3.3 Single-Level Ordered Indexes

Adams

1, Cesar, $ 887,00

2, Tarrens, $19,99Cesar

Miller

4, Orwell, $ 1866,00

5, Miller, $179,99

6, Tarrens, $ 682,56

7, Post, $ 8642,78

8, Adams, $ 856,78

Miller

3, Miller, $19,99

9, Snyder, $ 856,78

Orwell

Post

Tarrens

Tarrens

Snyder

• Simple Example: Cesar-Orwell(buffer holds only 1 block) Cesar: fetch first block Miller: evict first block,

fetch second block Miller: evict second

block, fetch first block Orwell: evict first block,

fetch second block

• Improving secondary indexes for range queries

– Remember: working on blocks in main memory is

cheap, fetching blocks from disk is expensive!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 35

3.3 Single-Level Ordered Indexes

Adams 1, Cesar, $ 887,00

2, Tarrens, $19,99Cesar

Miller

4, Orwell, $ 1866,00

5, Miller, $179,99

6, Tarrens, $ 682,56

7, Post, $ 8642,78

8, Adams, $ 856,78

Miller

3, Miller, $19,99

9, Snyder, $ 856,78

Orwell

Post

Tarrens

Tarrens

Snyder

• Improved Example: Cesar-Orwell• Fetch ‘Cesar’ and scan whole block

• Output ‘1’ & ‘3’• Mark block as completed and

evict• Fetch ‘Miller’ and scan whole block

• Output ‘4’ & ‘5’• Mark block as completed and

evict• Skip second ‘Miller’• Skip ‘Orwell’

Done

• Short Summary

– Primary and cluster indexes affect storage order, secondary indexes don’t

– Primary and clustering indexes may be sparse, secondary indexes are usually dense

• Primary and clustering indexes can use block anchors

– Index sizes

• Primary index – number of blocks

• Clustering index – number of distinct search keys

• Secondary index – up to number of records in table

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 36

3.3 Single-Level Ordered Indexes

• Example: What if a primary index does not fit in

main memory?

– Bad look-up efficiency!

• Simple multi-level solution

– Treat primary index on disk as a sequential file and

construct a sparse index on it

• outer index – sparse index of primary index

• inner index – primary index file

– If even outer index is too large to fit in main memory,

yet another level of index can be created, and so on

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 37

3.3 Multi-Level Ordered Indexes

• Multi-level indexes can further improve search

speed

– In single level indexes a binary search can be applied

to locate pointers to blocks

• Efficiency of log2N

– Multi-level indexes allow for higher search efficiency

• Fan-out of the index should be higher than 2

• Efficiency of logfan-outN

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 38EN 14.2

3.3 Multi-Level Ordered Indexes

• Concept of Multi-Level Indexes can be applied to primary,

clustering and secondary indexes as long as values are distinct

• Example: 2-Level Primary

Index

– Level 1 is Primary Index

– Level 2 is Primary Index of

level 1

– Efficiency of logfan-outN

• fan-out: Number of 1st level

blocks per 2nd level block

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 39EN 14.2

3.3 Multi-Level Ordered Indexes

1

1, Adams, $ 887,00

2, Bertram, $19,99

3, Behaim, $ 167,00

4, Cesar, $ 1866,00

5, Miller, $179,99

6, Naders, $ 682,56

7, Ruth, $ 8642,78

8, Smith, $675,99

9, Tarrens, $ 99,00

10, Arnold, $ 3442,76

11, Marks, $435,19

12, Black, $ 319,44

3

5

7

9

11

1

7

11

Level 1

Level 2

• There is a huge amount of different tree

structures for multi-level indexing

– Classical structures B-tree, B*-tree, etc.

– Multidimensional structures R-trees, Quad-trees, etc.

– Lots of literature

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 40

3.3 Multi-Level Ordered Indexes

• Index Structure based on Hashing– Idealized Access Speed: O(1)

• Basic Idea: – Fixed-Size directly addressable Index Space [0..M] containing

M+1 buckets• Buckets contain links to data

• Single link in internal hashing (i.e. in memory)

• Multiple links for external hashing (i.e. on harddisk)

– Hash Function h: ℤ → [0..M]• Maps any number to [0..M]

• Should be uniform (i.e.: same probability for any x ∈ [0..M])– Especially, should also be surjective (i.e. use the full range of [0..M])

• Needs to be deterministic (i.e.: same input → same output)

• Should be simple and fast

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 41EN 13.8.1

3.3 Hash Indexes

• Basic Idea– Convert value which is to be indexed to numeric representation

– Hash the value

– Store block anchor of value in the bucket with computed hash index

• Example: M=8

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 42EN 13.8.1

3.3 Hash Indexes

1, Cesar, $ 887.00

5, Miller, $179.99

9, Snyder, $ 856.78

Miller

Cesar

Snyder

h(Miller) = 4

h(Cesar) = 7

H(Snyder) = 1

0

1

2

3

4

5

6

7

8

• Problem: Collision– Hash functions are not injective – collision may happen

• Block pointers for full blocks (overflow list)

– Probability of collision increases with load factor• Deteriorates to linear behavior in worst case

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 43EN 13.8.1

3.3 Hash Indexes

1, Cesar, $ 887.00

5, Miller, $179.99

9, Snyder, $ 856.78

Miller

Cesar

Snyder

h(Miller) = 4

h(Cesar) = 1

h(Snyder) = 1

0

1

2

3

4

5

6

7

8

Smith h(Smith) = 1

Rogers h(Rogers) = 1

3, Rogers, $ 1866.58

7, Smith, $ 0.29

• Problem: Static Address Range– Addressable range is fixed to [0..M]

– What happens if address range is exhausted?• E.g., load factor too high, too many collisions occur

– Possible solutions• Rehashing :

– Create a new larger hashmap and rehash all values

– For example, java hashmap follow that policy

• Extendible Hashing:– Uses dynamic directory to double and half number of buckets

• Linear Hashing:– Buckets are split into two one after another and are individually

rehashed with an additional hash-function

– Not in this lecture

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 44EN 13.8.3

3.3 Hash Indexes

• Extendible Hashing

– Hash values are positive integers between [0..2n]

• Numbers can be represented in binary

• Uses a so-called trie structure

• Choose large n

– Create a directory of depth d with 2d entries for the first d bits of values

• d is global depth of directory

• Directory cells link to buckets containing links to data with hash value starting with given bit pattern– Adjacent cells may link to the same bucket depending on local

depth of bucket

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 45EN 13.8.3

3.3 Hash Indexes

• Extendible Hashing (Bucket size 3)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 46EN 13.8.3

3.3 Hash Indexes

000

001

010

011

100

101

110

111

Global Depth = 3

Local depth = 1

Local depth = 2

Local depth = 3

Local depth = 3

All hash values starting with 0

All hash values starting with 10

All hash values starting with 110

All hash values starting with 111

0_0010

0_0100

0_1010

10_000

10_101

10_110

110_000

110_010

110_100

111_001

111_010

111_111

• Insert 010000

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 47EN 13.8.3

3.3 Hash Indexes

000

001

010

011

100

101

110

111

Global Depth = 3

Local depth = 1

Local depth = 2

Local depth = 3

Local depth = 3

All hash values starting with 0

All hash values starting with 10

All hash values starting with 110

All hash values starting with 111

0_0010

0_0100

0_1010

10_000

10_101

10_110

110_000

110_010

110_100

111_001

111_010

111_111

Split this bucket, add 010000

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 48EN 13.8.3

3.3 Hash Indexes

000

001

010

011

100

101

110

111

Global Depth = 3

Local depth = 2

Local depth = 2

Local depth = 3

Local depth = 3

All hash values starting with 00

All hash values starting with 10

All hash values starting with 110

All hash values starting with 111

00_010

00_100

01_010

10_000

10_101

10_110

110_000

110_010

110_100

111_001

111_010

111_111

01_000All hash values starting with 01

Local depth = 2

• Extendible Hashing

– Allows to dynamically split and coalesce buckets as database grows and shrinks

• If bucket is full and local depth is lower than global depth: – More than one cell links to this bucket

– Bucket is split into two with increased local depth and values are distributed accordingly

– Links in cells are adjusted accordingly

• If local depth already equals global depth: – Global depth is increased by one and thus the size of the directory

is doubled

– Each cell is replaced by two cells, both of which contain the same pointer as the original cell

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 49EN 13.8.3

3.3 Hash Indexes

• Summary Hash Indexes: May perform in O(1)

– But: Management overhead can be quite large for

growing data collections

• Overflow lists have small management overhead, but may

decrease access performance to O(n)

• Rehashing is very expensive for each growth stage

• Extendible hashing has a smaller overhead and is especially

suitable for external storage hashing

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 50EN 13.8

3.3 Hash Indexes

• Why not simply index every attribute?

– Physical index on primary key, logical indexes on every

other attribute

• Results in good read efficiency

• But… whenever a DB file is modified, every

index on the file has to be updated

– Updating indices imposes overhead on database

modification (insert, delete, update)

• Especially for multi-level indexes all levels may have to be

updated

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 51

3.3 Indexing

• One of the tasks of the database administrator

• Black Magic…

– Get DB performance statistics

– Try something that seems sensible

– Get new performance statistics

– If it did not improve, reset it to previous configuration

– Try something different until satisfied...

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 52

3.4 Physical Tuning

• Black Magic today needs advanced

tools to operate correctly

• For database tuning, you need

– Snapshot Monitors

• Continuously collect cumulative statistics for

the DB within a given time span

– Event Monitors

• Hook into certain events and analyze them

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 53

3.4 Physical Tuning

• Snapshot Monitors

– Monitor collects cumulative statistics

– Internal counters set at global or session levels

– Collected statistics can be evaluated afterwards

for system tuning on general level

– Data collection is explicitly enabled or disabled

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 54

3.4 Diagnostic Database Tools

• Useful for collecting point-in-time information regarding overall DB/DBMS behavior – Locking –lists all locks & types held by all applications

– Bufferpools – cumulative stats for memory, physical & logical I/O’s, synch/asynch

– Sorts – provides detailed statistics regard sort-heap usage, overflows, # active etc.

– Tablespaces – detailed I/O and activity statistics for a given tablespace

– UOW – displays status of application Unit of Work at point-in-time

– Dynamic SQL – shows contents of package cache and related stats at point-in-time

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 55

3.4 Diagnostic Database Tools

• Snapshot Monitors

– Overhead continuously is in the 3% - 5% range

• But no I/O since counters are usually RAM resident, not written to disk

– Quick list of useful metrics to identify particular problems include:

• bufferpool hit ratios

• sort overflows/problems

• effectiveness of prefetch

• need for indexing

• transaction logging issues

• single query/application resource domination, etc.

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 56

3.4 Diagnostic Database Tools

• Snapshot Monitor: IBM DB2 Performance Expert

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 57

3.4 Diagnostic Database Tools

http://www.redbooks.ibm.com/abstracts/sg246470.html

• Event Monitors

– Collects information regarding events or chains of

events

• No continuously collected data as with snapshots

– Must be explicitly created and activated via DBMS

commands or API’s

– Are the best way to effectively diagnose and resolve

transaction problems (e.g., deadlock issues)

– Output may be directed to a database table and

results then analyzed using SQL

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 58

3.4 Diagnostic Database Tools

• Event Monitor can be used to

– Identify and rank most frequently executed or most

time consuming SQL

– Track the sequence of statements leading up to a lock

timeout or deadlock

– Track connections to the database

– Breakdown overall SQL statement execution time

into sub-operations such as sort time and use or

system CPU time

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 59

3.4 Diagnostic Database Tools

• Event Monitor: DBI Brother Panther

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 60

3.4 Diagnostic Database Tools

http://www.dbisoftware.com

• Physical DB Tuning is a complicated and intransparenttask

• Usually trial and error– Measure some hopefully meaningful metrics of your DB

usage statistics

– Adjust around on something• Mostly indexes and tablespace properties

– Measure again• If results better : Great! Continue tuning!

• If result worse : Bad! Undo everything you did and try something else.

• But what to measure and what to do?– Use best practices or try yourself!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 61

3.4 The Black Art of Physical Tuning

• What to measure?

• First, what kind of database you have?

– OLTP (Online Transaction Processing) : Database with many small, concurrent read / write queries

• Optimize for fast read write accesses

• Keep an eye on “real-time” response times

• Optimize for concurrency

– Warehouse (OLAP – Online Analytical Processing): Using larger and more complex queries for primary retrieving data

• Optimize for read accesses

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 62

3.4 Physical Tuning

• There is a huge amount of knobs and screws to

turn

– DB planning and tuning is for professionals

– Special certificates by each database vendor

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 63

3.4 Tuning Metrics

http://www.microsoft.com/learning/mcp/mcdba/

http://www-03.ibm.com/certify/certs/dm_index.shtml

IBM DB2 IBM Certified Database AssociateIBM Certified Database Administrator IBM Certified Application Developer DB2IBM Certified Advanced Database Administrator

http://education.oracle.com

• Scott Hayes

– President & CEO of Database-Brothers Inc (DBI)

• If I only had five minutes to assess the status of a

database, I would look at…

– Average Result Set Size

– Index Read Efficiency

– Synchronous Read Percentage

– Average Rows Read per DB Transaction per Table

– Average Read Time

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 64

3.4 Tuning Metrics

• ARSS - Average Result Set Size

– ARSS = RowsSelected / NumSelectQueries

– Rule of Thumb:

• ARSS ≤ 10 indicates a OLTP database

– If you think you have OLTP and ARSS>>10, there is something

wrong…

• ARSS > 10 indicates a warehouse database (OLAP)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 65

3.4 Tuning Metrics

• IREF - Index Read Efficiency

– “How many rows have to be read to retrieve one row?”

• i.e. How much overhead is within the where predicates?

– IREF = RowsRead / RowsReturned

• Well-designed indexes may evaluate a where-clause without reading additional rows!

– Rule of Thumb:

• IREF should be less than 10 for OLTP Databases

• May be higher for Warehouses (>100)

– What to do?

• Introduce more efficient indexes!

• User simpler queries!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 66

3.4 Tuning Metrics

• SRP – Synchronous Read Percentage

– Synchronous Read means the DBMS reads just the index file and the required data page (good)

– Asynchronous Read means the DBMS employs prefetching to scan for an index or data (BAD)

– Rule of Thumb:

• >90% for OLTP (>50% for warehouses) : Good

• 80%-90% (25%-50%) : Ok

• 50%-80% (<25%) Bad

– What to do?

• Improve index structures!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 67

3.4 Tuning Metrics

• TBRRTX – Average Rows Read per DB Transaction per Table

– TBRRTXtable = rowsRead / (attemptedCommits + attemptedRollBacks)

– Rule of Thumb:

• <10 for OLTP : Good

• 10-100 : Ok

• >100 : Bad

• >1000 : Horrific

– What to do?

• Improve index structures!

• Improve Queries

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 68

3.4 Tuning Metrics

• ART – Average Read Time

– The average time per read

– RT = ReadTime / (NumDataRead + NumIndexRead)

– If you think you did everything within your application

and indices, optimize read time

• Adjust size of tablespace containers

• Distribute containers across disks

– Avoid OS paging device

• Move highly access tablespaces to faster devices

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 69

3.4 Tuning Metrics

• Tree-Index Structures

– Binary Tree

• Naïve Binary Tree

• Balancing Tree

– B-Trees

• B

• B+

• B*

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 70

Next Lecture