53
Database Management 6. course

Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Embed Size (px)

Citation preview

Page 1: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Database Management

6. course

Page 2: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

OS and DBMS

DMBSDB OS DBMS

DBA

USER

DDLDML

DML

WHISHES

RULES

Page 3: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Steps of a query

1. SQL query2. Permission in the schema?3. Permission in the subschema?4. I/O operation5. Search6. Import7. Notification8. User workspace9. User notification

Page 4: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Data storage: disks and files

Page 5: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Data storage: disks and files

• Mass storage device (disc, drive)• I/O– READ: disc memory (RAM)– WRITE: memory disc– Time consuming

Page 6: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Why not storing everything in RAM?

• Expenses of 1GB: RAM 10 € ↔HDD 0,5 €• RAM volatilis• Tipical way of storage:– Actual data is in memory– Secondary storage is on HDD (local server, cloud)– Tertiary storage

Page 7: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Storage on disks

• Unit: disc block• Speed depends on location!

Page 8: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Components

Page 9: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Reading a block

• Access time of a block:– Seek time– Rotational delay– Transfer time: 1ms/4KB

• I/O optimization: reducing seek time and rotational delay

Page 10: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Order of data

• Frequently used blocks close to each other– Same block– Same track, same cylinder– Adjacent cylinder

• Reading is sequential• Multiple block reading saves time

Page 11: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Way of storage - RAID

• Redundant Array of Inexpensive/Independent Data

• Connecting disks logically, storing data redundantly

• Aims:– Minimizing data loss, increase reliability– Increasing capacity by more smaller/cheaper disks– Increase data access performance– Increase flexibility (can be replaced during usage)

Page 12: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Two main techniques

• Data striping– Data is partitioned (striping unit)– Partitions are distributed on several disks

• Redundancy– Reconstruction of data

Page 13: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Level 0

• Non redundant• If one of the disks fails, data is lost• Parallel reading/writing• Performance depends on the

worst disk

Page 14: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Level 1

• Mirrored• Data can be reconstructed• Parallel reading, increased velocity• Parallel writing, normal velocity• Performance depends on the worst

disk• Does not use data striping

Page 15: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Level 2

• Data striping (unit=1 bit), error-correcting codes

• ECC: redundant bits calculated from data bits (compress)

• Not used any more

Page 16: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Level 3

• Bit-Interleaved Parity• Cannot identify the failed disk• One check disk with parity information• The failed disk’s data can be recovered• Can process only one I/O at a time• Strip=1 bit

Page 17: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Level 4

• Block-Interleaved Parity• Like RAID 3, strip=disk blocks• Supports multiple users• Parity disk update can be bottle neck• In case of disk failure, reading speed reduces

Page 18: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Level 5

• Block-Interleaved Distributed Parity• Rotating parity• Parallel read and write• Similar to RAID 3 and 4 depending on the size

of strips• If a disks fails, it has to be replaced

inmediately

Page 19: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

RAID 5

• Capacity= min_capacity*(no of disks-1)• Reading speed=min_speed*(no of disks-1)

Page 20: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Level 6

• High possibility of the failure during recovery• 2 check disks• Recover from up to two disk failures• Read and write speed is equal to RAID 5

Page 21: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

RAID 0+1 and RAID 10

• RAID 0+1 • RAID 10

Page 22: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Disk space and buffering

Page 23: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Disk space management

• The lowest level of DBMS manages the space• Unit of data: page• Size of page=size of disk block• Higher levels can– Allocate and delete pages– Write / read pages

• Allows higher levels of DBMS to think of the data as a collection of pages

Page 24: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Keeping track of free blocks

• Maintain a list of free blocks with pointer to the first free block

OR

• Maintain a bitmap with one bit for each block: block is used or not

Page 25: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Using OS to manage disk space

• Possible, not common• Not portable: different file system• On 32-bit systems the largest file size is 4GB,

OS files cannot span disk devices

Page 26: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Buffer manager

• Data has to be imported into the memory (RAM) to use it• <frame#, pageid> pares are stored in tables

DB

Memory

Disc

page

free frame

Page requests

BUFFER POOL

If a requested page is not in thepool and the pool is full, thebuffer manager’s replacementpolicy controls which existingpage is replaced.

Page 27: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

When a request comes…

• If the page is not in the buffer:– Choose a frame to replace, incerase its pin count– If the dirty bit for the replacement frame is on, write

the content on the disk– Reads the requested page into the replacement frame

• Return the address of the frame to the requestor• If it can be predicted that which page will be

requested next, then multiple pages can be read (pre-fetching)

Page 28: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Buffer management

• The requestor has to unpin the request• Mark if the content of the page is modified– With the dirty bit

• The page in the buffer can be called multiple times by processes/transactions– Pin_count: page can be replaced if and only if

pin_count=0• Concurrency handling and rollback handling

can influence the replacement policy

Page 29: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Buffer replacement policies

• Least-recently-used (LRU): counts what was used and when (costs a lot)

• Clock replacement– Current frame is stored

Goes to the next until pin count=0 and referenced bit is off (not used)

– After the last, jumps to the first (like a circle)

Page 30: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Files and indexes

Page 31: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Records in files

• DBMS handles records and files• Files: collection of pages containing records• They must support– DML (insert, update, delete)– Read records (identified by record id – rid)– Read all the records (that satisfy some conditions)

Page 32: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Unordered (heap) files

• Simplest file structure• DBMS must register– pages in the file– free space in the page– records in the page

Page 33: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Heap file as a linked list

• Every page contains two pointers

HeaderPage

DataPage

DataPage

DataPage

DataPage

DataPage

DataPage Pages with

free space

Full pages

Page 34: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

• Disadvantages– Every page is in the list of free records if they have

variable length– To insert a record, we must examine several pages

before finding enough space

Page 35: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Directory-based heap file

• Maintain directoryof pages

• DBMS stores the addressof the first pageof each heap file

• Directory=collection of pages• Counter for every page: amount of free

space/entry

DataPage 1

DataPage 2

DataPage N

HeaderPage

DIRECTORY

Page 36: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Index

• Read the records sequentially• Search for a concrete rid• Records with specific conditions for its

attributes (e.g. all CLERCKs)• Value-based queries

Page 37: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Example, library

1. lokate books of Asimov2. Search for Foundation

Page 38: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

• Indexed file: Give a search key for the entries (records in files), calculate the index of this key, look for it

• Goal: speed up search• E.g. I am looking for employees of a given age, then

I can build an index which might contain <age,rid> pairs

• The pages of the index files are organized based on the indexes to find the result quickly (access methods)

Page 39: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Access methods

• B trees• B+ trees• Hash-based structures• Discussed in detail later

Page 40: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Page formats

• Data as a collection of records• Page~collection of slots, each slot contains a

record• Record identification:– <page id, slot number>=rid– Number every record and store its location in a

table

Page 41: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Fixed-length records

• All records have the same length• Insertion: locate empty slot, place there• Main issue:– Keep track of empty slots– Locate all records on a page

Page 42: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Deletion alternatives – first option

• Store records in the first N slots without gap• If a record is deleted, the last record is moved

to the gap• Advantage: finding location is easy (just offset

calculation)• The empty slots remain together at the end of

the page• Disadvantge: if the moved record is referred

externally (the rid changes)

Page 43: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Second option

• Using an array of bits, one bit/slot• If record is deleted, its bit turns off• Summary: Every page contains additional file-

level info

Page 44: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Variable-length records

• If new record is to be inserted, enough and not too big space is needed (do not waste)

• If deleted, move the others to fill the hole• Most flexible organization: directory of slots

for each page

Page 45: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Directory of slots

• Offset (pointer) and length of the records are stored

• Deletion: set offset to -1• Records can be moved since rid=(page

number,slot number[position in the directory]) does not change

• Only the record offset changes

Page 46: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

• The offset of the free space is stored• When new record is inserted and there is not

enough space, records are moved• If a record is deleted the number of the rest

record cannot be changed due to external references

• If a record is inserted, a missing number should be given to it

Page 47: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES
Page 48: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Record formats

• Number of fields and field types are stored in the system catalog

Page 49: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Fixed-length records

• Each field has fixed length (uniform for every record)

• By the offset of the record the offset of each field can be calculated easily:

Base address (B)

L1 L2 L3 L4

F1 F2 F3 F4

Address = B+L1+L2

Page 50: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Variable-length records

• Variable length fields (e.g. varchar2)• Two formats:– Separators are used: scan of the record is needed

for reading the fields– Array of integer offsets at the beginning of the

record to store the relative position of the fields and the end of the record:

Page 51: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

• The offset of the end of the record is stored• Disadvantage– Storage overhead

• Advantages– Direct access to the fields– NULL: start of the field=end of the field

Page 52: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Issues

• When insert, move the other fields• When modify, move the other fields– Page modification may cause a problem– Forwarding address is left on the page

• When a record is too big for one page– Break record to smaller records– Chain them

Page 53: Database Management 6. course. OS and DBMS DMBS DB OS DBMS DBA USER DDL DML WHISHESWHISHES RULESRULES

Thank you for your attention!