DB Curs 9 Data Storage and Index

DataBase part 08 - Data Storage and IndexDataBaseData Storage and Index

DataBase part 08 - Data Storage and Index

DataBase part 08 - Data Storage and IndexStoring data: disks and files


DataBase part 08 - Data Storage and IndexMemory Hierarchy


DataBase part 08 - Data Storage and IndexMemory Hierarchyprimarycache and main memoryprovides very fast access to datasecondary slower devices such as magnetic disks


DataBase part 08 - Data Storage and IndexMemory Hierarchytertiaryslowest, nonvolatile, optical disks, tapesamount of data is typically very large

relatively inexpensive and can store very large amounts of datagood choice for archival storage, when we need to maintain data for a long period but do not expect to access it very often


DataBase part 08 - Data Storage and IndexTertiary storagemain drawback is that they are sequential access devicesunsuitable for storing operational data, or data that is frequently accessedused to back up operational data periodically


DataBase part 08 - Data Storage and IndexMagnetic Disks


DataBase part 08 - Data Storage and IndexMagnetic Disksdata is stored in units called disk blockscontiguous sequence of bytesunit in which data is written and readblocks are arranged in concentric rings called tracks, on one or more platters


DataBase part 08 - Data Storage and IndexMagnetic Diskscylinder set of all tracks with same diametertrack is divided into arcs called sectors, whose size is a characteristic of the disk and cannot be changeddisk block can be set when disk is initialized as a multiple of sector size


DataBase part 08 - Data Storage and IndexDisk controllerinterfaces a disk drive to the computerimplements commands to read or write a sector by moving the arm assembly and transferring data to and from the disk surfaces


DataBase part 08 - Data Storage and IndexChecksumcomputed when data is written to a sector and stored with the sectorcomputed again when data is read backif sector is corrupted or the read is faulty for some reason, it is very unlikely that checksum read matches checksum writtenif it detects an error, it tries to read sector again


DataBase part 08 - Data Storage and IndexTime to access disk blockseek time is taken to move disk heads to the trackrotational delay is waiting time for desired block to rotate under disk headusually less than seek timetransfer time is time to actually read or write data in block once the head is positioned


DataBase part 08 - Data Storage and IndexPerformance Implications of Disk Structuredata must be in memory for DBMS to operate on itunit for data transfer between disk and main memory is a blockif a single item on a block is needed, entire block is transferredreading or writing a disk block is called an I/O operationtime to read or write a block varies, depending on location of the data:

access time = seek time + rotational delay + transfer


DataBase part 08 - Data Storage and IndexRAIDdisks are potential bottlenecks for system performance and storage system reliabilityRedundant Arrays of Independent Disksseveral RAID organizations, referred as RAID levels, have been proposed


DataBase part 08 - Data Storage and Indexdisk array is an arrangement of several disks, organized so as to increase performance and improve reliability of the resulting storage systemdata striping increases performance through distribution of the data over several disks to give impression of having a single large, very fast diskredundancy improves reliability, instead of having a single copy of data, redundant information is carefully organized so that in case of a disk failure, it can be used to reconstruct the contents of the failed disk


DataBase part 08 - Data Storage and IndexRedundancy schemesHamming codesrecover from single disk failurescan identify which disk has failedReed-Solomon codesrecover from up to two simultaneous disk failures


DataBase part 08 - Data Storage and IndexData stripingdata stripingdata is segmented into equal-size partitions that are distributed over multiple disksstriping unitsize of a partition


DataBase part 08 - Data Storage and IndexRAID Level 0: Nonredundantno redundant information is maintainedbest write performance of all levelsabsence of redundant information implies that no redundant information needs to be updateddoes not have the best read performancesystems with redundancy have a choice of scheduling disk accesses


DataBase part 08 - Data Storage and IndexRAID Level 1: Mirroredmost expensive solutiontwo identical copies of data on two different disks are maintainedsequential writeparallel readeffective space utilization is 50% independent of number of data disks


DataBase part 08 - Data Storage and IndexRAID Level 2:Error-Correcting Codesstriping unit is a single bit, redundancy scheme used is Hamming codenumber of check disks grows logarithmically with number of data disksread-modify-write cyclekeeps more redundant information than is necessary


DataBase part 08 - Data Storage and IndexRAID Level 3:Bit-Interleaved Paritysingle check disk with parity informationlowest overhead possibledisk controllers can easily detect which disk has failed without need of Hamming code


DataBase part 08 - Data Storage and IndexRAID Level 4:Block-Interleaved Paritystriping unit of a disk blockeffective space utilization increases with number of data disks, since always only one check disk is necessary


DataBase part 08 - Data Storage and IndexRAID Level 5:Block-Interleaved Distributed Par.improves upon Level 4 by distributing the parity blocks uniformly over all disksseveral write requests can potentially be processed in parallel, since bottleneck of unique check disk has been eliminated. read requests have higher level of parallelism


DataBase part 08 - Data Storage and IndexA RAID Level 5 system has the best performance of all RAID levels with redundancy for small and large read and large write requests


DataBase part 08 - Data Storage and IndexDisk space management


DataBase part 08 - Data Storage and IndexDisk space managerpage unit of dataallocate or deallocate a pageread or write a page


DataBase part 08 - Data Storage and Indexoften useful to allocate sequence of pagescontiguous sequence of blocks to hold data, which is frequently accessed in sequential ordercapability, if desired, must be provided by disk space manager to higher-level layers of DBMS


DataBase part 08 - Data Storage and IndexKeeping Track of Free Blocksdatabase grows and shrinks as records are inserted and deleted over timedisk space manager keeps trackwhich disk blocks are in usewhich pages are on which disk blocks


DataBase part 08 - Data Storage and IndexUsing OS File Systems to Manage Disk Spaceoperating systems also manage space on diskentire database could reside in one or more OS files for which a number of blocks are allocated (by the OS) and initializeddisk space manager is then responsible for managing space in these OS files


DataBase part 08 - Data Storage and IndexMany database systems do not rely on OS file system and instead do their own disk management, either from scratch or by extending OS facilitiesreasons are practical as well as technical


DataBase part 08 - Data Storage and Indexpractical reason is that DBMS vendor who wishes to support several OS platforms cannot assume features specific to any OS, for portabilitytechnical reason is that on a 32-bit system, largest file size is 4 GB, whereas a DBMS may want to access a single file larger than that


DataBase part 08 - Data Storage and IndexBuffer managerto understand role of buffer manager, consider that the database contains many pages, but only few pages of main memory are available for holding dataconsider a query that requires the scan of the entire fileDBMS brings pages into main memory as they are needed and, in the process, decide what existing page in main memory to replace to make space for new pagereplacement policy


DataBase part 08 - Data Storage and IndexBuffer Managerbookkeeping information, and two variables for each frame in poolnumber of times that page currently in a given frame has been requested but not released (number of current users of page) is recorded in pin count variable for that frameboolean variable dirty indicates whether page has been modified since it was brought into buffer pool from disk


DataBase part 08 - Data Storage and IndexBuffer Replacement Policiesbest-known replacement policy is least recently used (LRU)can be implemented using a queue of pointers to frames.frame is added to the end of queue when it becomes a candidate for replacementpage chosen for replacement is one in the frame at head of queue


DataBase part 08 - Data Storage and IndexLRU policies are not always best replacement strategies, particularly if many user requests require sequential scans of data


DataBase part 08 - Data Storage and IndexBuffer Management DBMS/OSDBMS can often predict reference patterns because most page references are generated by higher-level operations (such as sequential scans or particular implementations of various relational algebra operators) with a known pattern of page accessesability to predict reference patterns allows for better choice of pages to replace and makes idea of specialized buffer replacement policies more attractive in DBMS environment


DataBase part 08 - Data Storage and Indexbeing able to predict reference patterns enables use of a simple and very effective strategy called prefetching of pagesbuffer manager can anticipate next several page requests and fetch corresponding pages into memory before pages are requested


DataBase part 08 - Data Storage and IndexFiles and Indexes


DataBase part 08 - Data Storage and Indexpages are used to store records organized into logical collections or files of recordsa collection of records that may reside on several pages


DataBase part 08 - Data Storage and Indexeach record has a unique identifier called a record id, or rid can identify page containing a record by using record's ridbasic file structure that we consider, called a heap file, stores records in random order and supports retrieval of all records or retrieval of a particular record specified by its rid


DataBase part 08 - Data Storage and Indexsometimes we want to retrieve records by specifying some condition on fields of desired recordsto speed up such selections, we can build auxiliary data structures that allow us to quickly find rids of records that satisfy the given selection conditionsuch an auxiliary structure is called an index


DataBase part 08 - Data Storage and IndexHeap Filessimplest file structure, an unordered filedata in pages is not ordered in any wayonly guarantee is that one can retrieve all records in file by repeated requests for next recordevery record has unique rid, and every page is of same size


DataBase part 08 - Data Storage and IndexHeap Filessupported operations create and destroy filesinsert recorddelete record with given ridget record with given ridscan all recordsto get or delete a record with given rid, we must be able to find id of page containing record, given rid, the id of record


DataBase part 08 - Data Storage and IndexHeap Filesmust keep track of pages in each heap file in order to support scans, and we must keep track of pages that contain free space in order to implement insertion efficiently


DataBase part 08 - Data Storage and IndexLinked List of Pagesdoubly linked list of pagesDBMS can remember where first page is located by maintaining table containing pairs of {heap file name, page_1_addr} in a known location on diskcall the first page of heap file - the header page


DataBase part 08 - Data Storage and IndexLinked List of Pages


DataBase part 08 - Data Storage and IndexLinked List of Pagesdisadvantage of this scheme is that virtually all pages in a file will be on free list if records are of variable lengthbecause it is likely that every page has at least a few free bytesto insert typical record, we must retrieve and examine several pages on free list before we find one with enough free spacedirectory-based heap file organization addresses this problem


DataBase part 08 - Data Storage and IndexDirectory of PagesDBMS must remember where first directory page of each heap file is locateddirectory is itself a collection of pages and is shown as a linked listfree space can be managed by maintaining a count per entry, indicating amount of free space on page


DataBase part 08 - Data Storage and IndexIndexessometimes we want to find all records that have a given value in a particular fieldif we can find rids of all such records, we can locate page containing each record from the record's ridheap file organization does not help us to findindex is an auxiliary data structure that is intended to help us find rids of records that meet a selection condition


DataBase part 08 - Data Storage and IndexIndexesindex is just another kind of filecontaining records that direct traffic on requests for data recordsindex has an associated search key, which is a collection of one or more fields of the file of records for which we are building the indexsometimes refer to the file of records as indexed file


DataBase part 08 - Data Storage and IndexIndexesrecords stored in index file, which we refer to as entries to avoid confusion with data records, allow us to find data records with a given search key valueindex might contain {field value, rid} pairs where rid identifies a data record


DataBase part 08 - Data Storage and IndexIndex access methodspages in index file are organized in some way that allows us to quickly locate those entries in index that have a given search key valueand then follow rids in retrieved entriesorganization techniques, or data structures, for index files are called access methods, and several are known


DataBase part 08 - Data Storage and IndexIndex access methodsincluding B+ treesand hash-based structures B+ tree index files and hash-based index files are built using page allocation and manipulation facilities provided by disk space manager, just like heap files



Thank you for your kindly attention!


********************************************************

Documents

DB Curs 9 Data Storage and Index