CompSci516 Data Intensive Computing Systems · 2016-10-12 · DBMS Architecture Duke CS, Fall 2016 CompSci 516: Data Intensive Computing Systems 7 • A typical DBMS has a layered

CompSci 516DataIntensiveComputingSystems

Lecture5,6,7Storage andIndexing

Instructor:Sudeepa Roy

1DukeCS,Fall2016 CompSci 516:DataIntensiveComputingSystems

Announcements

• Homework1– DueonSeptember16(Friday),11:55pm

• Homework2:AWSaccountsetupinstructions• carefullyusethecredit• remembertoturnoffinstances(toavoidadditionalcharges)

• ConflictwithCSgraduatestudentsretreatforSeptember30(Friday)class– MidtermmovedtoOctober12(Wednesday)– Apiazzapollwillbepostedtodayforamake-upclassforthose

whoareparticipating– forothers,therewillbeaclassonSept30

– Pleasefilloutbytomorrow

DukeCS,Fall2016 CompSci 516:DataIntensiveComputingSystems 2

Wherearewenow?

Welearntü RelationalModelandQueryLanguages

ü SQL,RA,RCü Postgres(DBMS)§ HW1

ü Map-reduceandspark§ HW2

Next• DBMSInternals

– Storage– Indexing– QueryEvaluation– OperatorAlgorithms– Externalsort– QueryOptimization

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 3

ReadingMaterial

• [RG]– Storage:Chapters8.1,8.2,8.4,9.4-9.7– Index:8.3,8.5– Tree-basedindex:Chapter10.1-10.7– Hash-basedindex:Chapter11

Additionalreading• [GUW]

– Chapters8.3,14.1-14.4


Acknowledgement:Thefollowingslideshavebeencreatedadaptingtheinstructormaterialofthe[RG]bookprovidedbytheauthorsDr.Ramakrishnan andDr.Gehrke.

Whatwillwelearn?

• HowdoesaDBMSorganizefiles?– Recordformat,Pageformat

• Whatisanindex?• Whataredifferenttypesofindexes?

– Tree-basedindexing:• B+tree• insert,delete

– Hash-basedindexing• Staticanddynamic(extendiblehashing,linearhashing)

• Howdoweuseindextooptimizeperformance?


Storage


DBMSArchitecture


• AtypicalDBMShasalayeredarchitecture

• Thefiguredoesnotshowtheconcurrencycontrolandrecoverycomponents

– tobedonein“transactions”

• Thisisoneofseveralpossiblearchitectures

– eachsystemhasitsownvariations

Query Parsing, Optimization,and Execution

Relational Operators

Files and Access Methods

Buffer Management

Disk Space Management

DB

Theselayersmustconsiderconcurrencycontrolandrecovery

DataonExternalStorage• Datamustpersistondisk acrossprogramexecutionsina

DBMS– Dataishuge– Mustpersistacrossexecutions– ButhastobefetchedintomainmemorywhenDBMSprocessesthe

data

• Theunitofinformationforreadingdatafromdisk,orwritingdatatodisk,isapage

• Disks: Canretrieverandompageatfixedcost– Butreadingseveralconsecutivepagesismuchcheaperthanreading

theminrandomorder


DiskSpaceManagement• LowestlayerofDBMSsoftwaremanagesspaceondisk

• Higherlevelscalluponthislayerto:– allocate/de-allocateapage– read/writeapage

• Sizeofapage= sizeofadiskblock=dataunit

• Requestforasequenceofpagesoftensatisfiedbyallocatingcontiguousblocksondisk

• SpaceondiskmanagedbyDisk-spaceManager– Higherlevelsdon’tneedtoknowhowthisisdone,orhowfreespace

ismanaged


BufferManagement

Suppose• 1millionpagesindb,butonlyspacefor1000inmemory• Aqueryneedstoscantheentirefile• DBMShasto

– bringpagesintomainmemory– decidewhichexistingpagestoreplacetomakeroomforanew

page– calledReplacementPolicy

• ManagedbytheBuffermanager– Filesandaccessmethodsaskthebuffermanagertoaccessa

pagementioningthe“recordid”(soon)– Buffermanagerloadsthepageifnotalreadythere


BufferManagement

• DatamustbeinRAMforDBMStooperateonit• Tableof<frame#,pageid>pairsismaintained

DB

MAIN MEMORY

DISK

disk page

free frame

Page Requests from Higher Levels

BUFFER POOL

choice of frame dictatedby replacement policy


Bufferpool=mainmemoryispartitionedintoframeseithercontainsapagefromdiskorisafreeframe

WhenaPageisRequested...Foreveryframe,store• adirty bit:

– whetherthepagehasbeenmodifiedsinceithasbeenbroughttomemory

– initially0oroff

• apin-count:– thenumberoftimesapagehasbeenrequestedbutnotreleased

(andno.ofcurrentusers)– initially0– whenapageisrequested,thecountinincremented– whentherequestorreleasesthepage,countisdecremented– buffermanageronlyreadsapageintoaframewhenitspin-countis0– ifnopagewithpin-count0,buffermanagerhastowait(ora

transactionisaborted-- later)


WhenaPageisRequested...

• Checkifthepageisalreadyinthebufferpool• ifyes,incrementthepin-countofthatframe• Ifno,

– Chooseaframeforreplacementusingthereplacementpolicy– Ifthechosenframeisdirty (hasbeenmodified),writeittodisk– Readrequestedpageintochosenframe

• Pin (increasepin-countof)thepageandreturnitsaddress totherequestor

• If requests can be predicted (e.g., sequential scans), pages can be pre-fetched several pages at a time

• ConcurrencyControl&recoverymayentailadditionalI/Owhenaframeischosenforreplacement• e.g.Write-AheadLogprotocol:whenwedoTransactions


BufferReplacementPolicy

• Frameischosenforreplacementbyareplacementpolicy

• Least-recently-used(LRU)– addframeswithpin-count0totheendofaqueue– choosefromhead

• Clock– anefficientimplementationofLRU– Assign1toN(=#frames)toframes– choosenextframewithpin-count0

• FirstInFirstOut(FIFO)• Most-Recently-Used(MRU)etc.


BufferReplacementPolicy

• Policycanhavebigimpacton#ofI/O’s• Dependsontheaccesspattern• Sequentialflooding:NastysituationcausedbyLRU+

repeatedsequentialscans– Whathappenswith10framesand9pages?– Whathappenswith10framesand11pages?– #bufferframes<#pagesinfilemeanseachpagerequestineachscan

causesanI/O– MRUmuchbetterinthissituation(butnotinallsituations,ofcourse)


DBMSvs.OSFileSystem

• OperatingSystemsdodiskspaceandbuffermanagementtoo:• WhynotletOSmanagethesetasks?

• DBMScanpredictthepagereferencepatternsmuchmoreaccurately– canoptimize– adjustreplacementpolicy– pre-fetch pages– alreadyinbuffer+contiguousallocation– pinapageinbufferpool,forceapagetodisk(importantfor

implementingTransactionsconcurrencycontrol&recovery)

• DifferencesinOSsupport:portabilityissues

• Somelimitations,e.g.,filescan’tspandisks


FilesofRecords

• PageorblockisOKwhendoingI/O,buthigherlevelsofDBMSoperateonrecords,andfilesofrecords

• FILE:Acollectionofpages,eachcontainingacollectionofrecords

• Mustsupport:– insert/delete/modifyrecord– readaparticularrecord(specifiedusingrecordid)– scanallrecords(possiblywithsomeconditionsontherecordstoberetrieved)


FileOrganization

• Fileorganization:Methodofarrangingafileofrecordsonexternalstorage– Onefilecanhavemultiplepages– Recordid(rid)issufficienttophysicallylocatethepagecontainingtherecordondisk

– Indexes aredatastructuresthatallowustofindtherecordidsofrecordswithgivenvaluesinindexsearchkeyfields

• NOTE:Severalusesof“keys”inadatabase– Primary/foreign/candidate/superkeys– Indexsearchkeys


AlternativeFileOrganizationsManyalternativesexist,eachidealforsomesituations,and

notsogoodinothers:• Heap(randomorder)files: Suitablewhentypicalaccessisa

filescanretrievingallrecords• SortedFiles:Bestifrecordsmustberetrievedinsome

order,oronlya“range”ofrecordsisneeded.• Indexes:Datastructurestoorganizerecordsviatreesor

hashing– Likesortedfiles,theyspeedupsearchesforasubsetofrecords,

basedonvaluesincertain(“searchkey”)fields– Updatesaremuchfasterthaninsortedfiles


Unordered(Heap)Files

• Simplestfilestructurecontainsrecordsinnoparticularorder

• Asfilegrowsandshrinks,diskpagesareallocatedandde-allocated

• Tosupportrecordleveloperations,wemust:– keeptrackofthepages inafile– keeptrackoffreespaceonpages– keeptrackoftherecords onapage

• Therearemanyalternativesforkeepingtrackofthis


HeapFileImplementedasaList

• TheheaderpageidandHeapfilenamemustbestoredsomeplace

• Eachpagecontains2`pointers’plusdata• Problem:

– toinsertanewrecord,wemayneedtoscanseveralpagesonthefreelisttofindonewithsufficientspace

HeaderPage

DataPage

DataPage

DataPage

DataPage

DataPage

DataPage Pages with

Free Space

Full Pages


HeapFileUsingaPageDirectory

• Theentryforapagecanincludethenumberoffreebytesonthepage.

• Thedirectoryisacollectionofpages– linkedlistimplementationofdirectoryisjustonealternative– Muchsmallerthanlinkedlistofallheapfilepages!

DataPage 1

DataPage 2

DataPage N

HeaderPage

DIRECTORY


Howdowearrangeacollectionofrecordsonapage?

• Eachpagecontainsseveralslots– oneforeachrecord

• Recordisidentifiedby<page-id,slot-number>

• Fixed-LengthRecords• Variable-LengthRecords

• Forboth,thereareoptionsfor– Recordformats(howtoorganizethefieldswithinarecord)– Pageformats(howtoorganizetherecordswithinapage)


PageFormats:FixedLengthRecords

• Recordid=<pageid,slot#>• Packed:movingrecordsforfreespacemanagementchangesrid;maynotbe

acceptable• Unpacked:useabitmap– scanthebitarraytofindanemptyslot• Eachpagealsomaycontainadditionalinfoliketheidofthenextpage(notshown)

Slot 1Slot 2

Slot N

. . . . . .

N M10. . .M ... 3 2 1

PACKED UNPACKED, BITMAP

Slot 1Slot 2

Slot N

FreeSpace

Slot M11

number of records

numberof slots


PageFormats:VariableLengthRecords

• Needtofindapagewiththerightamountofspace– Toosmall– cannotinsert– Toolarge– wasteofspace

• ifarecordisdeleted,needtomovetherecordssothatallfreespaceiscontiguous– needabilitytomoverecordswithinapage

• Canmaintainadirectoryofslots(nextslide)– <record-offset,record-length>– deletion=setrecord-offsetto-1

• Record-idrid=<page,slot-in-directory>remainsunchanged


PageFormats:VariableLengthRecords

• Canmoverecordsonpagewithoutchangingrid– so,attractiveforfixed-lengthrecordstoo

• Store(record-offset,record-length)ineachslot• rid-sunaffectedbyrearrangingrecordsinapage

Page iRid = (i,N)

Rid = (i,2)

Rid = (i,1)

Pointerto startof freespace

SLOT DIRECTORY

N . . . 2 120 16 24 N

# slots


RecordFormats:FixedLength

• Eachfieldhasafixedlength– forallrecords– thenumberoffieldsisalsofixed– fieldscanbestoredconsecutively

• Informationaboutfieldtypessameforallrecordsinafile– storedinsystemcatalogs

• Findingi-th fielddoesnotrequirescanofrecord– giventheaddressoftherecord,addressofafieldcanbeobtained

easily

Base address (B)

L1 L2 L3 L4

F1 F2 F3 F4

Address = B+L1+L2


RecordFormats:VariableLength• Cannotusefixed-lengthslotsforrecords• Twoalternativeformats(#fieldsisfixed):

• Second offers direct access to i-th field, efficient storage of nulls (special don’t know value); small directory overhead

• Modification may be costly (may grow the field and not fit in the page)

4 $ $ $ $

FieldCount

Fields Delimited by Special Symbols

F1 F2 F3 F4

F1 F2 F3 F4

Array of Field Offsets

1.usedelimiters

2.useoffsetsatthestartofeachrecord


Lecture5endshere

Indexes


Lecture6startshere

Announcements

• Homework1– DueTODAY:September16(Friday),11:55pm

• ConflictwithCSgraduatestudentsretreatforSeptember30(Friday)class– MidtermmovedtoOctober12(Wednesday)– RegularclassonSeptember30– MakeuplectureonSeptember29(Thursday),4:40-5:55pm,LSRCD309:onlyforstudentsgoingtoCSgradretreat(roomaccommodates~10people)


Indexes

• Anindexonafilespeedsupselectionsonthesearchkeyfieldsfortheindex

– Anysubsetofthefieldsofarelationcanbethesearchkeyforanindexontherelation.

– “Searchkey”isnotthesameas“key”key=minimalsetoffieldsthatuniquelyidentifyatuple

• Anindexcontainsacollectionofdataentries,andsupportsefficientretrievalofalldataentries k*withagivenkeyvaluek


AlternativesforDataEntryk*inIndexk

• Inadataentryk*wecanstore:1. (Alternative1)Theactualdatarecordwithkeyvalue k,

or2. (Alternative2)<k,rid>

• rid=recordofdatarecordwithsearchkeyvalue k,or

3. (Alternative3)<k,rid-list>• listofrecordidsofdatarecordswithsearchkeyk>

• Choiceofalternativefordataentriesisorthogonaltotheindexingtechniqueusedtolocatedataentrieswithagivenkeyvaluek


AlternativesforDataEntries:Alternative1

• Indexstructureisafileorganizationfordatarecords– insteadofaHeapfileorsortedfile

• HowmanydifferentindexescanuseAlternative1?• AtmostoneindexcanuseAlternative1

– Otherwise,datarecordsareduplicated,leadingtoredundantstorageandpotentialinconsistency

• Ifdatarecordsareverylarge,#pageswithdataentriesishigh– Impliessizeofauxiliaryinformationintheindexisalsolarge

• Inadataentryk*wecanstore:1. Theactualdatarecordwithkeyvalue k2. <k,rid>

• rid=recordofdatarecordwithsearchkeyvalue k3. <k,rid-list>

• listofrecordidsofdatarecordswithsearchkeyk>


Advantages/Disadvantages?

AlternativesforDataEntries:Alternative2,3

• Dataentriestypicallymuchsmallerthandatarecords– So,betterthanAlternative1withlargedatarecords– Especiallyifsearchkeysaresmall.

• Alternative3morecompactthanAlternative2– butleadstovariable-sizedataentriesevenifsearchkeyshavefixedlength.

• Inadataentryk*wecanstore:1. Theactualdatarecordwithkeyvalue k2. <k,rid>

• rid=recordofdatarecordwithsearchkeyvalue k3. <k,rid-list>

• listofrecordidsofdatarecordswithsearchkeyk>


Advantages/Disadvantages?

IndexClassification

• Primaryvs.secondary• Clusteredvs.unclustered• Tree-basedvs.Hash-based


Primaryvs.SecondaryIndex

• Ifsearchkeycontainsprimarykey,thencalledprimaryindex,otherwisesecondary– Unique index:Searchkeycontainsacandidatekey

• Duplicatedataentries:– iftheyhavethesamevalueofsearchkeyfieldk– Primary/uniqueindexneverhasaduplicate– Othersecondaryindexcanhaveduplicates


Clusteredvs.Unclustered Index

• Iforderofdatarecordsinafileisthesameas,or`closeto’,orderofdataentriesinanindex,thenclustered,otherwiseunclustered

– Alternative1impliesclustered– 2,3aretypicallyunclustered• unlesssortedaccordingtothesearchkey

– Inpractice,clusteredalsoimpliesAlternative1(sincesortedfilesarerare)

– Afilecanbeclusteredonatmostonesearchkey– Costofretrievingdatarecords(rangequeries)throughindexvaries

greatlybasedonwhetherindexisclusteredornot


• SupposethatAlternative(2)isusedfordataentries,andthatthedatarecordsarestoredinaHeapfile

• Tobuildclusteredindex,firstsorttheHeapfile– withsomefreespaceoneachpageforfutureinserts– Overflowpagesmaybeneededforinserts– Thus,datarecordsare`closeto’,butnotidenticalto,sorted

Index entries

Data entries

direct search for

(Index File)(Data file)

Data Records

data entries

Data entries

Data Records

CLUSTERED UNCLUSTERED


Clusteredvs.Unclustered Index

Methodsforindexing

• Tree-based• Hash-based

• (indetaillater)


SystemCatalogs

• Foreachindex:– structure(e.g.,B+tree)andsearchkeyfields

• Foreachrelation:– name,filename,filestructure(e.g.,Heapfile)– attributenameandtype,foreachattribute– indexname,foreachindex– integrityconstraints

• Foreachview:– viewnameanddefinition

• Plusstatistics,authorization,bufferpoolsize,etc.• (describedin[RG]12.1)

Catalogs are themselves stored as relations!DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 40

RememberTerminology

• Indexsearchkey(key):k– Usedtosearcharecord

• Dataentry:k*– Pointedtobyk– Containsrecordid(s)orrecorditself

• Recordsordata– Actualtuples– Pointedtobyrecordids


INDEXdoesthis

Tree-basedIndexandB+-Tree


RangeSearches

• ``Findallstudentswithgpa >3.0’’– Ifdataisinsortedfile,dobinarysearchtofindfirstsuchstudent,thenscantofindothers.

– Costofbinarysearchcanbequitehigh.


Indexfileformat

• Simpleidea:Createan“indexfile”– <first-key-on-page,pointer-to-page>,sortedonkeys

Can do binary search on (smaller) index filebut may still be expensive: apply this idea repeatedly

Page 1 Page 2 Page NPage 3 Data File

k2 kNk1 Index File

P0 K 1 P 1 K 2 P 2 K m P m

index entry


IndexedSequentialAccessMethod(ISAM)

• Leaf-pagescontaindataentry– alsosomeoverflowpages• DBMSorganizeslayoutoftheindex– astaticstructure• Ifanumberofinsertstothesameleaf,alongoverflowchaincanbecreated

– affectstheperformance

Leaf pages contain data entries.

Non-leafPages

PagesOverflow

pagePrimary pages

Leaf


B+Tree• MostWidelyUsedIndex

– adynamicstructure• Insert/deleteatlogF Ncost=heightofthetree

– F=fanout,N=no.ofleafpages– treeismaintainedheight-balanced

• Minimum50%occupancy– Eachnodecontainsd <=m<=2d entries– Rootcontains1<=m<=2dentries– Theparameterd iscalledtheorder ofthetree

• Supportsequalityandrange-searchesefficiently

Index Entries

Data Entries("Sequence set")

(Direct search)Theindex-file


B+TreeIndexes

• Leaf pages contain data entries, and are chained (prev & next)• Non-leaf pages have index entries; only used to direct searches:

P0 K 1 P 1 K 2 P 2 K m P m

index entry

Non-l eafPages

Pages (Sorted by search key)

Leaf


ExampleB+Tree

• Searchbeginsatroot,andkeycomparisonsdirectittoaleaf

• Searchfor5*,15*,alldataentries>=24*...Based on the search for 15*, we knowit is not in the tree!Root

17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

13


ExampleB+Tree

• Find– 28*?– 29*?– All>15*and<30*

2* 3*

Root

17

30

14* 16* 33* 34* 38* 39*

135

7*5* 8* 22* 24*

27

27* 29*

Entries<=17 Entries>17

Notehowdataentriesinleaflevelaresorted


B+TreesinPractice

• Typicalorder:d=100.Typicalfill-factor:67%– averagefanout F=133

• Typicalcapacities:– Height4:1334 =312,900,700records– Height3:1333 =2,352,637records

• Canoftenholdtoplevelsinbufferpool:– Level1=1page=8Kbytes– Level2=133pages=1Mbyte– Level3=17,689pages=133MBytes


InsertingaDataEntryintoaB+Tree

• FindcorrectleafL• PutdataentryontoL

– IfLhasenoughspace,done– Else,mustsplit L

• intoLandanewnodeL2• Redistributeentriesevenly,copyupmiddlekey.• InsertindexentrypointingtoL2intoparentofL.

• Thiscanhappenrecursively– Tosplitindexnode,redistributeentriesevenly,butpushupmiddlekey

• Contrastwithleafsplits• Splits“grow”tree;rootsplitincreasesheight.

– Treegrowth:getswider oroneleveltallerattop.


Seethisslidelater,First,seeexamplesonthenextfewslides

Root

17 24 30

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

13

Inserting8*intoExampleB+Tree

2* 3* 5* 7* 8*

5Entry to be inserted in parent node.(Note that 5 iscontinues to appear in the leaf.)

s copied up and

• Copy-up:5appearsinleafandthelevelabove

• Observehowminimumoccupancyisguaranteed

STEP-1


• Notedifferencebetweencopy-upandpush-up

• Whatisthereasonforthisdifference?

• Alldataentriesmustappearasleaves– (foreasyrangesearch)

• nosuchrequirementforindexes– (soavoidredundancy)

appears once in the index. Contrast

5 24 30

17

13

Entry to be inserted in parent node.(Note that 17 is pushed up and only

this with a leaf split.)

Root

17 24 30

2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

13

Inserting8*intoExampleB+Tree

STEP-2

5* 7* 8*

Needtosplitparent

5

ExampleB+TreeAfterInserting8*

• Notice that root was split, leading to increase in height.

• In this example, we can avoid split by re-distributing entries (insert 8 to the 2nd leaf node from left and copy it up instead of 13)

• however, this is usually not done in practice – since need to access 1-2 extra pages always (for two siblings), and average occupancy may remain unaffected as the file grows

2* 3*

Root17

24 30

14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

135

7*5* 8*


DeletingaDataEntryfromaB+Tree

• Startatroot,findleafLwhereentrybelongs• Removetheentry

– IfLisatleasthalf-full,done!– IfLhasonlyd-1entries,

• Trytore-distribute,borrowingfromsibling(adjacentnodewithsameparentasL)

• Ifre-distributionfails,merge Landsibling

• Ifmergeoccurred,mustdeleteentry(pointingtoLorsibling)fromparentofL

• Mergecouldpropagatetoroot,decreasingheight

Eachnon-rootnodecontainsd <=m<=2d entries


Seethisslidelater,First,seeexamplesonthenextfewslides

ExampleTree:Delete19*

• Wehadinserted8*• Nowdelete19*• Easy


2* 3*

Root17

24 30

14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

135

7*5* 8*

Beforedeleting19*



2* 3*

Root17

24 30

14* 16* 20* 22* 24* 27* 29* 33* 34* 38* 39*

135

7*5* 8*

Afterdeleting19*



2* 3*

Root17

24 30

14* 16* 20* 22* 24* 27* 29* 33* 34* 38* 39*

135

7*5* 8*

Beforedeleting20*


• <2entriesinleaf-node• Redistribute


2* 3*

Root17

24 30

14* 16* 22* 24* 27* 29* 33* 34* 38* 39*

135

7*5* 8*

Afterdeleting20*- step1

• Noticehowmiddlekeyiscopiedup

2* 3*

Root

17

30

14* 16* 33* 34* 38* 39*

135

7*5* 8* 22* 24*

27

27* 29*


Afterdeleting20*- step2


2* 3*

Root

17

30

14* 16* 33* 34* 38* 39*

135

7*5* 8* 22* 24*

27

27* 29*

Beforedeleting24*

ExampleTree: ...AndThenDelete24*


• Onceagain,imbalanceatleaf• Canweborrowfromsibling(s)?• No– d-1anddentries(d=2)• Needtomerge

2* 3*

Root

17

30

14* 16* 33* 34* 38* 39*

135

7*5* 8* 22*

27

27* 29*

Afterdeleting24*- Step1



• Imbalanceatparent• Mergeagain• Butneedto“pulldown”rootindexentry

2* 3*

Root

17

14* 16* 33* 34* 38* 39*

135

7*5* 8* 22*

30

27* 29*

Afterdeleting24*- Step2


• Observe`toss’ ofoldindexentry27


because,threeindex5,13,30butfivepointerstoleaves

FinalExampleTree

2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39*5* 8*

Root30135 17


ExampleofNon-leafRe-distribution

• Anintermediatetreeisshown• Incontrasttopreviousexample,canre-distributeentryfromleftchildof

roottorightchild

Root

135 17 20

22

30

14* 16* 17* 18* 20* 33* 34* 38* 39*22* 27* 29*21*7*5* 8*3*2*


AfterRe-distribution• Intuitively,entriesarere-distributedby`pushingthrough’the

splittingentryintheparentnode.– Itsufficestore-distributeindexentrywithkey20;we’vere-distributed

17aswellforillustration.

14* 16* 33* 34* 38* 39*22* 27* 29*17* 18* 20* 21*7*5* 8*2* 3*

Root

135

17

3020 22


Duplicates

• FirstOption:– Thebasicsearchalgorithmassumesthatallentrieswiththe

samekeyvalueresidesonthesameleafpage– Iftheydonotfit,useoverflowpages(likeISAM)

• SecondOption:– Severalleafpagescancontainentrieswithagivenkeyvalue– Searchfortheleftmostentrywithakeyvalue,andfollowthe

leaf-sequencepointers– Needmodificationinthesearchalgorithm

• ifk*=<k,rid>,severalentrieshavetobesearched– Orincluderidink– becomesuniqueindex,noduplicate– Ifk*=<k,rid-list>,somesolution,butifthelistislong,againa

singleentrycanspanmultiplepages


ANoteon`Order’

• Order(d)– denotesminimumoccupancy

• replacedbyphysicalspacecriterioninpractice(`atleasthalf-full’)

– Indexpagescantypicallyholdmanymoreentriesthanleafpages– Variablesizedrecordsandsearchkeysmeandifferentnodeswill

containdifferentnumbersofentries.– Evenwithfixedlengthfields,multiplerecordswiththesamesearchkey

value(duplicates)canleadtovariable-sizeddataentries(ifweuseAlternative(3))


Summary

• Tree-structuredindexesareidealforrange-searches,alsogoodforequalitysearches

• ISAMisastaticstructure– Onlyleafpagesmodified;overflowpagesneeded– Overflowchainscandegradeperformanceunlesssizeofdatasetand

datadistributionstayconstant

• B+treeisadynamicstructure– Inserts/deletesleavetreeheight-balanced;logF Ncost– Highfanout (F)meansdepthrarelymorethan3or4– Almostalwaysbetterthanmaintainingasortedfile– Mostwidelyusedindexindatabasemanagementsystemsbecauseof

itsversatility.– OneofthemostoptimizedcomponentsofaDBMS


Hash-basedIndex


Hash-BasedIndexes

• Recordsaregroupedintobuckets– Bucket=primarypagepluszeroormore overflowpages

• Hashingfunction h:– h(r)=bucketinwhich(dataentryfor)recordrbelongs– h looksatthesearchkeyfieldsofr– Noneedfor“indexentries”inthisscheme


Example:Hash-basedindex


h1AGE

Smith,44,3000Jones,40,6003Tracy,44,5004

Ashby,25,3000Basu,33,4003

Bristow,29,2007

Cass,50,5004Daniels,22,6003

h1(AGE)=00

h1(AGE)=01

h1(AGE)=10

h2

3000300050045004

4003200760036003

h2(SAL)=01

h2(AGE)=00

Fileof<SAL,rid>pairshashedonSAL

EmployeeFilehashedonAGE

IndexorganizedfilehashedonAGE,withAuxiliaryindexonSAL

Alternative1 Alternative2

Introduction

• Hash-basedindexesarebestforequalityselections– Findallrecordswithname=“Joe”– Cannotsupportrangesearches– Butusefulinimplementingrelationaloperatorslikejoin(later)

• Staticanddynamichashingtechniquesexist– trade-offssimilartoISAMvs.B+trees


StaticHashing• Pagescontainingdata=acollectionofbuckets

– eachbuckethasoneprimarypage,alsopossiblyoverflowpages

– bucketscontaindataentriesk*

h(key)modN

hkey

Primary bucket pages Overflow pages

20

N-1


StaticHashing• #primarypagesfixed

– allocatedsequentially,neverde-allocated,overflowpagesifneeded.

• h(k)modN=buckettowhichdataentrywithkeykbelongs– N=#ofbuckets

h(key)modN

hkey

Primary bucket pages Overflow pages

20

N-1


StaticHashing• Hashfunctionworksonsearchkeyfieldofrecordr

– Mustdistributevaluesoverrange0...N-1– h(key)=(a*key+b)usuallyworkswell

• bucket=h(key)modN– aandbareconstants– chosentotuneh

• Advantage:– #bucketsknown– pagescanbeallocatedsequentially– searchneeds1I/O(ifnooverflowpage)– insert/deleteneeds2I/O(ifnooverflowpage)(why2?)

• Disadvantage:– Longoverflowchainscandevelopiffilegrowsanddegradeperformance– Orwasteofspaceiffileshrinks

• Solutions:– keepsomepagessay80%fullinitially– Periodicallyrehash ifoverflowpages(canbeexpensive)– oruseDynamicHashing


DynamicHashingTechniques

• ExtendibleHashing• LinearHashing


ExtendibleHashing• Considerstatichashing• Bucket(primarypage)becomesfull

• Whynotre-organizefilebydoubling#ofbuckets?– Readingandwriting(double#pages)allpagesisexpensive

• Idea:Usedirectoryofpointerstobuckets– double#ofbucketsbydoublingthedirectory,splittingjustthe

bucketthatoverflowed– Directorymuchsmallerthanfile,sodoublingitismuchcheaper– Onlyonepageofdataentriesissplit– Nooverflowpage(newbucket,nonewoverflowpage)– Trickliesinhowhashfunctionisadjusted


Example• Directoryisarrayofsize4

– eachelementpointstoabucket– #bitstorepresent=log4=2=

globaldepth

• Tofindbucketforsearchkeyr– takelastglobaldepth#bitsof

h(r)– assumeh(r)=r– Ifh(r)=5=binary101– itisinbucketpointedtoby01

00

01

10

11

2

2

2

2

2

LOCAL DEPTH

GLOBAL DEPTH

DIRECTORY

Bucket A

Bucket B

Bucket C

Bucket D

DATA PAGES

10*

1* 21*

4* 12* 32* 16*

15* 7* 19*

5*

Duke%CS,%Spring%2016 CompSci%516:%Data%Intensive%Computing%Systems 6213

ExampleInsert: • If bucket is full, split it• allocate new page• re-distribute

Supposeinserting13*• binary=1101• bucket01• Hasspace,insert

00

01

10

11

2

2

2

2

2

LOCAL DEPTH

GLOBAL DEPTH

DIRECTORY

Bucket A

Bucket B

Bucket C

Bucket D

DATA PAGES

10*

1* 21*

4* 12* 32* 16*

15* 7* 19*

5*


ExampleInsert: • If bucket is full, split it• allocate new page• re-distribute

Supposeinserting20*• binary=10100• bucket00• Alreadyfull• Tosplit,considerlastthreebitsof10100• Lasttwobitsthesame00– thedataentry

willbelongtooneofthesebuckets• Thirdbittodistinguishthem

13*00

01

10

11

2

2

2

2

2

LOCAL DEPTH

GLOBAL DEPTH

DIRECTORY

Bucket A

Bucket B

Bucket C

Bucket D

DATA PAGES

10*

1* 21*

4* 12* 32* 16*

15* 7* 19*

5*


20*

00011011

2 2

2

2

LOCAL DEPTH 2

2

DIRECTORY

GLOBAL DEPTHBucket A

Bucket B

Bucket C

Bucket D

Bucket A2(`split image'of Bucket A)

1* 5* 21*13*

32*16*

10*

15* 7* 19*

4* 12*

19*

2

2

2

000001010011100101

110111

3

3

3DIRECTORY

Bucket A

Bucket B

Bucket C

Bucket D

Bucket A2(new `split image'of Bucket A)

32*

1* 5* 21*13*

16*

10*

15* 7*

4* 20*12*

LOCAL DEPTH

GLOBAL DEPTH


Globaldepth:Max#ofbitsneededtotellwhichbucketanentrybelongsto

Localdepth:#ofbitsusedtodetermineifanentrybelongstothisbucket• alsodenoteswhetheradirectorydoublingisneededwhilesplitting• nodirectorydoublingneededwhen9*=1001isinserted(LD<GD)

Example

Whendoesbucketsplitcausedirectorydoubling?

• Beforeinsert,localdepthofbucket=globaldepth• Insertcauseslocaldepthtobecome>globaldepth

• directoryisdoubledbycopyingitoverand`fixing’pointertosplitimagepage


CommentsonExtendibleHashing• Ifdirectoryfitsinmemory,equalitysearchansweredwithone

diskaccess(toaccessthebucket);elsetwo.– 100MBfile,100bytes/rec,4KBpagesize,contains106 records(asdata

entries)and25,000directoryelements;chancesarehighthatdirectorywillfitinmemory.

– Directorygrowsinspurts,and,ifthedistributionofhashvaluesisskewed,directorycangrowlarge

– Multipleentrieswithsamehashvaluecauseproblems

• Delete:– Ifremovalofdataentrymakesbucketempty,canbemergedwith`split

image’– Ifeachdirectoryelementpointstosamebucketasitssplitimage,can

halvedirectory.


Lecture6endshere

LinearHashing

• Thisisanotherdynamichashingscheme– analternativetoExtendibleHashing

• LHhandlestheproblemoflongoverflowchains– withoutusingadirectory– handlesduplicatesandcollisions– veryflexiblew.r.t.timingofbucketsplits


Lecture7 startshere

LinearHashing:BasicIdea• Useafamilyofhashfunctionsh0,h1,h2,...

– hi(key)=h(key)mod(2iN)– N=initial#buckets– hissomehashfunction(rangeisnot0toN-1)– IfN=2d0,forsomed0,hi consistsofapplyinghandlookingatthelastdi bits,wheredi =d0 +i

• Note:hi(key)=h(key)mod(2d0+i)– hi+1doublestherangeofhi

• ifhi mapstoMbuckets,hi+1 mapsto2Mbuckets• similartodirectorydoubling

– SupposeN=32,d0 =5• h0 =hmod32(last5bits)• h1 =hmod64 (last6bits)• h2 =hmod128 (last7bits)etc.


LinearHashing:Rounds

• DirectoryavoidedinLHbyusingoverflowpages,andchoosingbuckettosplitround-robin

• DuringroundLevel,onlyhLevel andhLevel+1 areinuse

• Thebucketsfromstarttolastaresplitsequentially– thisdoublestheno.ofbuckets

• Therefore,atanypointinaround,wehave– bucketsthathavebeensplit– bucketsthatareyettobesplit– bucketscreatedbysplitsinthisround


OverviewofLHFile

Levelh

Buckets that existed at thebeginning of this round:

this is the range of

NextBucket to be split if hLevel (r) Buckets split in this round:

is in this range, must use

`split image' bucket.hLevel + 1 (r) to decide if entry is in

`split image' buckets:created (through splittingof other buckets) in this round


Next - 1

• Buckets0toNext-1havebeensplit

• NexttoNLevel yettobesplit• RoundendswhenallNRinitial

(forroundR)bucketsaresplit

0

NLevel

is in this range, no need

onblackboard

if hLevel (r)

• InthemiddleofaroundLevel– originally0toNLevel

OverviewofLHFile• InthemiddleofaroundLevel– originally0toNLevel

Levelh

Buckets that existed at thebeginning of this round:

this is the range of

NextBucket to be split if hLevel (r) Buckets split in this round:

is in this range, must use

`split image' bucket.hLevel + 1 (r) to decide if entry is in

`split image' buckets:created (through splittingof other buckets) in this round


Next - 1

• Buckets0toNext-1havebeensplit

• NexttoNLevel yettobesplit• RoundendswhenallNRinitial

(forroundR)bucketsaresplit

0

NLevel

is in this range, no need

onblackboard

if hLevel (r)

• Search:Tofindbucketfordataentryr,findhLevel(r):• IfhLevel(r)inrange`Next toNLevel ’ ,rbelongshere.• Else,rcouldbelongtobuckethLevel(r) orhLevel(r)+NR

• ApplyhLevel+1(r)tofindout

LinearHashing:Insert

• Insert:FindbucketbyapplyinghLevel /hLevel+1:– Ifbuckettoinsertintoisfull:

1. Addoverflowpageandinsertdataentry2. SplitNext bucketandincrementNext

• Note:Wearegoingtoassumethatasplitis`triggered’wheneveraninsertcausesthecreationofanoverflowpage,butingeneral,wecouldimposeadditionalconditionsforbetterspaceutilization([RG],p.380)


ExampleofLinearHashing

0hh

1

(Thisinfoisforillustrationonly!)

Level=0, N0=4=2d0 ,d0=2

00

01

10

11

000

001

010

011

(Theactualcontentsofthelinearhashedfile)

Next=0PRIMARYPAGES

Data entry rwith h(r)=5

Primary bucket page

44* 36*32*

25*9* 5*

14* 18*10*30*

31*35* 11*7*

• Insert43*=101011• h0(43)=11• Full• Insertinanoverflowpage• NeedasplitatNext(=0)• Entriesin00isdistributedto

000and100



0hh

1

(Thisinfoisforillustrationonly!)

00

01

10

11

000

001

010

011

(Theactualcontentsofthelinearhashedfile)

Next=0PRIMARYPAGES

Data entry rwith h(r)=5

Primary bucket page

44* 36*32*

25*9* 5*

14* 18*10*30*

31*35* 11*7*

0hh

1

00

01

10

11

000

001

010

011

Next=1

PRIMARYPAGES

44* 36*

32*

25*9* 5*

14* 18*10*30*

31*35* 11*7*

OVERFLOWPAGES

43*

00100

• Nextisincremented aftersplit• Notethedifferencebetweenoverflowpageof11

andsplitimageof00(000and100)


Level=0, N0=4=2d0 ,d0=2 Level=0, N0=4=2d0 ,d0=2


0hh

1

00

01

10

11

000

001

010

011

Next=1

PRIMARYPAGES

44* 36*

32*

25*9* 5*

14* 18*10*30*

31*35* 11*7*

OVERFLOWPAGES

43*

00100

• Searchfor18*=10010• betweenNext(=1)and4• thisbuckethasnotbeensplit

• 18shouldbehere

• Searchfor32*=100000or44*=101100• Between0andNext-1

• Needh1

• Notallinsertiontriggerssplit• Insert37*=100101• Hasspace

• SplittingatNext?• Nooverflowbucketneeded• Justcopyattheimage/original

• Next=Nlevel-1andasplit?• Startanewround• IncrementLevel• Nextresetto0


Level=0, N0=4=2d0 ,d0=2


0hh

1

00

01

10

11

000

001

010

011

Next=1

PRIMARYPAGES

44* 36*

32*

25*9* 5*

14* 18*10*30*

31*35* 11*7*

OVERFLOWPAGES

43*

00100

• Notallinsertiontriggerssplit• Insert37*=100101

• Hasspace


Level=0, N0=4=2d0 ,d0=2

0hh

1

00

01

10

11

000

001

010

011

Next=1

PRIMARYPAGES

44* 36*

32*

25*9* 5*

14* 18*10*30*

31*35* 11*7*

OVERFLOWPAGES

43*

00100

Level=0, N0=4=2d0 ,d0=2

37*


0hh

1

00

01

10

11

000

001

010

011

Next=1

PRIMARYPAGES

44* 36*

32*

25*9* 5*

14* 18*10*30*

31*35* 11*7*

OVERFLOWPAGES

43*

00100

• SplittingatNext?• Nooverflowbucketneeded• Justcopyattheimage/original


Level=0, N0=4=2d0 ,d0=2

0hh

1

00

01

10

11

000

001

010

011

Next=2

PRIMARYPAGES

44* 36*

32*

25*9*

14* 18*10*30*

31*35* 11*7*

OVERFLOWPAGES

43*

00100

Level=0, N0=4=2d0 ,d0=2

37*

insert29*=11101

5* 37*01101 29*

Example:EndofaRound

0hh1

22*

00

01

10

11

000

001

010

011

00100

Next=3

01

10

101

110

PRIMARYPAGES

OVERFLOWPAGES

32*

9*

5*

14*

25*

66* 10*18* 34*

35*31* 7* 11* 43*

44*36*

37*29*

30*

0hh1

37*

00

01

10

11

000

001

010

011

00100

10

101

110

Next=0

111

11

PRIMARYPAGES

OVERFLOWPAGES

11

32*

9* 25*

66* 18* 10* 34*

35* 11*

44* 36*

5* 29*

43*

14* 30* 22*

31*7*

50*


Level=0, N0=4=2d0 ,d0=2

Level=1,N1=8=2d1 ,d1=3

(afterinserting22*,66*,34*- checkyourself)

insert50*=110010

LHvs.EH

• Theyareverysimilar– hi tohi+1 islikedoublingthedirectory– LH:avoidtheexplicitdirectory,cleverchoiceofsplit– EH:alwayssplit– higherbucketoccupancy

• Uniformdistribution:LHhasloweraveragecost– Nodirectorylevel

• Skeweddistribution– Manyempty/nearlyemptybucketsinLH– EHmaybebetterDukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 97

Summary

• Hash-basedindexes:bestforequalitysearches,cannotsupportrangesearches.

• StaticHashingcanleadtolongoverflowchains.• ExtendibleHashingavoidsoverflowpagesbysplittingafullbucketwhenanewdataentryistobeaddedtoit– Duplicatesmaystillrequireoverflowpages– Directorytokeeptrackofbuckets,doublesperiodically– Cangetlargewithskeweddata;additionalI/Oifthisdoesnotfitinmainmemory


Summary

• LinearHashingavoidsdirectorybysplittingbucketsround-robin,andusingoverflowpages– Overflowpagesnotlikelytobelong– Duplicateshandledeasily

• Forhash-basedindexes,askewed datadistributionisoneinwhichthehashvaluesofdataentriesarenotuniformlydistributed– bad


SelectionofIndexes


DifferentFileOrganizations

Searchkey=<age,sal>

• Heapfiles– randomorder;insertatend-of-file

• Sortedfiles– sortedon<age,sal>

• ClusteredB+treefile– searchkey<age,sal>

• Heapfilewithunclustered B+-treeindex– onsearchkey<age,sal>

• Heapfilewithunclustered hashindex– onsearchkey<age,sal>


Weneedtounderstandtheimportanceofappropriatefileorganizationandindex

PossibleOperations

• Scan– Fetchallrecordsfromdisktobufferpool

• Equalitysearch– Findallemployeeswithage=23andsal =50– Fetchpagefromdisk,thenlocatequalifyingrecordinpage

• Rangeselection– Findallemployeeswithage>35

• Insertarecord– identifythepage,fetchthatpagefromdisk,insetrecord,writeback

todisk(possiblyotherpagesaswell)

• Deletearecord– similartoinsert


UnderstandingtheWorkload• Aworkloadisamixofqueries andupdates

• Foreachqueryintheworkload:– Whichrelationsdoesitaccess?– Whichattributesareretrieved?– Whichattributesareinvolvedinselection/joinconditions?How

selectivearetheseconditionslikelytobe?

• Foreachupdateintheworkload:– Whichattributesareinvolvedinselection/joinconditions?How

selectivearetheseconditionslikelytobe?– Thetypeofupdate(INSERT/DELETE/UPDATE),andtheattributesthatare

affected


ChoiceofIndexes

• Whatindexesshouldwecreate?– Whichrelationsshouldhaveindexes?Whatfield(s)shouldbethesearchkey?Shouldwebuildseveralindexes?

• Foreachindex,whatkindofanindexshoulditbe?– Clustered?Hash/tree?


MoreonChoiceofIndexes• Oneapproach:

– Considerthemostimportantqueries– Considerthebestplanusingthecurrentindexes– Seeifabetterplanispossiblewithanadditionalindex.– Ifso,createit.– Obviously,thisimpliesthatwemustunderstandhowaDBMSevaluatesqueries

andcreatesqueryevaluationplans– Wewilllearnqueryexecutionandoptimizationlater- Fornow,wediscusssimple

1-tablequeries.

• Beforecreatinganindex,mustalsoconsidertheimpactonupdatesintheworkload!

• Trade-off:Indexescanmakequeriesgofaster,updatesslower.Requirediskspace,too.


Trade-offsforIndexes• Indexescanmake

– queriesgofaster– updatesslower

• Requirediskspace,too


IndexSelectionGuidelines• AttributesinWHEREclausearecandidatesforindexkeys

– Exactmatchconditionsuggestshashindex– Rangequerysuggeststreeindex– Clusteringisespeciallyusefulforrangequeries

• canalsohelponequalityqueriesiftherearemanyduplicates

• Trytochooseindexesthatbenefitasmanyqueriesaspossible– Sinceonlyoneindexcanbeclusteredperrelation,chooseitbasedon

importantqueriesthatwouldbenefitthemostfromclustering

• Multi-attributesearchkeysshouldbeconsideredwhenaWHEREclausecontainsseveralconditions

– Orderofattributesisimportantforrangequeries

• Note:clusteredindexshouldbeusedjudiciously– expensiveupdates,althoughcheaperthansortedfiles


ExamplesofClusteredIndexes• B+treeindexonE.age canbeusedtogetqualifyingtuples

• Howselectiveisthecondition?– everyone>40,indexnotofmuchhelp,scanisasgood

– Suppose10%>40.Then?

• Dependsoniftheindexisclustered– otherwisecanbemoreexpensivethanalinearscan

– ifclustered,10%I/O(+indexpages)

SELECT E.dnoFROM Emp EWHERE E.age>40


Whatisagoodindexingstrategy?

ExamplesofClusteredIndexesGroup-Byquery

• UseE.age assearchkey?– BadIfmanytupleshaveE.age >10orifnot

clustered….– …usingE.age indexandsortingtheretrieved

tuplesbyE.dno maybecostly

• ClusteredE.dno indexmaybebetter– Firstgroupby,thencounttupleswithage>

10– goodwhenage>10isnottooselective

• Note:thefirstoptionisgoodwhentheWHEREconditionishighlyselective(fewtupleshaveage>10),thesecondisgoodwhennothighlyselective

SELECT E.dno, COUNT (*)FROM Emp EWHERE E.age>10GROUP BY E.dno



ExamplesofClusteredIndexes

Equalityqueriesandduplicates

• ClusteringonE.hobby helps– hobbynotacandidatekey,several

tuplespossible

• Doesclusteringhelpnow?• (eid =key)

– Notmuch– atmostonetuplesatisfiesthe

condition

SELECT E.dnoFROM Emp EWHERE E.hobby=‘Stamps’


SELECT E.dnoFROM Emp EWHERE E.eid=50


IndexeswithCompositeSearchKeys• CompositeSearchKeys:Searchon

acombinationoffields

• Equalityquery:Everyfieldvalueisequaltoaconstantvalue.E.g.wrt<sal,age>index:– age=20andsal =75

• Rangequery:Somefieldvalueisnotaconstant.E.g.:– sal >10– <age,sal>doesnothelp– hastobeaprefix

sue 13 75

bobcaljoe 12

10

208011

12

name age sal

<sal, age>

<age, sal> <age>

<sal>

12,2012,1011,80

13,75

20,1210,12

75,1380,11

11121213

10207580

Data recordssorted by name

Data entries in indexsorted by <sal,age>

Data entriessorted by <sal>

Examples of composite keyindexes using lexicographic order.

CompositeSearchKeys• ToretrieveEmp recordswithage=30AND sal=4000,anindexon

<age,sal>wouldbebetterthananindexonage oranindexonsal– firstfindage=30,amongthemsearchsal =4000

• Ifconditionis:20<age<30AND 3000<sal<5000:– Clusteredtreeindexon<age,sal>or<sal,age>isbest.

• Ifconditionis:age=30AND 3000<sal<5000:– Clustered<age,sal>indexmuchbetterthan<sal,age>index– moreindexentriesareretrievedforthelatter

• Compositeindexesarelarger,updatedmoreoftenDukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 112

Index-OnlyPlans• Anumberofqueriescanbeansweredwithoutretrievingany

tuplesfromoneormoreoftherelationsinvolvedifasuitableindexisavailable

SELECT E.dno, COUNT(*)FROM Emp EGROUP BY E.dno

SELECT E.dno, MIN(E.sal)FROM Emp EGROUP BY E.dno

SELECT AVG(E.sal)FROM Emp EWHERE E.age=25 AND

E.sal BETWEEN 3000 AND 5000

<E.dno>

<E.dno,E.sal>Tree index!

<E. age,E.sal>Tree index!


• Forindex-onlystrategies,clusteringisnotimportant

Documents

CompSci516 Data Intensive Computing Systems · 2016-10-12 · DBMS Architecture Duke CS, Fall 2016 CompSci 516: Data Intensive Computing Systems 7 • A typical DBMS has a layered