33
@ Feng HU, 2016 1 CSC212 Data Structure - Section FG Lecture 17 B-Trees and the Set Class Instructor: Feng HU Department of Computer Science City College of New York

Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 1

CSC212 Data Structure

- Section FG

Lecture17B-TreesandtheSetClass

Instructor:FengHUDepartmentofComputerScience

CityCollegeofNewYork

Page 2: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 2

Topics

• WhyB-Tree• Theproblemofanunbalancedtree

• TheB-TreeRules• TheSetClassADTwithB-Trees• SearchforanIteminaB-Tree• InsertanIteminaB-Tree (*)• RemoveaItemfromaB-Tree (*)

Page 3: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 3

TheproblemofanunbalancedBST

• MaximumdepthofaBSTwithnentires:n-1

❐An Example: Insert 1, 2, 3, 4, 5 in that order into a bag using a BST

❐Run BagTest!

1

2

3

4

5

Page 4: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 4

Worst-CaseTimesforBSTs

• Adding,deletingorsearchingforanentryinaBSTwithnentries isO(d)intheworstcase,wheredisthedepthoftheBST

• Sincedisnomorethann-1,theoperationsintheworstcaseis(n-1).• Conclusion:theworstcasetimefortheadd,deleteorsearchoperationofaBSTisO(n)

Page 5: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 5

Solutionstotheproblem

• Solution1• Periodicallybalancethesearchtree• Project10.9,page516

• Solution2• Aparticularkindoftree:B-Tree• proposed byBayer&McCreightin1972

Page 6: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 6

TheB-TreeBasics

• Similartoabinarysearchtree(BST)• wheretheimplementationrequirestheabilitytocomparetwoentriesviaaless-thanoperator(<)

•ButaB-treeisNOTaBST– infactitisnotevenabinarytree• B-treenodeshavemany(morethantwo)children

•Anotherimportantproperty• eachnodecontainsmorethanjustasingleentry

•Advantages:• Easytosearch,andnottoodeep

Page 7: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 7

Applications: bag andset

• TheDifference• twoormoreequalentriescanoccurmanytimesinabag, butnotinaset• C++STL:setandmultiset(=bag)

• TheB-TreeRulesforaSet• Wewilllookata“set formulation”oftheB-Treerules,butkeepinmindthata“bag formulation” isalsopossible

Page 8: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 8

TheB-TreeRules

• TheentriesinaB-treenode• B-tree Rule 1:Therootmayhaveasfewasoneentry(or0entryifnochildren);everyothernodehasatleastMINIMUMentries

• B-tree Rule 2:Themaximumnumber ofentriesinanode is2*MINIMUM.• B-tree Rule 3:TheentriesofeachB-treenodearestoredinapartiallyfilledarray,sortedfromthesmallesttothelargest.

Page 9: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 9

TheB-TreeRules(cont.)

• ThesubtreesbelowaB-treenode• B-tree Rule 4:Thenumber ofthesubtreesbelowanon-leafnodewithnentriesisalwaysn+1

• B-tree Rule 5:Foranynon-leafnode:• (a).Anentryatindex iisgreater thanalltheentries insubtreenumberiofthenode• (b)Anentryatindex iislessthanalltheentries insubtreenumberi+1ofthenode

Page 10: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 10

AnExampleofB-Tree

93 and 107

subtree number 0 subtree

number 1

subtree number 2

[0] [1]

each entry < 93 each entry

Î (93,107)each entry > 107

What kind traversal can print a sorted list?

Page 11: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 11

TheB-TreeRules(cont.)

• AB-treeisbalanced• B-tree Rule 6:EveryleafinaB-treehasthesamedepth

• ThisruleensuresthataB-tree isbalanced

Page 12: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 12

AnotherExample,MINIMUM=1

Can you verify that all 6 rules are satisfied?

2 and 4

6

7 and 8

9

10531

Page 13: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 13

Theset ADTwithaB-Tree

• Combine fixedsizearraywithlinkednodes

• data[]• *subset[]

• numberofentriesvary

• data_count• upto200!

• numberofchildrenvary

• child_count• =data_count+1?

set.h (p 528-529) template <class Item>class set{public:

... ...bool insert(const Item& entry);std::size_t erase(const Item& target);std::size_t count(const Item& target) const;

private:// MEMBER CONSTANTSstatic const std::size_t MINIMUM = 200;static const std::size_t MAXIMUM = 2 * MINIMUM;// MEMBER VARIABLESstd::size_t data_count;Item data[MAXIMUM+1]; // why +1? -for insert/erasestd::size_t child_count;set *subset[MAXIMUM+2]; // why +2? - one more

};

Page 14: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 14

Invariantfortheset Class

• TheentriesofasetisstoredinaB-tree,satisfyingthesixB-treerules.

• Thenumberofentriesinanodeisstoredindata_count,andtheentriesarestoredindata[0]throughdata[data_count-1]

• Thenumberofsubtreesofanodeisstoredinchild_count,andthesubtreesarepointedbysetpointerssubset[0]throughsubset[child_count-1]

Page 15: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 15

SearchforaIteminaB-Tree

• Prototype:• std::size_tcount(const Item&target)const;

• Post-condition:• Returnsthenumber ofitemsequaltothetarget• (either0or1foraset).

Page 16: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 16

SearchingforanItem:count

Startattheroot.1) locateiso

that!(data[i]<target)2) If(data[i]istarget)

return1;elseif(nochildren)

return0;else

returnsubset[i]->count(target);

19 and 22

6 and 17

2 and 3

4

25165 10

12

2018

search for 10: cout << count (10);

Page 17: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 17

SearchingforanItem:count

Startattheroot.1) locateiso

that!(data[i]<target)2) If(data[i]istarget)

return1;elseif(nochildren)

return0;else

returnsubset[i]->count(target);

19 and 22

6 and 17

2 and 3

4

25165 10

12

2018

search for 10: cout << count (10);i = 1

Page 18: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 18

SearchingforanItem:count

Startattheroot.1) locateiso

that!(data[i]<target)2) If(data[i]istarget)

return1;elseif(nochildren)

return0;else

returnsubset[i]->count(target);

19 and 22

6 and 17

2 and 3

4

25165 10

12

2018

search for 10: cout << count (10);

i = 1

subset[1]

Page 19: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 19

SearchingforanItem:count

Startattheroot.1) locateiso

that!(data[i]<target)2) If(data[i]istarget)

return1;elseif(nochildren)

return0;else

returnsubset[i]->count(target);

19 and 22

6 and 17

2 and 3

4

25165 10

12

2018

search for 10: cout << count (10);

i = 0

Page 20: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 20

SearchingforanItem:count

Startattheroot.1) locateiso

that!(data[i]<target)2) If(data[i]istarget)

return1;elseif(nochildren)

return0;else

returnsubset[i]->count(target);

19 and 22

6 and 17

2 and 3

4

25165 10

12

2018

search for 10: cout << count (10);

i = 0

subset[0]

Page 21: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 21

SearchingforanItem:count

Startattheroot.1) locateiso

that!(data[i]<target)2) If(data[i]istarget)

return1;elseif(nochildren)

return0;else

returnsubset[i]->count(target);

19 and 22

6 and 17

2 and 3

4

25165 10

12

2018

search for 10: cout << count (10);

i = 0data[i] is target !

Page 22: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 22

InsertaItemintoaB-Tree

• Prototype:• bool insert(const Item&entry);

• Post-condition:• Ifanequalentrywasalreadyintheset,thesetisunchangedandthereturnvalueisfalse.

• Otherwise,entrywasaddedtothesetandthereturnvalueistrue.

Page 23: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 23

InsertanIteminaB-Tree

Startattheroot.1) locateiso

that!(data[i]<entry)2) If(data[i]isentry)

returnfalse;//nowork!elseif(nochildren)

insertentryati;returntrue;

elsereturnsubset[i]->insert(entry);

19 and 22

6 and 17

2 and 3

4

25165 10

12

2018

insert (11);

i = 0data[i] is target !

i = 0

i = 1

Page 24: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 24

InsertanIteminaB-Tree

Startattheroot.1) locateiso

that!(data[i]<entry)2) If(data[i]isentry)

returnfalse;//nowork!elseif(nochildren)

insertentryati;returntrue;

elsereturnsubset[i]->insert(entry);

19 and 22

6 and 17

2 and 3

4

25165 10

12

2018

insert (11); // MIN = 1 -> MAX = 2

i = 1data[0] < entry !

i = 0

i = 1

Page 25: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 25

InsertanIteminaB-Tree

Startattheroot.1) locateiso

that!(data[i]<entry)2) If(data[i]isentry)

returnfalse;//nowork!elseif(nochildren)

insertentryati;returntrue;

elsereturnsubset[i]->insert(entry);

insert (11); // MIN = 1 -> MAX = 2

19 and 22

6 and 17

2 and 3

4

25165 10 & 11

12

2018

i = 1put entry in data[1]

i = 0

i = 1

Page 26: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 26

InsertanIteminaB-Tree

Startattheroot.1) locateiso

that!(data[i]<entry)2) If(data[i]isentry)

returnfalse;//nowork!elseif(nochildren)

insertentryati;returntrue;

elsereturnsubset[i]->insert(entry);

insert (1); // MIN = 1 -> MAX = 2

19 and 22

6 and 17

2 and 3

4

25165 10 & 11

12

2018

i = 0=> put entry in data[0]

i = 0

i = 0

Page 27: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 27

InsertanIteminaB-Tree

Startattheroot.1) locateiso

that!(data[i]<entry)2) If(data[i]isentry)

returnfalse;//nowork!elseif(nochildren)

insertentryati;returntrue;

elsereturnsubset[i]->insert(entry);

insert (1); // MIN = 1 -> MAX = 2

a node has MAX+1 = 3 entries!

19 and 22

6 and 17

1, 2 and 3

4

25165 10 & 11

12

2018

i = 0

i = 0

i = 0

Page 28: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 28

InsertanIteminaB-Tree

FixthenodewithMAX+1entries

❶ splitthenodeintotwofromthemiddle

❶ movethemiddleentryup

insert (1); // MIN = 1 -> MAX = 2

a node has MAX+1 = 3 entries!

19 and 22

6 and 17

1, 2 and 3

4

25165 10 & 11

12

2018

Page 29: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 29

InsertanIteminaB-Tree

FixthenodewithMAX+1entries

❶ splitthenodeintotwofromthemiddle

❶ movethemiddleentryup

insert (1); // MIN = 1 -> MAX = 2

Note: This shall be done recursively... the recursive function returns the middle entry to the root of the subset.

19 and 22

6 and 17

3

2 and 4

25165 10 & 11

12

20181

Page 30: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 30

InsertinganItemintoaB-Tree

• WhatifthenodealreadyhaveMAXIMUMnumberofitems?• Solution– loose insertion (p551– 557)

• AlooseinsertmayresultsinMAX+1entriesintherootofasubset• Twostepstofixtheproblem:

• fixit– buttheproblemmaymovetotherootoftheset• fixtherootoftheset

Page 31: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 31

ErasinganItemfromaB-Tree

• Prototype:• std::size_terase(constItem&target);

• Post-Condition:• Iftargetwasintheset,thenithasbeenremovedfromthesetandthereturnvalueis1.

• Otherwisethesetisunchangedandthereturnvalueiszero.

Page 32: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 32

ErasinganItemfromaB-Tree

• Similarly,after“looseerase”,therootofasubsetmayjusthaveMINIMUM–1entries

• Solution:(p557– 562)• Fixtheshortage ofthesubsetroot– butthismaymovetheproblemtotherootoftheentireset

• Fixtheroot oftheentireset(tree)

Page 33: Data Structure - visionlab.engr.ccny.cuny.eduvisionlab.engr.ccny.cuny.edu/~fhu/Lecture17-B-Trees.pdf · Data Structure -Section FG Lecture 17 B-Trees and the Set Class Instructor:

@ Feng HU, 2016 33

Summary

•AB-treeisatreeforsortingentriesfollowingthesixrules

•B-Treeisbalanced- everyleafinaB-treehasthesamedepth

•Adding,erasingandsearchinganiteminaB-treehaveworst-casetimeO(logn),wherenisthenumberofentries

•HowevertheimplementationofaddinganderasinganiteminaB-treeisnotatrivialtask.