Upload
jibrael-jos
View
11.079
Download
0
Tags:
Embed Size (px)
Citation preview
DATA STRUCTURESB-TREE
Jibrael Jos : Sep 2009
Avoid Taking Printout : Use RTF Outline in case needed 2
IntroductionMultiway TreesB TreeApplicationStructureAlgo : Insert / Delete
Agenda
Please Do Not Take Printout : Use RTF Outline in case needed 3
Data Structures
AVL Trees Red Black B-tree Hashing / Indexing Techniques Graphs
Please Do Not Take Printout : Use RTF Outline in case needed 4
Path Has to be enjoyed
Walking Walking in Rain !! Certification
Effort ~ Satisfaction
Please Do Not Take Printout : Use RTF Outline in case needed 5
Research
Shoulders of Giants
Research on an area to reach a level of expertise
Mindmap and Research Path
Please Do Not Take Printout : Use RTF Outline in case needed 6
B Tree
Critic
Maths
Summattion
Series
Variations
B*, B+
Application
Industry
Avoid Taking Printout : Use RTF Outline in case needed 7
Methodology
One Book to Another One Link to Another
Please Do Not Take Printout : Use RTF Outline in case needed 8
Binary Search Tree
What happens if data is loaded in a binary search tree in this order
23, 32, 45, 11, 43 , 41
1,2,3,4,5,6,7,8
What is AVL tree
Please Do Not Take Printout : Use RTF Outline in case needed 9
Multiway Trees
< K1>= K2
K1
K2
>= K1
<K2
m-way trees
Reduce the depth of the tree to O(logmn)with m-way trees
m children, m-1 keys per node m = 10 : 106 keys in 6 levels vs 20 for a
binary tree but ........
K1 K2 K3
K1
K2
K3
K1
K2
K3
K1
K2
K3
K1
K2
K3
m-way trees
But you have to search through the m keys in each node!
Reduces your gain from having fewer levels!
m-way trees50
100
150
35
45
110
120
60
70
125
135
85
95
90
75
175
Anand B
B-trees
All leaves are on the same level All nodes except for the root and the leaves
have at least m/2 children at most m children
Each node is at least
half full of keys
BTREE
74
78
85
9711
14 125
135
21
102
Please Do Not Take Printout : Use RTF Outline in case needed 15
Disk
1 track = 5000 Chars1 Cylinder = 20 tracks1 disk unit = 200 cylinders
Time Taken
Seek Time Latency Time Transmission Time
Overcoming Latency Time ??
72.5 + o.o5n millisec to read n chars
Please Do Not Take Printout : Use RTF Outline in case needed 17
3 level
Please Do Not Take Printout : Use RTF Outline in case needed 18
Multiway Tree
M – ary tree
3 levels :
Cylinder , Track , Record : Index Seq (RDBMS)
Tables with less change
Please Do Not Take Printout : Use RTF Outline in case needed 19
BTree
If level is 3, m =199 then what is N
How many split per insertion ?
Please Do Not Take Printout : Use RTF Outline in case needed 20
Multiway Trees : Application NDPL , Delhi: Electricity Billing
3 lakh consumers Table indexed as BTREE
UCO Bank, Jaipur One DD takes 10 minutes to print Saviour : BTREE
B-trees - Insertion
Insertion B-tree property : block is at least half-full
of keys Insertion into block with m keys
block overflows split block promote one key split parent if necessary if root is split, tree becomes one level
deeper
Insert Node
74
78
85
9711
14 125
135
21
102
63
After Insert 63
11
14 125
135
63
74
21
78
102
85
97
Insert Node
74
78
85
9711
14 125
135
21
102
99
After Insert 99
11
14 125
135
74
78
21
85
102
97
99
Split Node
74
78
85
97
74
78
85
97
4
node
0
63
Avoid Taking Printout : Use RTF Outline in case needed 27
Structure of Btree
node firstPtr numEntries Entries[1.. M-1] End
Entry key rightPtr End Entry
Split Node : Final
78
63
74
3
node
0
85
97
2
rightPtr
43
2
median
entry
toNdx
fromNdx
Split Node : Final
85
74
78
3
node
4
97
99
2
rightPtr
43
1
median
entry
toNdx
fromNdx
Traversal
42
45
63
7411
14 85 95
21
78
Avoid Taking Printout : Use RTF Outline in case needed 31
DeleteDelete Walk ThroughReflowBorrow LeftBorrow RightCombineDelete Mid
Agenda
Please Do Not Take Printout : Use RTF Outline in case needed 32
Delete : For 78
42
1
16
21
2 57
78
2
45
52
2 63
74
2 85
97
2
Btree Delete Delete() Delete() Delete Mid() Reflow() Reflow() If shorter delete root
Please Do Not Take Printout : Use RTF Outline in case needed 33
Btree Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root
42
1
16
21
2 57
78
2
45
52
2 63
74
2 85
97
2
Target = 78
B
Please Do Not Take Printout : Use RTF Outline in case needed 34
Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (left) if underflow underflow=reflow()
42
1
16
21
2 57
78
2
45
52
2 63
74
2 85
97
2
Target = 78
B
D
Please Do Not Take Printout : Use RTF Outline in case needed 35
Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow() Return underflow
42
1
16
21
2 57
78
2
45
52
2 63
74
2 85
97
2
Target = 78
B
D
Please Do Not Take Printout : Use RTF Outline in case needed 36
Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)
42
1
16
21
2 57
78
2
45
52
2 63
74
2 85
97
2
Target = 78
B
D
D
DM
Please Do Not Take Printout : Use RTF Outline in case needed 37
Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)
42
1
16
21
2 57
74
2
45
52
2 63
1 85
97
2
74 replaces 78
B
D
D
Please Do Not Take Printout : Use RTF Outline in case needed 38
Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)
42
1
16
21
2
45
52
2
After Reflow
57
1
63
74
85
97
4
B
D
D
Please Do Not Take Printout : Use RTF Outline in case needed 39
Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx) Return underflow
Before Reflow
42
1
16
21
2
45
52
2
57
1
63
74
85
97
4
B
D
Please Do Not Take Printout : Use RTF Outline in case needed 40
Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx) Return underflow
After Reflow
0
45
52
2 63
74
85
97
4
16
21
42
57
4
B
D
Please Do Not Take Printout : Use RTF Outline in case needed 41
BTREE Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root
0
45
52
2 63
74
85
97
4
16
21
42
57
4
B
Please Do Not Take Printout : Use RTF Outline in case needed 42
BTREE Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root
45
52
2 63
74
85
97
4
16
21
42
57
4
B
Please Do Not Take Printout : Use RTF Outline in case needed 45
Delete : For 78
42
1
16
21
2 57
78
2
45
52
2 63
74
2 85
97
2
Btree Delete Delete() Delete() Delete Mid() Reflow() Reflow() If shorter delete root
Please Do Not Take Printout : Use RTF Outline in case needed 46
Delete : Reflow
1: Try to borrow right.
2: If 1 failed try to borrow from left
3: Cannot Borrow (1,2 failed) Combine
Please Do Not Take Printout : Use RTF Outline in case needed 47
Delete Reflow
Underflow=false If RT->no > min Entries BorrowRight (root,entryNdx,LT,RT) Else If LT->no > min Entries BorrowLeft (root,entryNdx,LT,RT) Else combine (root,entryNdx,LT,RT) if root->no < min entries underflow=True Return underflow
Please Do Not Take Printout : Use RTF Outline in case needed 48
Borrow Left
8 78
2
85
145
63
74
3
Node >= 74 < 78
Node >= 78 < 85
Please Do Not Take Printout : Use RTF Outline in case needed 49
Combine
65
71
2
63
1
21
57
78
3
42
45
2
59
61
2
Please Do Not Take Printout : Use RTF Outline in case needed 50
Combine
65
71
2
63
1
21
57
78
3
59
61
2
42
45
57
3
Please Do Not Take Printout : Use RTF Outline in case needed 51
Combine
65
71
2
21
57
78
3
59
61
2
42 45
57 63
4
Please Do Not Take Printout : Use RTF Outline in case needed 52
Combine
65
71
2
21
78
2
59
61
2
42 45
57 63
4
Please Do Not Take Printout : Use RTF Outline in case needed 53
Delete Mid
If leaf exchange data and delete leaf
entry Else traverse right to locate
predecessor deleteMid(right) if underflow reflow
Please Do Not Take Printout : Use RTF Outline in case needed 54
Delete Mid
42
1
16
21
2 57
78
2
45
52
2 63
74
2 85
97
2
Case 1: To Delete 78 we replace with 74
Please Do Not Take Printout : Use RTF Outline in case needed 55
Delete Mid
42
1
16
21
2 57
78
2
45
52
2 63
74
2 85
97
2
75
76
2Case 2:To Delete 78 we replace with 76
Hence recursive call of Delete Mid to locate predecessor
Please Do Not Take Printout : Use RTF Outline in case needed 56
order
Order Min Max3 2 34 2 45 3 56 3 6… … …
m m/2 m
Please Do Not Take Printout : Use RTF Outline in case needed 57
Get the Order Right Keys are 4 Subtrees Max is 5 = Order is 5 Minimum = 3 (which is subtrees) Min Keys is 2
45
52
2 63
74
85
97
4
16
21
42
57
4
Please Do Not Take Printout : Use RTF Outline in case needed 58
2-3 Tree
Order 3 ….. So how many keys in a node
This rule is valid for non root leaf
Root can have 0, 2, 3 subtrees
Please Do Not Take Printout : Use RTF Outline in case needed 59
2 -3 Tree
42
1
16
2 57
78
2
45
52
2 63
2 85
97
2
Please Do Not Take Printout : Use RTF Outline in case needed 60
2-3-4 Tree
Order 4 ….. So how many keys in a node
This rule is valid for non root leaf
Root can have 0, 2, 3 subtrees
Avoid Taking Printout : Use RTF Outline in case needed 61
Structure of B + tree
Non leaf node firstPtr numEntries Entries[1.. M-1] End
Entry key rightPtr End Entry
Leaf node firstPtr numEntries Entries[1.. M-1] Next Leaf Node End
Please Do Not Take Printout : Use RTF Outline in case needed 62
B + Tree
42
1
57
78
2
45
52
2 63
74
2 85
97
2
Implies there are more nodes
Please Do Not Take Printout : Use RTF Outline in case needed 63
B * Tree
Space Usage
BTREE nodes can be 50% Empty (1/2)
So rule modified to two third (2/3)
Also when node overflows instead of being split immed distributed with siblings
And even when split happens all siblings are equally distributed (pg 462)
B+-trees
B+ trees All the keys in the nodes are dummies Only the keys in the leaves point to “real”
data Linking the leaves
Ability to scan the collection in orderwithout passing through the higher nodes
Please Do Not Take Printout : Use RTF Outline in case needed 65
Reference My Course Furzon
Chapter 10 Volume 3 Knuth : 5.4.9 (Disks ) 6.2.4 (Multiway)
Action Item Do research on BTREE , AVL , Red
Black