Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
CSE 5311 Notes 2: Binary Search Trees
(Last updated 5/27/13 8:37 AM) ROTATIONS
A
B
C
a b c d
Single right rotation at B(AKA rotating edge BA)Single left rotation at B
(AKA rotating edge BC)
B
C
c d
A
a
b
B
A
ba
C
c
d
A
B
C
D
E
F
G
A
B
C
D
E
F
G
Double right rotation at F
What two single rotations are equivalent?
2 (BOTTOM-UP) RED-BLACK TREES A red-black tree is a binary search tree whose height is logarithmic in the number of keys stored. 1. Every node is colored red or black. (Colors are only examined during insertion and deletion) 2. Every “leaf” (the sentinel) is colored black. 3. Both children of a red node are black. 4. Every simple path from a child of node X to a leaf has the same number of black nodes. This number is known as the black-height of X (bh(X)). Example:
= red = black
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
0 0 0 0
0 0
0 0 0 0
0 0
0 0
0 0 0 0
1 1
1
1 1
1
1
1 1
1
2
2
2
2
32
3
Observations: 1. A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If a node X is not a leaf and its sibling is a leaf, then X must be red. 3. There may be many ways to color a binary search tree to make it a red-black tree. 4. If the root is colored red, then it may be switched to black without violating structural properties.
3 Insertion 1. Start with unbalanced insert of a “data leaf” (both children are the sentinel). 2. Color of new node is _________. 3. May violate structural property 3. Leads to three cases, along with symmetric versions. The x pointer points at a red node whose parent might also be red. Case 1: (in 2320 Notes 12, using Sedgewick’s book, this is done top-down before attaching new leaf)
= red = black
A
A’
C
B
x
a b
c dk
k
k
k
k+1
A
A’
C
B
x
a b
c dk
k+1
k k
k+1
x
Case 2:
= red = black
A
B
D
C
x
b c
d ek
k
k
k-1
k+1
a
D
C
d e
k
k-1
k+1
B
Ax
ba
k
k
c
4 Case 3:
= red = black
D
C
d e
k
k-1
k+1
B
Ax
ba
k
k
c
A
B
ba
k
k
k+1
C
D
d e
k-1
k
c
Example:
= red = black
40
20 60
10 30 50 70 Insert 15
40
20 60
10 30 50 70
15x
40
20 60
10 30 50 70
15
x1
Insert 13
40
20 60
10 30 50 70
15
13
2
x
40
20 60
10 30 50 70
13
15x
3
40
20 60
13 30 50 70
1510
5 Insert 75
1
40
20 60
13 30 50 70
1510 75x
40
20 60
13 30 50 70
1510 75
x
Insert 14
1
40
20 60
13 30 50 70
1510 75
14x
40
20 60
13 30 50 70
1510 75
14
x
1
40
20 60
13 30 50 70
1510 75
14
x Reset to black
Example:
140
40
20
10 30
100
60 120
50 110 13080
70 90
150
160
170
Insert 75
140
40
20
10 30
100
60 120
50 110 13080
70 90
150
160
170
75x
140
40
20
10 30
100
60 120
50 110 13080
70 90
150
160
170
75
x
1
6 140
40
20
10 30
100
60 120
50 110 13080
70 90
150
160
170
75
x
12
140
40
20
10 30
100
60
120
50
110 130
80
70 90
150
160
170
75
x
3
40
20
10 30
100
60
50 80
70 90
75
140
150
160
170
120
110 130
Deletion Start with one of the unbalanced deletion cases:
1. Deleted node is a “data leaf”.
a. Splice around to sentinel. b. Color of deleted node? Red ⇒ Done Black ⇒ Set “double black” pointer at sentinel. Determine which of four rebalancing cases applies.
2. Deleted node is parent of one “data leaf”.
a. Splice around to “data leaf” b. Color of deleted node? Red ⇒ Not possible Black ⇒ “data leaf” must be red. Change its color to black.
3. Node with key-to-delete is parent of two “data nodes”.
a. “Steal” key and data from successor (but not the color). b. “Delete” successor using the appropriate one of the previous two cases.
7 Case 1:
= red = black
A
B
Dx
b
c d e
k-2
k
k
k+1
a
A
C E
f
k-1 k-1
D
B
x
b c d
e
k-1
k
k
k+1
E
CA
fk-1k-2
a Case 2:
= red = 0 = black = 1
A
B
Dx
b
c d e
k-2
k
k-1
k+color(B)
a
A
C E
f
k-2 k-2
= either
A
B
D
x
b
c d e
k-2
k-1
k-1
k+color(B)
a
A
C E
f
k-2 k-2
Case 3:
= red = 0 = black = 1
A
B
Dx
b
c d e
k-2
k
k-1
k+color(B)
a
A
C E
f
k-1 k-2
= either
A
B
D
x
b c
d
e
k-2
k
k-1
k+color(B)
a
A C
E
f
k-1
k-2
8 Case 4:
= red = 0 = black = 1
A
B
Dx
b
c d e
k-2+color(C)
k+color(C)
k-1+color(C)
k+color(B)+color(C)
a
A
C E
f
k-1 k-1+color(C)
= either
A
B
D
b
e
k+color(D)+color(C)
a
A
E
f
c d
Ck-1k-2
+color(C)
k-1+color(C)
k+color(C)
k-1+color(C)
(At most three rotations occur while processing the deletion of one key) Example:
40
20 60
10 30 50 70
120
100 140
90 110 130 150
80
Delete 50
40
20 60
10 30 70
120
100 140
90 110 130 150
80
x
40
20 60
10 30 70
120
100 140
90 110 130 150
80
x
2
9
2 40
20 60
10 30 70
120
100 140
90 110 130 150
80
x
2 40
20 60
10 30 70
120
100 140
90 110 130 150
80x
If x reaches the root, then done. Only place in tree where this happens. Delete 60
40
20
10 30
70
120
100 140
90 110 130 150
80
Delete 70
40
20
10 30
120
100 140
90 110 130 150
80
x
1
40
20
10
30
120
100 140
90 110 130 150
80
x
2
40
20
10
30
120
100 140
90 110 130 150
80
x
If x reaches a red node, then change color to black and done.
10 Delete 10
40
20
30
120
100 140
90 110 130 150
80
x
3
40
20
30
120
100 140
90 110 130 150
80
x
4
40
30 120
100 140
90 110 130 150
80
20
Delete 40
30 120
100 140
90 110 130 150
80
20x
2 30 120
100 140
90 110 130 150
80
20
x
1
30
120
100
140
90 110
130 150
80
20
x
2
30
120
100
140
90 110
130 150
80
20
x
Delete 120
30 100
140
90 110
130
150
80
20
x2
30 100
140
90 110
130
150
80
20
x
3
30
100 140
90
110
130
15080
20
x 4
140110
130
150
30
100
90
80
20
Delete 100
140
130
150
30 90
80
20
4
110
x
140
15030 90
80
20
110
130
11 AVL TREES An AVL tree is a binary search tree whose height is logarithmic in the number of keys stored. 1. Each node stores the difference of the heights (known as the balance factor) of the right and left
subtrees rooted by the children:
heightright - heightleft
50+1
20+1
80+1
100
40-1
60+1
100+1
300
700
900
120-1
1100
2. A balance factor must be +1, 0, -1 (leans right, “balanced”, leans left). 3. An insertion is implemented by:
a. Attaching a leaf b. Rippling changes to balance factor:
1. Right child ripple Parent.Bal = 0 ⇒ +1 and ripple to parent Parent.Bal = -1 ⇒ 0 to complete insertion Parent.Bal = +1 ⇒ +2 and ROTATION to complete insertion 2. Left child ripple Parent.Bal = 0 ⇒ -1 and ripple to parent Parent.Bal = +1 ⇒ 0 to complete insertion Parent.Bal = -1 ⇒ -2 and ROTATION to complete insertion
12 4. Rotations
a. Single (LL) - right rotation at D
D-2
B-1
Ah C
h
Eh
B0
Ah
D0
Eh
Ch
Rest ofTree
Rest ofTree
Restores height of subtree to pre-insertion number of levels RR case is symmetric
b. Double (LR)
B+1
F-2
D-1
Rest ofTree
Ch-1
Gh
Ah
Eh-1
B0
F+1
D0
Rest ofTree
Gh
Ah
Eh-1
Ch-1
Insert on eithersubtree
Restores height of subtree to pre-insertion number of levels RL case is symmetric
13 Deletion - Still have RR, RL, LL, and LR, but two addditional (symmetric) cases arise. Suppose 70 is deleted from this tree. Either LL or LR may be applied.
60
30 80
7010
20 40
50
0 0
0
0
+1 -1
-1
-1
Fibonacci Trees - special case of AVL trees exhibiting two worst-case behaviors - 1. Maximally skewed. (max height is roughly log1.618 n =1.44 lg n, expected height is lg n +.25) 2. θ(log n) rotations for a single deletion.
0
(empty)
1
(empty)
2 3 4
5 6
7
14 TREAPS (CLRS, p. 333) Hybrid of BST and min-heap ideas
Gives code that is clearer than RB or AVL (but comparable to skip lists) Expected height of tree is logarithmic (2.5 lg n)
Keys are used as in BST Tree also has min-heap property based on each node having a priority: Randomized priority - generated when a new key is inserted Virtual priority - computed (when needed) using a function similar to a hash function
104
193
1213
1814
1522
2315
275
3116
341
3817
416
5321
5750
622
6512
6720
717
7518
778
829
8811
Asides: the first published such hybrid were the cartesian trees of J. Vuillemin, “A Unifying Look
at Data Structures”, C. ACM 23 (4), April 1980, 229-239. A more complete explanation appears in E.M. McCreight, “Priority Search Trees”, SIAM J. Computing 14 (2), May 1985, 257-276 and chapter 10 of M. de Berg et.al. These are also used in the elegant implementation in M.A. Babenko and T.A. Starikovskaya, “Computing Longest Common Substrings” in E.A. Hirsch, Computer Science - Theory and Applications, LNCS 5010, 2008, 64-75.
Insertion Insert as leaf Generate random priority (large range to minimize duplicates) Single rotations to fix min-heap property
15 Example: Insert 16 with a priority of 2
104
193
1213
1814
1522
2315
275
3116
341
3817
416
5321
5750
622
6512
6720
717
7518
778
829
8811
162
After rotations:
104
193
1213
1814
1522
2315
275
3116
341
3817
416
5321
5750
622
6512
6720
717
7518
778
829
8811
162
Deletion Find node and change priority to ∞ Rotate to bring up child with lower priority. Continue until min-heap property holds. Remove leaf.
16 Delete key 2:
21
12
43
34
55
2∞
12
43
34
55
12
2∞
43
34
55
12
43
2∞
34
55
12
43
2∞
34
55
12
43
34
55
AUGMENTING DATA STRUCTURES Read CLRS, section 14.1 on using RB tree with ranking information for order statistics. Retrieving an element with a given rank Determine the rank of an element Problem: Maintain summary information to support an aggregate operation on the k smallest (or largest)
keys in O(log n) time. Example: Prefix Sum Given a key, determine the sum of all keys ≤ given key (prefix sum). Solution: Store sum of all keys in a subtree at the root of the subtree.
1 20
15 172
10 19
3 9
2 2 4 4
26 137
30 3020 81
16 16 24 45
21 21
Key Prefix Sum 1 1 2 3 3 6 4 10 10 20 15 35 16 51 20 71 21 92 24 116 26 142 30 172
To compute prefix sum for a key: Initialize sum to 0 Search for key, modifying total as search progresses: Search goes left - leave total alone Search goes right or key has been found - add present node’s key and left child’s sum to
total
Key is 24: (15 + 20) + (20 + 16) + (24 + 21) = 116 Key is 10: (1 + 0) + (10 + 9) = 20 Key is 16: (15 + 20) + (16 + 0) = 51 Variation: Determine the smallest key that has a prefix sum ≥ a specified value.
Updates to tree:
Non-structural (attach/remove node) - modify node and every ancestor Single rotation (for prefix sum)
D ΣD
B ΣB E ΣE
A ΣA C ΣC
B ΣD
A ΣA D ΣC+ΣE+D
C ΣC E ΣE
(Similar for double rotation)
General case - see CLRS 14.2, especially “Theorem” 14.1 Interval trees (CLRS 14.3) - a more significant application Set of (closed) intervals [low, high] - low is the key, but duplicates are allowed Each subtree root contains the max value appearing in any interval in that subtree Aggregate operation to support - find any interval that overlaps a given interval [low’, high’] Modify BST search . . .
18 if ptr == nil no interval in tree overlaps [low’, high’] if high’ ≥ ptr->low and ptr->high ≥ low’ return ptr as an answer if ptr->left != nil and ptr->left->max ≥ low’ ptr := ptr->left else ptr := ptr->right Updates to tree - similar to prefix sum, but replace additions with maximums