Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Algorithms in Java, 4th Edition · Robert Sedgewick and Kevin Wayne · Copyright © 2009 · January 22, 2010 10:21:42 PM
3.2 Binary Search Trees
‣ BSTs ‣ ordered operations‣ deletion
Definition. A BST is a binary tree in symmetric order.
A binary tree is either:
• Empty.
• Two disjoint binary trees (left and right).
Symmetric order. Each node has a key, and every node’s key is:
• Larger than all keys in its left subtree.
• Smaller than all keys in its right subtree.
2
Binary search trees
right childof root
a left link
a subtree
root
null links
Anatomy of a binary tree
valueassociated
with R
parent of A and R
left linkof E
keys smaller than E keys larger than E
key
AC
E
HR
SX
9
Anatomy of a binary search tree
Java definition. A BST is a reference to a root Node.
A Node is comprised of four fields:
• A Key and a Value.
• A reference to the left and right subtree.
3
BST representation in Java
smaller keys larger keys
private class Node{ private Key key; private Value val; private Node left, right; public Node(Key key, Value val) { this.key = key; this.val = val; }}
Key and Value are generic types; Key is Comparable
Binary search tree
BST with smaller keys BST with larger keys
key
left right
val
BST
Node
public class BST<Key extends Comparable<Key>, Value>{ private Node root;
private class Node { /* see previous slide */ } public void put(Key key, Value val) { /* see next slides */ }
public Value get(Key key) { /* see next slides */ }
public void delete(Key key) { /* see next slides */ }
public Iterable<Key> iterator() { /* see next slides */ }
}
4
BST implementation (skeleton)
root of BST
Get. Return value corresponding to given key, or null if no such key.
5
BST search
R is less than Sso look to the left
black nodes couldmatch the search key
gray nodes cannotmatch the search key
found R(search hit)
so return value
R is greater than Eso look to the right
AC
E
HM
R
SX
AC
E
HM
R
SX
AC
E
HM
R
SX
T is less than Xso look to the left
link is nullso T is not in tree
(search miss)
T is greater than Sso look to the right
AC
E
HM
R
SX
AC
E
HM
R
SX
Successful (left) and unsuccessful (right ) search in a BST
successful search for R unsuccessful search for T
Get. Return value corresponding to given key, or null if no such key.
Running time. Proportional to depth of node.
6
BST search: Java implementation
public Value get(Key key) { Node x = root; while (x != null) { int cmp = key.compareTo(x.key); if (cmp < 0) x = x.left; else if (cmp > 0) x = x.right; else if (cmp == 0) return x.val; } return null; }
Put. Associate value with key.
Search for key, then two cases:
• Key in tree ⇒ reset value.
• Key not in tree ⇒ add new node.
7
BST insert
search for L endsat this null link
reset linkson the way up
create new node
AC
E
HM
P
R
SX
AC
E
H
L
MP
R
SX
AC
E
H
LM
P
R
SX
Insertion into a BST
inserting L
Put. Associate value with key.
Running time. Proportional to depth of node.8
BST insert: Java implementation
public void put(Key key, Value val) { root = put(root, key, val); }
private Node put(Node x, Key key, Value val) { if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); else if (cmp == 0) x.val = val; return x; }
concise, but tricky, recursive code;read carefully!
9
BST trace: standard indexing client
S
AC
E
HR
SX
AC
E
HR
S
AC
E
HR
S
AC
ER
S
AE
R
AE
S
S
ES
S
6
S 0
E 1
A 2
R 3
C 4
H 5
E 6
X 7
red nodesare new
black nodesare accessed
in search
changedvalue
changedvalue
changedvalue
gray nodesare untouched
AC
E
HM
P
R
SX
AC
E
HM
R
SX
AC
E
HR
SX
AC
E
H
LM
P
R
SX
AC
E
H
LM
P
R
SX12
8
A 8
M 9
P 10
L 11
E 12
BST trace for standard indexing client
key value key value
• Many BSTs correspond to same set of keys.
• Cost of search/insert is proportional to depth of node.
Remark. Tree shape depends on order of insertion.
10
Tree shape
A
H
SR
X
CE
XS
RC
E
H
A
AC
E
HR
SX
BST possibilities
best case
typical case
worst case
A
H
SR
X
CE
XS
RC
E
H
A
AC
E
HR
SX
BST possibilities
best case
typical case
worst case
A
H
SR
X
CE
XS
RC
E
H
A
AC
E
HR
SX
BST possibilities
best case
typical case
worst case
Observation. If keys inserted in random order, tree stays relatively flat.
11
BST insertion: random order
12
BST insertion: random order visualization
Ex. Insert keys in random order.
13
Correspondence between BSTs and quicksort partitioning
Remark. Correspondence is 1-1 if no duplicate keys.
A
C
E
I
K
L
M
O
P
Q
R
S
T
U
UE
14
BSTs: mathematical analysis
Proposition. If keys are inserted in random order, the expected number of compares for a search/insert is ~ 2 ln N.
Pf. 1-1 correspondence with quicksort partitioning.
Proposition. [Reed, 2003] If keys are inserted in random order,expected height of tree is ~ 4.311 ln N.
But… Worst-case for search/insert/height is N.(exponentially small chance when keys are inserted in random order)
15
ST implementations: summary
implementationguaranteeguarantee average caseaverage case ordered
ops?operations
on keysimplementation
search insert search hit insert
orderedops?
operationson keys
sequential search(unordered list) N N N/2 N no equals()
binary search(ordered array) lg N N lg N N/2 yes compareTo()
BST N N 1.39 lg N 1.39 lg N ? compareTo()
Costs for java FrequencyCounter 8 < tale.txt using BST
20
13
0
16
‣ BSTs‣ ordered operations‣ deletion
Minimum. Smallest key in table.Maximum. Largest key in table.
Q. How to find the min / max.
Minimum and maximum
17
Examples of BST order queries
AC
E
HM
R
SX
min()max()max
min
Floor. Largest key ≤ to a given key.Ceiling. Smallest key ≥ to a given key.
Q. How to find the floor /ceiling.
Floor and ceiling
18
Examples of BST order queries
AC
E
HM
R
SX
min()max()
floor(D)
ceiling(Q)
floor(G)
Case 1. [k equals the key at root]The floor of k is k.
Case 2. [k is less than the key at root]The floor of k is in the left subtree.
Case 3. [k is greater than the key at root]The floor of k is in the right subtree(if there is any key ≤ k in right subtree);otherwise it is the key in the root.
Computing the floor
19
floor(G)in leftsubtree is null
result
!nding floor(G)
G is greater than E so floor(G) could be
on the right
G is less than S so floor(G) must be
on the left
AC
E
HM
R
SX
AC
E
HM
R
SX
AC
E
HM
R
SX
AC
E
HM
R
SX
Computing the "oor function
Computing the floor
20
floor(G)in leftsubtree is null
result
!nding floor(G)
G is greater than E so floor(G) could be
on the right
G is less than S so floor(G) must be
on the left
AC
E
HM
R
SX
AC
E
HM
R
SX
AC
E
HM
R
SX
AC
E
HM
R
SX
Computing the "oor function
public Key floor(Key key){ Node x = floor(root, key); if (x == null) return null; return x.key;}private Node floor(Node x, Key key){ if (x == null) return null; int cmp = key.compareTo(x.key);
if (cmp == 0) return x;
if (cmp < 0) return floor(x.left, key);
Node t = floor(x.right, key); if (t != null) return t; else return x;
}
In each node, we store the number of nodes in the subtree rooted at that node.To implement size(), return the count at the root.
Remark. This facilitates efficient implementation of rank() and select().21
Subtree counts
A
A C E H M R S X
C
E
HM
R
SX
A
A C E H M R S X
CE
HM
R
SX
2
6
5
8
8
1
1
1
1
1 1
3
2
22
2
node count N
Two BSTs that representthe same set of keys
A
A C E H M R S X
C
E
HM
R
SX
A
A C E H M R S X
CE
HM
R
SX
2
6
5
8
8
1
1
1
1
1 1
3
2
22
2
node count N
Two BSTs that representthe same set of keys
public int size() { return size(root); }
private int size(Node x) { if (x == null) return 0; return x.N; }
22
BST implementation: subtree counts
private class Node{ private Key key; private Value val; private Node left; private Node right; private int N;}
private Node put(Node x, Key key, Value val) { if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); else if (cmp == 0) x.val = val; x.N = 1 + size(x.left) + size(x.right); return x; }
nodes in subtree
23
Rank
Rank. How many keys < k?
Easy recursive algorithm (4 cases!)
public int rank(Key key) { return rank(key, root); }
private int rank(Key key, Node x) { if (x == null) return 0; int cmp = key.compareTo(x.key); if (cmp < 0) return rank(key, x.left); else if (cmp > 0) return 1 + size(x.left) + rank(key, x.right); else return size(x.left); }
A
A C E H M R S X
C
E
HM
R
SX
A
A C E H M R S X
CE
HM
R
SX
2
6
5
8
8
1
1
1
1
1 1
3
2
22
2
node count N
Two BSTs that representthe same set of keys
• Traverse left subtree.
• Enqueue key.
• Traverse right subtree.
Property. Inorder traversal of a BST yields keys in ascending order.
key
key
val
BST with smaller keys
smaller keys, in order larger keys, in order
all keys, in order
BST with larger keys
left right
BST
Inorder traversal
24
public Iterable<Key> keys() { Queue<Key> q = new Queue<Key>(); inorder(root, queue); return q;}
private void inorder(Node x, Queue<Key> q) { if (x == null) return; inorder(x.left, q); q.enqueue(x.key); inorder(x.right, q); }
• Traverse left subtree.
• Enqueue key.
• Traverse right subtree.
Inorder traversal
25
function call stack
inorder(S) inorder(E) inorder(A) enqueue A inorder(C) enqueue C enqueue E inorder(R) inorder(H) enqueue H inorder(M) enqueue M print R enqueue S inorder(X) enqueue X
A C E H
M R S X
SS ES E A
S E A C
S E RS E R H
S E R H M
S X
queuerecursive calls
A
A C E H M R S X
C
E
HM
R
SX
26
BST: ordered symbol table operations summary
sequentialsearch
binarysearch
BST
search
insert
min / max
floor / ceiling
rank
select
ordered iteration
N lg N h
1 N h
N 1 h
N lg N h
N lg N h
N 1 h
N log N N N
h = height of BST(proportional to log N
if keys inserted in random order)
worst-case running time of ordered symbol table operations
27
‣ BSTs‣ ordered operations‣ deletion
28
ST implementations: summary
Next. Deletion in BSTs.
implementation
guaranteeguarantee average caseaverage caseaverage caseordered
iteration?operations
on keysimplementation
search insert delete search hit
insert delete
orderediteration?
operationson keys
sequential search(linked list)
N N N N/2 N N/2 no equals()
binary search(ordered array)
lg N N N lg N N/2 N/2 yes compareTo()
BST N N N 1.39 lg N 1.39 lg N ??? yes compareTo()
29
BST deletion: lazy approach
To remove a node with a given key:
• Set its value to null.
• Leave key in tree to guide searches (but don't consider it equal to search key).
Cost. O(log N') per insert, search, and delete (if keys in random order),where N' is the number of key-value pairs ever inserted in the BST.
Unsatisfactory solution. Tombstone overload.
delete I
S
E
C
A
N
RH
I
S
E
C
A
N
RH
☠ tombstone
To delete the minimum key:
• Go left until finding a node with a null left link.
• Replace that node by its right link.
• Update subtree counts.
30
Deleting the minimum
public void deleteMin() { root = deleteMin(root); }
private Node deleteMin(Node x) { if (x.left == null) return x.right; x.left = deleteMin(x.left); x.N = 1 + size(x.left) + size(x.right); return x; }
go left untilreaching null
left link
return thatnode’s right link
available forgarbage collection
5
7
update links and countsafter recursive calls
AC
E
HM
R
SX
AC
E
HM
R
SX
CE
HM
R
SX
Deleting the minimum in a BST
node to delete
replace withnull link
available forgarbage
collection
update counts afterrecursive calls
5
1
7
AC
E
HM
C
R
SX
AE
HM
R
SX
AE
HM
R
SX
deleting C
To delete a node with key k: search for node t containing key k.
Case 0. [0 children] Delete t by setting parent link to null.
31
Hibbard deletion
To delete a node with key k: search for node t containing key k.
Case 1. [1 child] Delete t by replacing parent link.
32
Hibbard deletion
node to deletereplace with
child link available forgarbage
collection
AC C C
E
HM
R
R
SX
AE
HM
SX
AE
HM
SX
deleting Rupdate counts after
recursive calls
5
7
To delete a node with key k: search for node t containing key k.
Case 2. [2 children]
• Find successor x of t.
• Delete the minimum in t's right subtree.
• Put x in t's spot.
33
Hibbard deletion
x has no left child
but don't garbage collect x
still a BST
search for key E
node to delete
deleteMin(t.right)
t
5
7
x
successor min(t.right)
t.left
x
update links andnode counts after
recursive calls
AC
E
HM
R
SX
AC
E
HM
R
SX
AC
H
AC
H
MR
MR
SX
ES
X
deleting E
Deletion in a BST
go right, thengo left until
reaching nullleft link
search for key E
node to delete
deleteMin(t.right)
t
5
7
x
successor min(t.right)
t.left
x
update links andnode counts after
recursive calls
AC
E
HM
R
SX
AC
E
HM
R
SX
AC
H
AC
H
MR
MR
SX
ES
X
deleting E
Deletion in a BST
go right, thengo left until
reaching nullleft link
34
Hibbard deletion: Java implementation
public void delete(Key key) { root = delete(root, key); }
private Node delete(Node x, Key key) { if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp < 0) x.left = delete(x.left, key); else if (cmp > 0) x.right = delete(x.right, key); else { if (x.right == null) return x.left;
Node t = x; x = min(t.right); x.right = deleteMin(t.right); x.left = t.left; } x.N = size(x.left) + size(x.right) + 1; return x; }
no right child
replace with successor
search for key
update subtree counts
35
Hibbard deletion: analysis
Unsatisfactory solution. Not symmetric.
Surprising consequence. Trees not random (!) ⇒ sqrt(N) per op.Longstanding open problem. Simple and efficient delete for BSTs.
Next lecture. Guarantee logarithmic performance for all operations.36
ST implementations: summary
implementation
guaranteeguarantee average caseaverage caseaverage caseordered
iteration?operations
on keysimplementation
search insert deletesearch
hit insert delete
orderediteration?
operationson keys
sequential search(linked list)
N N N N/2 N N/2 no equals()
binary search(ordered array)
lg N N N lg N N/2 N/2 yes compareTo()
BST N N N 1.39 lg N 1.39 lg N √N yes compareTo()
other operations also become √Nif deletions allowed