35
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued Aaron Bauer Winter 2014 Winter 2014 CSE373: Data Structures & Algorithms 1

CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

  • Upload
    kane

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued. Aaron Bauer Winter 2014. Announcements. HW2 out, due beginning of class Wednesday Two TA sessions next week Asymptotic analysis on Tuesday AVL Trees on Thursday MLK day Monday. Previously on CSE 373. - PowerPoint PPT Presentation

Citation preview

Page 1: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

CSE373: Data Structures & Algorithms 1

CSE373: Data Structures & Algorithms

Lecture 6: Binary Search Trees continued

Aaron BauerWinter 2014

Winter 2014

Page 2: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

CSE373: Data Structures & Algorithms 2

Announcements

• HW2 out, due beginning of class Wednesday• Two TA sessions next week

– Asymptotic analysis on Tuesday– AVL Trees on Thursday

• MLK day Monday

Winter 2014

Page 3: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

CSE373: Data Structures & Algorithms 3

Previously on CSE 373

• Dictionary ADT– stores (key, value) pairs– find, insert, delete

• Trees– terminology– traversals

Winter 2014

Page 4: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

More on traversals

void inOrderTraversal(Node t){ if(t != null) { inOrderTraversal(t.left); process(t.element); inOrderTraversal(t.right); }}

Sometimes order doesn’t matter• Example: sum all elements

Sometimes order matters• Example: print tree with parent above

indented children (pre-order)• Example: evaluate an expression tree

(post-order)

A

B

D

E

C

F

G

A

B

D E

C

F G

Winter 2014 4CSE373: Data Structures & Algorithms

A

B

D E

C

F G

✓ ✓

✓ ✓

Page 5: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Binary Search Tree

4

121062

115

8

14

13

7 9

• Structure property (“binary”)– Each node has 2 children– Result: keeps operations simple

• Order property– All keys in left subtree smaller

than node’s key– All keys in right subtree larger

than node’s key– Result: easy to find any given key

Winter 2014 5CSE373: Data Structures & Algorithms

Page 6: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Are these BSTs?

3

1171

84

5

4

181062

115

8

20

21

7

15

Winter 2014 6CSE373: Data Structures & Algorithms

Page 7: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Are these BSTs?

3

1171

84

5

4

181062

115

8

20

21

7

15

Winter 2014 7CSE373: Data Structures & Algorithms

Page 8: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Find in BST, Recursive

2092

155

12

307 1710

Data find(Key key, Node root){ if(root == null) return null; if(key < root.key) return find(key,root.left); if(key > root.key) return find(key,root.right); return root.data;}

Winter 2014 8CSE373: Data Structures & Algorithms

Page 9: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Find in BST, Iterative

2092

155

12

307 1710

Data find(Key key, Node root){ while(root != null && root.key != key) { if(key < root.key) root = root.left; else(key > root.key) root = root.right; } if(root == null) return null; return root.data;}

Winter 2014 9CSE373: Data Structures & Algorithms

Page 10: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Other “Finding” Operations

• Find minimum node– “the liberal algorithm”

• Find maximum node– “the conservative algorithm”

• Find predecessor• Find successor

2092

155

12

307 1710

Winter 2014 10CSE373: Data Structures & Algorithms

Page 11: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Insert in BST

2092

155

12

307 17

insert(13)insert(8)insert(31)

(New) insertions happen only at leaves – easy!10

8 31

13

Winter 2014 11CSE373: Data Structures & Algorithms

Page 12: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Deletion in BST

2092

155

12

307 17

Why might deletion be harder than insertion?

10

Winter 2014 12CSE373: Data Structures & Algorithms

Page 13: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Deletion

• Removing an item disrupts the tree structure

• Basic idea: find the node to be removed, then “fix” the tree so that it is still a binary search tree

• Three cases:– Node has no children (leaf)– Node has one child– Node has two children

Winter 2014 13CSE373: Data Structures & Algorithms

Page 14: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Deletion – The Leaf Case

2092

155

12

307 17

delete(17)

10

Winter 2014 14CSE373: Data Structures & Algorithms

Page 15: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Deletion – The One Child Case

2092

155

12

307 10

Winter 2014 15CSE373: Data Structures & Algorithms

delete(15)

Page 16: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Deletion – The Two Child Case

3092

205

12

7

What can we replace 5 with?

10

Winter 2014 16CSE373: Data Structures & Algorithms

delete(5)

Page 17: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Deletion – The Two Child Case

Idea: Replace the deleted node with a value guaranteed to be between the two child subtrees

Options:• successor from right subtree: findMin(node.right)• predecessor from left subtree: findMax(node.left)

– These are the easy cases of predecessor/successor

Now delete the original node containing successor or predecessor• Leaf or one child case – easy cases of delete!

Winter 2014 17CSE373: Data Structures & Algorithms

Page 18: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Lazy Deletion

• Lazy deletion can work well for a BST– Simpler– Can do “real deletions” later as a batch– Some inserts can just “undelete” a tree node

• But– Can waste space and slow down find operations– Make some operations more complicated:

• How would you change findMin and findMax?

Winter 2014 18CSE373: Data Structures & Algorithms

Page 19: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

BuildTree for BST• Let’s consider buildTree

– Insert all, starting from an empty tree

• Insert keys 1, 2, 3, 4, 5, 6, 7, 8, 9 into an empty BST

– If inserted in given order, what is the tree?

– What big-O runtime for this kind of sorted input?

– Is inserting in the reverse order

any better?

1

2

3

O(n2)Not a happy place

Winter 2014 19CSE373: Data Structures & Algorithms

Page 20: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

BuildTree for BST• Insert keys 1, 2, 3, 4, 5, 6, 7, 8, 9 into an empty BST

• What we if could somehow re-arrange them– median first, then left median, right median, etc.– 5, 3, 7, 2, 1, 4, 8, 6, 9

– What tree does that give us?

– What big-O runtime?

842

73

5

9

6

1

O(n log n), definitely better

Winter 2014 20CSE373: Data Structures & Algorithms

Page 21: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Unbalanced BST

• Balancing a tree at build time is insufficient, as sequences of operations can eventually transform that carefully balanced tree into the dreaded list

• At that point, everything isO(n) and nobody is happy– find– insert– delete

1

2

3

Winter 2014 21CSE373: Data Structures & Algorithms

Page 22: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Balanced BST

Observation• BST: the shallower the better!• For a BST with n nodes inserted in arbitrary order

– Average height is O(log n) – see text for proof– Worst case height is O(n)

• Simple cases, such as inserting in key order, lead tothe worst-case scenario

Solution: Require a Balance Condition that

1. Ensures depth is always O(log n) – strong enough!2. Is efficient to maintain – not too strong!

Winter 2014 22CSE373: Data Structures & Algorithms

Page 23: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Potential Balance Conditions

1. Left and right subtrees of the roothave equal number of nodes

2. Left and right subtrees of the roothave equal height

Too weak!Height mismatch example:

Too weak!Double chain example:

Winter 2014 23CSE373: Data Structures & Algorithms

Page 24: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

Potential Balance Conditions

3. Left and right subtrees of every nodehave equal number of nodes

4. Left and right subtrees of every nodehave equal height

Too strong!Only perfect trees (2n – 1 nodes)

Too strong!Only perfect trees (2n – 1 nodes)

Winter 2014 24CSE373: Data Structures & Algorithms

Page 25: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

25

The AVL Balance Condition

Left and right subtrees of every node

have heights differing by at most 1

Definition: balance(node) = height(node.left) – height(node.right)

AVL property: for every node x, –1 balance(x) 1

• Ensures small depth– Will prove this by showing that an AVL tree of height

h must have a number of nodes exponential in h

• Efficient to maintain– Using single and double rotations

Winter 2014 CSE373: Data Structures & Algorithms

Page 26: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

CSE373: Data Structures & Algorithms 26

The AVL Tree Data Structure

4

131062

115

8

14127 9

Structural properties

1. Binary tree property

2. Balance property:balance of every node isbetween -1 and 1

Result:

Worst-case depth isO(log n)

Ordering property– Same as for BST

15

Winter 2014

Page 27: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

27CSE373: Data Structures & Algorithms

111

84

6

10 12

70

0 0

0

1

1

2

3

An AVL tree?

Winter 2014

Page 28: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

28CSE373: Data Structures & Algorithms

3

1171

84

6

2

5

0

0 0 0

1

1

2

3

4

An AVL tree?

Winter 2014

Page 29: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

29CSE373: Data Structures & Algorithms

The shallowness bound

Let S(h) = the minimum number of nodes in an AVL tree of height h– If we can prove that S(h) grows exponentially in h, then a tree

with n nodes has a logarithmic height

• Step 1: Define S(h) inductively using AVL property– S(-1)=0, S(0)=1, S(1)=2– For h 1, S(h) = 1+S(h-1)+S(h-2)

• Step 2: Show this recurrence grows really fast– Can prove for all h, S(h) > h – 1 where

is the golden ratio, (1+5)/2, about 1.62– Growing faster than 1.6h is “plenty exponential”

• It does not grow faster than 2h

h-1h-2

h

Winter 2014

Page 30: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

30CSE373: Data Structures & Algorithms

Before we prove it

• Good intuition from plots comparing:– S(h) computed directly from the definition– ((1+5)/2) h

• S(h) is always bigger, up to trees with huge numbers of nodes– Graphs aren’t proofs, so let’s prove it

Winter 2014

Page 31: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

31CSE373: Data Structures & Algorithms

The Golden Ratio

62.12

51

This is a special number

• Aside: Since the Renaissance, many artists and architects have proportioned their work (e.g., length:height) to approximate the golden ratio: If (a+b)/a = a/b, then a = b

• We will need one special arithmetic fact about : 2= ((1+51/2)/2)2

= (1 + 2*51/2 + 5)/4 = (6 + 2*51/2)/4 = (3 + 51/2)/2

= 1 + (1 + 51/2)/2 = 1 +

Winter 2014

Page 32: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

32CSE373: Data Structures & Algorithms

The proof

Theorem: For all h 0, S(h) > h – 1

Proof: By induction on h

Base cases:

S(0) = 1 > 0 – 1 = 0 S(1) = 2 > 1 – 1 0.62

Inductive case (k > 1):

Show S(k+1) > k+1 – 1 assuming S(k) > k – 1 and S(k-1) > k-1 – 1

S(k+1) = 1 + S(k) + S(k-1) by definition of S

> 1 + k – 1 + k-1 – 1 by induction

= k + k-1 – 1 by arithmetic (1-1=0)

= k-1 ( + 1) – 1 by arithmetic (factor k-1 )

= k-1 2 – 1 by special property of = k+1 – 1 by arithmetic (add exponents)

Winter 2014

S(-1)=0, S(0)=1, S(1)=2For h 1, S(h) = 1+S(h-1)+S(h-2)

Page 33: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

33CSE373: Data Structures & Algorithms

Good news

Proof means that if we have an AVL tree, then find is O(log n)– Recall logarithms of different bases > 1 differ by only a

constant factor

But as we insert and delete elements, we need to:

1. Track balance

2. Detect imbalance

3. Restore balance

Winter 2014

Is this AVL tree balanced?How about after insert(30)?

92

5

10

7

15

20

Page 34: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

CSE373: Data Structures & Algorithms 34

An AVL Tree

20

92 15

5

10

30

177

0

0 0

011

2 2

3 …

3

value

height

children

Track height at all times!

10 key

Winter 2014

Page 35: CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued

35CSE373: Data Structures & Algorithms

AVL tree operations

• AVL find: – Same as BST find

• AVL insert: – First BST insert, then check balance and potentially “fix”

the AVL tree– Four different imbalance cases

• AVL delete: – The “easy way” is lazy deletion– Otherwise, do the deletion and then have several imbalance

cases (we will likely skip this but post slides for those interested)

Winter 2014