18
CSE 5311 Notes 2: Binary Search Trees (Last updated 5/27/13 8:37 AM) ROTATIONS A B C a b c d Single right rotation at B (AKA rotating edge BA) Single left rotation at B (AKA rotating edge BC) B C c d A a b B A b a C c d A B C D E F G A B C D E F G Double right rotation at F What two single rotations are equivalent?

CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

CSE 5311 Notes 2: Binary Search Trees

(Last updated 5/27/13 8:37 AM) ROTATIONS

A

B

C

a b c d

Single right rotation at B(AKA rotating edge BA)Single left rotation at B

(AKA rotating edge BC)

B

C

c d

A

a

b

B

A

ba

C

c

d

A

B

C

D

E

F

G

A

B

C

D

E

F

G

Double right rotation at F

What two single rotations are equivalent?

Page 2: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

2 (BOTTOM-UP) RED-BLACK TREES A red-black tree is a binary search tree whose height is logarithmic in the number of keys stored. 1. Every node is colored red or black. (Colors are only examined during insertion and deletion) 2. Every “leaf” (the sentinel) is colored black. 3. Both children of a red node are black. 4. Every simple path from a child of node X to a leaf has the same number of black nodes. This number is known as the black-height of X (bh(X)). Example:

= red = black

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

170

0 0 0 0

0 0

0 0 0 0

0 0

0 0

0 0 0 0

1 1

1

1 1

1

1

1 1

1

2

2

2

2

32

3

Observations: 1. A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If a node X is not a leaf and its sibling is a leaf, then X must be red. 3. There may be many ways to color a binary search tree to make it a red-black tree. 4. If the root is colored red, then it may be switched to black without violating structural properties.

Page 3: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

3 Insertion 1. Start with unbalanced insert of a “data leaf” (both children are the sentinel). 2. Color of new node is _________. 3. May violate structural property 3. Leads to three cases, along with symmetric versions. The x pointer points at a red node whose parent might also be red. Case 1: (in 2320 Notes 12, using Sedgewick’s book, this is done top-down before attaching new leaf)

= red = black

A

A’

C

B

x

a b

c dk

k

k

k

k+1

A

A’

C

B

x

a b

c dk

k+1

k k

k+1

x

Case 2:

= red = black

A

B

D

C

x

b c

d ek

k

k

k-1

k+1

a

D

C

d e

k

k-1

k+1

B

Ax

ba

k

k

c

Page 4: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

4 Case 3:

= red = black

D

C

d e

k

k-1

k+1

B

Ax

ba

k

k

c

A

B

ba

k

k

k+1

C

D

d e

k-1

k

c

Example:

= red = black

40

20 60

10 30 50 70 Insert 15

40

20 60

10 30 50 70

15x

40

20 60

10 30 50 70

15

x1

Insert 13

40

20 60

10 30 50 70

15

13

2

x

40

20 60

10 30 50 70

13

15x

3

40

20 60

13 30 50 70

1510

Page 5: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

5 Insert 75

1

40

20 60

13 30 50 70

1510 75x

40

20 60

13 30 50 70

1510 75

x

Insert 14

1

40

20 60

13 30 50 70

1510 75

14x

40

20 60

13 30 50 70

1510 75

14

x

1

40

20 60

13 30 50 70

1510 75

14

x Reset to black

Example:

140

40

20

10 30

100

60 120

50 110 13080

70 90

150

160

170

Insert 75

140

40

20

10 30

100

60 120

50 110 13080

70 90

150

160

170

75x

140

40

20

10 30

100

60 120

50 110 13080

70 90

150

160

170

75

x

1

Page 6: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

6 140

40

20

10 30

100

60 120

50 110 13080

70 90

150

160

170

75

x

12

140

40

20

10 30

100

60

120

50

110 130

80

70 90

150

160

170

75

x

3

40

20

10 30

100

60

50 80

70 90

75

140

150

160

170

120

110 130

Deletion Start with one of the unbalanced deletion cases:

1. Deleted node is a “data leaf”.

a. Splice around to sentinel. b. Color of deleted node? Red ⇒ Done Black ⇒ Set “double black” pointer at sentinel. Determine which of four rebalancing cases applies.

2. Deleted node is parent of one “data leaf”.

a. Splice around to “data leaf” b. Color of deleted node? Red ⇒ Not possible Black ⇒ “data leaf” must be red. Change its color to black.

3. Node with key-to-delete is parent of two “data nodes”.

a. “Steal” key and data from successor (but not the color). b. “Delete” successor using the appropriate one of the previous two cases.

Page 7: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

7 Case 1:

= red = black

A

B

Dx

b

c d e

k-2

k

k

k+1

a

A

C E

f

k-1 k-1

D

B

x

b c d

e

k-1

k

k

k+1

E

CA

fk-1k-2

a Case 2:

= red = 0 = black = 1

A

B

Dx

b

c d e

k-2

k

k-1

k+color(B)

a

A

C E

f

k-2 k-2

= either

A

B

D

x

b

c d e

k-2

k-1

k-1

k+color(B)

a

A

C E

f

k-2 k-2

Case 3:

= red = 0 = black = 1

A

B

Dx

b

c d e

k-2

k

k-1

k+color(B)

a

A

C E

f

k-1 k-2

= either

A

B

D

x

b c

d

e

k-2

k

k-1

k+color(B)

a

A C

E

f

k-1

k-2

Page 8: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

8 Case 4:

= red = 0 = black = 1

A

B

Dx

b

c d e

k-2+color(C)

k+color(C)

k-1+color(C)

k+color(B)+color(C)

a

A

C E

f

k-1 k-1+color(C)

= either

A

B

D

b

e

k+color(D)+color(C)

a

A

E

f

c d

Ck-1k-2

+color(C)

k-1+color(C)

k+color(C)

k-1+color(C)

(At most three rotations occur while processing the deletion of one key) Example:

40

20 60

10 30 50 70

120

100 140

90 110 130 150

80

Delete 50

40

20 60

10 30 70

120

100 140

90 110 130 150

80

x

40

20 60

10 30 70

120

100 140

90 110 130 150

80

x

2

Page 9: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

9

2 40

20 60

10 30 70

120

100 140

90 110 130 150

80

x

2 40

20 60

10 30 70

120

100 140

90 110 130 150

80x

If x reaches the root, then done. Only place in tree where this happens. Delete 60

40

20

10 30

70

120

100 140

90 110 130 150

80

Delete 70

40

20

10 30

120

100 140

90 110 130 150

80

x

1

40

20

10

30

120

100 140

90 110 130 150

80

x

2

40

20

10

30

120

100 140

90 110 130 150

80

x

If x reaches a red node, then change color to black and done.

Page 10: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

10 Delete 10

40

20

30

120

100 140

90 110 130 150

80

x

3

40

20

30

120

100 140

90 110 130 150

80

x

4

40

30 120

100 140

90 110 130 150

80

20

Delete 40

30 120

100 140

90 110 130 150

80

20x

2 30 120

100 140

90 110 130 150

80

20

x

1

30

120

100

140

90 110

130 150

80

20

x

2

30

120

100

140

90 110

130 150

80

20

x

Delete 120

30 100

140

90 110

130

150

80

20

x2

30 100

140

90 110

130

150

80

20

x

3

30

100 140

90

110

130

15080

20

x 4

140110

130

150

30

100

90

80

20

Delete 100

140

130

150

30 90

80

20

4

110

x

140

15030 90

80

20

110

130

Page 11: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

11 AVL TREES An AVL tree is a binary search tree whose height is logarithmic in the number of keys stored. 1. Each node stores the difference of the heights (known as the balance factor) of the right and left

subtrees rooted by the children:

heightright - heightleft

50+1

20+1

80+1

100

40-1

60+1

100+1

300

700

900

120-1

1100

2. A balance factor must be +1, 0, -1 (leans right, “balanced”, leans left). 3. An insertion is implemented by:

a. Attaching a leaf b. Rippling changes to balance factor:

1. Right child ripple Parent.Bal = 0 ⇒ +1 and ripple to parent Parent.Bal = -1 ⇒ 0 to complete insertion Parent.Bal = +1 ⇒ +2 and ROTATION to complete insertion 2. Left child ripple Parent.Bal = 0 ⇒ -1 and ripple to parent Parent.Bal = +1 ⇒ 0 to complete insertion Parent.Bal = -1 ⇒ -2 and ROTATION to complete insertion

Page 12: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

12 4. Rotations

a. Single (LL) - right rotation at D

D-2

B-1

Ah C

h

Eh

B0

Ah

D0

Eh

Ch

Rest ofTree

Rest ofTree

Restores height of subtree to pre-insertion number of levels RR case is symmetric

b. Double (LR)

B+1

F-2

D-1

Rest ofTree

Ch-1

Gh

Ah

Eh-1

B0

F+1

D0

Rest ofTree

Gh

Ah

Eh-1

Ch-1

Insert on eithersubtree

Restores height of subtree to pre-insertion number of levels RL case is symmetric

Page 13: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

13 Deletion - Still have RR, RL, LL, and LR, but two addditional (symmetric) cases arise. Suppose 70 is deleted from this tree. Either LL or LR may be applied.

60

30 80

7010

20 40

50

0 0

0

0

+1 -1

-1

-1

Fibonacci Trees - special case of AVL trees exhibiting two worst-case behaviors - 1. Maximally skewed. (max height is roughly log1.618 n =1.44 lg n, expected height is lg n +.25) 2. θ(log n) rotations for a single deletion.

0

(empty)

1

(empty)

2 3 4

5 6

7

Page 14: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

14 TREAPS (CLRS, p. 333) Hybrid of BST and min-heap ideas

Gives code that is clearer than RB or AVL (but comparable to skip lists) Expected height of tree is logarithmic (2.5 lg n)

Keys are used as in BST Tree also has min-heap property based on each node having a priority: Randomized priority - generated when a new key is inserted Virtual priority - computed (when needed) using a function similar to a hash function

104

193

1213

1814

1522

2315

275

3116

341

3817

416

5321

5750

622

6512

6720

717

7518

778

829

8811

Asides: the first published such hybrid were the cartesian trees of J. Vuillemin, “A Unifying Look

at Data Structures”, C. ACM 23 (4), April 1980, 229-239. A more complete explanation appears in E.M. McCreight, “Priority Search Trees”, SIAM J. Computing 14 (2), May 1985, 257-276 and chapter 10 of M. de Berg et.al. These are also used in the elegant implementation in M.A. Babenko and T.A. Starikovskaya, “Computing Longest Common Substrings” in E.A. Hirsch, Computer Science - Theory and Applications, LNCS 5010, 2008, 64-75.

Insertion Insert as leaf Generate random priority (large range to minimize duplicates) Single rotations to fix min-heap property

Page 15: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

15 Example: Insert 16 with a priority of 2

104

193

1213

1814

1522

2315

275

3116

341

3817

416

5321

5750

622

6512

6720

717

7518

778

829

8811

162

After rotations:

104

193

1213

1814

1522

2315

275

3116

341

3817

416

5321

5750

622

6512

6720

717

7518

778

829

8811

162

Deletion Find node and change priority to ∞ Rotate to bring up child with lower priority. Continue until min-heap property holds. Remove leaf.

Page 16: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

16 Delete key 2:

21

12

43

34

55

2∞

12

43

34

55

12

2∞

43

34

55

12

43

2∞

34

55

12

43

2∞

34

55

12

43

34

55

AUGMENTING DATA STRUCTURES Read CLRS, section 14.1 on using RB tree with ranking information for order statistics. Retrieving an element with a given rank Determine the rank of an element Problem: Maintain summary information to support an aggregate operation on the k smallest (or largest)

keys in O(log n) time. Example: Prefix Sum Given a key, determine the sum of all keys ≤ given key (prefix sum). Solution: Store sum of all keys in a subtree at the root of the subtree.

1 20

15 172

10 19

3 9

2 2 4 4

26 137

30 3020 81

16 16 24 45

21 21

Key Prefix Sum 1 1 2 3 3 6 4 10 10 20 15 35 16 51 20 71 21 92 24 116 26 142 30 172

Page 17: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

To compute prefix sum for a key: Initialize sum to 0 Search for key, modifying total as search progresses: Search goes left - leave total alone Search goes right or key has been found - add present node’s key and left child’s sum to

total

Key is 24: (15 + 20) + (20 + 16) + (24 + 21) = 116 Key is 10: (1 + 0) + (10 + 9) = 20 Key is 16: (15 + 20) + (16 + 0) = 51 Variation: Determine the smallest key that has a prefix sum ≥ a specified value.

Updates to tree:

Non-structural (attach/remove node) - modify node and every ancestor Single rotation (for prefix sum)

D ΣD

B ΣB E ΣE

A ΣA C ΣC

B ΣD

A ΣA D ΣC+ΣE+D

C ΣC E ΣE

(Similar for double rotation)

General case - see CLRS 14.2, especially “Theorem” 14.1 Interval trees (CLRS 14.3) - a more significant application Set of (closed) intervals [low, high] - low is the key, but duplicates are allowed Each subtree root contains the max value appearing in any interval in that subtree Aggregate operation to support - find any interval that overlaps a given interval [low’, high’] Modify BST search . . .

Page 18: CSE 5311 Notes 2: Binary Search Trees - Rangerranger.uta.edu/~weems/NOTES5311/notes02.pdf · A red-black tree with n internal nodes (“keys”) has height at most 2 lg(n+1). 2. If

18 if ptr == nil no interval in tree overlaps [low’, high’] if high’ ≥ ptr->low and ptr->high ≥ low’ return ptr as an answer if ptr->left != nil and ptr->left->max ≥ low’ ptr := ptr->left else ptr := ptr->right Updates to tree - similar to prefix sum, but replace additions with maximums