20
Week 13: Balanced Tree Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University CSCI 2100 Data Structures Fall 2019

csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

Week 13: Balanced Tree

Tae-Hyuk (Ted) AhnDepartment of Computer Science

Program of Bioinformatics and Computational BiologySaint Louis University

CSCI 2100 Data Structures Fall 2019

Page 2: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

2CSCI 2100

Learning Objectives● Discuss the balanced property of a binary search tree

Page 3: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

3CSCI 21003

Parts of a binary tree● A binary tree is composed of zero or more nodes

● Each node contains:§ A value (some sort of data item)§ A reference or pointer to a left child (may be null), and§ A reference or pointer to a right child (may be null)

● A binary tree may be empty (contain no nodes)

● If not empty, a binary tree has a root node§ Every node in the binary tree is reachable from the root node by a unique path

● A node with neither a left child nor a right child is called a leaf§ In some binary trees, only the leaves contain a value

Page 4: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

4CSCI 2100

Types of Binary Tree

● A Binary Tree is Full Binary Tree if every node has 0 or 2 children.

● Following are examples of full binary tree.

● We can also say a full binary tree is a binary tree in which all nodes except leaves have two children.

● In a Full Binary, number of leaf nodes is number of internal nodes plus 1 (L = I + 1) where L = Number of leaf nodes and I = Number of internal nodes.

Page 5: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

5CSCI 2100

Types of Binary Tree

● A Binary Tree is Complete Binary Tree if all levels are completely filled except possibly the last level and the last level has all keys as left as possible

● A complete binary tree is one where all the levels are full with exception to the last level and it is filled from left to right.

Page 6: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

6CSCI 2100

Types of Binary Tree

● A Binary tree is Perfect Binary Tree in which all internal nodes have two children and all leaves are at same level.

● Following are examples of Perfect Binary Trees.

Page 7: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

7CSCI 2100

Recall Binary Search TreeBinary Search Tree is a node-based binary tree data structure which has the following properties:

● The left subtree of a node contains only nodes with keys lesser than the node’s key.

● The right subtree of a node contains only nodes with keys greater than the node’s key.

● The left and right subtree each must also be a binary search tree.

Page 8: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

8CSCI 2100

Binary Search Tree

• Stored keys must satisfy the binary search treeproperty.– " y in left subtree of x, then

key[y] £ key[x].– " y in right subtree of x,

then key[y] ³ key[x].

56

26 200

18 28 190 213

12 24 27

Page 9: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

9CSCI 2100

Balance● A balanced tree is a tree in which difference between heights of sub-trees of any

node in the tree is not greater than one.

Page 10: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

10CSCI 2100

Binary Search Tree - Best Time• All BST operations are O(d), where d is tree depth

› What is the best case tree?› What is the worst case tree?

• So, best case running time of BST operations is O(log N)

https://en.wikipedia.org/wiki/Binary_search_tree

Page 11: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

11CSCI 2100

Binary Search Tree - Worst Time

• Worst case running time is O(N) › What happens when you Insert elements in ascending order?• Insert: 2, 4, 6, 8, 10, 12 into the root.› Problem: Lack of “balance”: • compare depths of left and right subtree› Unbalanced degenerate tree

Page 12: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

12CSCI 2100

Balanced and unbalanced BST

4

2 5

1 3

1

5

2

4

3

7

6

4

2 6

5 71 3

Is this “balanced”?

Page 13: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

13CSCI 2100

Perfect Balance

• Want a complete tree after every operation› tree is full except possibly in the lower right

• This is expensive› For example, insert 2 in the tree on the left and then rebuild as a complete tree

Insert 2 &complete tree

6

4 9

81 5

5

2 8

6 91 4

Page 14: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

14CSCI 2100

Balanced binary tree● The disadvantage of a binary search tree is that its height can be as large as N-1● This means that the time needed to perform insertion and deletion and many other

operations can be O(N) in the worst case● We want a tree with small height● A binary tree with N node has height at least O(log N) ● Thus, our goal is to keep the height of a binary search tree O(log N)● Such trees are called balanced binary search trees. Examples are AVL tree, red-

black tree.

Page 15: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

15CSCI 2100

AVL - Good but not Perfect Balance

• AVL trees are height-balanced binary search trees

• Balance factor of a node› height(left subtree) - height(right subtree)

• An AVL tree has balance factor calculated at every node› For every node, heights of left and right subtree can differ by no more than 1› Store current heights in each node

Page 16: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

16CSCI 2100

Height of an AVL Tree

h-1 h-2

h

Page 17: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

17CSCI 2100

Height of an AVL Tree● Fact: The height of an AVL tree storing n keys is O(log n).● Proof: Let us bound n(h): the minimum number of internal nodes of an AVL tree of height h.● We easily see that n(1) = 1 and n(2) = 2

● For n > 2, an AVL tree of height h contains the root node, one AVL subtree of height n-1 and another of height n-2.

● That is, n(h) = 1 + n(h-1) + n(h-2)● Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). So

n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), … (by induction),n(h) > 2in(h-2i)

● Solving the base case we get: n(h) > 2 h/2-1

● Taking logarithms: h < 2log n(h) +2● Thus the height of an AVL tree is O(log n)

3

4 n(1)

n(2)

Page 18: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

18CSCI 2100

Rotations

● When the tree structure changes (e.g., insertion or deletion), we need to transform the tree to restore the AVL tree property.

● This is done using single rotations or double rotations.

x

y

AB

C

y

x

A B C

Before Rotation After Rotation

e.g. Single Rotation

Page 19: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

19CSCI 2100

Rotations

● Since an insertion/deletion involves adding/deleting a single node, this can only increase/decrease the height of some subtree by 1

● Thus, if the AVL tree property is violated at a node x, it means that the heights of left(x) ad right(x) differ by exactly 2.

● Rotations will be applied to x to restore the AVL tree property.

Page 20: csci2100 week13 1 - biohpc.github.io · CSCI 2100 7 Recall Binary Search Tree Binary Search Tree is a node-based binary tree data structure which has the following properties: The

20CSCI 2100

ZyBooks

● 6.2 AVL Rotations