94
Trees!

Hash Tables, Binary Search Treescs.boisestate.edu/~scutchin/cs321/lectures/0313_trees_spring2019.… · Binary Search Trees •Efficient data structure for storing data for later

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Trees!

Tree Definition

• A tree is a set of Nodes connected by Branches.

• A tree has 1 Root node no incoming branches.

• In a general tree a Node can have any number of child nodes.

• A child node with no outgoing branches is called a Leaf.

Nodes

Samples of Trees

A

B

A

B

A

B C

G E

I

D

H

F

Nearly Complete Binary Tree

Skewed Binary Tree

E

C

D

1

2

3

4

5

Tree Terms

• The # of branches touching a node is called it’s degree.

• The # of branches coming into a node is called its ‘indegree’.

• The # of branches leaving a node is called its ‘outdegree’.

• The root has indegree = ?

Nodes

Tree Terms

• The Parent of a node is the Node connected by it’s indegree branch.

• A Node may only have one Parent Node.

• A Node is a Parent if it has at least one outdegree branch.

• Any node that is not a parent is a Leaf

• A Node with a parent is a child.

• An internal node has a parent and children.

• All Leaves are children.

Nodes

Tree Terms

• A path is a sequence of nodes connected by outdegree branches only.

• Or a path is a set of nodes only connected by alternating parent/child branches.

• The level of a node is the number of branches between it and the root.

• The root has level 0.

Nodes

Tree Terms

• Two nodes with same parent are siblings.

• An ancestor is any node in the path to the root.

• A descendant is any node in the path below the Node.

Nodes

Tree Terms

• The height of the tree is the longest path from the root.

• A subtree is any connected component below the root.

Nodes

Tree Terms

Tree Terms

Example

Tree ADT

• Objects: any type of objects can be stored in a tree

• Methods:

• accessor methods – root() – return the root of the tree

– parent(p) – return the parent of a node

– children(p) – returns the children of a node

• query methods – size() – returns the number of nodes in the tree

– isEmpty() - returns true if the tree is empty

– elements() – returns all elements

– isRoot(p), isInternal(p), isExternal(p)

A Tree Node

• Every tree node:

– object – useful information

– children – pointers to its children nodes

O

O O

O

O

Left Child - Right Sibling

A

B C D

E F G H I J

K L M

data

left child right sibling

Parenthetical Listing

A (B (C D) E F (G H I) )

Binary Trees

• A binary tree is a tree in which no node can have more than two subtrees; the maximum outdegree for a node is two.

• In other words, a node can have zero, one, or two subtrees.

• These subtrees are designated as the left subtree and the right subtree.

• Also often simply: left child, right child.

Binary Tree

Example

J

I M

H L

A

B

C

D

E

F G K

If a complete binary tree with n nodes (depth = log n + 1) is represented sequentially, then for any node with index i, 1<=i<=n, we have:

parent(i) is at i/2 if i!=1. If i=1, i is at the root and has no parent.

leftChild(i) is at 2i if 2i<=n. If 2i>n, then i has no left child.

rightChild(i) is at 2i+1 if 2i +1 <=n. If 2i +1 >n, then i has no right child.

What Structure have we used this for already?

Binary Tree Structure

A

B

--

C

--

--

--

D

--

E

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

A

B

C

D

E

F

G

H

I

A

B

E

C

D

A

B C

G E

I

D

H

F

(1) waste space

Binary Trees in Arrays

data left right

data

left right

Efficient Linked Structure

Node

A null

tree is a

tree with

no nodes

Collection of Binary Trees

Properties of Binary Trees

• The height of binary trees can be mathematically determined

• Given that we need to store N nodes in a

binary tree, the maximum height is

Hmax

= N -1

A tree with a maximum height is rare. It occurs when all of

the nodes in the entire tree have at most one subtree.

Properties of Binary Trees

• The minimum height of a binary tree is determined as follows:

Hmin

= log2(N +1)é

ëùû-1

For instance, if there are three nodes to be stored in the

binary tree (N=3) then Hmin=1.

Properties of Binary Trees

• Given a height of the binary tree, H, the

minimum number of nodes in the tree is given as follows:

Nmin

= H +1

Properties of Binary Trees

• The formula for the maximum number of nodes is derived from the fact that each node can have only two descendents. Given a height of the binary tree, H, the maximum number of nodes in

the tree is given as follows:

Nmax

= 2H+1 -1

Properties of Binary Trees

• The children of any node in a tree can be accessed by following only one branch path, the one that leads to the desired node.

• The nodes at level 1, which are children of the root, can be accessed by following only one branch; the nodes of level 2 of a tree can be accessed by following only two branches from the root, etc.

• The balance factor of a binary tree is the difference in height between its left and right subtrees:

L RB H H

B=0 B=0 B=1 B=-1

B=0 B=1

B=-2 B=2

Balance of

the tree

Properties of Binary Trees

• In the balanced binary tree (definition of Russian mathematicians Adelson-Velskii and Landis) the height of its subtrees differs by no more than one (its balance factor is -1, 0, or 1), and its subtrees are also balanced.

Complete binary Trees

• A complete tree has the maximum number of entries for its height. The maximum number is reached when the last level is full.

• A tree is considered nearly complete if it has the minimum height for its nodes and all nodes in the last level are found on the left

A

B C

G E

I

D

H

F

A

B C

G E

K

D

J

F

I H O N M L

Complete Binary Tree Nearly Complete binary tree

Complete Examples

Complete Binary Trees

33

Binary Tree Traversal

• A binary tree traversal requires that each node of the tree be processed once and only once in a predetermined sequence.

• In the depth-first traversal processing process along a path from the root through one child to the most distant descendant of that first child before processing a second child.

Binary Tree Traversal

PreOrder Traversal

PreOrder Output

PreOrder Traversal

InOrder Traversal

InOrder Traversal

postOrder

postOrder Traversal

Breadth First Traversal

Infix Expression Tree

Infix Traversal = ? Traversal

Infix Traversal

+

*

A

*

/

E

D

C

B

inorder traversal A / B * C * D + E infix expression preorder traversal + * * / A B C D E prefix expression postorder traversal A B / C * D * E + postfix expression level order traversal + * E * D / C A B

Evaluation of Expressions

Tree Uses

• Unix / Windows file structure

Uses Of Trees!

Games Doom

CG Modeling Maya

Video Compositing:Shake

Intermission

Binary Search Trees

http://www.visualcomplexity.com/vc/project.cfm?id=347

Taxonomy of Some Trees

General Trees – any number of children per node.

Binary Trees

Heaps Binary Search Trees

Binary Search Trees

• Efficient data structure for storing data for later retrieval: – Both Time and Space efficient.

• Why not just use an array? – What can we say about space use in arrays?

• Why not just use a hash table? – Space efficiency?

– Unbounded growth at minimal cost.

Binary Search Trees

• Efficient data structure for storing data for later retrieval: – Both Time and Space efficient.

• Structure of Binary Search Tree – Every element has a unique key.

– The keys in a left subtree (right subtree) are smaller (larger) than the key in the root of subtree.

– The left and right subtrees are also binary search trees.

How is this different from a Heap?

34 41 56 63 72 89 95

0 1 2 3 4 5 6

34 41 56

0 1 2

72 89 95

4 5 6

34 56

0 2

72 95

4 6

A Binary Search Tree

• A Binary Tree where the data is organized in special ways: 1. The value of the nodes are sorted.

2. And the topography of the nodes is organized for efficiency.

• Just keeping the key values sorted leads to O(h)

searches. h can be O(n) in this case.

• Keeping the topography organized (Balanced) leads to O(logn) searches.

Binary Search Trees

General Advantages of BST

• Fast searches for large data sets.

• Extremely space efficient – no wasted space.

• Algorithms are comparatively simple.

• Algorithms are ‘greedy’.

34 41 56 63 72 89 95

0 1 2 3 4 5 6

34 41 56

0 1 2

72 89 95

4 5 6

34 56

0 2

72 95

4 6

Binary Search algorithm of an array of sorted items reduces the search space by one half after each comparison

Binary Search Algorithm

63

41 89

34 56 72 95

• the values in all nodes in the left subtree of a node are less than the node value

• the values in all nodes in the right subtree of a node are greater than the node values

Node Organization Rule for BST

Binary Search Tree Methods

• Search(value) – find a value in the tree.

• Insert(value) – insert a value in the tree.

• Remove(value) – remove a value from the tree.

• What is the worst case performance for all of these? Hint= all the same!

Searching in the BST

method search(key)

• implements the binary search based on comparison of the items

in the tree

• the items in the BST must be comparable (e.g integers, string, etc.)

The search starts at the root. It probes down, comparing the

values in each node with the target, till it finds the first item equal

to the target. Returns this item or null if there is none.

BST method Search

if the tree is empty return NULL else if the item in the node equals the target return the node value

else if the item in the node is greater than the target return the result of searching the left subtree else if the item in the node is smaller than the target return the result of searching the right subtree

Search in BST - Pseudocode

BSTNode search(BSTNode root, int key)

{

if (!root) return NULL;

if (key == root.key) return root;

if (key < root.key)

return search(root.left,key);

return search(root.right,key);

}

Search in BST

What is the running time?

method insert(key)

places a new item near the frontier of the BST while retaining its organization of data:

starting at the root it traverses down the tree till it finds a node whose left or right pointer is empty and is a logical place for the new value

uses a binary search to locate the insertion point

is based on comparisons of the new item and values of nodes in the BST

Elements in nodes must be comparable!

BST method: Insert

9

7

5

4 6 8

Case 1: The Tree is Empty

Set the root to a new node containing the item

Case 2: The Tree is Not Empty

Call a recursive helper method to insert the item

10

10 > 7

10 > 9

10

if tree is empty

create a root node with the new key

else

compare key with the top node

if key = node key

replace the node with the new value // what is odd about this?

else if key > node key

compare key with the right subtree:

if subtree is empty create a leaf node

else add key in right subtree

else key < node key

compare key with the left subtree:

if the subtree is empty create a leaf node

else add key to the left subtree

Insertion in BST - Pseudocode

void insert (BSTNode root, int key)

{

BSTNode ptr, ipoint;

ptr.key = key; ptr.left = ptr.right = NULL;

if (root = NULL) { root = ptr; return; }

ipoint = searchI(root, key); // returns closest Tree

if (key < ipoint.key) ipoint.left = ptr;

else ipoint.right = ptr;

}

Insertion in BST Tree V1

void insert (BSTNode root, int key)

{

BSTNode ptr, ipoint;

ptr.key = key; ptr.left = ptr.right = NULL;

if (root = NULL) { root = ptr; return; }

if (ptr.key > root.key)

if (root.right == NULL) {root.right = ptr; return;}

else { insert(root.right,key); return; }

if (ptr.key < root.key)

if (root.left == NULL) {root.left = ptr; return;}

else { insert(root.left,key); return; }

}

Insertion in BST Tree V2

The order of supplying the data determines where it is placed in the BST , which determines the shape of the BST

Create BSTs from the same set of data presented each time in a different order:

a) 17 4 14 19 15 7 9 3 16 10

b) 9 10 17 4 3 7 14 16 15 19

c) 19 17 16 15 14 10 9 7 4 3 can you guess this shape?

BST Shapes

removes a specified item from the BST and adjusts the tree

uses a binary search to locate the target item:

starting at the root it probes down the tree till it finds the target or reaches a leaf

node (target not in the tree)

removal of a node must not leave a ‘gap’ in the tree,

BST Operations: Removal

method remove (key) I if the tree is empty return false II Attempt to locate the node containing the target using the

binary search algorithm if the target is not found return false else the target is found, so remove its node: Case 1: if the node has 2 empty subtrees replace the link in the parent with null Case 2: if the node has a left and a right subtree - replace the node's value with the max value in the left subtree - delete the max node in the left subtree

Removal in BST - Pseudocode

Case 3: if the node has no left child - link the parent of the node - to the right (non-empty) subtree Case 4: if the node has no right child - link the parent of the target - to the left (non-empty) subtree

Removal in BST - Pseudocode

9

7

5

6 4 8 10

9

7

5

6 8 10

Case 1: removing a node with 2 EMPTY SUBTREES

parent

cursor

Removal in BST: Example

Removing 4 replace the link in the parent with null

Case 2: removing a node with 2 SUBTREES

9

7

5

6 8 10

9

6

5

8 10

cursor cursor

- replace the node's value with the max value in the left subtree - delete the max node in the left subtree

4 4

Removing 7

Removal in BST: Example

What other element can be used as replacement?

9

7

5

6 8 10

9

7

5

6 8 10

cursor

cursor

parent

parent

the node has no left child: link the parent of the node to the right (non-empty) subtree

Case 3: removing a node with 1 EMPTY SUBTREE

Removal in BST: Example

9

7

5

8 10

9

7

5

8 10

cursor

cursor

parent

parent

the node has no right child: link the parent of the node to the left (non-empty) subtree

Case 4: removing a node with 1 EMPTY SUBTREE

Removing 5

4 4

Removal in BST: Example

The complexity of operations get, insert and remove in BST is O(h) , where h is the height.

O(log n) when the tree is balanced. The updating operations cause the tree to become unbalanced.

The tree can degenerate to a linear shape and the operations will become O (n)

Analysis of BST Operations

BST tree = new BST();

tree.insert ("E");

tree.insert ("C");

tree.insert ("D");

tree.insert ("A");

tree.insert ("H");

tree.insert ("F");

tree.insert ("K");

>>>> Items in advantageous order:

K

H

F

E

D

C

A

Output:

Best Case

BST tree = new BST();

for (int i = 1; i <= 8; i++)

tree.insert (i);

>>>> Items in worst order:

8

7

6

5

4

3

2

1

Output:

Worst Case

tree = new BST ();

for (int i = 1; i <= 8; i++)

tree.insert(random());

>>>> Items in random order:

X

U

P

O

H

F

B

Output:

Random Case

Applications for BST

• Sorting with binary search trees

– Input: unsorted array

– Output: sorted array

• Algorithm ?

• Running time ?

Prevent the degeneration of the BST :

• A BST can be set up to maintain balance during updating operations (insertions and removals)

• Types of ST which maintain the optimal performance: – splay trees

– AVL trees

– 2-4 Trees

– Red-Black trees

– B-trees

Better Search Trees

Balanced Search Trees

• Keep all subtrees within some fixed balance, 0, 1, -1

– Point is to divide each search in half all the time.

• Provide guarantees of search performance = O(logn).

• Increase the complexity of insert and delete.

• Primary Technique is ‘Rotation’.

Rotation in Binary Trees

Picture comes to us courtesy of Wikipedia.

Rotation Examples

AVL Trees 85

AVL Trees

6

3 8

4

v

z

Balanced Binary Search Tree

• A Balanced Binary Search Tree is a binary search tree with a height of Θ(log n) where n is the # nodes in the tree.

• Is a Max-heap balanced?

AVL Tree

AVL Insertion: Two Steps

Potentially Breaks Balance!

AVL Insertion: Example

AVL Deletion: Two Steps

Potentially Breaks Balance!

AVL Deletion: Example

Height of AVL Trees

Height of AVL Trees

AVL Trees 94

Running Times for AVL Trees

• a single restructure is O(1) – using a linked-structure binary tree

• find is O(log n) – height of tree is O(log n), no restructures needed

• insert is O(log n) – initial find is O(log n)

– Restructuring up the tree, maintaining heights is O(log n)

• remove is O(log n) – initial find is O(log n)

– Restructuring up the tree, maintaining heights is O(log n)